Beware Hyped Claims of WMD
Cathy O’Neil, a blogger who is a weekly guest on the Slate Money podcast, has written a book called Weapons of Math Destruction, which argues that algorithms and mathematical models are destructive, specifically harming society’s most vulnerable members, because the models are not accurate but people trust in them blindly. O’Neil singles out teacher value-added models, which measure the impact that teachers have on student test scores, as a prime example of destructive data analysis. Here at the Ed Impact Lab, we know all about the drawbacks and weaknesses of value-added models. I’ve built and used these models myself for teachers in the District of Columbia and Charleston, South Carolina. It’s an important public service to be on the lookout for uses that harm teachers or students. The problem is, many of the book’s claims about value-added models are hyped and don’t stand up to scrutiny.
O’Neil first misleads her readers by implying that value-added scores are the sole measures used for evaluating teachers and even firing them. I would challenge O’Neil to find a school district or state anywhere in the country that relies exclusively on value-added scores for evaluating teachers or making personnel decisions. DC Public Schools is offered as the example described in the book and a Slate podcast promoting it. But value added was not the sole measure. It initially made up 50% of the teacher’s evaluation score system, while the other 50% included classroom observations and other measures. The book notes this, failing to note that the percentage was lowered to 35%, but the discussion proceeds as if value added is used in isolation for high stakes decisions. Furthermore, only about 20% of teachers receive value-added scores, so it’s misleading to describe value added as the key driver of most teachers’ ratings or decisions based on those ratings. The idea that a mindless algorithm with no human intervention determined people’s lives is just incorrect.
The second reason O’Neil thinks value-added models are WMDs, that they have no feedback loop, is also wrong. She argues that because there is no external information that can be used to validate or refute the outcome of the value-added model, the model leads to a self-perpetuating loop where “bad” teachers are fired and the model is deemed successful because it has gotten rid of bad teachers. A disturbing claim, but it isn’t true. DC’s IMPACT system has always relied on multiple measures of teacher quality, including ratings from outside experts who observed teachers in the classroom. These other measures do serve as a feedback mechanism: if value added were consistently at odds with the other measures it would be clear to district leaders. Also, just like baseball teams that use analytics badly will start to lose games, DCPS faces intense competition from charter schools, which have a 44% market share in DC, as well as suburban schools, so there is a risk of losing enrollment and funding. The competition for teachers is just as fierce. Recent papers by researchers at Stanford and the University of Virginia (here and here), as well as our own study, have shown that the quality of teaching continues to increase (including all those teachers for whom value added is not applicable) in DCPS.
Another thing the book gets wrong is the lack of transparency. The book’s claim that “the model itself is a black box, its contents a fiercely guarded corporate secret” is false. The DC value-added model is well documented for anyone to examine. At this link you can find five technical reports for the model, documenting the evolution from 2010 through 2014, plus several working papers and journal articles that cover different aspects of value-added modeling. The Ed Impact Lab even makes value-added program code, very similar to what was used in DC, with documentation, available for free downloading on our website. O’Neil never contacted me or anyone at Mathematica, but it’s not too late! Drop us a line at email@example.com. Or browse our publications demonstrating the transparency of value-added models used in other sites around the country here.
O’Neil has taken on an important issue in how statistical algorithms are used in decisions that affect people’s livelihoods. There is some agreement among experts about how to set up value-added models, but she is right to suggest that more discussion is needed about how best to make use of the information these models provide. Value-added models can indeed be used in ways that are unfair and counterproductive, but they provide important data points on teacher impact that can complement other kinds of information to improve teaching. We should be working to make sure the information is used appropriately, not throwing it out entirely.