The lessons of “The Accountability Plateau”

“Consequential accountability,” à la No Child Left Behind and the high-stakes state testing systems that preceded it, corresponded with a significant one-time boost in student achievement, particularly in primary and middle school math. Like the meteor that led to the decline of the dinosaurs and the rise of the mammals, results-based accountability appears to have shocked the education system. But its effect seems to be fading now, as earlier gains are maintained but not built upon. If we are to get another big jump in academic achievement, we’re going to need another shock to the system—another meteor from somewhere beyond our familiar solar system.

So argues Mark Schneider, a scholar, analyst, and friend whom we once affectionately (and appropriately) named “Stat stud.” Schneider, a political scientist, served as commissioner of the National Center for Education Statistics from 2005 to 2008, and is now affiliated with the American Institutes for Research and the American Enterprise Institute. In a Fordham-commissioned analysis released yesterday, he digs into twenty years of trends on the National Assessment of Educational Progress (NAEP), aka the “Nation’s Report Card.”

We originally asked Schneider to investigate the achievement record of the great state of Texas. At the time—it feels like just yesterday—Rick Perry was riding high in the polls, making an issue of education, and taking flak from Secretary Arne Duncan for running an inadequate school system. We wondered: Was Duncan right to feel “very, very badly” for the children of Texas? Had the state’s schools—once darlings of the standards movement and prototypes for NCLB—really slipped into decline since Perry took office? What do the NAEP data really show?

Schneider agreed to take on the project but quickly concluded that there’s a larger and more interesting story to tell than simply the saga of Texas. It was true, he noted, that Texas’s achievement slowed during the Perry years, particularly as compared to the rest of the country. But rather than pin that development on the governor, Schneider saw a more likely explanation: As an early adopter of standards, testing, and accountability, Texas got a head start on the big achievement gains that these initiatives brought in many place—and realized most of these gains in the 1990s when George W. Bush was governor.

Take fourth-grade math, for example:

In 1992, fourth graders in Texas were performing on par with their peers nationwide on NAEP’s math assessment. In the 1993-94 school year, however, Texas introduced its strict “consequential accountability” system, and its effects can be seen in NAEP’s next iteration in 1996: Texas fourth graders scored about five points above students nationwide. That gap persisted through the 2000 NAEP assessment as well, and not just for Texas students as a whole. Rather, Texas’s black, Hispanic, and lowest-performing fourth graders—those groups that served as particular focal points of NCLB and accountability systems more generally—all scored at least thirteen points above their national peers in 2000. That translates to roughly a year’s worth of schooling, or more.

Indeed, the Lone Star State made Texas-sized gains from the early- to mid-1990s, as its accountability system got traction. But as other states followed suit in the late 90s, and as all remaining states introduced NCLB-style accountability systems after 2001, they too jumped onto the achievement fast-track, leading to sizable national gains. Between 2000 and 2003 alone, fourth-grade performance in math nationwide jumped a full nine points on NAEP’s scale.

But much as Texas led the nation in making positive strides throughout the 90s, so did it lead the nation in petering out thereafter. Texas’s fourth graders peaked on the NAEP math exam in 2005; their average score has wavered between scale scores of 240 and 242 in the last four iterations of NAEP. Students nationwide caught up to Texas by 2009. In that year, and in the 2011 assessment that followed, there was no significant difference in the scores posted by students in Texas and those nationwide. That’s true of sub-groups, too. While black, Hispanic, and low-performing students in Texas still led their peers nationwide in math by eight, six, and five points in 2011, respectively, their progress and the progress of those groups nationally has stalled over the last few assessment cycles. The same general patterns can also be traced for eighth-grade math students, though their progress built upon fourth-grade gains both in Texas and nationally and thus occurred in later years, without clear signs of stalling as yet. (Overall reading scores, however, have changed little over the last two decades, both nationally and in Texas, though some subgroups have shown significant gains.)

So while Texas’s progress has cooled in recent years, the same pattern can be observed in the country as a whole—only offset by about half a decade. It’s not that Perry was a worse “education governor” than Bush (or, for that matter, Ann Richards) before him, but that he presided over an accountability strategy that was running out of steam. Like the meteor that set in place a new state of equilibrium for our Earth, consequential accountability shocked our education system and improved math scores, but that system has now begun to settle into a new stasis. So sayeth the stats stud.

It’s an intriguing argument, and one that deserves serious consideration, even more so as the U.S. marks the tenth anniversary of the enactment of NCLB and tries to figure out what the next version of that law should entail. If school-level accountability, as currently practiced, is no longer an effective lever for raising student achievement, then what is? If we need another “meteor” to disrupt the system, where should we look? Mark suggests that the Common Core and rigorous teacher evaluations have potential. We also see promise in the digital-learning revolution. But other shocks to the system might work even better. What are they?