Results Say More About the Way Test Makers Decide to Measure Children's Knowledge

By Jo Craven McGinty

When New York and Kentucky rolled out the first tests aligned with the Common Core State Standards, the results were dismal: Most students failed the new standardized tests, in stark contrast to the old assessments, which the vast majority passed.

The results alarmed parents, but the scores on these new tests-just like those on earlier forms of assessment-reveal less about what children know than about the way the test makers decide to measure that knowledge.

The National Governors Association and Council of Chief State School Officers unveiled the Common Core standards in 2010, saying they were intended to raise academic standards, and the test scores so far appear to reflect the increased expectations.--------------------------------SIDEBAR GRAPHICS: View Graphics at http://online.wsj.com/news/interactive/NUMBERS0712?ref=SB10001424052702303833804580021911682661580--------------------------------When New York first administered Common Core tests in 2013, less than a third of students demonstrated "proficiency"-considered a pass-in math and English. Kentucky recorded similar scores when it launched its assessments in 2012. And results are expected to be comparable in other states as they roll out new tests next year.

Experts warned, however, against reading too much into the proficiency labels or pass-fail numbers in assessing how much students were actually learning.

"It's the same set of kids," said Jonah Rockoff, an economist at Columbia Business School who studies school accountability. "It's the same set of teachers. Their school didn't change dramatically overnight."

In devising the grading scale for the new tests, New York used a "bookmark" method to identify four levels of achievement from "well below proficient" to "excels," according to Ken Wagner, deputy commissioner for assessment and curriculum at the New York Department of Education.

"We brought together 95 teachers from across the state," Mr. Wagner said. "We gave them the test the students took in order of difficulty from easiest to most difficult. You keep going until you say, gosh, I don't think a student at level one would get this correct, but someone at level two would."

At that point, each teacher dropped a "bookmark" and continued until the threshold for each performance level was identified. The panel, divided into math and English groups, repeated the process four times before arriving at final cut scores.

"It's a deep concept," Mr. Rockoff said. "How do you send a message to kids about what is good enough?"

Setting cut scores can indeed be somewhat arbitrary, and New York provides another example of how manipulating the scale changes the perception of how well children are performing.

In 2009, before the Common Core was introduced, 86% of students taking New York's standardized tests scored proficient or better in mathematics and 77% in English. Those impressive results led the state and outside observers to conclude that the tests were too easy.

To address the criticism, New York simply raised the cutoffs, so that passing required a higher score. With the new scale, 61% of students ranked proficient in math and 53% in English.

"We use words like 'proficient' that carry a lot of baggage," said Daniel Koretz, a Harvard professor who is an expert on educational assessment and testing policy. "People believe they know what these labels mean. It has nothing to do with how well kids are doing. It is a way of making a judgment about how performance is going to be labeled."

Various groups object to the Common Core standards, which were drafted with the aim of better preparing students in all states for college and career. Some regard the uniform approach as an intrusion on states' rights. Others object to the frequent testing. Teachers dislike being evaluated based on student test scores.

But such debates over Common Core have diverted attention from legitimate questions about whether U.S. schools adequately educate children, and there is compelling evidence to the contrary.

The federal government's National Center for Education Statistics compared individual states' assessments with the "nation's report card"-an assessment mandated by Congress to provide data on student achievement-and found that, by its standards, most students weren't proficient in reading or math. A respected international assessment of 65 countries shows U.S. students scoring below average in math and average in reading and science. And a study by the National Center for Education Statistics found that 20% of college freshmen in the U.S. need remedial classes.

"We have 210,000 students," Mr. Wagner said of New York. "Only 35% graduate-college- and career-ready. That means almost 140,000 students every year, year after year, are leaving the fourth year of high school not ready for what's next. That's a real stark reality."

Still, experts caution that concerned parents who want to understand more about their children's education should look beyond testing labels.

"The deeper question parents ought to be asking themselves," Mr. Rockoff said, "is 'Did I know what my kid was learning last year, and if I compare it to the new Common Core curriculum, am I happy or sad?'"------------------------------------Write to Jo Craven McGinty at Jo.McGinty@wsj.com***********************************************