Failing the grade

A recent report that American students rank lower than their international counterparts in math and science proficiency should give education professionals pause. The results, based on a standardized test administered every four years to students around the globe, have long-term ramifications for the future of American workers in the globalized economy. The reasons why students in the U.S. lag behind those in Asia and elsewhere, according to experts, is that state math and science curricula are not as rigorous as in those countries where the students excel. What educators in this country do with such data remains to be seen, but the study provides an objective measure of achievement that can provide a basis for future reforms. And that's what educational testing should be: an evaluative tool to help guide strategies for improvement.

North Carolina's school testing program has the potential to serve students in a similar way. But its punitive, high-stakes design serves more to validate superficial state administrative goals than to really benefit the kids who take them.

The situation has improved somewhat the past few years. In particular, the legislature eliminated the odious mandate to hold back any child who failed to pass the end-of-grade tests, substituting a more flexible model. Parents and teachers still complain, however, that showing continued improvement in standardized test scores remains the focal point of the public education system at the expense of other, more important goals. The phrase "teaching to the test," which describes the common practice of gearing months of lessons primarily toward test preparation, has become as prominent in educational parlance as the spelling bee.

The development of critical thinking and leadership skills, surely more determinative of success as an adult than, say, an intimate knowledge of North Carolina history, is practically irrelevant in a test-driven environment. "We're creating a class of worker kids who don't think," says Daniella Cook, education fellow at the N.C. Child Advocacy Institute and a critic of the state's testing program.

On the other hand, you won't find many teachers or administrators lobbying to kill the most destructive aspect of the state testing program--the awarding of bonus pay to those teachers whose schools show the "expected" level of progress or better in test scores from year to year. Last year, 75 percent of the state's schools improved sufficiently to qualify their teachers for the bonus; in 2002-03, the figure was a hefty 94 percent.

On its face, the bonus plan rewards those teachers whose students achieve measurable results. Based on relative improvement rather than an absolute score, it doesn't unfairly benefit those teachers who have the good fortune to work in the Chapel Hill-Carrboro schools or other systems where the students traditionally do well on standardized tests. And with budget-crunched legislators unwilling to raise teacher pay to more competitive levels, the bonus program puts a chunk of much-needed cash in the pockets of front-line educators.

Those facts may quiet opposition to test-related bonuses, but they don't offset the insidious negatives. The bonus program only encourages teaching to the test--if a healthy fraction of a teacher's take-home pay hinges on test scores, how many will sublimate their self-interest and avoid the temptation to focus on the tests, thereby shortchanging other priorities? It also encourages good teachers who work in schools with historically under-performing kids--the same schools, of course, with the highest percentages of poor and minority students--to flee to wealthier schools within their districts, or to more privileged districts altogether. That leaves poor and minority children with a disproportionate number of novice teachers, which in turn perpetuates the longstanding achievement gap between wealthy white and poor minority students.

Tying teacher and administrator pay to test results also encourages cheating, and not the classic copying-off-the-smart-kid kind of cheating. While illicitly enhancing scores school-wide is undoubtedly rare, the opportunity certainly exists on several fronts, as school administrative sources contacted for this column readily confirmed. Each school district sets test schedules, and completed tests often sit in individual schools for several days before being turned over to the district's accountability director for scanning. In Houston, which employs a similar incentives structure, the principal of a school that had performed poorly over the years was discovered to have sequestered herself in her office with the tests and changed many of the answers, which resulted in her school achieving the exalted--and lucrative--"exemplary" status.

If salaries (especially in poor districts that can't afford to supplement teacher pay as some wealthier districts do) aren't sufficient to recruit and retain good teachers, resources need to be found for that purpose. Paying them off to keep quiet about the shortcomings of the testing program is just a cynical way for the state to avoid its obligations.

The testing program offers other evidence of malfunction. Tests are created by outside consulting firms in conjunction with the Department of Public Instruction, and then tweaked internally before being administered. The tests must meet professional standards of reliability and validity, but within that framework it would be relatively easy to craft a test that would show the kind of improvements the state has seen annually. Does it really make sense that the majority of students in every system do substantially better every year?

State Director of Accountability Services Lou Fabrizio argues that it does make sense, since schools that do not perform up to expectations are given direct assistance by state and district personnel. DPI also distributes "pacing guides" and other curriculum aides to help teachers, ah, teach to the test. But that simple explanation doesn't entirely answer the question. And to what extent those interventions address the critical question of teaching method is unclear--as everyone has experienced, a skilled teacher can impart the same set of information more successfully than one who lacks those skills.

The low scores required to pass the tests have also come under recent scrutiny. To pass the end-of-grade math test, for example, eighth graders must answer only 33 percent of the questions correctly. In 2001, students had to answer only 28 percent of the questions correctly in order to pass; almost everyone did. This prompted Sam Miller, who chairs UNC-Greensboro's department of curriculum and instruction, to comment that the bar has been set so low "you could almost trip over it." The passing scores are set by the state and differ from test to test and grade to grade, furthering the impression that the bars are set to ensure a rosy result.

Even the best laid plans to achieve "growth" in test scores sometimes go awry. In 2002, reading test scores across North Carolina dropped 20 percentage points. No matter: The state discarded the results, even while maintaining that the scores were valid. On the other hand, the board recently decided to let last year's sixth-grade reading scores stand, even though only two of the state's 388 middle schools met expected improvements. Still, a third of those schools qualified for teacher bonuses based on positive gains in other subjects.

These aberrations only prove that testing is at best an inexact science and should be treated that way. But the state education bureaucracy clings to testing as an end in and of itself, instead of as a means to an end. Don't expect changes any time soon--though one might expect the state to seek an outside evaluation of its testing program, the assessments to date have all been conducted by DPI. Not surprisingly, the agency has given itself high marks.

High-stakes testing was instituted to address real problems. Social promotion was rampant before the current testing program was devised in 1997, and stories of schools graduating functional illiterates were shockingly common. It provided a measure of teacher as well as student performance that had hitherto been lacking. And it offered hard evidence that the achievement gap was more than a myth and spanned wealthy as well as poor school districts.

Nor do the system's flaws negate the intrinsic value of testing, which even the most jaded opponents agree is an important indicator of how well teachers, individual schools, school systems and the state are serving students. But until the system is overhauled and subjected to an independent evaluation, the testing program will have little value beyond propping up the political and professional aspirations of those who created it.