Unmasking the Low Standards Of High-Stakes Tests

In an apparent attempt to reclaim the offensive in the
standards-and-testing debate, Robert Schwartz and Matthew Gandal,
writing in these pages, have seriously misstated the position of those
raising the alarm about the consequences of high-stakes testing now
under way in several states. ("Higher Standards, Stronger
Tests: Don't Shoot the Messenger," Jan. 19, 2000.) They contend
that the critics "come out of the woodwork when passing rates on tests
are low" and "blame the messenger, rather than take on the shortcomings
that standards and assessment expose." The charge is not only untrue,
but seriously distorts the deep and troubling concerns expressed by
large numbers of parents, teachers, researchers, and legal experts
about the limitations of high-stakes-testing policies. We believe
passionately in raising student achievement, but we disagree that
raising standards requires standardization.

To set the record straight: We critics are not, as Messrs. Schwartz
and Gandal suggest, Johnny-come-latelies to the controversy over
standards and high-stakes testing. Nor are we back-benchers. Many of us
are front-line, in-the-trenches practitioners who work in schools and
whose track record with real students-not statistical proxies-attests
to our long-term commitment to high standards. The debate has been
one-sided. We don't command the same access to the media as do the
writers, politicians, or state education commissioners, but we have
consistently and urgently expressed alarm regarding the consequences of
linking single-event tests-whatever their nature-to critical decisions
such as graduation and diploma granting.

We have argued to whomever will listen that one size fits few:
Common sense tells us that there are, as Howard Gardner puts it,
multiple forms of excellence and multiple ways of finding out what kids
know and can do.

We have argued that since learners are complex, assessments should
be too, that no single instrument or set of instruments should be used
to make life—determining decisions. While proponents of
high-stakes tests admit that their instruments are imperfect, they seem
to be willing to live with rates of failure—"casualties," as one
state commissioner of education put it-that are alarming to us. There
are some predictions of dropout rates as high as 50 percent!

We have argued that high-stakes tests devalue high-quality
education. Just listen to teachers across the country detail how these
tests have trivialized curriculum, or pay a visit to schools throughout
New York state and quickly see how the tests have taken control of the
school day. Already, music and art classes have been pushed to the side
to make way for more test-oriented English classes.

We have argued
that high-stakes tests devalue high-quality education.

Maybe this is why none of the high-performing independent schools in
New York state gave the state regents' tests in the past or plan to use
them now. Maybe this explains why the high-performing Westchester
County districts have been seeking a variance from the tests. Perhaps
this explains why, at a recent legislative hearing, the spokesman for
New York's archdiocesan schools reluctantly admitted that Roman
Catholic schools were giving the tests because they had been
"coerced."

And we have done more than argue. We have developed schools that
require students to demonstrate a high level of competence in ways that
meet and exceed state standards, and we have done it in ways that
engage young people far more effectively than high-stakes tests. We
have shown that, unlike high-stakes testing, which statistically favors
the more privileged, performance assessment results in genuine equity
by leveling the playing field even with youngsters for whom
standardized testing is often an obstacle.

So, enough of this talk about "shooting the messenger." Let's open
the envelope and examine what's inside. Let's see what these "tougher
standards" look like, as revealed on the English-language-arts regents'
exam required in New York for high school graduation and see whether
these high-stakes tests adequately or fairly represent high
standards:

Question No. 1 on the English-language-arts regents' exam (June
1999), which apparently is intended to test listening skills, asks
students to respond to a speech on the Suzuki method of teaching the
violin. Following the speech, which is read aloud by test proctors,
students are required to answer six multiple-choice questions like this
one: According to the Suzuki method, which step comes first? (1)
playing by ear, (2) reading written notes, (3) listening to music, (4)
writing original tunes. Then they are told to write a letter to their
"school board recommending whether or not the Suzuki violin method
should be taught in your district."

The content focus of this item (that is, the Suzuki violin method)
raises serious issues with respect to relevance. Some experts argue
that using such remote and artificial topics undermines students'
capacity to demonstrate the higher-order-thinking skills the test is
designed to measure. Students are expected to listen to the proctor
read a speech on a subject quite remote to most students, then write a
letter arguing a position about which they have only the most
superficial knowledge. This contrasts sharply with how we actually want
students to behave.

In our classes, students are taught to form opinions by conducting
research on multiple perspectives, asking questions, and analyzing the
findings. This test item totally obviates such rigor. In fact, it
raises serious questions of validity: What exactly is it that this test
item is supposed to measure? Is it content or test-taking expertise?
And how do test proponents defend the reliance on individual proctors
to read aloud the selection? It may be a small thing, but, as we have
observed in this election year, delivery counts a lot.

Question No. 2 is apparently designed to test for "information and
understanding." Using information on the history of child labor,
students are instructed to "write a report summarizing some provisions
of current New York state law regarding the employment of children and
discussing the conditions that may have led to those provisions."

Test-takers are provided with a chart and a 3½-page reading on
the topic and are required to answer 16 multiple-choice questions on
the reading material. Of these 16 questions, only one requires students
to apply inference skills, another is a "main idea" question, and the
remaining 14 involve recall of specific detail from the reading. Here's
a typical question: As a result of the 1938 Wages and Hours Act,
children are not allowed to (1) earn minimum wage, (2) work after
school, (3) hold dangerous jobs, (4) pay income taxes.

While we obviously want students to be able to locate and retrieve
information from a reading selection, this activity could never qualify
as the primary focus of an intellectually rigorous curriculum. The
research papers we assign our students require, in addition to in-depth
understanding of factual information, an emphasis on how to apply such
information for a given purpose, how to assess its value, and how to
weigh it against additional evidence, not merely to restate it.

Is this what we mean by higher standards? In our own schools,
students are taught to read and analyze materials reflecting multiple
perspectives, and they are expected to demonstrate the ability to
develop a logical argument based on those perspectives. Our schools
focus on teaching students ways to assess diverse points of view, to
act as historians, to identify reliable sources of information, to
debate ideas, and to demonstrate these skills in thoughtful, rigorously
argued research papers.

Consider these research papers recently completed by students in our
performance-assessment schools: "Did Lincoln Free the Slaves?" "How
Should Columbus Be Regarded Today?" "Why Did the United States Become
Involved in Vietnam?" "Did King Make the Movement or Did the Movement
Make King?" As these titles suggest, performance assessment provides a
vehicle by which to achieve the high standards called for by the state
board of regents in more authentic and effective ways than does the
English-language-arts exam.

Question No. 3 on that test asks students to read an essay and a
poem on "the influence of teachers on the lives of students." Students
are required to answer 10 short-answer questions before beginning their
own essays. Again, the multiple-choice questions place a far greater
emphasis on the recall of specific factual information than they do on
inferential or implied understandings (the more sophisticated reading
skills college work requires).

Both the content, distinguished by its artificiality, and the
language used in the task's instructions (write a "unified essay" with
a "controlling idea") require students to suspend their normal learning
behaviors, move into "test mode," and write about something quite
artificial, within time constraints that undermine quality work, in a
manner that is formulaic.

Finally, Question No. 4 on the test asks students to provide a
"valid interpretation" of a statement proposed as a "critical lens" by
comparing two works of literature the student may have read. Leaving
aside the ambiguous use of such terminology as "critical lens," it is
the exact quote that is even more troubling: "In literature, evil often
triumphs, but never conquers."

Students in schools using performance assessment are frequently
required to write literary essays in which they compare two works of
literature with respect to genre, period, literary technique, or style.
Yet, even our most sophisticated readers and writers might be
handicapped by this "critical lens" statement. In one of our classes,
for example, students' reading includes Catch-22, The Painted Bird, An
American Tragedy, Madame Bovary, The Hunchback of Notre Dame, One Flew
Over the Cuckoo's Nest, and "Agamemnon." Since one could effectively
argue that, indeed, evil both triumphs and conquers in these literary
works, our students would be penalized for not answering the question
posed.

And what is the connection between questions like these and the
standards themselves? Close examination suggests very little. Many
critical skills listed in the New York state standards are simply
ignored in the examination itself. How did the test-makers choose which
standards to emphasize? No one has provided that information.

"Speaking," for example, is emphasized in all four of the state
English-language standards. According to the regents who set the
standards, students should "present orally" well-developed analysis of
issues, ideas, and texts. Considering the skills expected of students
in most college and work settings, such a standard is well-chosen. Yet,
nowhere in the exam is oral presentation evaluated. Thus, schools that
take the standards seriously and emphasize discussion, question-asking,
oral analyses, presentation, and the development of succinct, informed,
and thoughtful oral responses are undermined by the state's own
assessment system.

There are other omissions. Despite the prominent place given to
multicultural literature (it is listed at the top of
English-language-arts Standard 2, there is no exam question requiring
students to draw on such experience or knowledge. Moreover, despite
Standard 2's detailed discussion of literary terminology, there are no
exam questions relating to this point.

The English-language-arts exam is six hours long (making it six
times longer than the sat II exams and three times longer than a
typical Ph.D. defense). Given over two days, it requires students to
apply a set of test skills-pacing, format, suspension of one's belief
system, test-taking terminology, and an understanding of
multiple-choice-question structure-that are quite unlike skills
required of students in classrooms where engagement with and ownership
of material, reflective behaviors, revision skills, and consideration
of multiple perspectives are emphasized.

To prepare students for such tests requires teachers to exchange a
rigorous curriculum (one in which students prepare analytic essays,
thoughtful research papers, original science experiments, and
sophisticated math applications) for repetitious drill on timed
practice tests. Eventually, as such exams become embedded in the
schools, students, understanding that less is required, will reject the
more rigorous efforts demanded by performance assessments. High
standards will be replaced by test-driven lower ones.

The results are already evident: fewer entries in writing
competitions, less time for in-depth analysis during classroom
discussions, disregard for subjects that won't be tested,
depersonalization of teacher-student relationships, and the beginning
signs of student alienation and climbing dropout figures. (Already,
some schools that serve the most marginal students have reported that
their registers have declined by a third.)

A high-quality education encourages students to set long-range
goals, learn persistence and time management, and practice reflection
and revision skills that further education and lifelong learning. High
standards will not be achieved if these goals are neglected.

Ann Cook is the director of the Urban Academy Laboratory School,
Cece Cunningham is the director of Middle College High School, and
Phyllis Tashlik is the director of the Center for Inquiry in Teaching
and Learning, all in New York City. The schools are members of the New
York Performance Standards Consortium, a network of 40 New York state
schools that have developed and use a system of performance
assessment.

Ann Cook is the director of the Urban Academy Laboratory School, Cece
Cunningham is the director of Middle College High School, and Phyllis
Tashlik is the director of the Center for Inquiry in Teaching and
Learning, all in New York City. The schools are members of the New York
Performance Standards Consortium, a network of 40 New York state
schools that have developed and use a system of performance assessment.

Ground Rules for Posting
We encourage lively debate, but please be respectful of others. Profanity and personal attacks are prohibited. By commenting, you are agreeing to abide by our user agreement.
All comments are public.