The A-List

Before this guide, the only way to judge a program was by its
advertising, says one official.

Architects of school reform models may have reason to worry. On the
heels of controversy over a federal list of suggested programs, a new
independent study has concluded that just three of 24 popular models
have strong evidence that they improve student achievement.

An Educators' Guide to Schoolwide Reform, a 141-page report
from Washington, D.C.-based American Institutes for Research, found
that only Direct Instruction, High Schools That Work, and Success for
All made the grade. Commissioned by five education groups-including the
National Education Association and the American Federation of
Teachers-the report is the most comprehensive rating of school reform
programs.

Such information is desperately needed, says Paul Houston, executive
director of the American Association of School Administrators, another
sponsor of the report. "Before this guide came along, about the only
way educators could judge the worth of some of these programs was by
the quality of the developers' advertising and the firmness of their
handshakes. Now, superintendents, principals, and classroom teachers
can sit down together and make reasonable decisions about which is best
for their district's needs."

But some developers are questioning how AIR decided which studies to
include as evidence of a program's effectiveness. Others maintain that
they have more evidence of positive results than AIR gives them credit
for. Henry Levin, a Stanford University economist whose Accelerated
Schools program received only a "marginal" rating, described the study
as "fairly amateurish."

"Basically, they discounted anything, as far as I can tell, that
comes in and changes test scores over time for a particular school," he
says. "And [any program] that said it had a comparison group was given
a gold standard."

Such criticism echoes the recent hubbub over the federal list of
suggested reform models. [See "Who's In, Who's Out," March.] That
list-which included 17 programs-was intended to guide schools seeking
some of the $150 million that's available as part of the Comprehensive
School Reform Demonstration Program. But reformers who didn't make the
cut contested Congress' selection process.

After the release of the AIR report, some developers welcomed the
new scrutiny. More than anything, they say, the AIR study underscores
the need for strong, third-party evaluations of schoolwide reform
models. Similar studies are now completed or in the works.

"The fact is that the capacity to do this kind of research is very
limited in this country," says Marc Tucker, a founder of America's
Choice, one of the 24 models reviewed by AIR. "I believe that it's very
important for the federal government to put a fair amount of money on
the table to make this kind of research possible."

Ellen Condliffe Lagemann, president of the National Academy of
Education, a group of education researchers and scholars, agrees. "It's
amazing how little evaluation there is," she says. "Since the early
20th century, the people who have peddled the educational reform
strategies that we all hear about tend to be successful because they're
the best entrepreneurs. It doesn't necessarily have to do with any
research capability."

The AIR's consumer-oriented guide rates 24 whole-school reform
models according to whether they improve achievement in such measurable
ways as higher test scores and attendance rates. It also evaluates the
assistance provided by the developers to schools that adopt their
strategies and compares the first-year costs of such programs. "We
wanted to have a document that really, critically evaluated the
evidence base underpinning these programs," says Marcella Dianda, a
senior program associate at the NEA, which helped underwrite the
$90,000 study. "We felt that our members really wanted that. They
wanted us to get to the bottom line."

The evaluators used a multistep process to rate whether the programs
had evidence that they raised student achievement, according to Rebecca
Herman, the project director. First, AIR gathered almost any document
that reported student outcomes, including articles in scholarly
journals, unpublished case studies and reports, and changes in raw test
scores reported by the developers. More than 130 studies were then
reviewed and rated for their methodological rigor in 10 categories,
based on such criteria as the quality and objectivity of the
measurement instruments used, the period of time over which the data
were collected, the use of comparison or control groups, and the number
of students and schools included.

Studies that met AIR's criteria for rigor were used to judge whether
a program was effective in raising student achievement. For example, a
number of developers submitted changes in state or local test scores as
evidence that their programs were working. But, says Herman, "we really
didn't consider test scores alone, without some sort of context,
because there are a lot of things that can explain changes in test
scores."

In its final analysis, the study gave a "strong" rating to the
programs with the most conclusive supporting research, notably four or
more studies that used rigorous methodology and found improved
achievement. The gains had to be statistically significant in at least
three of those studies. A "promising" rating went to models with three
or more rigorous studies that showed some evidence of success. A
"marginal" rating went to reform models that had fewer rigorous studies
with positive findings or a higher proportion of studies showing
negative or no effects. A "mixed or weak" label was assigned to
programs with study findings that were ambiguous or negative. And AIR
gave a "no research" rating to programs for which there were no
methodologically rigorous studies. Eight programs received the "no
research" rating-not surprising, according to Herman, given the newness
of many of the models.

"It takes a good three years to implement a reform model across a
school, and another two years to come up with a decent study," she
says. "What we're looking at is the first wave of research, and we're
hoping for an ocean to follow it."

The study comes as districts around the country seek proven,
reliable solutions to the problem of low-performing schools. But as
they spend greater amounts of tax dollars on the various reform models,
questions remain about how well the programs work. About 8,300 schools
nationwide-roughly 10 percent-were using one of the 24 designs rated in
the study as of October 30, the report says. Yet it notes that "most of
the prose describing these approaches remains uncomfortably silent
about their effectiveness."

-Lynn Olson

Copies of the report, An Educators' Guide to Schoolwide
Reform, are available from the sponsoring organizations for $15.95
each for nonmembers and $12.95 each for members. The full text of the
report is also available on the World Wide Web at www.aasa.org/Reform/index.htm.

The Research section is underwritten by a grant from the Spencer
Foundation .

Ground Rules for Posting
We encourage lively debate, but please be respectful of others. Profanity and personal attacks are prohibited. By commenting, you are agreeing to abide by our user agreement.
All comments are public.