Whole-School Projects Show Mixed Results

When the superintendent in Memphis, Tenn., announced he was
scuttling that district's long-running effort to install schoolwide
improvement programs in every school in the city, the decision seemed
unusual enough to merit national attention.

But the July closing of the city's closely watched experiment was
just the latest in a string of setbacks in the nationwide movement
known as comprehensive or "whole school" reform.

Since 1998, districts in San Antonio and Miami-Dade County have
abandoned some efforts to adopt well-known, "off the shelf" improvement
models on a large scale. Early reports from New Jersey, where 30 poor
districts are under a 1998 court order to adopt schoolwide improvement
models, also suggest that implementation in that state, while still on
track, is running into obstacles and pockets of resistance.

And a report by the RAND Corp., published earlier this year,
suggests that in districts that are trying to "scale up" their
schoolwide reform programs, such efforts may be producing significant
gains in only about half the schools that try them.

Of the 163 schools that the think tank's researchers had tracked
over two to three years, only about half made bigger gains in
mathematics and reading achievement than their districts did
overall.

Experts say those developments, taken together, suggest that the
movement for whole-school improvement could be entering a new, more
challenging phase.

"Implementation is much harder than any of us expected," said Henry
M. Levin, who founded an improvement model known as Accelerated
Schools. "I think this is probably just a long, slow process, and too
much was claimed for it too soon."

Industry 'Shakeout'

The movement reflects the long-held recognition that piecemeal
attempts to improve schooling by putting in a reading program here or a
new management strategy there were not working— especially in
some of the nation's poorest and lowest-achieving schools. To improve
learning for all of the children under their care, proponents of the
schoolwide approach argued, schools and districts had to come up with
more coherent programs that could deal with all aspects of
schooling.

But rather than start from scratch to create such programs, the
thinking went, schools might be better off trying some models that had
already accumulated successful records.

Often-cited examples of such programs included Success for All, a
reading-based program for elementary schools devised by researchers at
Johns Hopkins University in Baltimore; Mr. Levin's Accelerated Schools
program; and Direct Instruction, an approach developed in the 1960s by
Siegfried Engelmann.

Nurtured since 1992 with $100 million from New American Schools, a
private, nonprofit organization in Arlington, Va., program developers
began to move their models out into more and more schools. And, with
Memphis at the head of the pack, a few districts began experimenting
with the programs on an even larger scale.

In the 118,000-student Memphis district, all 163 schools were
required to adopt an improvement program of their own choosing.

Nationwide, the movement got an added boost in 1997, when Congress
approved the Comprehensive School Reform Demonstration Program, a grant
program aimed at helping mostly poor schools put in place
research-proven improvement models. Since then, federal lawmakers have
sunk $480 million into the program, which is underwriting improvement
efforts in around 2,000 schools across the country.

"Comprehensive school reform has really changed the way that
educators come around the table," said Karen Hinton, the vice president
for external affairs for New American Schools. "It's not a faddish
trend."

That's why proponents suggest that the setbacks they see now may be
more of a transitional phase than a sign of failure.

"The situation, as I see it, is typical of things that happen
probably with any innovation in any field, where at the outset there's
a great deal of enthusiasm, and then people get realistic about what
the methods can and cannot do and there's a shakeout," said Robert E.
Slavin, a co-developer of the Success for All model used by 1,800
schools. "The stronger programs continue, and then it becomes part of
the landscape."

The analogy he draws is to the Internet industry, which saw its
stock prices drop as the field underwent a shakeout.

"Is that failing? Of course not," said Mr. Slavin, who is also
co-director of the Center for Research on Students Placed at Risk, a
federally funded research center based at Johns Hopkins. "It's actually
the stabilization of something that is a major change for society."

To longtime critics of the whole-school approach, however, the
problems cropping up are proof of the movement's wrongheadedness.

"The whole edifice that was constructed around the notion of
schoolwide reform models and approaches being better was not correct,"
said Stanley Pogrow, an associate professor of education at the
University of Arizona in Tucson. "I'm not saying we shouldn't have
experimented with them. I'm saying they should not have been, to the
exclusion of almost everything else, what the [U.S.] Department of
Education pushed and promoted and funded research around."

Flawed Studies?

Mr. Pogrow's criticism has been characterized as sour grapes because
he markets a more narrowly focused, computer-based program for teaching
critical-thinking skills that has suffered in the rush to embrace
broader strategies.

But he, like other critics, also points to a growing body of
research suggesting that some of the studies favoring schoolwide
programs such as Success for All have been flawed, and that the
programs are not meeting their initial promise.

Early studies of Memphis' programs suggested that, from 1995 to
1999, students in the schools that were restructuring were making
greater achievement gains than students in schools with demographically
similar enrollments that had not yet undertaken the changes. That
study, produced by the University of Memphis, spurred other districts,
such as Atlanta, to follow Memphis' example.

However, when district researchers conducted their own study at the
request of their new superintendent, Johnnie B. Watson, they came to a
more pessimistic conclusion. Using different study methods, they found
that, over the first five years of the initiative, students' test
scores were stagnant or declining in mathematics, reading, and English.
("Memphis Scraps Redesign
Models in All Its Schools," April 18, 2001.)

In a new critique paid for by New American Schools, though, an East
Tennessee State University professor expresses some skepticism about
the district's findings. James E. McLean, an education professor at the
university's Johnson City campus, said the district study was flawed
because the researchers failed to use comparison groups and because
they measured progress in terms of changes in the percentages of
students who scored above the 50th percentile on tests.

"While considering the percentage of students above the 'national
average' sounds impressive," he said, "it may not be appropriate." The
reason: The methodology fails to pick up subtler changes in students'
overall average achievement, particularly for students who may have
started out with bottom-hugging scores but could not quite pass the
"high jump" the researchers set out for them.

Schools'-Eye Views

In a similar vein, supporters of comprehensive improvement projects
criticize the RAND study for measuring the progress of students in
schools undergoing restructuring against the averages for their
districts. That methodology poses problems, they say, because the
schools trying to incorporate new strategies were among the poorest in
their districts. ("RAND Finds
Mixed Results for School Reform Models," April 18, 2001.)

They also contend those schools turned out to be unrepresentative of
schools nationwide that were getting support from New American Schools
to "scale up" schoolwide changes, because they were concentrated within
a few states or less successful reform models.

The problem with the RAND study, as with most evaluations of such
improvement initiatives, said Steven M. Ross, the University of Memphis
researcher who conducted the first studies in that district, is that
they draw on data collected at the school level, which is a cruder
measure than data collected on individual students.

"Hardly any program in the history of education would show
sustainable gains with school-level data," he said. "Educational
research has to be extremely sensitive to factor out extraneous
differences."

Mark Berends, a co-author of the RAND study, does not disagree.

"Because there is so much variation in implementation even in the
same school, it's very difficult to look for achievement effects," he
said.

Researchers conducting such evaluations also disagree over how to
address the fact that in many inner-city schools, 20 percent to 40
percent of students might be new to the school and new to the programs
under study. One school of thought contends that it's unfair to the
programs to include achievement data on those students; another argues
for including those students because schools, too, have to include them
and adjust their teaching accordingly.

"There's just very little research to date on the variety of models
out there in different settings that show deep implementations and
sustained implementations, so that we can even think about achievement
in student learning," he said.

The hope is that the picture will become clearer over the next five
years through some of the newer research efforts being underwritten by
the federal demonstration program. Last year, the Department of
Education awarded $21 million in grants to six research groups to study
the progress and effectiveness of federally financed schoolwide
reforms.

In the meantime, the growing pressure to show research-backed
results is producing a steady trickle of studies—most of them
positive—on individual school reform designs, such as Accelerated
Schools, Success for All, James P. Comer's School Development Program,
E.D. Hirsch Jr.'s Core Knowledge approach, and America's Choice.

"There's evidence across all the reports that a whole bunch of
models can have a positive impact on student achievement," said Mary
Anne Schmitt, the president and chief executive officer of New American
Schools. "But there's also evidence that we can't guarantee that any
given model will have an impact on student achievement."

What her organization and others have learned, however, from all of
the studies is what kinds of conditions must be in place for
comprehensive improvement efforts to succeed. Districts that look to
New American Schools for support now have to put together portfolios
showing that they're willing to stay with the program for the long
haul, provide the necessary teacher training and financial support, and
meet other criteria that studies say may be important to sustaining
schoolwide programs.

"We've all learned a lot about how to create those conditions so we
can increase the probability of success to something much greater in
the future," Ms. Schmitt said.

No One Size Fits All

In fact, some experts contend that the setbacks the movement is
experiencing now have little to do with anything researchers have to
say about the reform models' overall efficacy. Rather, the problems
reflect management missteps, political opposition, outside pressure,
and practical impediments that bedevil school systems on a day-to-day
basis.

One of the biggest mistakes, many proponents of schoolwide
improvement programs say, may have been attempting to impose reform
models on a large scale, much as Memphis, the New Jersey districts, and
Miami-Dade County have done or are trying to do to one degree or
another.

"I think for schoolwide reform to really work, you have to have the
buy-in of the entire staff," said Nereida Santa-Cruz, the assistant
superintendent for curriculum support services in the 361,000-student
Miami- Dade system.

Of the 45 schools in that district that began working with Success
for All, only seven are still using the program.

"We were not successful with Success For All," Ms. Santa-Cruz said.
"For whatever reason, it was not a program for which we could show the
enormous increases promised by the developer."

Some of the resistance to the improvement models in New Jersey has
come from schools that were doing well on their own before the court
order, according to Bari Anhalt Ehrlichson, an assistant professor of
policy at Rutgers University in New Brunswick, N.J., who has been
following the attempts at whole-school reform in that state.

"When people say resistance, you often think of the lazy teacher who
doesn't want to change, but in some of these cases, these were
protective faculties who were excited about what was already going on
in their schools and had the data to back it up," she said.

A more garden-variety problem plaguing many schoolwide reforms is a
change of leadership at the top. New principals and new superintendents
are often more eager to make their own mark on a school system than
they are to continue initiatives their predecessors launched. Without
continuing support from the top, whether financial or otherwise, many
schoolwide programs tend to wither away.

"For all the reforms, that's really the death knell," said Mr.
Levin, a professor of economics and education at Teachers College,
Columbia University.

Ms. Ehrlichson says turnover at the staff level can also hinder
schools' progress. In some of the schools she has been tracking for
three years, only 10 percent of the staff members have been there from
the beginning.

"Developers almost have to offer year-one training every year," she
said. Program developers also complain they are given limited time to
work with teachers to try to bring about deep, lasting changes in their
instruction.

Another difficulty is that policymakers tend to show little patience
for long-term change—a problem when many program creators say
their models take two to three years or longer to show results.

The pressure to perform has also heightened in many states as
policymakers rely increasingly on standardized tests to determine which
students can graduate on time and which schools qualify for rewards or
punishments. That can be a problem, experts say, if the tests are not
compatible with the curricula and teaching strategies the restructuring
programs are using.

"I've seen many schools end up dropping their efforts because, while
they might see some improvement on other measures, they don't see it on
state standardized tests," said Amanda L. Datnow. An assistant
professor in the department of theory and policy studies at the
University of Toronto's Ontario Institute for Studies in Education, she
has been tracking several such reform projects across the United
States.

But, she said, it's too soon to count out such models. "I think
there's some hope that, under the right conditions and, if used for the
right reasons," she said, "these models can produce some improvement
for student learning."

Vol. 21, Issue 10, Pages 1, 24-25

Published in Print: November 7, 2001, as Whole-School Projects Show Mixed Results

Notice: We recently upgraded our comments. (Learn more here.) If you are logged in as a subscriber or registered user and already have a Display Name on edweek.org, you can post comments. If you do not already have a Display Name, please create one here.

Ground Rules for Posting
We encourage lively debate, but please be respectful of others. Profanity and personal attacks are prohibited. By commenting, you are agreeing to abide by our user agreement.
All comments are public.