The
Best and Worst Performing School Districts in Massachusetts

The Massachusetts
Education Assessment Model was developed by economists at the Beacon Hill Institute
for the purpose of permitting policy makers to determine how well schools perform
on MCAS tests, given such important factors as socioeconomic characteristics,
past test performance and changes in spending per student and in class size [1]

The model makes
it possible to determine how well schools are doing, given that, while factors
such as these are important determinants of performance, they are nevertheless
beyond the control of schools themselves. Educators can use results to
identify schools that outperform the model and to observe teaching and administrative
methods employed by those schools. Because the model does a good job at
predicting school performance, schools that perform substantially better (or
worse) than predicted are worth studying for the good (or bad) example they
provide. We
created two ratings of school districts based on our findings. The first,
“District Rankings for Achieving Good Performance,” lists school
districts according to their ability to exceed the model’s predictions
of the fraction of students that score in the “good” (G) category
(Advanced or Proficient on MCAS). Schools with lower numbers, i.e. a rating
close to “1,” outperform schools with higher ones. The closer
the district comes to getting a “1” rating, the more the actual
fraction of students from that district falling in the “good’ (G)
category exceeds that predicted by the model.

The second rating,
“District Rankings for Reducing Poor Performance,” reflects a district’s
success in reducing the fraction of students doing poorly, i.e. falling in the
poor (P) or MCAS “Failing” category. The closer a district comes
to a “1” rating here, the more successful it was in keeping the fraction
of students who fail below what the model predicted for that district.
Each school
district therefore gets six different ratings; the G and P ratings are each applied
to the three grades taking the MCAS (4th, 8th and 10th).
The ratings run from 1 (best) to as high as 220 (worst), depending on the number
of individual or combined districts for which there were complete data.
Schools falling among the top 10 for either the G or P rating can be seen as doing
a superior job in that grade. Schools falling among the bottom ten can be
seen as doing a very poor job. One
way to identify the best or worst performing schools is to find those schools
that, for at least two of the six possible opportunities, fell in the top ten
or the bottom ten of each rating scale. Table 1 identifies the 15 schools
that, by that measure, registered superior performance.

Table 1

The 15 Best-Performing
Massachusetts School Districts

Achieving
Good Performance (G Rating)

Reducing
Poor Performance

(P
Rating)

DISTRICT
(number of ratings for which district fell in the top 10)

4th

8th

10th

4th

8th

10th

Hadley
(5)

X

X

X

X

X

Clinton
(3)

X

X

X

Methuen
(3)

X

X

X

Stoneham
(3)

X

X

X

Tyngsborough
(3)

X

X

X

Nantucket
(2)

X

X

Chelsea
(2)

X

X

Dighton-Rehoboth
(2)

X

X

Eastham
(2)

X

X

Everett
(2)

X

X

Hanover
(2)

X

X

Oxford
(2)

X

X

Provincetown
(2)

X

X

Shrewsbury
(2)

X

X

Sutton
(2)

X

X

We see that the
Hadley district stands out as the best of the best schools by falling among
the top ten for five of the six ratings, the exception being the 4th
grade P rating. Clinton, Methuen, Stoneham and Tyngsborough were next
in performance, falling in the top ten for three out of the six ratings. Table 2 identifies
the 12 schools that fell in the bottom ten for at least two of the six ratings
and that, by the same measure, registered poor performance. We see that,
by this measure, Narragansett was the worst of the worst, falling in the bottom
ten for its results under the G and P ratings for both 4th and 10th
graders. Next in order were Gateway and Somerset.

Table 2

The 12 Worst-Performing
Massachusetts School Districts

Achieving
Good Performance (G Rating)

Reducing
Poor Performance

(P
Rating)

DISTRICT
(number of ratings for which district fell in the bottom 10)

4th

8th

10th

4th

8th

10th

Narragansett
(4)

X

X

X

X

Gateway
(3)

X

X

X

Somerset
(3)

X

X

X

Chesterfield-Goshen
(2)

X

X

Adams
Cheshire (2)

X

X

Hudson
(2)

X

X

Leicester
(2)

X

X

Millis
(2)

X

X

X

Mount
Greylock (2)

X

X

Randolph
(2)

X

X

Swampscott
(2)

X

X

Watertown
(2)

X

X

Department
of Education Ratings

Due to the changes
mandated by the Education Reform Act of 1993, Massachusetts public high school
sophomores will be required, beginning in 2003, to pass the MCAS test as a condition
for graduation. In view of the poor performance shown on past MCAS tests,
there is a growing fear that many of these students will not pass and will therefore
be denied graduation. Various groups are calling for the elimination or
at least the weakening of the MCAS requirement – a concession that would
represent a serious compromise of the principles on which the state has invested
some $6 billion in education reform. It
thus becomes imperative that Massachusetts school administrators determine as
quickly as possible what they can do to improve performance on the MCAS tests.

To this end, they
need to determine which schools are doing a good job and which a poor job of
preparing students for the MCAS tests. They need to learn what the good
schools are doing right and what the bad schools are doing wrong.On
January 9, 2001, the Massachusetts Department of Education took what it intended
to be a step in this direction by issuing its “cornerstone” School
Performance Rating Process (SPRP) report. The report rated Massachusetts
public schools for their improvement on MCAS tests over the period 1998 to 2000
according to whether they had (1) exceeded, (2) met, (3) approached or (4) failed
to meet DOE expectations for improvement.Under
the DOE rating system, schools were subjected to a varying standard, depending
on how they performed on the 1998 tests. The poorer they did on the 1998
tests, the higher the standard of improvement required.

Schools whose
1998 performance was “very high” had to improve their average MCAS
score by 1-3 points, while schools whose 1998 performance was “critically
low” had to improve by 5-7 points to meet expectations.Schools
identified as failing to meet DOE expectations can receive warnings or “referrals
for review.” School districts found to be “chronically under-performing”
can be put in state receivership. Conversely, schools identified as exceeding
DOE expectations become eligible for recognition as “exemplary”
schools or role models for others to emulate. Unfortunately,
the DOE rating system is a badly flawed and misleading interpretation of MCAS
test results. The problem lies in the fact that, despite claims of statistical
rigor, the DOE ratings amount to nothing more than the application of subjective
and arbitrary standards to the data. Merely saying that a given school
“failed” to meet or “exceeded” some predetermined standard
for improvement does not in and of itself tell us anything about how good a
job that school did teaching its students. The
problem lies in the failure by DOE to consider socioeconomic factors and other
factors that are known to enter importantly into the determination of
school performance. Applying similar standards to schools with widely
varying socioeconomic characteristics is to load the dice against schools that
are disadvantaged by virtue of these characteristics.

BHI
rating is a better indicator

Looking at Table
1 of Best-Performing Schools, we see that BHI’s top rated school district
is Hadley. However, the DOE rates Hadley as “failing” at the 4th
and 10th grade levels. Similarly, the DOE rated Chelsea’s 4th
and 10th grade levels and Oxford and Tyngsborough’s 4th
grade levels as failing, where the BHI model reveals that these schools are
in fact exceeding their potential. This highlights the pitfalls of the
DOE method of assessing improvement solely according to MCAS scores and a preconceived
standard of performance.Comparing
Table 2 of Worst-Performing Schools to the DOE rating again shows some significant
discrepancies. Millis 4th grade, Hudson 8th grade
and Watertown 8th and 10th grades were all labelled as
meeting the improvement expectations of the DOE. But, as shown in Table
2, the BHI model reveals that these schools are not meeting their performance
potential at these grade levels.
It is important to recognize that BHI’s rating system is intended to show
how well or poorly a school performed, given the socioeconomic character
of the community it serves and other factors beyond the control of the school’s
administrators and teachers. The
BHI rating system therefore reflects the quality of teaching and utilization
of school resources, not the level of results produced by the school.
Performing well on the BHI model is indicative of exceeding or achieving what
can be expected of a district, given socioeconomic factors and the like.

Schools given
a high rating by the BHI model might do poorly on MCAS when their raw scores,
unadjusted for these factors, are compared with other schools. Schools
given a low BHI rating might conversely turn out to do well when their raw MCAS
scores are compared to other schools. Table
3 lists schools according to their combined English, Mathematics and Science
rankings for each grade level in the good (G) category (Advance or Proficient),
with schools with lower numbers, i.e. a rank close to 1, outperforming schools
with higher ones. If a school district is ranked close to 1, then that
particular district’s actual proportion of students in the good (G) category
is substantially higher than that predicted by the model. We see, for
example, that for 4th graders, the Sutton school district did the
best job (with a 1 ranking) of outperforming the model and that the Chesterfield
Goshen Regional district did the worst job (with a 215 ranking) of measuring
up to what the model predicted.Table
4 provides a second ranking, reflecting a district’s success in reducing
the fraction of students doing poorly, i.e. falling in the Poor (P) or Failing
category. The closer to 1 that a district is ranked, the more successful
it was in keeping the fraction of students who perform poorly below what the
model predicted for that district. Thus, of all districts, the Everett
district did the best job of reducing poor performance for 4th graders.Finally in Table 5, we list districts alphabetically, providing the G and P
rankings for each district. Again, for both categories, the closer the rank
is to 1 the better the district performed.