Pages

3/22/12

Why The "Tests" Are Actually Completely Useless

...

So, what’s the problem? In short, it’s that year-to-year changes in proficiency rates are not valid evidence of school or policy effects. These measures cannot do the job we’re having them do, even on a limited basis. This really has to stop.

The literature is replete with warnings and detailed expositions of these measures’ limitations. Let’s just quickly recap the major points, with links to some relevant evidence and previous posts.

Proficiency rates may be a useful way to present information accessibly to parents and the public, but they can be highly-misleading measures of student performance, as they only tell you how many test-takers are above a given (often somewhat arbitrary) cutpoint. The problems are especially salient when the rates are viewed over time – rates can increase while average scores decrease (and vice-versa), and rate changes are heavily dependent on the choice of cutpoint and distribution of cohorts’ scores around it. They are really not appropriate for evaluating schools or policies, even using the best analytical approaches (for just two among dozens of examples of additional research on this topic, see this published 2008 paper and this one from 2003);

The data are (almost always) cross-sectional, and they mask changes in the sample of students taking the test, especially at the school- and district-level, where samples are smaller (note that this issue can apply to both rates and actual scores; for more, see this Mathematica report and this 2002 published article);

Most of the change in raw proficiency rates between years is transitory – i.e., it is not due to the quality of a school or the efficacy of a policy, but rather to random error, sampling variation (see the second bullet) or factors, such as students’ circumstances and characteristics, that are outside of schools’ control (see this paper analyzing Colorado data, this one on North Carolina and our quick analysis of California data).