SHANKER BLOG

How Often Do Proficiency Rates And Average Scores Move In Different Directions?

New York State is set to release its annual testing data today. Throughout the state, and especially in New York City, we will hear a lot about changes in school and district proficiency rates. The rates themselves have advantages – they are easy to understand, comparable across grades and reflect a standards-based goal. But they also suffer severe weaknesses, such as their sensitivity to where the bar is set and the fact that proficiency rates and the actual scores upon which they’re based can paint very different pictures of student performance, both in a given year as well as over time. I’ve discussed this latter issue before in the NYC context (and elsewhere), but I’d like to revisit it quickly.

Proficiency rates can only tell you how many students scored above a certain line; they are completely uninformative as to how far above or below that line the scores might be. Consider a hypothetical example: A student who is rated as proficient in year one might make large gains in his or her score in year two, but this would not be reflected in the proficiency rate for his or her school – in both years, the student would just be coded as “proficient” (the same goes for large decreases that do not “cross the line”). As a result, across a group of students, the average score could go up or down while proficiency rates remained flat or moved in the opposite direction. Things are even messier when data are cross-sectional (as public data lmost always are), since you’re comparing two different groups of students (see this very recent NYC IBO report).

Let’s take a rough look at how frequently rates and scores diverge in New York City.

Unlike rates (and unlike the hypothetical example above), scale scores on New York State’s tests are only comparable within grades – e.g., fourth graders in one year can be compared with fourth graders the next year, but not with students in any other grade. This means we have to compare scores and rates not only school-by-school, but grade-by-grade as well.

In the scatterplot below, each dot (roughly 4,000 of them) is a single tested grade in a single school. The horizontal axis is the change in this grade’s ELA proficiency rate between 2010 and 2011. The vertical axis is the change in its actual scale score. The red lines in the middle of the graph are zero change. So, if the dot is located in the upper right quadrant, both the score and the rate increased. If it’s in the lower left quadrant, both decreased.

You can see that most of the dots are in one of these two quadrants, as we would expect. When scores increase or decrease, rates also tend to increase or decrease. But not always. You’ll also notice that there are a large number of dots in the upper left and lower right quadrants of the graph (particularly the latter), which means that these ELA scores and rates moved in opposite directions.

(The scatterplot looks extremely similar for math, and if I exclude grade/school combinations with smaller numbers of tested students.)

Let’s see if we can sum this up to give you a very rough idea of how many grade/school combinations exhibited such a trend (note that these figures don't count extremely small changes – see the first footnote for details).*

We’ll start with ELA.

Around 30 percent of grade/school groups had disparate trends in their scores and rates. In one in five cases, the two moved in opposite directions. In another 11 percent of grades, either the score or the rate moved while the other was relatively stable.

So, if you were summarizing student performance (at least cohort changes for individual grades) based solely on changes in rates between 2010 and 2011, there’s a 30 percent chance you’d reach a different conclusion if you checked the scores too.

Here’s the same graph for math.

The situation is similar. About one in four grades saw their rates and scores either move in opposite directions, or one was stable while the other moved (once again, the figures don't change much if I exclude grades with small numbers of students).

Certainly, these results for NYC are not necessarily representative of all districts, or even of NYC results in other years. It all depends on the distribution of successive cohorts' scores vis-a-vis the proficiency line.

That said, what this shows is that changes in proficiency rates can give a very different picture than trends in average scores. They measure different things.**

Yet, when states and districts, including New York, release their testing results for the 2011-2012 school year (if they haven’t already), virtually all of the presentations and stories about these data will focus on trends in the proficiency rates, and it's not unusual to see officials issue glowing proclamations about very small changes in these rates.

If, however, policymakers and reporters fail to look at the scores too (grade-by-grade, if necessary), they risk drawing incomplete, potentially misleading conclusions about student performance. And states and districts, such as the D.C. Public Schools, that don’t release their scores should do so.

- Matt Di Carlo

*****

* Since many of the grades are comprised of relatively small samples, I (somewhat crudely) coded minor changes in the rate or score (one point or less) as “stable," so as to avoid characterizing grades as moving in opposite directions when the changes are very small. If, for example, the score and rate move in opposite directions, but both changes are small, this does not “count” as divergence. Furthermore, if it just so happens that a “stable” score or rate (i.e., one that changes only a little) moves in the same direction as a larger change in the other measure, that grade/school change is coded as convergent. This recoding reduces the number of divergent changes by a small but noteworthy amount.

** I would argue that the scores are better for looking at changes over time, since they give a better idea of the change in performance of the "typical student." But, as mentioned above, scores usually aren't comparable between grades, so score changes would have to be examined on a grade-by-grade basis. In addition, of course, the scores don't mean much to most people.

Upcoming Events

Elected officials seeking to diminish the pensions of public sector employees have argued that they are responding to a fiscal crisis. Is this crisis real or contrived? March 11, noon-2. More information and registration.

Teaching and learning are not primarily individual accomplishments but rather social endeavors that are best achieved and improved through trusting relationships and teamwork. Yet, most policies focus on improving the individual capacities of teachers. What is to be gained from approaches that strike a balance between human and social capital?

There is concern that, as the U.S. population and student body is growing more racially and ethically diverse, the teacher workforce does not yet reflect this diversity. In fact, diversity should go beyond having more black and brown teachers in front of students. Diversity is also about equipping all teachers (regardless of race) to work with heterogeneous classrooms and diverse schools.

Our Mission

The Albert Shanker Institute, endowed by the American Federation of Teachers and named in honor of its late president, is a nonprofit, nonpartisan organization dedicated to three themes - excellence in public education, unions as advocates for quality, and freedom of association in the public life of democracies. With an independent Board of Directors (composed of educators, business representatives, labor leaders, academics, and public policy analysts), its mission is to generate ideas, foster candid exchanges, and promote constructive policy proposals related to these issues.

This blog offers informal commentary on the research, news, and controversies related to the work of the Institute.

DISCLAIMER

This web site and the information contained herein are provided as a service to those who are interested in the work of the Albert Shanker Institute (ASI). ASI makes no warranties, either express or implied, concerning the information contained on or linked from shankerblog.org. The visitor uses the information provided herein at his/her own risk. ASI, its officers, board members, agents, and employees specifically disclaim any and all liability from damages which may result from the utilization of the information provided herein. The content in the Shanker Blog may not necessarily reflect the views or official policy positions of ASI or any related entity or organization.