The hype of 'value-added' in teacher evaluation

By
Valerie Strauss

The issue of evaluating teacher performance has been in the news a lot lately, including yesterday, when a school in Rhode Island fired every one of the educators in the building. Here today to discuss teacher evaluation methods is Lisa Guisbond. She is a policy analyst for the National Center for Fair and Open Testing, known as FairTest, a Boston-based organization that aims to improve standardized testing practices and evaluations of students, teachers and schools.

By Lisa Guisbond
As a rookie mom, I used to be shocked when another parent expressed horror about a teacher I thought was a superstar. No more. The fact is that your kids’ results will vary with teachers, just as they do with pills, diets and exercise regimens.

Nonetheless, we all want our kids to have at least a few excellent teachers along the way, so it’s tempting to buy into hype about value-added measures (VAM) as a way to separate the excellent from the horrifying, or least the better from the worse.

It’s so tempting that VAM is likely to be part of a reauthorized No Child Left Behind. The problem is, researchers urge caution because of the same kinds of varied results featured in playground conversations.

Value-added measures use test scores to track the growth of individual students as they progress through the grades and see how much “value” a teacher has added.

Policymakers want to use this data to evaluate teachers and make decisions about pay, tenure or termination.

Tracking an individual child’s progress is clearly better than what we have now: a kind of apples to oranges comparison of the average scores of this year’s fourth graders to last year’s, for example. It’s easy to see how such comparisons could get muddled by a large influx of kids with autism, for example. That’s why value-added measures initially seem so attractive.

“A great deal is unknown about the potential and limitations of alternative statistical models for evaluating teachers’ value added contributions to student learning. BOTA agrees with other experts who have urged the need for caution and for further research prior to any large-scale, high-stakes reliance on [value-added approaches].”

Here are four cautions, among many:

*First, value-added rests on the shaky assumption that math and English test scores tell us what we need to know about student progress. No matter how good a test may be, it can’t measure all of what parents want their kids to be learning and doing in school. In short, value added would intensify the existing unhealthy pressure on teachers to teach to the test.

*Second, as I pointed out in my last post, it’s impossible to tease out the effect of one teacher from those who came before, or from a music teacher, for example, who is the linchpin in a musical student’s school week (but is not measured by any test). It’s also difficult to separate a teacher’s influence from the influence of a chaotic home, poor nutrition, lack of sleep or a host of other factors.

*Third, the validity of this approach rests on the false assumption that students and teachers are assigned randomly. In reality, senior teachers can and do choose better schools and classes, while parents in affluent towns fight to get their kids into classrooms of teachers with good reputations. Think this might skew test results a bit?

*Fourth, value added doesn’t give us any information about what practices distinguish good teachers from bad. All we know is good teachers get better test scores, not what they did to achieve this.

Oh, and here’s an interesting twist: Researchers looking at math test results saw more variation within one teacher’s “effectiveness” than from one teacher to another. Turns out “good” teachers aren’t consistently good, and “bad” teachers aren’t consistently bad. As I was saying, your results will vary.

For more in-depth analysis of value-added measures and why growth should be assessed using multiple measures, see one of FairTest’s analyses of VAM.

-0-

Follow my blog all day, every day by bookmarking washingtonpost.com/answer-sheet And for admissions advice, college news and links to campus papers, please check out our new Higher Education page at washingtonpost.com/higher-edBookmark it!

This may come as a shock to the testing crowd, but many students figure out that their results on these tests have no impact on them. How much effort would you put into answering the questions on a 4 hour census form? The whole idea of making any judgments bases on test results is pointless if there is nothing in it for the test taker.

I think you have confused "growth models" with "value added" models, as has the author of the article you cite. Growth models are much simpler calculations that calculate the change in student achievement from the beginning of the year to the end of the year. With value added, "statistical models can control for these demographic factors [that you cite above as issues]. The models can also control for the influences of a school or of classmates on a student’s performance. The goal is to isolate that part of a student’s performance gains that result from his or her teacher’s skill and effort through the course of a year.[*]". In other words, these are much more sophisticated models for calculating a student's growth in learning and take into account socioeconomic status, the school, the student's previous trajectory of learning and more. See this article for a more accurate depiction of the methodology and considerations when using value added:http://www.educationreport.org/pubs/mer/9598. As for teacher assignment, that is a seperate issue and one that needs to be addressed through district policy regarding placement and seniority. One benefit of using value added is that teachers should no longer feel "punished" for taking on challenging students, especially when they are successful, and this can be a platform for rewarding them for taking on that challenge. One final point: districts that are beginning to use this approach are also incorporating other measures, including portfolios of work, teacher observations and more in evaluating teachers.