Over the last couple of months I have noticed your dedication to your drawings. You sit at your desk and at every spare moment you grab a drawing tool (pen, pencil, pencil crayon, or felt) and paper. You draw what you feel, and I love it! I need to ask you a favor.

Can you please decorate my desk? My desk needs a personality, and I think it needs yours. Draw whatever is in your heart. There is only one rule, you draw what your heart wants to draw, and not what you think I want to see.

If you agree to my request, please fill out this form and return it to me later today.

Thank you Madysen!

A gift of validation. An emphasis on strength rather than weakness. An unnecessary but memorable kindness.

For a policy that practically every public school system in the nation is pursuing, the lack of evidence to support [the] effectiveness [of using student test scores to rank and evaluate teachers] is staggering. The number of high-performing education systems that use such an approach: zero. The number of peer-reviewed scientific studies that support this approach: zero.

Mike Wiser at The Quad-City Times reported today on the controversy here in Iowa around connecting student test scores to teacher evaluations (aka ‘value-added modeling’ or ‘VAM’). Last week I shared the research and prevailing opinion of scholars supporting why this should not be done.

In the article, notes that ‘teacher accountability has to be be part of it, or it’s not reform.’ This is consonant with policymakers’ general willingness to ignore the rating volatility concerns associated with VAM. As Amrein-Beardsley, et al. (2013) noted:

Policymakers have come to accept VAM as an objective, reliable, and valid measure of teacher quality. At the same time, [they ignore] the technical and methodological issues.

There appears to be a blind faith by many legislators in the objectivity of VAM, even though the actual data show that there is extremely high volatility in teacher ratings from year to year. Somehow policymakers are able to dismiss that rating instability as unimportant, even though it has tremendous impacts on teachers’ lives and reputations and public faith in the educational system. When Teachers of the Year are being rated ‘unsatisfactory’ by VAM systems, parents are rightfully suspicious. When high-achieving schools are rated as ‘needing improvement’, the public rightfully suspects that something’s not right. It’s important to note that legislators are not asking other professions to accept evaluation schemes in which 30 to 50 percent (or more) of their ratings fluctuate widely and completely randomly.

Of greater concern to me, however, is the response of Tom Narak, lobbyist for the School Administrators of Iowa (SAI). SAI represents all of the principals and superintendents in the state and is supposed to be knowledgeable about educational research and policy. Yet Mr. Narak says about VAM, “Why wouldn’t you? It’s the way (evaluations) are going now.”

Well, Mr. Narak, here are a few big reasons why we wouldn’t:

Because year-to-year ratings for teachers are randomly varying 30%, 40%, 50%, or even higher [Di Carlo; Economic Policy Institute; Baker; National Education Policy Center]. In other words, extremely high percentages of teachers’ evaluations have absolutely nothing to do with their actual performance. As lobbyist for the administrators responsible for evaluating teachers, this should be alarming to you, not dismissed out-of-hand. Do you want principals and superintendents to send the message to their teaching staffs that they don’t care if evaluations are fair?

Because even when student test scores are averaged over 3 to 5 years, random variation in teacher ratings still results in over 25% to 48% of teachers being rated inaccurately [U.S. Department of Education; Di Carlo]. In other words, when it comes to rating instability, looking over a longer time frame helps some but not a lot.

Because when VAM systems are implemented, predictably ludicrous and harmful results occur. These policy decisions have real consequences for our teachers for whom we supposedly have such great respect.

Because even if we could devise a fair VAM system (which right now no one seems to be able to do), research shows consistently that the contribution of teachers to overall student test scores is 10% to 15% at most. The rest is attributable to other school factors or non-school factors. Any VAM system that imputes greater teacher responsibility than that small percentage would be highly unethical.

If Mr. Narak and SAI are going to take a policy position on teacher evaluation, they should be up on the research I cited last week. In fact, on April 21 I e-mailed Mr. Narak the research noted above. Apparently, like many legislators, he and SAI don’t seem to care that the teacher evaluation systems for which they’re expressing support are inherently unfair and probably illegal? Would they feel the same if we were talking about the principals and superintendents whom they represent?

“Dear principal, 33% of your year-to-year evaluation will be completely random. Even though what you did this year isn’t substantially different from what you did last year, you may end up being rated highly or you may be rated near the bottom. Despite the extreme rating instability, there will be real consequences for you depending on the results. Good luck.”

Our teachers deserve evaluation systems that are fair. If they’re not fair, they’re unethical. If they’re not fair, they’re illegal. And right now, despite their intuitive appeal and legislative popularity in certain circles, VAM systems are unable to meet the basic principle of fairness and thus should not be supported by SAI or any other knowledgeable educational organization or policymaker.

[I’ll also note as an aside that some states are starting to talk about evaluating administrators based on student test scores. If we are rightfully concerned about volatility in teacher ratings, wait until we remove the connection to students one additional step and try to tie scores to administrators. In other words, SAI, be careful for what you advocate because the principals and superintendents you represent are next…]

Finally, I’ll close with a plea to Jason Glass, Director of the Iowa Department of Education (DE), to publicly release the research that he has which supposedly supports VAM. Over the past months Jason has said repeatedly that DE and the Governor were not advocating for VAM approaches. And yet, here at the end of the legislative session, we somehow find ourselves discussing VAM systems and both DE and the Governor are supporting them. Whatever research Jason has, it’s going to somehow have to address the concerns noted above. Given that leading scholars and our most respected educational research/policy organizations are familiar with and have summarized the literature base and yet still strongly advocate against VAM, I’m skeptical. But, hey, maybe he’s got a bunch of dispositive studies with which both I and they are unfamiliar…

—–

I recognize that this post likely is going to make me unpopular with SAI (and even more unpopular than I already am with DE), which I regret because I’ve had good relations with them for a long time. But when the weight of the evidence is overwhelmingly against the policy position for which they’re advocating, I can’t just sit by and say nothing, not when it has very real, negative consequences for Iowa educators. John Ewing, President of Math for America, notes:

Of course we should hold teachers accountable, but this does not mean we have to pretend that mathematical models can do something they cannot.

I’ll state emphatically that we absolutely, under any circumstances, shouldn’t pretend that mathematical imprecision in evaluative processes has no impact on teachers’ lives and the fairness of our educational systems.

Should teachers be evaluated by students’ standardized test scores? While that idea seems to make intuitive sense, my newest resource on value-added measures (VAM) highlights the rating volatility, legal issues, and other concerns that have led our most trusted assessment experts and educational research/policy organizations to vehemently advocate against evaluating teachers with student test scores:

Gates’ – and other reformers’ – dismissal of value-added’s problems with the “it’s just one of multiple measures” line is akin to saying, “Look, the entire ice cream cone isn’t made of cow patties; the manure is just one scoop along with three scoops of real ice cream.” Either way, you’re asking people to eat crap.