Grade Inflation Seen in Evaluations of Teachers, Regardless of System

Few parents, principals, or even teachers themselves agree that all teachers are equally effective at helping children learn. Yet formal teacher evaluations tell a different story, one that looks a bit like something out of Lake Wobegon.

In many districts, nearly all tenured teachers—like the children in author Garrison Keillor’s fictional town—are deemed above average, concludes a report issued last week.

Conducted by the New Teacher Project, a New York City-based teacher-training organization, the report analyzes the results of a survey of more than 15,000 teachers and 1,300 administrators across four states and 12 districts. It also incorporates records maintained by those districts’ human-resources departments. The records show that more than nine in 10 tenured teachers met local standards in recent evaluation cycles.

Although the study results don’t reflect a representative national sampling of districts, they do suggest that norms of egalitarianism remain powerful in the teaching profession—sometimes to the detriment of students.

“This is a cultural problem, a problem of not having a commitment to recognizing key differences in performance,” said Timothy Daly, the president of the New Teacher Project.

The problem goes beyond teacher-evaluation and -disciplinary procedures into other policy areas, Mr. Daly added. Because distinctions in effectiveness aren’t formally documented, districts are missing out on opportunities to link the evaluation systems to professional-development tools, to decisions for granting tenure, and to bonuses or career-ladder initiatives.

Whether using binary or multiple ratings systems, districts in the New Teacher Project study routinely gave teachers high marks.

SOURCE: New Teacher Project

The districts that employed a binary rating system granted 99 percent of tenured teachers a “satisfactory” rating. In systems with more than two categories of ratings, 94 percent of teachers received one of the two highest ratings.

The evaluations also appear to have failed as a method for offering professional development tailored to individual teachers’ needs. Seventy-three percent of the teachers surveyed said their evaluations did not identify an area for development. Only 43 percent said the evaluations helped them improve.

Nor did the systems serve to remove ineffective teachers. For instance, only 10 percent of Denver schools missing standardized-testing goals under the federal No Child Left Behind Act issued an unsatisfactory rating to a teacher over the past three years. Yet 81 percent of administrators and 58 percent of teachers in the districts surveyed said a tenured teacher in their school was performing poorly, and 43 percent of teachers said a colleague should be dismissed for poor performance.

The New Teacher Project supplemented the survey data with interviews of teachers and principals, and convened an advisory group of district administrators and teachers’ union officials from the districts studied to provide input into the report.

In lieu of strong evaluation systems, some districts have created “shadow systems” to differentiate performance, Mr. Daly said.

For instance, although Denver is widely known for its ProComp differentiated-pay system, it maintains a binary rating system that deems teachers to be either satisfactory or unsatisfactory. Those ratings aren’t connected to ProComp, professional development, or strategies for school improvement.

The report recommends that all districts adopt performance-based evaluation systems based on agreed-upon standards of good teaching, multiple rating options, and a process for offering feedback to teachers on their strengths and weaknesses.

Food for Thought

In interviews, few advisory-board members disagreed with the report’s basic findings. Several underscored the report’s recommendation that retooled evaluation systems be used to help teachers improve.

“Professional development should not be the same year in and year out for all teachers, because the kids are different,” said Beverly Williams, the assistant commissioner of education in Arkansas. “You’ve got to fine-tune it to the population you have and the teachers’ needs.”

But in a sign of the difficulty that local teacher groups and district officials will face in revising their evaluation systems, several advisory-group members brought up issues not emphasized in the report.

Union representatives, for instance, underscored that the systems should give teachers support from their peers and an appropriate chance to improve before facing dismissal. One way to do so would be via peer-assistance and -review programs, said Deb I. Tully, the director of professional issues for the Ohio Federation of Teachers.

“The formative, supportive process is the most important to identify who is doing a good job and who will need more support,” Ms. Tully said. “If teachers aren’t performing, it is a way to counsel them out of the profession.”

Officials also held differing opinions about whether a teacher-evaluation system could be used simultaneously to guide the improvement of teachers and serve as an accountability tool.

Beverly Ingle, the president of the 39,000-member Colorado Education Association, said her union favors evaluations to support teacher improvement. But she suggested that the rapport between a teacher and a professional-development coach might be jeopardized if the coach is also the person who conducts formal evaluations.

“I’ve seen coaches who are not successful because they run to an administrator. They don’t get teachers wanting to confide in them about what’s not working in their classrooms so they can have the real dialogue about increasing student achievement,” she said.

Performance-Based Pitfalls

Administrators familiar with performance-based evaluation said that even well-designed systems hinge on strong training for evaluators.

Hamilton County, Tenn., a district not in the study, evaluates teachers several times a year using a set of measures that describes escalating levels of performance. Administrators in the 40,000-student district begin with a pre-evaluation meeting with a teacher to discuss the process. Actual observation periods are lengthy, and the observations themselves scripted so the administrator can document instances of success or struggle.

The observations are followed by more conversations and coaching on effective instructional methods, said Jennifer Spates, an assistant principal at Harrison Elementary School, in Chattanooga.

“I have an ethical responsibility to address concerns that are not best practices,” Ms. Spates said. “It really damages the morale of the faculty when you don’t.”

But Ms. Spates said principals in the district vary in how efficaciously they use the instrument.

“I wish there was some way we could train administrators to be more consistent and bring more fidelity to the process,” she said. “Teachers know which administrators take this process seriously, which ones are there to clean house, and which ones don’t care.”

Despite such challenges, some states are moving forward.

Ms. Williams, the Arkansas official, said her state has set up a task force to design a model evaluation instrument, and several districts, including Jonesboro, have expressed interest in adopting a finalized instrument.

The goal, she said, is to align the evaluation system with a state focus on teacher performance that now begins with entry into the profession: Arkansas is the only state to use the Praxis III, a performance-based teacher-licensing test.

“We don’t have the answers in Arkansas, but we realize what the problems are,” Ms. Williams said. “We are ready to step up to the plate and get started.”

Ground Rules for Posting
We encourage lively debate, but please be respectful of others. Profanity and personal attacks are prohibited. By commenting, you are agreeing to abide by our user agreement.
All comments are public.