With only a few weeks of training, Teach for America teachers are as effective with elementary students as traditionally trained, far more experienced teachers at the same high-poverty schools, concludes a new Mathematica study.

In pre-K through second grade, TFA teachers’ students gained an extra 1.3 months of reading, the study found.

For all the talk of “value-added” performance measures, most teachers can’t be evaluated by gains in their students’ test scores because they don’t teach tested subjects or no prior test scores are available, write Grover J. “Russ” Whitehurst, Matthew M. Chingos and Katharine M. Lindquist in Education Next. That makes it important to get classroom observations right.

“Teacher evaluations should include two to three annual classroom observations, with at least one of those observations being conducted by a trained observer from outside the teacher’s school,” they recommend.

In addition, classroom observations “should carry at least as much weight as test-score gains in determining a teacher’s overall evaluation score when both are available.”

That’s true, say the researchers. “Districts should adjust teachers’ classroom-observation scores for the background characteristics of their students, a factor that can have a substantial and unfair influence on a teacher’s evaluation rating.”

Scores can be adjusted for “the percentages of students who are white, black, Hispanic, special education, eligible for free or reduced-price lunch, English language learners, and male,” they write.

Productivity ratings are adjusted for factors including “cost-of-living differences and higher concentrations of low-income, non-English-speaking, and special education students,” according to the report.

Of the more than 400 twin districts studied, we found the higher-spending twin spent on average $1,600 more per student to educate similar groups of students to similar achievement levels. . . . We also found a number of districts that spent equal amounts of money, had the same demographics, but ended up with different levels of student achievement.

The highest score of 6 might go to a gritty teacher who was a “member of the cross-country team for four years and voted MVP in senior year” and was also “founder and president for two years of the university’s Habitat for Humanity chapter.” The unnamed teacher-training organization that provided the data for the study is now using a version of this rating system as one of multiple tools to help make hiring decisions.

The study used the teacher-training group’s assessment of effectiveness, which was based on several different measures of student achievement.

When Eva Kellogg’s bosses evaluated her performance as a teacher, they observed her classes. They reviewed her lesson plans. They polled her students, their parents and other teachers. And then they took a look at her students’ standardized test scores.

When the lengthy process was over, the eighth-grade English teacher at Aspire Lionel Wilson College Preparatory Academy in Oakland had received the highest rank possible.

She was a master teacher.

And based on her job performance, she got a $3,000 bonus as well as a metaphorical front-row seat at one of the biggest battles in public education: how to evaluate teachers and whether to give good ones a bigger paycheck.

Forty percent of a teacher’s score is based on observation by the principal, 30 percent on students’ standardized test scores and the rest on student, colleague and family feedback, as well as the school’s overall test scores.

Teachers are ranked as emerging, effective, highly effective or master. Bonuses range from $500 to $3,000.

Guidelines released in August required states to use teacher-evaluation data, starting in October, 2015, to see that “poor and minority students are not taught by ineffective teachers at a higher rate than their peers,” writes Michele McNeil. The Education Department will drop that rule.

Civil rights groups have fought for better teachers in high-poverty schools. Teachers’ unions have opposed the use of evaluation data to rate teachers.

The Education Department claims it will deal with the issue next year by putting “teeth” into NCLB. But the law deals only with “inexperienced, unqualified or out-of-field teachers,” notes Sawchuk. “The effectiveness language came later and only applied to stimulus funds.”

Most highly effective teachers turned down the transfers, notes Sawchuck.

The top 20 percent of teachers in each district were identified using each district’s own “value added” measure. They were offered a $20,000 bonus to switch, paid out over a two-year period. (Effective teachers already in those schools got $10,000).

Of 1,500 eligible teachers, only 81 decided to transfer to qualify for bonuses.

Tranferring teachers were more likely than colleagues to stay at their new schools during the two years when bonuses were paid. After that, they left at the same rate as other teachers.

Students in high-poverty, low-performing schools are much less likely to be taught by experienced and highly effective teachers, say advocates. But it’s not clear whether a teacher who’s effective with easy-to-teach students will be effective with high-risk students.

Most states are using student achievement to evaluate teachers, according to Connect the Dots from the National Council on Teacher Quality. “What is occurring more slowly are the policy changes that will connect the rich performance data from these systems to tenure decisions, professional development, compensation, teacher preparation, and consequences for ineffectiveness.”

NCTQ looks at teacher evaluation policies across the 50 states and Washington D.C. Louisiana is “connecting the most dots,” followed closely by Florida and Tennessee, NCTQ concludes. Colorado, Delaware, Illinois, Michigan, Rhode Island and DCPS are also ahead of the curve.

Just two percent of Syracuse teachers were rated highly effective, and an additional 58 percent were deemed effective. Seven percent were classified as ineffective, and 33 percent as developing, categories that suggest low levels of teaching performance, the need for teacher improvement plans, and the threat of eventual dismissal.

On average, Syracuse teachers were rated effective on the state’s metric for student growth. They were rated effective or highly effective by the principals and peers who observed their teaching. But the school-wide measures of student achievement used by the district lowered scores significantly.

That’s because teachers had to raise test scores from 2012 to 2013 to be rated effective. But the 2013 tests, aligned with Common Core standards, was much harder. Scores went down in Syracuse — and everywhere else in the state. That was inevitable.

I wonder how State Commissioner John King, Jr. would like it if his performance evaluation were based on the same criteria applied to teachers in Syracuse. The percentage-point increase in students statewide scoring at level 3 and 4 in ELA from 2012 to 2013? Well, that actually fell from 55 percent to 31 percent. The Commissioner gets a zero. The percentage-point increase in students scoring at level 3 and 4 in math? That fell from 65 percent to 31 percent. The Commissioner gets a zero. The percentage-point decrease in students statewide scoring at level 1 in ELA from 2012 to 2013? That actually increased from 10 percent to 32 percent. The Commissioner gets a zero. And the percentage-point decrease in students scoring at level 1 in math? That rose from eight percent to 33 percent. The Commissioner gets a zero.

A good school requires a good principal, nearly everyone agrees. But most states collect little or no information about how their principals are prepared, licensed, supported and evaluated, concludes Operating in the Dark, an analysis by the Dallas-based George W. Bush Institute.