We are dedicated to renewing America by continuing the quest to realize our nation's highest ideals, honestly confronting the challenges caused by rapid technological and social change, and seizing the opportunities those changes create.

Teacher Preparation Programs: Measuring Success

Blog Post

Aug. 1, 2016

Last month, nine organizations that prepare new teachers—including Urban Teachers, Teach for America, and TNTP—signed a joint statement directed at the U.S. Department of Education and Congress. The issue at hand? Ensuring teacher preparation programs are publicly held accountable for their graduates’ outcomes.

Currently, there are no federal requirements for states to collect or publish data on teacher preparation program outcomes—the Department of Education has yet to finalize its controversial 2014draft regulations requiring states to develop systems for holding preparation programs publicly accountable. Despite this, at least 11 states are experimenting with requiring preparation programs to collect and report on their graduates’ outcomes. And more states could soon be joining them, as the newly minted Every Student Succeeds Act (ESSA) explicitly allows states to use federal funds to reform their teacher preparation programs. But state education agencies may struggle to determine how to best create and use such outcomes-based data systems without additional guidance and support. Addressing this issue, the nine organizations’ statement calls for federal policymakers to issue guidance, funding, and technical assistance to state education agencies around helping teacher preparation programs collect and publicize information on the performance of their graduates.

The published statement argues that requiring all preparation programs to report outcomes data will benefit three groups of stakeholders: future teacher candidates searching for great programs; hiring managers scouting for well-trained teachers; and the teacher preparation programs looking to improve. (Although the list of beneficiaries does not explicitly mention the students assigned to novice teachers, the group’s letter does mention how program outcome transparency will also serve students.) The letter also highlights sample outcomes that programs could be required to report to help drive decisions by these stakeholders, such as teacher retention rates, principal satisfaction with new teachers, and performance on teacher evaluations.

Two aspects of this letter are especially notable. First, the letter is not signed by any “traditional” teacher preparation programs based in institutions of higher education, which raises questions about the current willingness of many programs to be held publicly accountable by the measures mentioned here. Second, the letter suggests using one type of data that has been particularly controversial in preparation program accountability debates: gains in student achievement.

The contention over using measures of teachers’ student learning gains to assess preparation programs’ performance is an extension of the contention over using them to assess individual teachers’ performances.

The contention over using measures of teachers’ student learning gains to assess preparation programs’ performance is an extension of the contention over using them to assess individual teachers’ performances. Over the last five years, the majority of states have developed multi-measure teacher evaluation systems that, at a minimum, include measures of student learning growth and observations of teachers’ classroom practice. As a result, researchers increasingly have access to various types of data about teachers’ performance—and can use it to understand how performance is related to teachers’ preparation.

While several academic studies have explored whether there are differences in preparation program performance based on teachers’ impact on student learning gains, a new study from the University of Michigan’s Matthew Ronfeldt and Shanyce Campbell is the first to also examine whether there are differences between preparation programs based on the classroom observation ratings received by their graduates during their individual performance evaluations. The study used three years of teacher evaluation data from elementary and secondary teachers in Tennessee to examine the performance of recent graduates from 39 Tennessee university, college, and alternative teacher preparation providers. Specifically, the researchers looked at individual preparation programs within each institution (undergraduate elementary or graduate secondary, for example) that graduated more than ten teacher candidates who were subsequently employed as full-time teachers.

Ronfeldt and Campbell used a method of data analysis that attempts to factor in the components of teacher evaluation that are outside the control of the teacher or the program they attended. (For example, if a preparation program sends teachers to work in a school that systematically inflates its teachers’ evaluation ratings, this would be controlled for in the analysis.) The researchers then looked at the relationship between the average observation rating and the average value-added rating (a measure of teachers’ impact on their students’ state test score growth) for graduates of each preparation program’s recent graduates. They found that programs whose graduates performed best on measures of observed classroom practice also performed best on measures of teachers’ impact on student achievement. Programs with graduates who, on average, had classroom observation ratings in the top 25% had significantly higher value-added ratings than programs whose graduates, on average, scored in the bottom 25% on observation ratings. The difference in value-added ratings was similar to what we might expect if teachers from the top-performing programs had an additional year of teaching experience compared to teachers from the poorest-performing programs.

However, Ronfeldt and Campbell also found that different programs landed in the top or bottom quartiles when they ranked programs by teachers’ impact on student learning (as measured through a value-added rating) instead of observation ratings. Only 40% of institutions and programs ended up in the same quartile based on both measures. As a result, they recommend that states hold preparation programs accountable using the “checks and balances” of multiple measures (like observation ratings and growth in student learning).

A single measure simply can’t convey the whole performance story, whether it be the story of an individual teacher or of an entire cohort of graduates from a preparation program.

These findings point to the need for more research on how to measure preparation program performance to help guide federal and state policymaking and guidance. But they also point to the importance of maintaining multiple measures in teacher evaluation in the first place: a single measure simply can’t convey the whole performance story, whether it be the story of an individual teacher or of an entire cohort of graduates from a preparation program. State policymakers that are considering discontinuing the use of student growth measures in teacher evaluation systems should take heed.