One of the most talked-about education studies in recent months is a new working paper on the effects of Louisiana’s statewide voucher program during its first year of operation. In short, the authors find that students who won a school voucher via lottery ended up having substantially lower achievement after one year in math, reading, science, and social studies compared to students who lost the lottery and received no voucher.

This study has been the topic of much conversation in part because voucher research is inherently controversial, but also because the results are somewhat anomalous. Rigorous studies (like this new study in Louisiana) often find at least some positive effects from voucher programs, and it is interesting and important to think about why the effects of voucher programs might vary across contexts.

One of the most plausible explanations of the Louisiana results, put forward most forcefully by Jason Bedrick of the Cato Institute, is what has come to be known as “Overregulation Theory.” According to Overregulation Theory, regulations imposed by Louisiana’s voucher scheme were so burdensome that only the private schools most desperate to boost enrollment opted to participate. To be eligible for voucher students, private schools had to agree not to set admission requirements or charge tuition above the (relatively meager) value of the voucher, and to administer the state’s standardized tests. These requirements, however well-intentioned, may have discouraged many private schools from accepting vouchers at all.

As Bedrick points out, this theory is consistent with the study’s finding that participating private schools were more likely than non-participating schools to have experienced significant enrollment declines prior to entering the voucher program. These schools may have had declining enrollments precisely because they were among the least effective schools, and ineffective schools will produce less student learning.

While it is plausible that regulations did reduce private school participation, it is not obvious that Overregulation Theory is entirely consistent with the available evidence. For one thing, the authors of the Louisiana study specifically check to see if learning outcomes vary significantly between schools experiencing greater or lesser prior enrollment declines, and find that they do not. (Bedrick acknowledges this, but doubts there was enough variation in the enrollment trends of participating schools to identify differences.) Joshua Cowen of Michigan State University also points out that there is previous evidence of positive effects from accountability rules on voucher program outcomes in other states (though regulations may differ in Louisiana).

It is also important to recognize, however, that Overregulation Theory—even if true—does not by itself explain substantial negative effects from vouchers. After all, one of the primary mechanisms by which vouchers are normally theorized to have benefits is that they enable families to choose superior schools. Overregulation Theory—at most—explains why those benefits would not obtain, not why students would experience harm.

That is, even if regulation prevented all but the worst private schools from participating, this would explain why students did not benefit from transferring into them, but not why students would transfer into them in the first place.

So Overregulation Theory might be part of the story in explaining negative voucher effects in Louisiana, but it is not by itself sufficient. To explain the results we see in the study, it is necessary to tell an additional story about why families would sort into these apparently inferior schools.

Such a story is certainly possible. It may be that families are selecting schools that provide unobserved benefits, including benefits that accrue over longer time horizons. In such a case, families may simply be sacrificing learning gains for other advantages that this study does not capture (e.g., school safety or character education). It may also be that Louisiana’s voucher system somehow induced families to select inferior schools. It is not immediately obvious how such an inducement might work, but if families’ ability to choose schools wisely is fragile or sensitive to program design, that may have important implications for policy. Similarly, if families simply tend to assume that any private school must be superior to their available public schools, it may be that informing parents about school quality is more difficult than school choice advocates tend to assume.

In any case, it is likely that the existing evidence will not allow us to fully adjudicate between competing hypotheses. Analyses of subsequent years of data from Louisiana may also change the picture considerably. It is nevertheless important to articulate our theories fully and clearly so that we can test them going forward, and while Overregulation Theory may be one piece of the puzzle, other pieces are still required.

As school choice expands in both the public sector (e.g., via charter schools) and the private sector (e.g., though vouchers and education savings accounts) it will become increasingly important to understand how families determine where their children will be educated. Choice programs may give parents the ability to choose schools that are better (or simply better for their child). Nevertheless, this new study out of Louisiana suggests that there may also be a risk that students will sort into new schools in sub-optimal –- or even harmful –- ways. By better understanding how parents are choosing schools for their children, we can maximize the benefits of school choice while mitigating the risks.

First, for some context, here’s California’s K-12 student enrollment. Between the 1997-98 and 2004-05 school years, student enrollment increased by about 10% and has been roughly flat since. There are now approximately 6.2 million students in California’s K-12 system.

California hired more teachers as enrollment grew, and eventually cut back right around the time of the recession. Last year there were roughly 295,000 teachers in California’s public schools.

As enrollment and hiring has ebbed and flowed, the ratio of students to teachers in California has oscillated between 20-22.

At the same time, student demographics in California have changed substantially, with the proportion of white students steadily falling while Hispanic/Latino students became a majority.

Teacher demographics in California have changed somewhat as well, with white teachers becoming somewhat less common and Hispanic/Latino teachers somewhat more common.

The teaching force in California nevertheless remains much less diverse than the student population. Today there are roughly 8 white students for every white teacher in the state. That compares to 35 black students per black teacher (about the same as for Asian students & teachers) and 63 Hispanic students per Hispanic teacher.

Shifting gears to the size of the new teacher supply in California, the first and most striking fact is that the number of new credentials issued in California has fallen for each of the 10 most recent years for which data are available. The 14,810 credentials issued in 2013-14 represent a drop from the high in 2003-04 of 53%. This includes traditionally-certified teachers, teachers in intern programs (like Teach for America) and teachers converting their credentials from other states.

(I recently heard Linda Darling-Hammond say that enrollment in teacher preparation programs has climbed recently. She is in a position to know, but the next round of enrollment numbers probably won’t be public until October, and new credentialing numbers may not be available until April or so.)

The 14,810 credentials issued in 2013-14 is equivalent to 5.1% of the existing teaching force that year. That is the lowest level since at least 1997-98.

As is often the case in education, overall numbers can mask a lot of variation. If we break down credentials into three broad categories – multiple subject (elementary), single subject (secondary), and specialist (special education) – we can see that the vast majority of the drop is attributable to fewer elementary teachers becoming certified, with multiple subject credentials falling by 2/3 since 2003-04. Over the same time period new single subject credentials are down by about 40% and special education credentials are down by about 10%.

Of course, some of these categories can be broken down further still, as single subject and specialist credentials come with “authorizations” to teach specific subjects or students with specific special needs. This gets somewhat trickier with the data currently available to me, which doesn’t track intern credentials by authorization until 2007-08. If we restrict ourselves to the last seven years – when intern credentials can be included in the totals – it turns out that, at least recently, credentialing declines have been broadly similar across many common credential types. (I’ve indexed all values to 2007-08 so that the large number of elementary credentials doesn’t stretch the y axis.)

This is not to say there aren’t exceptions and variation. If we look at all four categories of science credentials, for example, we can see that, at least over the last seven years, chemistry and physics credentials have not declined as much (proportionally) as biology and geosciences credentials.

Is the decline in new teachers putting pressure on schools as they hire staff? It’s very hard to generalize, particularly in a state as large as California, but one way to see if schools are experiencing supply constraints is to see how many teaching permits they are receiving from the state. There are various kinds of teaching permits, and California has changed its permit requirements over the years, but in general permits allow schools to hire teachers who are not fully credentialed in a given subject area provided that they (the school) can demonstrate that they have made an effort to find fully credentialed candidates.

California reports the number of teaching permits issued by subject area each year, and we can see that while permits for common subject area authorizations declined and almost vanished after the turn of the century, they have ticked back up in recent years.

While still low in historical terms, various types of teaching permits and waivers (a rarer and more extreme type of credentialing exemption) increased between 29% and 51% last year. This may indicate that schools are having a harder time finding fully credentialed candidates to hire.

This perhaps should not be surprising since schools in California have been increasing their hiring recently, even as the number of newly-issued credentials continues to drop. Each year schools provide the state with estimates of how many teachers they will need to hire the following year. These are likely only very crude estimates of the actual number of teachers who will be hired and the number of new teachers who will be required, but it is illustrative to consider trends.

If California is going to need to certify more teachers, it is helpful to know where the state’s teachers come from. Historically, the vast majority of teachers have been prepared by programs in the California State University system or at private schools. (University system figures include both their traditional and intern programs.)

However, as we can see the relative importance of each source has changed somewhat over the years, with teachers increasingly likely to be prepared out of state or in University of California schools. This is largely because enrollment at CSU and private schools has dropped dramatically.

Hopefully I’ll have a chance to update these charts as new reports come out in the future.

[M]any teacher evaluation reform efforts may be focused too heavily on the demand side of teacher evaluation. That is, many reform efforts tend to assume that principals are overly generous with their evaluations because they lack either the motivation or the information to demand better performance from their teachers. There may be something to this, but it is important not to ignore the supply side of the teacher quality problem. After all, the extent to which a principal is willing to dismiss (or give a poor evaluation to) a teacher will likely depend in part upon her beliefs about the probability of finding a superior replacement in a reasonable period of time.

The extent to which principals today are constrained in their evaluation and dismissal decisions by the quality and size of the teacher labor supply is not obvious and probably varies by grade level, content area, and geographic location. There are, however, reasons to suspect that teacher supply constraints are real and may be getting worse.

Feedback has generally been very positive, but I did hear a few critiques that are worth responding to. Some of this I’d have included in the original post but even as it was I was running a little over-long.

“It’s Much Harder to Use Policy to Influence Teacher Supply”

I heard from several people that the reason education reform has not targeted supply more directly is that the policy levers to influence demand are mostly at the K-12 level and for various reasons therefore easier to pull. That is, evaluation reform can be legislated or controlled much more easily than change in the higher education system (where teachers are trained).

There may be something to this, but I’m not sure I fully buy it. For one thing, teacher evaluation reform seems to me to have been enormously difficult politically in most places, and my point is precisely that for all of the political oxygen that’s been consumed the actual impacts have often been muted. So I’m not really sure what “harder” means when thinking about teacher supply reform.

Second, the K-12/higher ed dichotomy is largely false. While improving, say, teacher preparation would be hard, the teacher supply also depends a great deal on factors at the K-12 level. As I wrote in the Brookings piece, teacher compensation, working conditions, and even evaluation may all matter for the quantity and quality of the teacher supply, but seem to me have been unjustifiably neglected.

Finally, while reform at the higher education level may be difficult, it could nevertheless be worthwhile. In fact, teacher preparation reforms may be one of the best ways to improve not only the size, but the quality of the teacher supply. That promise is why I signed on as an adviser to Deans for Impact.

“Reformers Really Have Targeted the Teacher Supply”

Matt Barnum thinks I’m not giving reformers enough credit for improving the teacher supply. After all, the rise of alternative certification – which has lowered barriers to entry into teaching seemingly without sacrificing quality – is arguably one of reformers’ biggest policy wins.

I’m a supporter of alternative certification for this very reason, so I basically agree with Matt. Three caveats, however.

First, it’s not entirely obvious to me exactly how big the supply effects of alternative certification programs really are. Some programs seem to be making an effort to target geographic and subject areas that are most in need, but that’s not a universal priority and I haven’t seen a good analysis of whether these programs are meeting our greatest needs in effective ways. I also don’t know how many people who enter through alternative certification wouldn’t enter the classroom otherwise. (As one piece of anecdata, I entered the classroom through a traditional route after my application to Teach for America was rejected.)

Second, to the extent that alternative certification programs do not emphasize – and sometimes deliberately deemphasize – teacher retention, they may be undermining their own supply benefits. Again, it’s an open question in my view and if there are good analyses out there I’d like to see them.

Third, I have concerns about common narratives around recruiting the “best and brightest” through some alternative certification programs. To some extent alternative certification is about lowering barriers to entry into the profession. In some cases, though, alternative certification has become about throwing up different barriers to entry, e.g., recruiting the most academically impressive candidates from the most exclusive colleges and universities. Since it’s not clear that these new barriers result in a better teaching force overall, I worry that they contribute to growing-but-as-yet-unjustified calls to “raise the bar” for teachers.

“Teacher Evaluation is not about Dismissing Teachers”

On the subject of contributing to unproductive narratives, Kaitlin Pennington suggests that I have framed the purpose of teacher evaluation too narrowly:

While there is some truth to Bruno’s argument, his framing of teacher evaluation—as a system that is built to “dismiss teachers”— furthers a flawed narrative about the purpose and use of teacher evaluation systems.

The purpose of reformed teacher evaluation systems, first and foremost, is to identify teachers’ strengths and weaknesses in order to refine educators’ instruction for improved student learning. New evaluation systems were meant to be a tool to reward excellent instruction, provide opportunities for targeted professional development, and create systems of support in schools in districts.

I’ve been pushing reformers to focus less on teacher dismissal for years, so I don’t really disagree with this and certainly did not intend to suggest that dismissal is the only – or even the primary – purpose of teacher evaluation. As I say even where Pennington quotes me, the teacher supply matters even for critical evaluations and apart from dismissal. Principals are making all of their evaluation decisions – whether for staff development or for dismissal – in the context of the status quo teacher supply. Their criteria for effectiveness and their willingness to risk alienating staff with critical evaluations will likely depend in part upon the quantity and quality of the other teachers available to them.

Also, not to put too fine a point on it, but teacher evaluation really is in part about dismissing teachers. I do not think administrators should be firing large numbers of teachers for performance, but I’m certainly not the only person who believes they should be dismissing some teachers based on poor evaluations.

I do think there is a narrative around teacher evaluations that is overly focused on their punitive uses. As Pennington herself points out, “new teacher evaluation systems in many places were sold as ways to ‘get rid of bad teachers.'” So while it’s not my narrative and while I don’t want to contribute to it, the narrative she is concerned about really does exist and it probably does change the way people read a piece like mine.

I probably could have done a better job distancing myself from that narrative, but there’s also an important lesson there about narratives in general. The existence of that narrative could probably have been avoided if, many years ago, education reformers had been more thoughtful with their policies and careful with their language. But now that we’re stuck with it, it makes talking about teacher evaluation and dismissal even in measured terms that much more difficult.

If reformers now want to distance themselves from the narrative that teacher evaluation reform is mostly about firing bad teachers, I think that’s great. At the same time, it’s worth thinking about how we can be improving other narratives today so that they aren’t tripping up worthwhile reform efforts tomorrow.

Many contemporary education reform efforts attempt to leverage teacher evaluation policy to improve teacher quality, by making the evaluation process more rigorous or by tying results more directly to student learning outcomes, for example. By increasing the demand for high-quality teaching and teachers, these reforms have had some success. However, insufficient attention to the supply of teachers may be preventing many teacher quality and evaluation reforms from realizing their full potential.

To be clear, there is preliminary evidence suggesting that contemporary evaluation reforms may in at least some cases have the desired effects. For example, raising the rigor and stakes of teacher evaluations in New York City and Washington, DC seems to have improved teacher quality in both locations, whether through teacher improvement or selective attrition of weaker teachers.

At the same time, however, many other contentious efforts to reform teacher evaluation have resulted in little change to teacher evaluation outcomes. Recent statewide efforts to make evaluations more rigorous and meaningful in New Jersey, New Mexico, New York, Florida, Indiana, Rhode Island, Maryland, and Hawaii, for example, have resulted in the vast majority of teachers— often well over 90%— continuing to receive ratings of “effective” or “highly effective.” These reforms may have been valuable, but they have disappointed reformers who are skeptical that the results accurately reflect the quality of the teaching force in schools where large numbers of students are not academically proficient.

If we are surprised by the often muted effects of teacher evaluation reform, that is perhaps because we are insufficiently sensitive to the forces that contribute to seemingly inflated teacher evaluations. And there are many reasons why managers might—and often do— tend to evaluate their employees highly, including: aversion to interpersonal conflict; genuine beliefs about the quality of the employees they’ve hired; and the maintenance of workplace morale.

Additionally, many teacher evaluation reform efforts may be focused too heavily on the demand side of teacher evaluation. That is, many reform efforts tend to assume that principals are overly generous with their evaluations because they lack either the motivation or the information to demand better performance from their teachers. There may be something to this, but it is important not to ignore the supply side of the teacher quality problem. After all, the extent to which a principal is willing to dismiss (or give a poor evaluation to) a teacher will likely depend in part upon her beliefs about the probability of finding a superior replacement in a reasonable period of time.

The extent to which principals today are constrained in their evaluation and dismissal decisions by the quality and size of the teacher labor supply is not obvious and probably varies by grade level, content area, and geographic location. There are, however, reasons to suspect that teacher supply constraints are real and may be getting worse.

As an example, consider my own state of California. A number of states have seen steep drops in enrollment in teacher preparation programs in recent years, and the declines are particularly stark in California. The number of new teaching credentials issued in the Golden State has fallen for ten consecutive years, for a total drop of 53 percent from the peak in 2004. (Total K-12 public school enrollment in California is up nine percent since 1998 and has declined one percent since 2004.)

The vast majority of this decline is attributable to a sharp drop in the number of multiple subject credentials (usually granted to elementary school teachers), which may suggest that elementary teachers have historically been overproduced in California. However, single subject credentials— generally required for middle and high school teachers— have fallen 41 percent since 2004 as well.

Ultimately, the extent to which the decline in new credentials is impacting districts’ hiring, principals’ evaluation processes, or students’ learning in California is not clear. It is nevertheless plausible that a shrinking supply of teachers increases principals’ uncertainty about their prospects for finding superior replacements for unsatisfactory members of their staffs, especially for certain harder-to-staff teaching positions. As a result, principals may be less inclined to dismiss their weak teachers or even to risk offending them with low evaluation scores.

Because the causes of our shrinking teacher supply are not entirely clear, it is difficult to know how best to respond. Still, some general—if mostly untested— principles suggest themselves.

First, stricter evaluation requirements should probably be coupled with more aggressive teacher recruitment and retention policies. Higher salaries or more pleasant working conditions could go a long way toward growing the supply of teachers and toward making sure that good teachers aren’t leaving voluntarily. There is evidence that teacher turnover is harmful to student achievement, and a principal faced with replacing several voluntarily-departing teachers may be less inclined to evaluate the teachers who remain more harshly.

Second, policymakers should consider systems of differential compensation and evaluation that allow districts and administrators to be more generous or flexible with harder-to-staff teaching positions. This may mean higher salaries for teachers in certain schools, subject areas, or grade levels. It could also mean somewhat looser evaluation requirements for teachers who, realistically, would be more difficult to replace. Allowing administrators to waive subsequent observations and evaluations for math teachers who perform well early in the year, for example, could both gratify math teachers and free principals to focus their evaluation efforts where they may be more useful.

The fact that the supply of teachers can change dramatically suggests that we should perhaps focus more on the supply side of teacher quality issues. Researchers should examine more carefully the extent to which teacher supply can be manipulated and the ways in which it impacts (or doesn’t impact) student learning and human resources for schools. Reformers and policymakers, meanwhile, should be mindful of whether their preferred reforms depend on—or have implications for— the quantity and quality of the teacher supply.

4. Reform Math Went Poorly in Quebec. This post was about a fascinating and important study that reflected very poorly on reform math and progressive education more generally. We should still be talking about this study, both because its findings are important and because it’s the sort of large-scale real-world implementation study that education needs more of.

5. For Reformers: An Important Paper on Worker Compensation and Incentives. This post nicely illustrates why, as sympathetic as I am to a lot of reformy positions in education, I don’t identify with the reform movement in general. The problem is myopia. When reformers talk about American education, they generally think only about 1) America and 2) education. This is unfortunate because other countries and other sectors deal with many very similar problems and neglecting them leads to some very confused – or at least highly simplistic – thinking about American education.

And here, as a little bonus, are my favorite of my pieces written in other venues this year:

What Do We Really Know About Eva Moskowitz’s Success? Written for the Fordham Institute. I don’t do anything especially complicated here, but I think I do lay out reasonably clearly the questions about Success Academy charter schools that we don’t have answers to and often don’t even seem interested in asking.

My Goodbye & Retrospective at This Week in Education. If you look at all my writing, it turns out that at some level I’m just saying a few things over and over again. But I really like those things!

Share this:

Starting in 1999, schools in Quebec implemented an ambitious curricular & instructional program at all schools in the province. Broadly speaking, this program can be considered “constructivist” and the math program in particular seems to have been of the “reform math” variety. To get a sense for what the reformers had in mind, they described wanting students to increasingly

find answers to questions arising out of everyday experience, to develop a personal and social value system, and to adopt responsible and increasingly autonomous behaviors

and

Instead of passively listening to teachers, students will take in active, hands-on learning. They will spend more time working on projects, doing research and solving problems based on their areas of interest and their concerns. They will more often take part in workshops or team learning to develop a broad range of competencies.

A little over a decade later, a team of economists went in to see how these reforms were going. (An older, ungated version of the paper can be found here.)

Apparently it did not go well.

Catherine Johnson has a good rundown, but I wanted to highlight a few things in particular.

First and foremost, the overall results in terms of student learning appear to have been quite bad:

Our data set allows us to differentiate impacts according to the number of years of treatment and the timing of treatment. Using the changes-in-changes model, we find that the reform had negative effects on students’ scores at all points on the skills distribution and that the effects were larger the longer the exposure to the reform.

This study provides support for my pedagogy of privilege hypothesis, namely that “progressive” teaching may be acceptable for the strongest students, even if most students, and especially the weakest students, are likely to flounder:

In grade 2, only students in the 75th percentile appear to be significantly impacted by the reform. However as we move from grade 2 to grades 9–10, the effect also becomes significant for lower and average performing students. In grades 9–10, the magnitude of the coefficients is the largest for students in the 25th percentile, and slowly decreases as one moves toward the upper tail of the distribution. Looking at the top of the distribution (90th percentile), we also find negative effects across all grades, but the estimates are generally not significant. It is possible that the reform did not harm top performers. It is also possible that the reform did impact top performers, but that the number of observations at this mass point is too small to obtain precise estimates…

Lower performing students were impacted more severely, and the effects grew larger as students progressed from primary to secondary school. These large negative effects are worrying, and suggest that the reform may have harmed those most in need.

Notably – and ominously – the reforms in Quebec seem to be aligned with reforms that have been advocated in other places, including the United States:

Evidence…suggests that most OECD countries are moving away (or have long moved away) from the traditional (more academic) teaching approach. More specifically, the teaching approach promoted by the Quebec reform is comparable to the reform-oriented teaching approach in the United States. As of 2006, this approach was widely spread across the United States (although more traditional approaches remained dominant) and it was supported by leading organizations such as the National Council of Teachers of Mathematics, the National Research Council, and the American Association for the Advancement of Science.

Katharine Beals predicted – correctly, in my experience – that advocates of reform math would respond by claiming that there must have been “implementation” problems. This is not an inherently unreasonable argument to make, but it has a whiff of wishful thinking about it and is not clearly supported by the evidence.

Consider, for example, that these changes in Quebec were rolled out in an extraordinarily cautious fashion by the standards of education reform. From the paper:

click to enlarge

The implementation timeline spanned more than a decade and was rolled out in only one or two grades per year, always preceded by a year of planning, training, and professional development:

Extensive training was provided to support the new program. The year prior to the implementation in Elementary Cycle 1, teachers, principals and government officials began the task of preparing the implementation of the reform. Sixteen pilot schools along with several other Lead schools in the English sector experimented with the key concepts of the program of study, as well as school organizational approaches that could be best suited to the strategies required to maximize the effectiveness of the learning environment.

In June 2000, principals in conjunction with teachers began developing their implementation plans for September 2000. Each school was allowed to develop its own approach to deal with the implementation since no single approach was believed to meet the needs of each school across Quebec…In 2000, all schools, both elementary and secondary, participated in some way to the development of the implementation of the reform despite the fact that it did not affect all levels of schooling at the time. Guides for teachers were produced. The implementation was staggered over many years (grades), giving time for teachers to adapt to the new programs.

It may be that the reforms could have been implemented more effectively, but this is nevertheless a lot of preparation. If a decade of gradual phase-in with elaborate supports is inadequate for effective reform math implementation, arguably the problem is not with “implementation” per se.

The strongest support for the “poor implementation” hypothesis is the fact that the researchers found that as time went on, younger students seemed to experience less of a negative effect from the instructional reform. This could suggest that implementation was getting better over time, but as the authors note this finding is neither unambiguous nor encouraging:

We find that grade 2 students, 8 years after the implementation of the reform, no longer seem to experience a significant negative effect…The reform being ambitious, it is possible that it took a fair number of years for teachers to develop the necessary skills to fully deploy all aspects of the reform. It may also be the case that, observing the decline in students’ academic performance, teachers informally decided to reintroduce some of their pre-reform teaching approaches, and set aside in part or in totality the reform approach…[W]e are unable to identify which of these two explanations is dominant. In any case, this finding implies that at best the provincial reform had no long run effects on the development of procedural mathematics skills.

In other words, teachers may have gotten better at teaching math using constructivist techniques or they may have given up on trying. In either case, the authors were unable to find any significant positive effect of the reform for young students even after their schools were 8 years into implementation.

To see if their findings are limited by the particular math test taken by students in Quebec, the authors also look at TIMSS and PISA results. Those international tests assess a broader range of skills and allow them to compare trends in Quebec to trends in neighboring Ontario. The patterns are similar. For TIMSS:

Grade 8 students’ performance shows a similar pattern when results from 2007 and 2011 are compared with results from all previous years: Quebec’s performance in both mathematics and sciences is trending downwards, while the performance in Ontario is increasing or stable…

Estimated effects are large and negative in all cases. They are significant in mathematics in both grades, but only in grade 8 in science.

And PISA:

The ERES project has recently produced two reports comparing the math knowledge and the French proficiency of grade 11 students exposed to the reform to that of pre-reform ERES. The math test uses 25 questions from exercises administered during the 2003 and 2006 PISA assessments…In sum, students in the reform group scored slightly lower on average, with a larger difference in geometry and algebra. As for the French proficiency test, they do not find any significant differences, but almost 30% of grade 11 students in the two groups did not complete the assessment.

Overall, the evidence from TIMSS and PISA suggests a worsening of Quebec’s students performance post reform in mathematics and at best a stand still in science and French.

The authors also did not limit themselves to academic indicators, and looked at the effect of reform on (self-reported) indicators of student behavior. Again, they found some negative effects and – at best – some null effects:

We find that the vast majority of coefficients across all grades and outcomes suggest a negative impact of the reform on students’
behavior. With a few exceptions, these effects are rarely significant. In grades 5–6 and 7–8, we find a significant worsening of the situation for the following measures: hyperactivity, anxiety, physical aggression, interpersonal competencies and emotional quotient. In grades 5–6 and 7–8, the strongest evidence that the policy had an impact on behavior is for hyperactivity and anxiety (more than 50% of a std. dev.). The effects are rather strong, positive (more hyperactivity and anxiety) and robust to sample and method. In grades 9–10, the estimated effects are significant only for prosocial behavior, physical aggression and property offense.

Our results are in line with those reported by ERES, which found no effect on social adjustment, personal and emotional adjustment and intrinsic motivation. They also found that post-reform students felt less well-adapted to secondary school, male students were found to have lower self-esteem, and at risk students were less engaged in school work. We therefore conclude that the reform did not improve the behavior of students measured using the self-reported NLSCY behavioral indicators.

Needless to say – and as the authors themselves acknowledge – this study is not really “definitive” in any meaningful sense. This study was only able to measure short- and medium-term effects of this particular reform on some academic abilities and, to a lesser extent, some behavioral effects. And rigorous studies of large-scale curricular reforms are few and far between, so we don’t have a huge body of research to pull from and one study should never be relied on too heavily.

Nevertheless, this is considerable zeal in some quarters for widespread adoption of constructivist, progressive, or “student-centered” teaching approaches. Advocates of such approaches should find these results concerning.

So, it is doubly important that Carnegie commissioned McKinsey to use the reformers’ data “to test whether or not it might be possible to avoid large drops in graduation rates using human capital strate­gies alone.”

A year ago, Carnegie and McKinsey concluded, “The short answer is no: even coordinated, rapid, and highly effective efforts to improve high school teaching would leave millions of students achieving be­low the level needed for graduation and college success as defined by the Common Core.”

They determined that the six-year dropout rate would double from 15% to 30%. If, as Carnegie projects, the four-year graduation rate drops from 75% to 53%, that would be a blow that Common Core probably couldn’t survive.

If the dropout rate were to double that would indeed be horrific, which is precisely why it won’t happen.

It helps first to understand how the Carnegie Corporation arrived at its conclusions. Essentially, the authors assume that under the CCSS, roughly twice as many students (67% vs. 34% now) will be considered “below grade level” when they enter high school. If, like today, half of such students drop out, we can expect the dropout rate to double right along with the number of students “below grade level”.

This is crude, to say the least.

The core of the confusion is this: Even if we assume that the Common Core standards are considerably tougher than the state standards they are replacing, it does not follow that students will be held to substantially higher requirements in practice.

For one thing, whether students are identified as “below grade level” depends at least as much on the Common Core tests – and associated cut scores – as it does on the standards themselves. While there is pressure from some quarters to raise cut scores on these tests and thereby identify fewer students as “proficient”, there will also be downward pressure on cut scores if officials are reluctant to tell many more families and communities that their kids are not as “smart” as they thought they were. The authors of the Carnegie Corporation report assume that Common Core cut scores will end up matching NAEP’s, but this is basically a guess on their part.

More importantly, whether a student is identified as “below grade level” in this way doesn’t tell us much about the extent of her academic challenges in high school. The biggest academic challenges she is likely to face are course expectations and – depending on where she goes to school – exit exam requirements, and those are largely independent of Common Core tests.

Consider course requirements. While many CCSS supporters would like teachers to make their classes harder in proportion to the new standards and tests, many teachers, especially of struggling students, will know or quickly determine that drastically raising their course expectations will cause kids to flounder. As a result, they will “soften” the expectations for students, lowering the difficulty of the work and giving passing grades for work that is, in some sense, “below grade level”.

Indeed, this is how it works now: despite the fact that all classrooms in a state are theoretically held to the same content standards, not all classes are equally rigorous nor are students within a single class all held to the same absolute expectations. This is partly because teachers vary in their interpretations of the standards and their perceptions of students, but it is also because teachers differentiate on the basis of their students needs and abilities. Teachers do not, as a rule, want to see their students struggle and fail.

What this means is that in practice many students – and academically vulnerable students in particular – are not likely to see dramatic changes in the difficulty of their courses. They might engage in different sorts of activities or cover different content, but their teachers will adjust the difficulty of the course so that students are not completely overwhelmed.

A similar logic applies to exit exams. While such exams should be aligned to the Common Core, nothing requires that they be as challenging as the CCSS tests. Whether or not a “below grade level” student can “pass” an exit exam is a choice made by adults, most of whom are disinclined to subject children to unnecessary – or politically awkward – failure.

In other words, to argue that the Common Core will double the dropout rate, you have to assume – implausibly and uncharitably – that teachers and policy-makers are heartless, oblivious automata who will not respond to the effects of standards on students.

Now, it’s fair to say that Common Core supporters have sometimes tried to have it both ways here, claiming that the new standards will “raise the bar” for students and, simultaneously, that kids will not suffer as a result of greater challenges in school.

And it’s also likely that, Common Core notwithstanding, our dropout rates will increase in the coming years since they are currently at an all-time low and an improving economy will give marginal students better alternatives outside of school.

For the record, it is entirely possible that the CCSS will contribute modestly to future increases in the dropout rate. The Common Core will – by design – make some courses more difficult for many students, and for marginal students that may be enough to nudge them out of school altogether.

There is, however, no reason to think that the Common Core will “double” the dropout rate.

It was a little surprising to see the AFT take a stand against the edTPA teacher licensing test given President Randi Weingarten’s support for similar “bar exams” for teachers, and it got me thinking about “professionalizing” teaching in general.

That teaching needs to be “professionalized” is a mostly-platitudinous claim, but you often hear from both sides in the education reform debates.

You often hear people reason about professionalization by analogy: that we need to change the way teachers are certified to make the profession more similar to law or medicine.

This is probably a bad way of thinking about teaching.

It’s easy to forget, but the United States actually needs a lot of teachers. In 2012, public and private K-12 schools employed roughly 3.7 million teachers.

In other words, we need three times as many teachers as we have lawyers and more than four times as many teachers as doctors.

Another way to think of it is this: 3.7 million teachers represents nearly 2.8% of the civilian labor force and 8% of all college graduates in the labor force in 2010.

And teachers, of course, make substantially lower salaries than doctors or lawyers, which will complicate any efforts to reduce the profession’s attractiveness or to throw up additional barriers to entry.

So it’s really not obvious that it’s possible to make teaching much like medicine or law even if we wanted to.

The upshot is that, even if you operate with an extremely naive model in which only student achievement outcomes matter, it’s not obvious that tenure reform2 will have large net benefits:

There will probably be some good effects and some bad effects of tenure reform, and much depends exactly on how the tenure rules are changed and how the state, districts, and schools respond. For example, will districts raise salaries in response to limitations on tenure? Will administrators find work-arounds to reduce energy spent on evaluations?

As a result, it’s hard to know which effects, if any, will dominate in the long-term. They may largely cancel each other out.

One of the central tensions for reformers when it comes to improving teacher quality is that on the one hand they believe teachers are fighting desperately for excessive job security but also, on the other hand, that you can substantially reduce that job security without making teaching significantly less attractive.

In theory this is not impossible. Making it work, however, requires admitting that job security is a benefit for teachers and that taking it away will – all else equal – make being a teacher less appealing.

I use historical data from the Bureau of Labor Statistics to identify the probability that a worker in the model is involuntarily separated from their job, which is about a 4 percent chance per month, average duration of unemployment, which is about 3.5 months, and the probability of continuing employment, which is about 96 percent.

For the case of the public sector, the probability of involuntary separation is just 1.3 percent, which is one-third as high as the probability in the private sector case. I then calculate the difference in compensation between the public sector (low unemployment case) and the private sector, such that a worker would be indifferent between working in either sector. I find that workers would be willing to work for about 10 percent less compensation in the public sector, given the additional benefit of much higher job security. This estimate is conservative in terms of considering today’s labor market, as average unemployment duration today is much higher than its historical average.

In other words, Ohanian thinks you could use job security as a means of attracting employees into the public sector even if you offered salaries roughly 10% lower than in the private sector because the job security itself has some value.

Now, Ohanian is a conservative economist writing for a conservative think tank, so he unsurprisingly concludes that you can do away with these job security benefits because public sector workers are so wildly overcompensated to begin with that the marginal value of the “excess” compensation is very small.

My sense, however, is that many education reformers – who are often left-leaning – don’t want to say that at all. On the contrary, they will often say that we want the “best and brightest” – i.e., significantly above-average workers – to go into teaching and that teachers should receive more compensation (contingent on performance).

The trouble is that, as Rick Hess puts it, courts are good at “access, not quality”. The Vergara lawsuit is ultimately a very crude way of enacting policy change; the decision does not require, for example, any measures to compensate teachers for reductions in job security.

Reformers may want salary increases but since they weren’t judicially mandated – and do not otherwise appear to be forthcoming – we have to consider a world without them.

I’m not opposed, in principle, to carefully paring back tenure protections for teachers in exchange for higher salaries or other benefits. I could therefore be made more sympathetic to many reformers’ projects if they seemed to be taking these trade-offs more seriously.

At this StudentsFirst forum I submitted to the panelists a question about trade-offs, and was assured all questions would eventually be answered in person or online, but never got a response. [↩]

You can perform a similar exercise for changes to seniority rules. [↩]

The authors used data on a large number of first grade students to see what strategies their teachers used to teach them math. They then looked to see whether teachers tended to use different strategies when students had stronger or weaker math skills, and then grouped these strategies together based on whether they would normally be considered ‘teacher-directed’ or ‘student-centered’.

Their methods – including an elaborate set of statistical controls for variables like student SES and prior achievement – also allowed them to make tentative causal inferences about which teaching strategies seem to be more effective for students who were stronger or weaker in math to begin with.

The results are mostly unflattering to student-centered approaches.

Student-centered Teaching Can be a Pedagogy of Privilege.

A couple of years ago I claimed that teaching methods typically considered ‘student-centered’ together represent a ‘pedagogy of privilege‘; such methods might be good – or at least good enough – for relatively strong students, but they often do not meet the needs of students with weaker skills.

The authors of this new study reach a basically similar conclusion, at least in regards to first grade math instruction:

Controlling for many potential confounds, we also found that only more frequent use of teacher-directed instructional practices was consistently and significantly associated with residualized (value added) gains in the mathematics achievement of first-grade students with prior histories of MD [i.e., mathematics difficulties]. For students without MD, more frequent use of either teacher-directed or student-centered instructional practices was associated with achievement gains. In contrast, more frequent use of manipulatives/calculator or movement/music activities was not associated with significant gains for any of the groups.

…

An important contribution of our work is that we find that teacher-directed instructional practices are associated with achievement by both students with a prior history of persistent MD, as well as those with a prior history of transitory MD. In contrast, other, more student-centered activities (i.e., manipulatives/calculators, movement/music) were not associated with achievement gains by students with MD.

In other words, the most fortunate students will manage one way or the other but the less fortunate kids are not well-served by student-centered approaches.1

Student-centered Teaching is Attractive in Low-Skill Settings

Despite their inappropriateness for struggling students, I’ve also hypothesized that student-centered approaches may – paradoxically – be more favored when students have fewer or weaker skills.

My guess was that student-centered approaches can obscure skill gaps, which tend to be more salient in low-skill classrooms. When students are mostly proficient or advanced, teachers, administrators, and parents tend to have plenty of independent verification that students are skilled; ambiguous, student-centered activities are not relied on for demonstrations of mastery. With lower-skilled students, adults are more likely to be worried about their students’ skills, because much of the available evidence (e.g., test scores, independent classwork) suggests those skills are absent or weak. When students engage in student-centered activities, they can easily give the illusion of proficiency – talking to one another, handling materials, and so on – especially if you don’t examine their work too closely or don’t know what you’re looking for. And it’s easy to interpret ambiguous evidence of learning favorably if you really want to see proficiency (as most educators do).

This new study finds evidence consistent with my theory, at least for some student-centered teaching strategies:

We found no significant relation between the percentage of MD students in the classroom and the frequency of teacher-directed or student-centered instructional activities. However, we did find that…classes of students with higher percentages of MD students were more likely to be taught these skills and with instructional practices emphasizing using manipulatives/calculators and movement/music. As reported below, these instructional activities…were not associated with mathematics achievement gains by students with MD.

Regardless of the reason, however, it seems that teachers are choosing to use less effective methods especially with those students who need the most help.

Student-centered Teaching is not Obviously Research-based

Of course, if you ask adults who favor student-centered methods, they will very often say that those methods are ‘research-based’. There is some sense in which this is true, at least to the extent that you can find seemingly-reputable education studies to support almost any instructional decision.

The trouble is that a great deal of education research is ideologically-motivated, and well-controlled studies of instructional effectiveness are difficult to perform in any case. So how strong, really, is the research base for student-centered teaching?

This new study suggests it is probably not as strong as it is often made out to be:

Some types of instructional practices are commonly considered “evidence-based,” and so presumably their use by teachers should result in increased mathematics achievement. For example, Baker, Gersten, and Lee’s (2002) synthesis of researcher-directed intervention studies yielded a weighted ES of .66 for the use of structured peer tutoring on low-skilled children’s mathematics achievement. Additional syntheses also support peer tutoring as an evidence-based practice (Elbaum, Vaughn, Tejero, & Watson, 2000; Mathes & Fuchs, 1994). Yet our estimate of student-centered instruction, which includes peer tutoring, was statistically non-significant when used with students with prior histories of MD (Guarino et al. [2013] also reported a statistically non-significant finding for peer tutoring).

The authors suggest this might be related to implementation fidelity problems with student-centered approaches, and I suspect that’s a factor.2 It’s also possible, though, that much of the underlying research is just not as strong as we’d like to begin with.

To be clear, nothing here demonstrates that any particular ‘student-centered’ approach doesn’t have its place, even potentially in classrooms with large numbers of struggling students.

This study is, however, more evidence that many traditional, ‘teacher-centered’ approaches are often unfairly-maligned and under-utilized.

In fact, calling those approaches ‘student-centered’ at all seems presumptuous. [↩]

This is not exactly a ringing defense of student-centered approaches; if they are harder to implement, so much the worse for them. [↩]