6 Questions Hattie Didn’t Ask But Could Have

If you’re familiar with John Hattie’s meta-analyses and it hasn’t given you fits, it may be worth a closer look.If you missed it, in 2009, John Hattie released the results of a massive amount of work (work he updated in 2011). After pouring over thousands of studies, Hattie sought to separate the wheat from the chaffe–what works and what doesn’t–similar to Marzano’s work, but on a much (computationally) larger scale.Hattie (numerically) figures that .4 is an “average” effect–a hinge point that marks performance: anything higher is “not bad,” and anything lower is “not good.” More precisely, Grant Wiggins aggregated all of the strategies that resulted in a .7 or better–what is considered “effective.” The top 10?

While I leave it up to Hattie and those left-brain folks way smarter than I am to make sense of the numbers, I continue to wonder how the effect of one strategy–problem-based learning, for example–can be measured independently of other factors (assessment design, teacher feedback, family structure, and so on). Also, it can also be difficult to untangle one strategy (inquiry-based learning) from another (inductive teaching).

Hattie’s research is stunning from a research perspective, and noble from an educational one, but there are too many vague–or downright baffling–ideas to be used as so many schools and districts will be tempted to use it. Teacher Content Knowledge has an effect size of.09, which actually is worse than if they did nothing at all? Really? So how does it make sense to respond, then?

As always, start with some questions–and you may be left with one troubling implication.

What Should You Be Asking?

Recently, we shared a list of these effect sizes, shown in ascending order. Included in Grant’s original post is a well thought-out critique of Hattie’s work (which you can read here), where the author questions first Hattie’s mathematical practice of averaging, and then brings up other issues, including comparing apples and oranges (which another educator does here). Both are much more in-depth criticisms that I have any intention of offering here.

There are multiple languages going on in Hattie’s work–statistical, pedagogical, educational, and otherwise. The point of this post is ask some questions out loud about what the takeaways should be for an “average teacher.” How should teachers respond? What kinds of questions should they be asking to make sense of it all?

1. What’s the goal of education?

Beyond anything “fringe benefits” we “hope for,” what exactly are we doing here? That, to me, is the problem of so many new ideas, trends, educational technologies, research, and more–what’s the goal of education? We can’t claim to be making or lacking progress until we know what we’re progressing towards.

The standards-based, outcomes-based, data-driven model of education has given us bravely narrow goals for student performance in a very careful-what-you-wish-for fashion.

2. How were the effect sizes measured exactly?

How are we measuring performance here so that we can establish “effect”? Tests? If so, is that ideal? We need to be clear here. If we’re saying this and this and this “work,” we should be on the same page about what that means. And what if a strategy improves test scores but stifles creativity and ambition? Is that still a “win”?

3. What do the terms mean exactly?

Some of the language is either vague or difficult to understand. I am unsure what “Piagetian programs” are (though I can imagine), nor “Quality Teaching” (.44 ES). “Drugs”? “Open vs Traditional”? This is not a small problem.

4. How were those strategies locally applied?

Also, while the “meta” function of the analysis is what makes it powerful, it also makes me wonder–how can Individualized Instruction only demonstrate a .22 ES? There must be “degrees” of individualization, so that saying “Individualized Instruction” is like saying “pizza”: what kind? With 1185 listed effects, the sample size seems large enough that you’d think an honest picture of what Individualized Instruction looked like would emerge, but it just doesn’t happen.

5. How should we use these results?

In lieu of any problems, this much data has to be useful. Right? Maybe. But it might be that so much effort is required to localize and recalibrate it a specific context, that’s it’s just not–especially when it keeps schools and districts from becoming “researchers” on their own terms, leaning instead on Hattie’s list. Imagine “PDs” where this book has been tossed down in the middle of every table in the library and teachers are told to “come up with lessons” that use those strategies that appear in the “top 10.” Then, on walk-throughs for the next month, teachers are constantly asked about “reciprocal teaching” (.74 ES after all), while project-based and inquiry-based learning with diverse assessment forms and constant meta-cognitive support is met with silence (as said administrator flips through Hattie’s book to “check the effect size” of these strategies).

If you consider the analogy of a restaurant, Hattie’s book is like a big book of cooking practices that have been shown to be effective within certain contexts: Use of Microwave (.11 ES) Chefs Academic Training (.23 ES), Use of Fresh Ingredients (.98). The problem is, without the macro-picture of instructional design, they are simply contextual-less, singular items. If they are used for teachers as a starting point to consider while planning instruction, that’s great, but that’s not how I’ve typically seen them used. Instead, they often become items to check, along with learning target, essential question, and evidence of data use.

Which brings me to the most troubling question of all…

6. Why does innovation seem unnecessary?

Scroll back up and look at the top 10. Nothing “innovative” at all. A clear, credible teacher that uses formative assessment to intervene and give learning feedback should be off the charts. But off the charts how? Really good at mastering standards? If we take these results at face value, innovation in education is unnecessary. Nothing blended, mobile, connected, self-directed, or user-generated about it. Just good old-fashioned solid pedagogy. Clear, attentive teaching that responds to data and provides feedback. That’s it.

Unless the research is miles off and offers flat out incorrect data, that’s the path to proficiency in an outcomes-based learning environment. The only way we need innovation, then, is if we want something different.

1 Comment

Thank you for this analysis. Another major issue is Hattie’s misrepresentation of meta-analyses. For example, the number 1 influence – Student self-assessment/self-grading. The meta-analyses Hattie used do not claim to measure this. They are mostly measuring students’ reporting their High School GPA to a College entry interview a year or so later. The high effect size indicates that students are reporting a higher GPA than what they achieved! Details here – http://visablelearning.blogspot.com.au/p/self-report-grades.html

Misrepresentation appears to be a common problem in Hattie’s 2009 book. Another striking example is the research used for Teacher Training, giving a low effect size of 0.11 (note many of the same studies are used for Teacher Content Knowledge). But, on closer analysis, NONE of the studies looked at teacher training but at a particular sort of USA teacher certification. This certification is done many years after they have gained their University degree and many years after a teacher has started teaching. So the low effect size indicates certification does not improve teaching. This certification is similar to what we do here in Australia where we have to create a dossier of lesson plans and present evidence of what we do in the classroom in order to get a promotion. So this research is saying the certification does not improve teaching! Details here – http://visablelearning.blogspot.com.au/p/teacher-training.html