What Tango taught me about evidence in education

2 September, 2018

by
Pauline Ho

Introduction

Next week, Evidence for Learning publishes the results of the first of four independent Learning Impact Fund evaluations. A model new to Australia, it identifies, funds and manages research trials through pairing a promising educational program with an independent evaluator to conduct a mixed-methods randomised controlled trial (RCT) in schools. Students are randomly assigned to receive an intervention or not. The goal is to test if a program increases learning compared to a similar comparison group (or control).

Thinking Maths is the first program to evaluate and for us to learn from. (Evidence for Learning, 2017). The South Australia Department for Education developed this professional learning program to support middle years maths (Years 6-9) teachers’ pedagogical content knowledge, with the aim to improve students’ maths achievement. They have worked with schools to implement this initiative since 2015. In 2016, they took the next step to have their program independently evaluated to know if the program has any beneficial effects on their students, in the most rigorous way possible.

Why reliable evidence matters in education?

I happened to go for a social tango dancing lesson recently and I’ve learnt there are principles for each step, but the combination of steps is improvised on the spot. You go with the flow of the music, ‘listen’ and move in step with your partner, in the context of the space and other couples around you so you don’t bump into each other. When you combine these in one experience, you have synchronisation.

Like the art of tango, you need strong, reliable evidence to guide our decisions on the next dance step, and skills to decipher the best possible paths to embed evidence to allow a greater chance of success for your students.

Practising teachers and leaders have a wealth of local evidence; identification of student or class needs, and of what happened when they tried approaches in the class or school. What we lack is easy access to an evidence base of programs that have been shown to consistently work well in many contexts. And to show which programs work, for which students, under what conditions. Without high-quality, rigorous evaluations of these programs, there is a danger that educators choose strategies on the basis of a ‘hunch’ or only their local evidence. Indeed, without guidelines and yardsticks against which to measure effectiveness, teachers may adopt a particular approach without evidence of any effectiveness which could be damaging to children, as compared to normal classroom experience.

In this blog, I wish to share what Evidence for Learning hopes to do to bridge this gap and share what we can expect from the Thinking Maths trial report and supporting resources. To share why reliable evidence from RCTs such as this evaluation matters, and how we can interpret the evidence to inform our decisions to influence student learning.

Helping teachers use evidence

Teachers and school leaders are most qualified to make decisions about what interventions to use in school. Before they can do that, they need reliable evidence on which to base their judgements. But even the most reliable evidence will not benefit educational outcomes if educators can’t engage with the content.

This point reminded me of an interesting experiment a group of researchers from Durham University tried with several primary schools to see how teachers implemented an intervention (Maclellan, 2018). The intervention was based on research published in 2017 by John Hattie and Helen Timperley on enhanced feedback (Hattie & Timperley, 2007). The researchers were looking to see if teachers engage with research evidence that has shown proven benefits (previous studies have shown an effect size of around 0.6) and apply it in a way that had a positive outcome for their students. What they found was that teachers struggled to understand the academic language used and teachers also wanted more examples of how to apply enhanced feedback.

For the Thinking Maths evaluation results, we have developed ‘practitioner-friendly’ resources which will be freely available from the Evidence for Learning website (Evidence for Learning, 2017) from 5 September. They include:

Evaluation Report: A detailed research and data report of the trial by the independent evaluators.

Executive Summary: A summary of the evaluation report and its impact and findings.

Evidence for Learning Commentary: Plain English commentary on implications based on the evaluation findings and considerations for teachers, school leaders, program developers and systems.

What can educators expect to see from these reports?

High-quality and well conducted RCTs can show if an educational intervention is effective (Hutchison & Styles, 2010). Each of the Learning Impact Fund trials allows us to test the effectiveness of a training or approach in schools and classrooms and compare these to those not receiving the program. We are therefore identifying gains that go beyond normal increases in performance through the school year. A process evaluation is embedded in the evaluation design to gather rich information about how the programs are implemented. When RCTs are conducted in this way, they provide educators and system leaders with valuable evidence of what works and why.

At the end of each trial, Evidence for Learning works with the evaluator to produce a plain English report and commentary. This includes simple to understand ratings of a) months’ of learning from the program b) our level of confidence in the results and c) the cost to implement.

Three key indicators

Months’ Impact: Effect sizes are important to determine the magnitude of impact of an intervention. Aside from the effect size, we estimate the additional months’ progress you can expect students to make as a result of an approach being used in schools.

Cost: Using Evidence for Learning’s Cost Rating approach (Evidence for Learning, 2018) evaluators calculate the approximate cost per student per year of implementing the intervention over three years. This may include training and materials, Temporary Relief Teaching (TRT) days replacement, and other resources. Estimates are based on training being delivered to a group of 35 teachers with an average class size of 25 students. This amount per student is rated on the basis of very low to very high, according to Evidence for Learnings’s Cost Rating guidelines (Evidence for Learning, 2018).

Evidence Security: The number of padlocks gives us a sense of how confident are we in the results (Evidence for Learning, 2018). Each Evidence for Learning trial undergoes rigorous reviews and transparency throughout the course of the evaluation. At the end of the trial, the report is independently assessed by two reviewers and given an evidence rating of how confident the results are based on its trial design, methodology, attrition numbers, threats to internal validity and other statistical decisions and analysis made. Evidence for Learning independently assesses and gives a security rating of 1 to 5 padlocks (1 being the lowest and 5 the highest), for each evaluation using a rating system that assesses the evaluation’s trial design, attrition numbers, and statistical analysis.

Apart from the statistically significant strength of the evidence presented in our reports, educators should consider a number of factors in considering effect sizes and what it means for their context. The padlock ratings provide another layer of judgment about the overall strength and rigour of the evaluation. For example, if an evaluation showed two months’ learning progress and four padlocks, this means we can say with high confidence that this evidence is based on high quality research.

Context – were the schools in the trial similar to my school?

Even the best interventions need to be implementable in a specific context and in response to students’ needs. So the trials are run within different contexts.

Educators want to know ‘Are these results implementable to my context?’ To help educators make these decisions, we show how similar the schools in the trial may be to other educators’ context.

Balance and strength when using evidence

There are two considerations to think about when reading the evidence from the evaluation.

Start from ‘what works’ to ‘how it works’

To continue with the tango analogy, the foundation of tango is always ‘be on one axis’, that is - having enough weight on one foot to confidently take the next step. Similarly, to enable innovation to start on the front foot, we need to make sure that schools have access to high-quality research. The results tell us what has been successful, with a particular group of students. The results also explore, under what conditions did the implementation work or not work, and why. For example, drawing on the Thinking Maths evaluation, which involved the experiences of over 7,000 students, we were able to tell if the intervention worked in improving students’ maths achievement for all levels from Years 6-9.

The findings in well-conducted trials help us answer important questions about outcomes in education that need to be asked, that is, whether an educational program or teaching approach worked. This information is important. But from here, teachers and school leaders need to get to the next level down on ‘how it works’. For example, if we know metacognition as a teaching approach works, the next step is to take a wider look at the research base on the types of strategies that could be used in their classrooms.

Not one size fits all, tailor evidence to your context

Taking the evidence from Thinking Maths trial or any trial, educators need to consider if the evidence applies for their contexts and students. At the broadest level, educators should first consider the conclusion statement(s) in the evaluation report. This provides the evidence about whether:

the program works,

what was measured works (e.g. pedagogical content knowledge) and

for whom it works for (e.g. teachers or students) and

in which settings (e.g., Primary, Secondary).

To help educators decipher impact that relates to teaching practice, Evidence for Learning provides the translation of the effect size into months of learning progress (as calculated from a mean weighted effect size) (Evidence for Learning, 2018).

Beyond what works, educators also should consider ‘Does this program or approach work for my context?’ Consider the other findings in the report as well as the implications and considerations in the Evidence for Learning Commentary. These provide implications from the evaluation and considerations for teachers, school leaders, program developers and systems as a starting point to discuss the results.

Before choosing to implement a program or approach, it is useful to know how others have found it works in their contexts, and if it increases results by doing something similar. 'Is this innovation likely to meet the needs of my students or address a teaching and learning challenge in my school?'

Before the song ends …

Evidence can help us focus our efforts where it will make the most difference to students. Not only is it a waste of resources to spend them on something that may not work, it also does not make sense to roll out an initiative to many schools, if the initiative has not proven it is beneficial for students. So, unless we test something, we don’t actually know that it works.

I want to see – as I am sure do you – an education system that helps every student reach their potential, no matter their backgrounds. This is where evidence-informed decision making and practice-based evidence have an important role to play. To ensure the chances of improving student achievement, there will still be a need to adopt a given intervention in ways that is tailored to the local context of the school and students. It is this combination of elements – rigorous research evidence, skills to interpret research, and the knowledge of contexts, enablers and barriers that will help us achieve the best outcomes for our students in Australia.