In a brief stop-over between Sydney and Dubai, Tom Bennett was surprised and delighted to discover so many teachers prepared to ‘give up a Saturday’ to come and explore the role of a research lead. It shouldn’t come as such a surprise. The event, expertly organised by Helene Galdin O’Shea, involved a combination of thought-provoking speakers and the opportunity to meet exceptional colleagues – making it extraordinary CPD. Here’s my reflection on the day and a (dreadfully inadequate) summary of some of the talks I attended.

Philippa Cordingley opened with a keynote identifying some effective ways to lead research within schools. Schools are increasingly creating directed time for professional inquiry (whether it’s called coaching, lesson study or action research) and she related some interesting case studies of how schools were structuring these opportunities. She related the importance of having a research base which reflected ‘the things that wake teachers in the middle of the night’ – and this is an important issue. Finding, accessing and disseminating the sort of educational research which can be applied by classroom teachers is one of the great challenges.

Philippa referred to the bottle-neck between research and practice based knowledge and the work that CUREE has been doing to overcome this:

Philippa emphasised the need for professional learning to involve creating new ideas and strategies with a clear focus on student outcomes. It wasn’t so important whether teachers engaged in their own research or the research of others, rather that it was the process of challenging prevailing orthodoxies and supporting teachers as learners which had the greatest impact.

One of the most interesting items she raised was importance of school leaders actively participating and modelling this process. To support the argument, Philippa referred to research by Robinson, Hohepa and Lloyd (2009). Their meta-analysis into school leadership and student outcomes identified that the factor which had the greatest influence was promoting and participating in teacher learning and development:

The fact that this appeared to have greatest influence on student outcomes raised a few eyebrows. It might be expected that school leaders participating and promoting professional development would have an impact on outcomes for students, but the fact it appeared to be the most influential factor was surprising.

The day then presented some difficult choices! I missed Jude Enright’s talk on enquiry based practice (which she has very kindly blogged about here), Beth Grenville-Giddings talk on setting up a journal club and Robert Loe talking about how we might explore and measure ‘relationships for learning’. I hope someone will blog on these.

Gary Jones led the second session with a focus of how we can help teachers ask better questions about their teaching. Drawing parallels with evidence-based medicine and clinical practice, he related a number of ways that we can move from ‘background questions’ which tend to be poorly formulated and difficult to evaluate, towards ‘foreground questions’. In essence, he gave some useful ideas on how teachers might operationalise their own research in more productive ways:

P — Pupil or Problem. How would you describe the group of pupils or problem?I — Intervention. What are you planning to do with your pupils?C — Comparison. What is the alternative to the intervention/action/innovationsO — Outcomes. What are the effects of the intervention/action/intervention?

There is a host of interesting ideas and resources for research leads on Gary’s blog. I might also recommend his discussion on some of the pitfalls and misconceptions that research leads would be wise to avoid:

The SUPER partnership (the ‘Power Rangers’ of researchED) is a school-university partnership for educational research between schools in the East of England and the Faculty of Education at the University of Cambridge. The SUPER partnership was bringing researchers and schools together years before researchED was even a twinkle in Tom Bennett’s eye. If you haven’t seen their blog, there’s a pretty comprehensive list of research resources that research leads would find useful:

The talks by Clare Hood and Abi Thurgood were a fascinating insight into the challenges of the research lead role. Clare contrasted the fast-paced culture of accountability and ‘evidence’ that currently exists in many schools with the slower pace of academic research. She also talked of the value of having the Cambridge team as a ‘critical friend’, especially when formulating research questions across a teaching alliance. Abi identified some of the core aspects of her role: Formulating teacher and subject department research questions, the dissemination of practitioner research through a ‘marketplace’ format, links to the Cambridge MEd programme and the very interesting idea of bringing in student researchers through EPQ projects. Both talks emphasised the ‘sense of re-professionalism’ that came from teachers having opportunities to choose their own development goals rather than working to imposed targets.

One question that arose in my mind, as research leads become more common place in schools, is how we ensure an appropriate ethical framework for teacher-led research. University-based researchers have a range of ethical checks and guidelines to ensure research is conducted in a responsible way and minimise risk to participants, but few schools appear to have a research ethics policy. I have some ideas about this which I’ll endeavour to blog about before the next researchED research leads conference at Brighton in April. If people think it’s an area worth exploring, I’d be happy to present something and facilitate a discussion at a future event.

The issue of access to research was an enduring theme across the day. Vincent Lien made a reasoned and impassioned argument for teachers to have free access to education research. If you haven’t signed his petition yet, it’s available here:

Jonathan Sharples and Caroline Creaby related examples of connecting teachers and researchers together. Jonathan related some of the barriers which exist in sharing and promoting the use of evidence in schools. He made the point that this process was likely to be slow going – NICE estimate it takes up to 15 years for research evidence to become embedded within medical practice. The ‘Push’ of evidence-based research coming from universities will be slow to change education on its own. One of the roles for a research lead might be to help ‘pull’ evidence-based research into schools and foster the ‘links’ between universities and schools. Caroline gave some excellent examples of how teachers had been able to draw upon the expertise of university researchers through mechanisms as simple as emailing questions. However, I suspect these informal channels of communication – whilst excellent – are not really scalable across a large number of schools.

Academics tend to be very generous with their time and keen to talk to teachers about research, but without a more formal framework it’s difficult to see how this can genuinely make an impact across a school system. However, plans are in motion. Caroline is about to project manage ‘evidence for the frontline’ – involving the Coalition for Evidence-Based Education (CEBE) and the Institute for Effective Education (IEE) at York.

Essentially E4F appears to be a brokering service, linking up teachers with research expertise and resources. One of the things that they want to create is a map of expertise showing practitioners, researchers and other providers:

Ffion Eaton took up the role of research lead in 2013 and talked about embedding a research culture, through a whole-staff action research programme, within her school and teaching school alliance. Her school is part of the RISE project – EEF funded research examining whether research leads help improve student outcomes in schools – and she described a little of the training, resources and support this had provided. One of the key initiatives she related was the difficulty in maintaining communication of research across the school – it’s easy for research to go on in isolated pockets within schools. One interesting idea was the development of a teaching and learning bulletin and mini ‘research conferences’ to help disseminate some of the research and findings across the alliance.

There felt like a convergence in many of the arguments raised across the day: A key focus upon research leads playing a role in strengthening professional development and using evidence-based research as the starting point for improving student outcomes. This aligns well, I think, with the recent Sutton Trust report on improving professional development. I think the six principles of teacher feedback listed in that report might serve as an effective summary of some of the major themes arising from the network day:

Sustained professional learning is most likely to result when:
• the focus is kept clearly on improving student outcomes;
• feedback is related to clear, specific and challenging goals for the recipient;
• attention is on the learning rather than to the person or to comparisons with others;
• teachers are encouraged to be continual independent learners;
• feedback is mediated by a mentor in an environment of trust and support;
• an environment of professional learning and support is promoted by the school’s leadership.

The Janus-faced role of a research lead

The metaphor of bottlenecks and bridges between research and practice-based knowledge emerged more than once over the course of the day. The Roman god of bridges, doorways and passageways was Janus. What emerged from the day, for me, was the ‘Janus-faced’ role of a research lead in schools: outward-looking towards the extensive and sometimes difficult-to-access research evidence that might inform practice; and inward-looking towards facilitating teachers investigating their own practice.

University researchers and classroom teachers expressed frustration at the ‘closed doors’ to each other’s institutions. At the moment, many of these links appear informal and rather haphazard – working through personal connections and chance encounters. To scale these mutually profitable relationships across the school system will likely involve more formal mechanisms by which schools can network with university researchers. Teachers need access to the broader evidence base to stimulate ideas, help formulate questions, gain research tools and act as a valid foundation for their own professional inquiry into their teaching. Researchers need access to schools and sometimes encouragement to focus their research on some of the applied problems that teachers face trying to improve outcomes for their students. Janus was associated with such travelling and trading, and research leads might also adopt this aspect – coordinating closer links and a greater trading of ideas, between school-based and university researchers.

Executive functioning is, in some ways, a pesky cognitive ability to define as it’s implicated in so many different functions. It’s a hypothesised capacity for things like problem solving, reasoning, planning and organisation, inhibiting action or speech within context appropriate norms and managing attention control (amongst others).

These functions develop rapidly in early childhood, then slowly throughout adolescence and early adulthood – reaching a peak in our mid-twenties before gradually beginning to decline.

It’s one of the areas of the brain which is much greater in size (relative to the rest of the brain) in human beings compared to other primates and other species of hominid. The main reason appears to be the greater myelination of neurones (i.e. volume of white matter) which provides greater connectivity between the prefrontal cortex and the other areas of the brain in humans compared to other species.

The prefrontal cortex plays a significant role in what psychologists call ‘Working memory’ and the idea of ‘executive functioning’ is related to the ‘central executive’ component of that model of memory. Executive functioning is associated with a number of SEND conditions which teachers will have encountered or heard about, for example ADHD (Attention deficit hyperactivity disorder). There’s some evidence to suggest that deficits in working memory, potentially related to poor executive functioning, underlie some of the difficulties children may face in school. For example, Gathercole and Alloway (2007) report:

“Approximately 70% of children with learning difficulties in reading obtain very low scores on tests of working memory that are rare in children with no special educational needs.”

There may be considerable variance in the working memory function of children in a particular classroom. For example, Gathercole and Alloway (2007) suggest that:

“Differences in working memory capacity between different children of the same age can be very large indeed. For example, in a typical class of 30 children aged 7 to 8 years, we would expect at least three of them to have the working memory capacities of the average 4-year-old child and three others to have the capacities of the average 11-year-old child, which is quite close to adult levels.”

Perhaps the most famous example of a test of executive functioning is the ‘Marshmallow Test’ by Walter Mischel. In these studies a child is offered a choice; a small immediate reward (e.g. a marshmallow) or double the reward if they could wait for 15 minutes. What Mischel found in the follow up studies was that the children who deferred gratification (i.e. waited for the bigger reward) rather than going for immediate gratification (i.e. couldn’t wait) showed different characteristics even years later.

Children who deferred gratification were rated as better able to handle stress, engage in planning, and exhibit self-control when adolescents 10 years later and went on to obtain higher SAT scores. They found that these differences appeared to be apparent even when participants were in their 40s.

Can we train executive functioning?

Given the importance of executive function in emotional regulation and higher cognitive abilities like memory and attention, there’s been considerable interest in whether such abilities can be trained in children. Certainly there have been attempts to train children’s working memory in the hope that it might help them achieve more in school, but these interventions are not straightforward.

For example, Melby-Lervåg and Hulme (2013) examine the claims of training programmes designed to boosts working memory function. They report that some of these working memory training packages made fairly confident claims regarding their effectiveness; for example, that they could help children with ADHD, dyspraxia, ASD, that they could boost IQ and improve school grades. The programmes themselves appeared to involve numerous computerised memory trials:

“However, these programs do not appear to rest on any detailed task analysis or theoretical account of the mechanisms by which such adaptive training regimes would be expected to improve working memory capacity. Rather, these programs seem to be based on what might be seen as a fairly naïve “physical– energetic” model such that repeatedly “loading” a limited cognitive resource will lead to it increasing in capacity, perhaps somewhat analogously to strengthening a muscle by repeated use.”*

The outcomes of the meta-analysis were not so supportive of these impressive claims. They suggest that although there appeared to be short-term improvements on both verbal and nonverbal working memory tasks – these gains did not last very long, nor generalise to things like the ability to do arithmetic or decode words. For attentional control, the effects were small to moderate immediately after training, but reduced to nothing in the follow up.

* Incidentally, this is one reason why I personally dislike the ‘growth mindset’ analogy of the brain being ‘like a muscle’. In many, many ways, it simply isn’t!

Ok – so ‘brain training’ programmes don’t appear to have lasting or generalisable effects on working memory, but what about other interventions – specifically aimed at improving executive functioning? There’s certainly been a recent surge of interest for the idea of developing executive functioning in our pupils – linked with the whole notion of ‘character education’.

“Yet, despite this enthusiasm, there is surprisingly little rigorous empirical research that explores the nature of the association between executive function and achievement and almost no research that critically examines whether the association is causal. From the existing research it is not clear whether improving executive functioning skills among students would cause their achievement to rise as a result.”

The authors of the review suggest that interventions to increase executive functioning probably have little value unless they are also helping children achieve greater success within school. Thus they focused the meta-analysis on whether interventions designed to improve executive functioning cause improvements to outcomes.

Interestingly, they found that there was no significant difference between attention/inhibition and working memory measures in their correlation with student achievement. Both appeared to correlate ~0.30 level. However, this relationship did not appear to be a directly causal one:

“.. there is substantial evidence that academic achievement and measures of executive function are correlated—both at a single point in time and as predictors of future achievement, and for a variety of different constructs and age groups. Despite this, there is surprisingly little evidence that a causal relationship exists between the two. High levels of executive function may simply be a proxy for other unobserved characteristics of the child.”

So what might be the factor underlying both executive functioning and school achievement? The authors explore a range of possible factors:

“Once child background characteristics and IQ are accounted for, the association between executive function and achievement drops by more than two thirds in most of these studies and in most cases the conditional associations are close to zero.”

This suggests that school-based interventions focused on improving executive functioning will have a disappointing impact on achievement:

“The most effective school-based interventions designed to influence executive function have only had an impact on measures of executive function equal to around half a standard deviation (e.g., Raver et al., 2011). This means that under the best case scenario … interventions designed to improve executive function would only have the potential to increase future achievement by less than a tenth of a standard deviation (half of 0.15).”

As well as regression analysis, they also looked at where randomised controlled trials had attempted to assess the impact of executive function interventions. They only found five studies which specifically looked at the effects of training on achievement and had a randomised design. They describe a number of programmes which have been evaluated, for example ‘Tools of the Mind’, ‘Head Start REDI’ and the ‘Chicago Schools Readiness Programme’.

These programmes varied in content, but tended to be taught as stand-alone, ‘skills-based’ approaches. For example, the REDI programme was taught to pre-school children in weekly lessons and extension activities where children were taught language skills, social skills, emotional understanding, self-regulation and aggression control by teachers trained on the ‘Promoting Alternative THinking Strategies’ curriculum. The review finds that none of these approaches appeared to directly improve student outcomes.

“The few random assignment studies which rigorously evaluate interventions designed to impact executive function provide some evidence that executive function can be influenced by intervention (most of the studies we reviewed showed some positive impacts on measures of executive function) but provide no compelling evidence that impacts on executive function lead to increases in academic achievement.”

One of the problems with the training programmes was that they target multiple factors at the same time. For instance the REDI intervention targeted executive functioning and school achievement. They make the point that:

“… if the intervention improved children’s ability to take tests, then children would perform better on both measures of executive function and on measures of achievement. If the improved ability to take tests was not accounted for in the analyses, the improvement in executive function would be correlated with the improvement in achievement.”

The problems with applying psychological research in schools

Children vary in many ways – so it should come as no great surprise we find examples of psychological differences between kids which do well at school and ones that struggle. However, just because we find that children’s school attainment correlates with cognitive ability ‘X’ or attribution ‘Y’, doesn’t tell us whether trying to train ability ‘X’ or change attribution ‘Y’ will actually help.

That’s one of the problems when trying to apply psychological findings to education: Simply identifying cognitive or affective differences between children isn’t actually all that useful. This kind of purely psychological research is a different ‘kettle of fish’ to the applied psychology of designing effective ‘interventions’ to raise achievement. There’s a lot of hype around cognitive or attributional variables which correlate with school outcomes at the moment.

As usual, the cart ends up before the horse – and interventions are implemented into schools before there’s good evidence about whether they do any good. It’s important we remember that interventions based on identified psychological differences may not necessarily lead to benefits for children. For instance, an intervention may be costly and irrelevant as there’s another factor which causes both the differences detected and the improved outcomes.

Of course, when schools have invested as great deal of time, effort and training in such an intervention scheme, it becomes easy for them to convince themselves that they are seeing a genuine difference. But we can’t rely on anecdotal evidence or professional experience alone here! It seems that the evidence to date suggests that teachers should be highly sceptical of training or intervention programmes which claim to have success in raising achievement through targeting executive functioning.

“… I found things that even more people believe, such as that we have some knowledge of how to educate. There are big schools of reading methods and mathematics methods, and so forth, but if you notice, you’ll see the reading scores keep going down–or hardly going up–in spite of the fact that we continually use these same people to improve the methods. …

I think the educational and psychological studies I mentioned are examples of what I would like to call cargo cult science. In the South Seas there is a cargo cult of people. During the war they saw airplanes with lots of good materials, and they want the same thing to happen now. So they’ve arranged to make things like runways, to put fires along the sides of the runways, to make a wooden hut for a man to sit in, with two wooden pieces on his head to headphones and bars of bamboo sticking out like antennas–he’s the controller–and they wait for the airplanes to land. They’re doing everything right. The form is perfect. It looks exactly the way it looked before. But it doesn’t work. No airplanes land. So I call these things cargo cult science, because they follow all the apparent precepts and forms of scientific investigation, but they’re missing something essential, because the planes don’t land.”

When I was an undergraduate, one of the best modules I took was ‘applications of psychology’ lead by the then head of psychology, Professor Ian Howarth. One of Howarth’s observations, which has stuck with me ever since, was the nonsensicality of business advice which focused on the practices of the top performing companies. What does such analysis actually tell us? How do we know that the bottom performing companies aren’t doing the same thing? It might be the case that the most successful and the least successful companies do many of the same things – or that these factors are actually irrelevant to building a successful company, or simply necessary but not sufficient for success.

There’s the same problem with the drive for school improvement. Frequently, ‘top performing’ schools (or school systems in other countries) are picked out for praise and their ‘best practices’ shared with others. However, for comparisons between schools to be of any value in improving the service they provide for children, any analysis requires that the judgements of school performance are valid and reliable.

Without valid and reliable judgements, school improvement advice devolves to the level of ‘cargo cult’ science. In the absence of a genuine understanding of what improves schools, we risk (ineffectively and pointlessly) merely emulating the appearance of success.

“Drawing on interviews with parents, qualitative studies provided evidence that middle-class parents are particularly anxious about securing a place for their child in a ‘good school’ and, as a result, are very active in the school choice process.10 In contrast working-class families are more concerned about finding a school where they feel at home, in order to avoid rejection and failure.”

“Right now, good people are being turned off becoming headteachers because the element of risk involved in the job has increased significantly. We’re in a situation where the knee-jerk reaction is that if a school has problems, the answer is to get rid of the head. It’s the football manager mentality, whereas what schools need is stability, and what heads need is constructive support, not the adversarial system we’re in now where school inspections are hit-jobs.”

“Ofsted was wielding a “Sword of Damocles” over “any senior leaders foolish enough to think that they will be sufficient to undertake the tricky work of turning round schools with seriously entrenched problems,” Mary Bousted, general secretary of the Association of Teachers and Lecturers, told her union’s annual conference in Manchester.”

‘Speaking as the report was published, Sir Michael added: “We also face a major challenge getting the best teachers into the right schools.

‘“Good and outstanding schools with the opportunity to cherry pick the best trainees may further exacerbate the stark differences in local and regional performance. The nation must avoid a polarised education system where good schools get better at the expense of weaker schools.”’

Ofsted ratings of schools are not worthless, but whilst they are treated as infallible judgements of school quality – I strongly suspect they are doing more harm than good. It won’t happen this side of an election – where talking tough on schools appears to be “the winning strategy” – but the way forward likely involves lowering the immediate stakes of school accountability in line with the validity of the judgements.

So how valid are Ofsted judgements of school performance?

Some validity issues with school performance measures

The key measure of a state school performance in England is that school’s Ofsted rating. However, there are serious questions about the validity these judgements. One issue is the reliability of the judgements. If a measure is unreliable (i.e. a different inspection team would often reach a different verdict) then the measure is not a valid one. We know that the validity of judgements of teaching quality from observations is suspect precisely because of this reason.

“a number of research studies have looked at the reliability of classroom observation ratings. For example, the recent Measures of Effective Teaching Project, funded by the Gates Foundation in the US, used five different observation protocols, all supported by a scientific development and validation process, with substantial training for observers, and a test they are required to pass. These are probably the gold standard in observation (see here and here for more details). The reported reliabilities of observation instruments used in the MET study range from 0.24 to 0.68.

“One way to understand these values is to estimate the percentage of judgements that would agree if two raters watch the same lesson. Using Ofsted’s categories, if a lesson is judged ‘Outstanding’ by one observer, the probability that a second observer would give a different judgement is between 51% and 78%.”

‘Ofsted said: “The reliability of the short inspection methodology will be tested during the pilots by two HMIs independently testing the same school on the same day and comparing judgements.”

‘Ofsted said that inspectors would continue using data as a “starting point” in all inspections.

‘“However in reaching a final judgement, inspectors consider the information and context of a school, as shown by the full range of evidence gathered during an inspection, including evidence provided by a school.”’

Regardless of the outcome, this probably won’t reassure school leaders, in my opinion, for several reasons:

Lack of blinding protocols

One major reason why successful trial results might be rejected by Ofsted’s critics may be the lack of independence of the judgements made by inspection teams. At the moment, the proposal appears to be that two inspection teams would enter a school on the same day and form separate judgements of overall effectiveness (presumably alongside separate assessments of achievement, quality of teaching, behaviour, leadership). However, for any agreement between these grades to be convincing evidence of reliability, the ratings would have to be truly independent from one another.

One threat to these trials is the absence (at time of writing) of any ‘blinding’ protocols. In research the system used to ensure that observations or measurements are genuinely independent is called ‘blinding’.

‘Blinding’ is particularly used in medical research. For instance, where a new drug is being tested against a placebo, the participant in the trial doesn’t know whether they are receiving the real treatment or not. Indeed, in high quality studies this is taken further (as a ‘double-blind’ trial) and the doctor giving the treatment also doesn’t know whether the patient is receiving a real or placebo drug. The purpose of these protocols is to reduce the potential bias created by the expectations of the patient and the doctor, so that if a difference is found between the effectiveness of the real drug versus the placebo it can be confidently related to the effect of the drug rather than the reactivity of patient or the (unconscious) bias of the doctor.

Anchoring effects in judgements of performance

In the absence of some form of blinding, agreement between two inspection teams won’t necessarily mean that the system of forming judgements about schools is reliable. It may simply mean that unconscious bias has distorted the two judgements so they fall into closer agreement.

One obvious source of bias is the anchoring effect that a review of prior data before the inspection would produce. There is a widespread belief that inspection judgements are too heavily influenced by the data they see on a school before the team even steps foot in the building.

Whilst we might hope that inspection systems somehow compensate for this bias, the evidence appears to suggest that it is a significant problem.

“So, it is easy. If you want your secondary school to get an Outstanding Ofsted grade and you want to avoid RI or Inadequate, make sure your pupils’ previous attainment on intake is as high as possible.

“Depressingly, this is our accountability system. This is why it is getting difficult to recruit good people to struggling schools. The odds are against them in more ways than the obvious ones.”

“All schools, regardless of their starting point, have a responsibility to do their utmost to secure the very best outcomes for each pupil in their school. However, to address the balance of the observations that have been made, is it not right that judgements of effectiveness (and indeed achievement) give appropriate consideration to the attainment profiles of every school.”

I contend that this strongly suggests that there is a powerful anchoring effect at play when inspection teams make judgements of schools. It appears that reviewing attainment data prior to inspection ‘anchors’ the judgement that inspection teams finally reach. Once anchored, impressions of the school’s effectiveness will likely gravitate towards that prior expectation (what is sometimes called the ‘halo or horns’ effect).

In the case of two teams inspecting on the same day to establish reliability, then I suspect this anchoring effect might have even greater sway. Inspection teams concerned that their judgements would be at odds might tend to rely much more on the objective data about the school than their subjective judgement on the day. Such expectations about whether the quality of a school is likely to be ‘good’ or ‘requires improvement’ formed prior to inspection would be like telling the patient whether they have received the real drug or a placebo. Thus, even if the pilot is successful, it would not necessarily tell us the genuine reliability of inspection judgements.

The problems of ‘progress measures’

Reliability is a necessary but not sufficient component of validity. In other words, a judgement can be consistently wrong. Thus, it is worth exploring whether the measures of progress used to make judgements of school effectiveness are in-and-of-themselves valid.

The fact that school performance judgements correlate strongly with the prior attainment of students suggests that attainment data (e.g. 5+ A*-C incl English and maths) may be acting as an anchoring effect on inspection judgements. Will progress measures (e.g. 3 levels of progress across a key stage) provide a more valid basis from which to make judgements about schools?

“Across England 81% of “high prior attainment” students (those on level 5 at age 11) make 3 levels of progress in Maths, but only 33% of “low prior attainment” students (those on level 3 or below at age 11) do so.”

Indeed, the 3 levels of progress measure appears to give a fairly easy ride for schools which can select or attract students with higher prior attainment scores:

“High expectations are a good thing. But this measure gives a low expectation for the level 5s while providing a very tough one for level 3s. The effect is that whether a school is deemed to be making sufficient progress with its students is much more likely to be defined by the intake than by the value it adds. Rather than balancing the absolute 40% measure, it reinforces any bias due to the low level of achievement at age 11.”

This isn’t just a problem for schools which have low prior attaining cohorts. It’s interesting that a lack of ambition embedded into the 3LoP measures – where KS2 students attaining a 5a are only expected to gain a grade B at GCSE in order to make ‘Expected Progress’ – is now being reported as a ‘failure’ of state schools rather than a distortion created by the accountability system:

“almost two thirds of high-attaining pupils (65%) leaving primary school, securing Level 5 in both English and mathematics, did not achieve an A* or A grade (a key predictor to success at A level and progression to university) in both these GCSE subjects in 2012 in non-selective secondary schools.”

In a couple of years, of course, all this will change with the movement to ‘Progress 8’ as a replacement to levels of progress.

“So let’s get this straight… “The progress 8 score improves equally, regardless of the grades they are moving between?”

“In 2017, a student moving up a grade at the top of the grading scale, receives a point score increment three times of someone lower down the scale. Is that fair? …

“What this could mean is that a school teaching all of its students equally would be better served focusing on those more likely to attain the higher grades. i.e. by its design the progress 8 measure in 2017 could be encouraging prioritisation of the students with higher prior attainment over those with lower “ability”.”

He makes the point that it isn’t clear what RAISEonline is actually testing:

“If you want to test whether the test results for a given school is statistically significant when compared to a national mean and standard deviation, as RAISEonline does, you are effectively testing a ‘school effect’. Is there something about this school which makes it different to a control sample, in RAISEonline’s case all the children contributing results for a given school year?

“So what does make the school different? Is it the quality of the teaching and learning, as RAISEonline implicitly assumes? Is it a particular cohort’s teaching and learning? Is it the socio-economic background of the children? Is it their prior attainment? Is it their family income? Or is it a combination of these factors?”

He also points out some fairly fundamental issues with the assumptions underlying the RAISEonline analysis of school performance. Many of these statistical assumptions simply do not match the reality of the school system. For example:

Children are not randomly allocated to schools.

Children attending a school are likely to be similar to each other, making their data related.

The piecemeal development of Keystages and sublevels means that the data is compromised before any analysis begins.

Age differences between summer and winter born children are ignored

Creating average point scores (APS) renders analysis meaningless

The use of the term ‘significant’ is used in a way contrary to its definition by statisticians.

He summarises:

“You are stuffed from the outset, as RUBBISHonline is likely to show up all kinds of warnings because it’s using the wrong, and incorrectly applied, tests of significance which don’t even stand up to elementary statistical scrutiny.”

Non-linear progress

Of course, most school measures presume the validity of the concept of linear progress across a keystage. Many schools now talk about trajectories when discussing student progress – embodying the assumption that it is reasonable to expect children to make such linear progress.

“We have an accountability system that has encouraged schools to check that children are making a certain number of sub-levels of progress each year. This is the basis on which headteachers monitor (and now pay) teachers and on which Ofsted judges schools. Yet there is little hard science underpinning the linear progress system in use.”

It turns out that there may be numerous pathways by which children reach their attainment at 16 and that almost all children – at some stage across their school life – will be deemed as underachieving:

This volatility in outcomes is especially high for children with low attainment at key stage 1 and implies that the tracking systems used (by schools and Ofsted) to identify students as ‘on target’ or ‘off track’ are based on invalid assumptions about how children progress through school. These labels have a large impact on what happens to those students, their teachers and ultimately their schools.

“The vast majority of pupils do not make linear progress between each Key Stage, let alone across all Key Stages. This means that identifying pupils as “on track” or “off target” based on assumptions of linear progress over multiple years is likely to be wrong. “This is important because the way we track pupils and set targets for them:

influences teaching and learning practice in the classroom;

affects the curriculum that pupils are exposed to;

contributes to headteacher judgements of teacher performance;

is used to judge whether schools are performing well or not.”

Is Ofsted inadvertently driving a ‘cargo cult’ of school improvement?

It is difficult to understand why anyone would defend the current high-stakes performance measures currently applied to schools: Many of the statistical assumptions are flawed; children do not make ‘linear’ progress, the progress measures are distorted to favour high prior attainment, and attainment measures likely ‘anchor’ inspector judgements. Among the damaging consequences of a system which applies high-stakes accountability using performance measures which lack validity is the excessive workload it creates for teachers, the difficulties it compounds for teacher recruitment and retention, and the problems it produces in attracting school leaders to challenging schools.

Low validity in Ofsted judgements would imply that there are schools rated RI or even Inadequate which have received those judgements by virtue of the volatile progress of their students – but are perhaps doing some very worthwhile things to help those students get the most out of their education. On the other hand, there are likely some schools rated Outstanding which are actually coasting along relatively ineffectually, but protected by the fact that they attract or select higher attaining pupils.

Perhaps one of the most damaging consequences of a high-stakes/low-validity accountability system is that we continually fail to learn very much about what makes a school effective. Instead, it encourages a ‘cargo cult’ approach to school improvement:

Gimmicks and pet projects from ‘outstanding’ schools or teachers – which may have nothing to do with improved outcomes for students – are copied across schools as ‘best practice’.

Teachers and school leaders engage in time-consuming activities – which may have nothing to do with improved outcomes for students – in the hope that Ofsted lands them a good judgement.

“Measurement has become a proxy for learning. If we really value students’ learning we should have a clear map of what they need to know, plot multiple pathways through the curriculum, and then support their progress no matter how long or circuitous. Although this is unlikely to produce easily comprehensible data, it will at least be honest. Assigning numerical values to things we barely understand is inherently dishonest and will always be misunderstood, misapplied and mistaken.”

Whether discussed under the guise of ‘resilience’, ‘grit’ or ‘character’, there appears to be a great appetite for psychologically manipulating pupils’ personalities or their attributions about school. One concept which has particularly captured the imagination of teachers and school leaders is ‘growth mindset’: the idea that children who possess incremental theories of intellect (a growth mindset) appear to achieve better grades than those who possess an entity theory of intellect (a fixed mindset).

The claim that there are attributional differences between pupils which can affect their experience of school and their academic outcomes is well supported. You can read a bit more about some of the psychology behind the idea of a ‘growth mindset’ here: Growth Mindset: It’s not magic

However, accepting that these key attributional variables exist still leaves at least two important questions that school leaders and teachers should be asking before seeking to implement ‘growth mindset’ interventions in schools.

• Firstly, will changing a pupil’s attributions (their attitudes and beliefs) cause desirable changes in behaviour?
It’s possible that the causal arrow between ‘mindset’ and performance is not a straightforward one. It’s natural to assume that changing a person’s beliefs will alter their behaviour, but the evidence on this is much more complicated.

• Secondly, even where experimental psychological interventions are successful, will the implementation of such interventions in schools lead to the desired outcomes?
It’s possible that the elements of a psychological intervention which led it to be successful will be lost or negated when it is scaled at a school level.

Changing behaviour is hard. Just about everyone I know is trying to change their behaviour in some way; trying to eat more healthily or take more exercise, cutting down on drinking or quitting smoking, being more environmentally friendly by recycling more or using their car less. However, simply because we hold certain beliefs and attitudes (should eat more vegetables, smoking leads to early death, it’s important to protect the environment) doesn’t necessarily mean we successfully change our behaviour.

“To take a concrete example, efforts to communicate to people the benefits of not smoking, in the absence of a wider set of measures to reinforce and sustain this healthy lifestyle choice, are doomed to failure. A more comprehensive approach is required which explicitly acknowledges social and environmental influences on lifestyle choices and addresses such influences alongside efforts to communicate with people.”

Worse than having no effect, sometimes poorly implemented public health interventions can have negative effects. For example, Glasgow et al (1999):

“Interventions delivered to large populations can also have unanticipated negative effects. Labeling someone with a potential illness may have profound social and psychological consequences.”

Another example of the complexity of the relationship between attitudes and behaviour is using fear to change health behaviour. It seems that attempts to change health behaviour through fear appeals can be very effective, but can also quickly backfire where individuals have low self-efficacy in their ability to avert that threat (Witte and Allen, 2000).

“when a threat is portrayed as and believed to be serious and relevant (e.g., “I’m susceptible to contracting a terrible disease”), individuals become scared. Their fear motivates them to take some sort of action—any action—that will reduce their fear. Perceived efficacy (composed of self-efficacy and response efficacy) determines whether people will become motivated to control the danger of the threat or control their fear about the threat.”

Whilst there’s clearly a relationship – a correlation – between attitudes and behaviour, the relationship is complex and interventions can potentially backfire. The reasons for this are complex and numerous, but Kollmuss and Agyeman (2002) summarise four reasons why attempting to change attitudes fails to lead to changes in behaviour.

“… quantitative research has shown that there is a discrepancy between attitude and behavior. Many researchers have tried to explain this gap. Rajecki (1982) defined four causes:

Direct versus indirect experience: Direct experiences have a stronger influence on people’s behavior than indirect experiences. In other words, indirect experiences, such as learning about an environmental problem in school as opposed to directly experiencing it (e.g. seeing the dead fish in the river) will lead to weaker correlation between attitude and behavior.

Normative influences: Social norms, cultural traditions, and family customs influence and shape people’s attitudes, e.g. if the dominant culture propagates a lifestyle that is unsustainable, pro-environmental behavior is less likely to occur and the gap between attitude and action will widen.

Temporal discrepancy: Inconsistency in results occur when data collection for attitudes and data collection for the action lie far apart (e.g. after Chernobyl, an overwhelming majority of Swiss people were opposed to nuclear energy; yet a memorandum two years later that put a 10-year halt to building any new nuclear reactors in Switzerland was approved by only a very narrow margin). Temporal discrepancy refers to the fact that people’s attitudes change over time.

Attitude-behavior measurement: Often the measured attitudes are much broader in scope (e.g. Do you care about the environment?) than the measured actions (e.g. Do you recycle?). This leads to large discrepancies in results (Newhouse, 1991).”

Mindset interventions don’t work by trying to browbeat pupils into believing in the merits of hard work or that their ‘brain can grow’. Direct appeals and information alone don’t change behaviour very effectively at all. In fact, effective psychological interventions involve a subtle, well-aimed nudge, which initiates a more complex social process.

How do successful mindset interventions work?

Experimental studies have shown that very brief psychological interventions can lead to long-lasting changes in mindset and effects on pupil performance.

For example: Yeager, Paunesku, Walton and Dweck (2013) provide a review of some of this evidence. They relate students solving maths problems more successfully, months after being presented with a short mindset encouragement at the top of a computer screen compared to randomly allocated controls. Another experiment found that a single lesson on growth mindset over the internet reduced the failure rate by low-achieving pupils by 7%. A web-based mindset intervention also improved freshman completion rates by 3-4% (10% improvement amongst African-American students).

Some of these psychological interventions, lasting for only a few minutes, have been shown to not only improve school performance but also in other social contexts like encouraging voting participation. Any school leader or teacher thinking of implementing a growth mindset intervention would do well to read some of the work by Yeager and Walton – for example:

Both Walton and Yeager identify some key components to successful mindset interventions: Psychological insight and precise targeting of a brief and stealthy intervention; and utilising recursive processes, essentially triggering a virtuous circle which supports the original intervention. These two components appear to be lacking in many school initiatives to exploit ‘growth mindset’ research, I contend.

‘Theory guided precision’

“In the spirit of Kurt Lewin (“There is nothing so practical as a good theory”), creators of “wise interventions” leverage specific psychological insights.”

Rather than being a generic appeal, successful psychological interventions tend to be highly specific – crafted to the precise psychological process being manipulated:

“A wise intervention begins with a specific, well-founded psychological theory. This theoretical precision allows researchers to create a precise tool, often instantiated in a brief exercise, to change a specific psychological process in a real-world setting. This psychological precision reflects the same values psychologists cultivate in laboratory research—keen insight into basic processes and methodological precision to isolate these processes. Wise interventions export this precision in theory and methodology to field settings.”

The intervention methods come from a solid understanding of the psychology of social influence and persuasion. For example, for growth mindset interventions:

“Rather than simply presenting an appeal to a student, each intervention enlisted students to actively generate the intervention itself. For instance, one delivery mechanism involves asking students to write letters to younger students advocating for the intervention message (e.g., “Tell a younger student why the brain can grow”). As research on the “saying-is-believing” effect shows, generating and advocating a persuasive message to a receptive audience is a powerful means of persuasion (Aronson, 1999).”

By targeting a psychological process in such a specific way, these interventions use ‘stealthy’ and brief delivery mechanisms that quickly change students’ beliefs. But, they’re not ‘magic’:

“They are not worksheets or phrases that will universally or automatically raise grades. Psychological interventions will help students only when they are delivered in ways that change how students think and feel in school, and when student performance suffers in part from psychological factors rather than entirely from other problems like poverty or neighbourhood trauma.”

This isn’t a non-specialist role, according to Yeager and Walton. They suggest that we need a new class of professional psychologist to scale the impact of social-psychological interventions in schools:

“Along similar lines, it may be useful to revisit past suggestions for creating a new class of professional—a “psychological engineer”—a person with the expertise needed to scale psychological interventions effectively. Such professionals would be trained in experimental methodology and psychological theory, although their primary work would be not to advance psychological theory but to understand and alter psychological dynamics in applied settings.”

Essentially, psychological interventions aren’t suited to generic attempts at amateur psychology. The people claiming to demonstrate some profoundly successful interventions suggest a level of expertise is involved; that to be successful, individuals designing and delivering an intervention require significant understanding of the psychological theories involved. In addition, they should not be seen as a panacea – they cannot, on their own, overcome significant problems caused by socio-economic deprivation.

‘Recursive processes’

Successful experiments in social-psychological interventions need to be stealthy and brief, according to Yeager and Walton:

“Often psychological interventions are brief — not extensive or repeated. Excessive repetition risks sending the message those students are seen as needing help or may undermine the credibility of a reassuring message (as in “thou doth protest too much”). In this way, delivering psychological interventions differs markedly from teaching academic content. Academic content is complex and taught layer on layer: The more math students are taught, the more math they learn. Changing students’ psychology, by contrast, can call for a light touch”

Thus, frequent repetition of ‘growth mindset’ messages through lessons or tutorials firstly doesn’t boost the effectiveness of the intervention, and secondly may actively undermine it. This is such a counter-intuitive point; I’m not surprised that it’s overlooked by teachers and school leaders. It’s also, I suspect, likely not in anyone’s commercial interests to make this point!

So, successful interventions are subtle and brief, so how do they have such apparently large effects on student performance? Walton explains:

“To understand them, it is essential that one consider how interventions change not a moment in time (“a snapshot”) but a process that unfolds over time (“a movie”; Kenthirarajah & Walton, 2013). In a relationship, every interaction builds on the previous interaction. By targeting psychological processes that contribute to recursive dynamics that compound with time, wise interventions can improve downstream consequence.”

To give an example, Walton and Cohen (2011) relate an experiment seeking to reinforce the sense of belonging for freshmen at college. They hypothesised that African-American students would particularly benefit from the 1 hour, social-psychological intervention to reduce the threat perceived by social adversity.

“The intervention gives students an alternative narrative for understanding negative experiences—namely that worries about belonging are normal in the transition to a new school but dissipate with time.” …

“How could this work? Imagine you are a freshman worried about whether you belong in college. Learning that such worries are common and improve with time may take the edge off negative experiences. … minority students in the intervention condition no longer saw daily slights as if they portended a global lack of belonging; this change in social construal mediated the 3-year improvement in grades. If everyday encounters feel less threatening, perhaps students can interact with others in more positive ways and build better relationships.”

“Wise interventions harness the power of self-fulfilling beliefs. … Believing that change is possible with effort — “When you learn a new kind of math problem, you grow your math brain!” — students may experience greater success, which discounts the sense they aren’t “gifted” at math and strengthens their self-efficacy.”

It is the experience of success which comes with effort which feeds into a student’s perception. The purpose of a ‘growth mindset’ intervention is a subtle ‘nudge’ which promotes the behaviour which is more likely to achieve that success. It is not motivational quotes, inspirational stories of ‘growth mindset’ heroes, students post-it notes on a growth mindset wall, growth mindset lesson objectives, roleplaying a TV show about overcoming a fixed mindset or other kinds of ‘rah-rah boosterism’:

“Bolstering a sense of belonging for poor-performing students requires establishing credible norms that worry about belonging are common and tend to fade with time — not rah-rah boosterism.” …

“Good teachers often know the importance of belonging, growth, and positive affirmation. But they may not know the best ways to bring these about. Well-intended practices can sometimes even do more harm than good.”

A successful psychological intervention involves a quick, well-targeted ‘nudge’; not repeatedly hitting students over the head with a sledgehammer!

In the absence of an army of ‘psychological engineers’ to implement interventions in schools for us, what might work as interventions in schools to encourage growth mindset? Here are 5 suggestions that might help.

1) Focus on students achieving success, rather than tackling their motivation.

In ‘What makes great teaching’, Rob Coe and his team report that trying to tackle motivation alone has very little effect on student progress:
“Address issues of confidence and low aspirations before you try to teach content

Teachers who are confronted with the poor motivation and confidence of low attaining students may interpret this as the cause of their low attainment and assume that it is both necessary and possible to address their motivation before attempting to teach them new material. In fact, the evidence shows that attempts to enhance motivation in this way are unlikely to achieve that end. Even if they do, the impact on subsequent learning is close to zero (Gorard, See & Davies, 2012). In fact the poor motivation of low attainers is a logical response to repeated failure. Start getting them to succeed and their motivation and confidence should increase.”

It is vital to remember that it is the experience of success which leads to long-lasting change in attitudes to school. Even where attitudes are changed, it will have little long-term effect on behaviour unless the pupil enters that recursive, virtuous cycle of success. What makes a mindset intervention successful isn’t magical. It is a subtle nudge which encourages pupils to behave in ways which are more likely to achieve success. However, where pupils do not see that success, the efforts will be undermined or even have a negative influence (e.g. apparently confirming lack of intelligence).

2) Focus students upon the strategies they use

Whilst an absence of effort pretty much guarantees failure, ‘more effort’ on its own is not a guarantee of success. One positive development is that some schools are shifting the focus away from ‘praising effort’ to a more thoughtful approach involving developing students’ metacognition. In a recent blog by John Tomsett, he relates the development of learning tools – a range of strategies which pupils can select when they don’t meet success.

There are likely to be some common and maladaptive strategies which students employ. For example, students adopting avoidance strategies when they become anxious; e.g. behaving badly, truancy from a lesson or procrastinating when revising. However, my suspicion is that ‘learning tools’ will only be adaptive where they are domain specific – e.g. a range of particular strategies which students can use when they ‘get stuck’ with maths problems. The issue with a ‘generic skills’ approach to learning (e.g. developing dispositions, or learning to learn skills) is that they tend not to transfer between contexts.

3) Evaluate change in behaviour rather than attitudes

The use of surveys is a common way of trying to establish whether a pupil possesses a growth or a fixed mindset. We need to be very wary of these as measurements of impact. School mindset interventions which rely upon explicit mindset messages may temporarily alter student attitudes to their learning without actually changing their behaviour in the classroom or outside of school. Worse still, reliance upon ‘inspirational’ messages or explicit teaching of mindset may simply tell pupils the socially desirable response expected in surveys – giving the appearance of changing attitudes without genuinely changing the attitudes that pupils possess. This would render any attempt to measure ‘impact’ through – for instance – student surveys potentially meaningless.

Successful interventions will look at behavioural changes – rather than effects on attitudes. For example, one possible quantitative measure of impact might be to quietly measure specific students’ ‘time on task’ in lessons. Qualitatively, one might measure changes in effort through analysis of students’ work in books.

4) Focus on the normative influences within the school culture

Interventions need to consider the broader normative influences operating within a particular school context and for a particular child within that context. Norms need to de-emphasise the negative consequences of making mistakes and discourage social comparisons. There’s probably little point in ‘preaching’ a growth mindset within a broader school context which explicitly emphasises performance-orientated structures or goals. (Of course, it will be almost impossible to communicate this effectively to pupils if teachers in a school are subject to high-stakes accountability systems which do not embody the same values!)

For example, there’s the danger that even a successful attempt to alter pupils’ theory of intellect will be significantly undermined by the pupils’ experience of being relegated to a ‘bottom set’ or being given an artificial target of an ‘A’ that they repeatedly fail to reach.

5) Consider teachers’ implicit theories of intellect.

For an incremental theory of ability to become an unspoken social norm it would be useful to consider the attitudes and beliefs about learning which are held by teachers. For example, the ‘What makes great teaching’ report suggests that highly effective maths teachers tend to have some common beliefs about learning:

“How children learn:
• almost all pupils are able to become numerate
• pupils develop strategies and networks of ideas by being challenged to think, through explaining, listening and problem solving.

They used teaching approaches that:
• ensured that all pupils were being challenged and stretched, not just those who were more able
• built upon pupils’ own mental strategies for calculating, and helped them to become more efficient.”

Teachers unconsciously communicate their attitudes and beliefs about intellect and learning when they interact with pupils. To what extent do teachers hold ‘growth mindset’ beliefs about their pupils? Where teachers privately hold entity theories about ability in their subject, they are likely to communicate these to pupils (despite giving the ‘socially desirable’ response when asked in surveys).

Challenging the profession to learn more about the nature and nurture of intelligence and concepts like neuroplasticity may help them believe that their pupils can succeed with the right strategies and some effort. The purpose of this isn’t compliance with ‘acceptable beliefs’ but a genuine engagement with learning a bit more about the psychology of how children learn.

This won’t be easy for all teachers – we come from such a wide variety of disciplines – but actually, perhaps experiencing difficulty in learning some genuine neuro-cognitive psychology (rather than usual dumbed-down stuff teachers get) will also help model how effective learners behave to their pupils?

There’s some interesting evidence to suggest that well applied study skills can have an important influence on student outcomes. Indeed, perhaps the key reason that girls tend to academically outperform boys is related to the effective use of study strategies. For example, Griffin et al (2012)

“The results of this research suggest that it is incorrect to suppose that females necessarily outperform males in intellectual tasks. In pedagogical settings it also does not make sense to perpetuate this misconception. For teaching effectiveness, academia should focus on developing and enhancing the various learning skills and strategies of students regardless of gender.”

Early in my career, a school I was working in invested a fair sum of money taking all year 11 students off-site for a day to work on revision skills. The sessions were presented by a fairly young team, led by a roguish, out of work actor, who guided students through a work book mainly focused on basic mnemonic strategies. Students enjoyed the event, though I suspect mainly because they were off-site for the day. The presenters cracked jokes and indulged in “Don’t call me ‘sir’, I’m not a teacher” informality, whilst accompanying teachers handled crowd control during the ‘dull bits’ where students actually did some writing in the work books. Students and teachers evaluated the whole thing very positively at the end of the day.

Well, almost all teachers. eternal sceptic that I am, I wasn’t particularly convinced the costly exercise improved the quality or effectiveness of revision that our students undertook after the event.

Generic study skills: Mnemonics

Mnemonic strategies will be familiar to most teachers. In the study skills day I attended with Y11s, students learnt how to count to ten in Japanese using the keyword method and memorise a ‘shopping list’ of random nouns using the loci method (the basis of Sherlock Holmes’ ‘mind palace’). They learnt the difference between an acronym and an acrostic and used them to memorise the order of rainbow colours and the (then nine) planets. They created a storyboard to help them recall the key events in a rather dull short story. They learnt how to create a visual map of the revision techniques they’d been taught.

There are two issues with the sorts of activities which typically make up a ‘study skills’ sessions for students. The first and biggest problem is lack of transfer. The efficacy of teaching ‘learning skills’ independent of domain knowledge relies upon students transferring (often fairly abstract) strategies to their own studies. We know that transferability is problematic and the intended transfer of ‘skills’ to other contexts typically doesn’t occur. For example in Perkins and Salomon (1992) they discuss the problems of the transferability of ‘generic’ principles:

“In areas as diverse as chess play, physics problem solving, and medical diagnosis, expert performance has been shown to depend on a large knowledge base of rather specialized knowledge (see Ericsson and Smith 1991). General cross-domain principles, it has been argued, play a rather weak role. In the same spirit, some investigators have urged that learning is highly situated, that is, finely adapted to its context (Brown et al. 1989, Lave 1988).”

Secondly, many mnemonic strategies only really help where the material they have to learn is mnemonic ‘friendly’; for instance ordered lists of words (preferably nouns). The lack of wide applicability of these strategies is one of the reasons it rates so poorly in the Dunloski et al (2013) review of study techniques.

“On the basis of the literature reviewed above, we rate the keyword mnemonic as low utility. We cannot recommend that the keyword mnemonic be widely adopted. It does show promise for keyword-friendly materials, but it is not highly efficient (in terms of time needed for training and keyword generation), and it may not produce durable learning”

In summary, I’m not saying that mnemonic strategies don’t have a role to play in helping students learn content and revise. I’m simply suggesting that the sorts of ‘study skills’ events (which schools often outsource to external providers) are unlikely to have any positive impact on student outcomes. A better plan might be to teach teachers the various mnemonic techniques and encourage them to fins examples of where the ideas might be profitably applied within their own subject domain.

For example, as a science teacher, I can think of examples where verbal and visual mnemonics might be helpful – but it would be better for me to plan these into my teaching, to illustrate their use within science lessons as we progress through sequences of learning, rather than students to learn them in the abstract and hope that they spontaneously identify opportunities and successfully apply them to material across the science curriculum.

Generic study skills: Summarising

Empirical evidence on what makes study skills effective has been slowly gaining momentum within education research, though I suspect it’s not yet really had much impact on how revision skills are taught in schools.

Back in 1999, Purdie and Hattie undertook a meta-analysis of 52 studies to examine the link between study skills and learning outcomes. They found that simply increasing study time was not correlated to outcomes. Where they found positive outcomes, they also noted that it was not due to the inherent quality of any particular study skill, but more likely a function of meta-cognition on the students’ part; the decisions the students made about when and how to employ a particular strategy for a specific learning goal. In essence, students who are able to apply a wide range of available strategies appeared to have better outcomes than students with only a narrow range. However, the case for teaching students study strategies as discrete ‘skills’ appeared to be problematic.

“Of course, it is desirable that students possess a repertoire of desirable study skills, but they also must know when to use study skill x, and when to use study skill Y. … Repeatedly, research points to the importance of selecting the right set of study skills to use for a particular purpose in a clearly defined context…. Effective strategies in one domain may be weak strategies in another ….”

This is an important point; one which undermines a purely ‘skills-based’ approach to teaching revision strategies. In the absence of subject content, study skills may focus on fairly surface-level general strategies rather than deeper strategies related to specific content. Thus, it would seem advisable to embed ‘how to revise’ as a regular feature of subject teaching, rather than rely on one-off revision skill sessions.

The Purdie and Hattie analysis didn’t find particular study skills which were ‘good’ or ‘bad’, though they found highest correlations for note-taking. However, gaining benefit from writing summary notes was not a simple exercise:

“Although notetaking was categorised as an achieving strategy … , closer inspection of those studies in which notetaking was correlated with a measure of student achievement showed that higher correlations were obtained when notetaking involved the identification and manipulation of the most important ideas … rather than the mere recording of information read in texts or heard in lectures.”

The complexity of identifying and organising relevant domain knowledge is likely also a reason why some SEND students underperform. For example, Ghani and Gathercole (2013) reported that for students with dyslexia:

“Results indicated that the dyslexic students were more worried of their school and academic performance, have weakness in managing their time and concentration to meet the learning demands for class or assignments, were less able to select important information from less important information, and using test preparation and test taking strategies less effectively.”

These difficulties are likely the reason why other reviews (e.g. Dunlosky et al, 2013) find that creating summaries has only limited utility as a study strategy.

“On the basis of the available evidence, we rate summarization as low utility. It can be an effective learning strategy for learners who are already skilled at summarizing; however, many learners (including children, high school students, and even some undergraduates) will require extensive training, which makes this strategy less feasible.”

These problems also extend to other common study skills which involve creating simplified content, for example creating visual maps and flash cards. I’ve written before about the effectiveness of training students to create visual maps: Does visual mapping help revision?

“Visual mapping of one sort or another is a commonly suggested revision technique, based on the assumption that the process of organising material in linked, hierarchical and graphical ways is superior to note-writing or simply answering practice questions. However, the evidence for its effectiveness as a process of elaboration is currently poor.”

There are few studies which specifically look at using flash cards as a revision strategy. The limited evidence appears to suggest that they aren’t always used in a very effective way. For example, Hartwig, M. and Dunlosky, J. (2012) found that only about 30% of students in their survey used them for self-testing and their use was restricted by the type of domain knowledge the students had to learn.

“Flashcards may often be used nonoptimally in vivo, such as when students mindlessly read flashcards without generating responses. Even when they are used appropriately, flashcards may be best suited to committing factual information to memory and not equally effective for studying all types of materials.”

The skill of summarising or note-taking is one that is difficult for students as it requires a firm understanding of which are the most important ideas and keywords within a specific subject. Even less formal summarising techniques like visual maps and flashcards cannot be relied upon as generic revision strategies as their application involves the same problems of identifying and organising the relevant domain knowledge.

That’s not to say that these summarising techniques aren’t worth developing – they are – but that teaching ‘how to summarise’ or ‘how to draw visual maps’ as a generic skill will likely have little impact. The most effective way for students to learn these skills is likely to be within the context of a subject-based lesson – where the teacher can help students to focus in on the most important domain specific information.

Generic study skills: Self-testing

In short, there appear to be few independent revision strategies which reliably improve student outcomes. However, one strategy that does seem pretty convincing is practice-testing. The ‘testing effect’ has consistently found across a range of laboratory and classroom based studies. The act of trying to retrieve information appears to enhance the future recall of that knowledge fairly reliably.

“Testing effects have been demonstrated across an impressive range of practice-test formats, kinds of material, learner ages, outcome measures, and retention intervals. Thus, practice testing has broad applicability. Practice testing is not particularly time intensive relative to other techniques, and it can be implemented with minimal training. Finally, several studies have provided evidence for the efficacy of practice testing in representative educational contexts.”

There’s also evidence that engaging in retrieval practice not only helps recall but also helps to reduce student exam anxiety. This may seem paradoxical to some teachers – after all, classroom tests often appear to be a source of anxiety for some students. However, it does appear that practising the retrieval of information can reduce students’ feelings of anxiety. For example, a recent study by Agarwal et al (2014) found that middle-school and high-school students reported lower test anxiety when they had engaged in low-stakes ‘clicker quizzes’ prior to final (grade relevant) testing.

Teachers can exploit this in lessons by using quizzes and low-stakes tests as a normal part of subject teaching. Indeed, I suspect it would have far more impact on student self-efficacy than attempts to tweak their personalities through social-psychological interventions related to ‘growth mindset’ or ‘resilience’. However, research has also found that self-testing can also have a positive influence on outcomes – making it a potentially effective ‘study skill’.

For example, Hartwig, M. and Dunlosky, J. (2012) identified that students who frequently quizzed themselves on material they were learning had higher GPA scores. However, whilst the positive benefits of testing oneself appear quite robust, there are questions about how effective this strategy is across the full range of contexts of assessment. For example, in the Hartwig and Dunlosky (2012) article they note:

“A major issue is the degree to which these benefits of self-testing will generalize to different kinds of tests (e.g., multiple choice, free recall, or essay), different course contents (e.g., biology, psychology, or philosophy), students with differing abilities, and so forth.”

Incidentally, that same study also found that the scheduling of study also appeared to have an impact on student outcomes. Low-achievers tended to opt for late-night ‘cramming’ sessions, usually close to the deadline of an assessment rather than planned in advance.

Another issue is that students may not engage in self-testing in a way that exploits its effectiveness. For example, Einstein et al (2012) report:

“Existing research suggests that students will sometimes engage in testing during their studying but mainly for diagnosing whether or not they know certain material and not as means of improving their learning and memory… . Students seem to be unaware that retrieval itself enhances memory …”

One reason students may fail to see benefit from independent self-testing is because they mistake recognition for ability to accurately recall information. Effective self-testing likely relies upon having an accurate ‘judgement of learning’. After all, if you believe you know the material well, you are likely to practise it less than material you believe you know less securely or don’t know it at all.

There’s evidence that ‘cue recognition’ (i.e. a feeling of familiarity with the material) may cause us to overestimate how well we know the material. This feeling of recognition may ‘trick’ students into ceasing self-testing before they are genuinely have a secure recall of the material.

For example, Reder and Ritter (1992) found that participants in their study tended to make a quick ‘feeling of knowing’ judgement about material based on familiarity with the question stem rather than accurate assessment of their memory for the material.

Whilst we can likely teach students to recognise the risk of overestimating the security of the recall, knowing the existence of a cognitive bias doesn’t make us immune from that bias. Thus, it is likely that even self-testing will require support and feedback within a domain-specific context to ensure that students have the genuine depth of subject knowledge required for assessments.

Implications for helping students with revision

At the end of the day, one-off sessions on generic revision skills may seem like a worthwhile intervention for students struggling with exams, but such activities may lend themselves to the appearance of ‘doing something’ to help students rather than actually improving outcomes.

A focus on generic ‘skills’ over-simplifies and makes abstract the strategies which can help students in their learning. Whilst useful strategies exist, a purely ‘skills-based’ approach overlooks important subject-specific differences in content and assessment, and the requirement of specialised domain knowledge to apply strategies effectively.

Generic ‘revision skills’ sessions involving mnemonics training and summarisation techniques are unlikely doing any harm, but there’s reason to believe that they also aren’t doing much good. Instead, there’s a case for domain-specific study techniques becoming a feature of regular classroom teaching. Indeed, there’s arguably a case for teaching generic study skills to teachers rather than their students, so teachers can adapt and apply these to their subject with the benefit of their greater domain knowledge. Through this, teachers could make students aware of a wider range of subject-appropriate strategies and properly reinforce the use of these techniques on a regular basis within their teaching.

Post script: Courtesy of Andy Lewis (@iTeachRE) a summary of improving students’ judgements of learning and some of the more robust psychological findings on effective study.

In 1899 William James collected together a series of lectures he’d given to teachers over the years. If you’ve never read it, I’d recommend it; there are many debates within education related in his work which resonate over a century later. Not least of these echoes, perhaps, is the parallel between James’ belief in the importance of students developing virtuous habits and the current preoccupation with children’s personality traits.

“So far as we are thus mere bundles of habit, we are stereotyped creatures, imitators and copiers of our past selves. And since this, under any circumstances, is what we always tend to become, it follows first of all that the teacher’s prime concern should be to ingrain into the pupil that assortment of habits that shall be most useful to him throughout life. Education is for behavior, and habits are the stuff of which behavior consists.”

Talks to Teachers, William James. Chapter 8

Schools have been working to instil such ‘good habits’ since the time of William James. Every time a school gives a late detention or chases up an absence they are developing ‘time keeping’. I can think of no greater example of resilience than a child who suffers long-term illness, loses a parent or cares for a sibling yet continues their studies in school. Flexibility is exemplified in the secondary school timetable; where a child may go from the very different demands required for PE, maths, art and English in a single day. Problem-solving is one area where children in England appear to excel according to the OECD and the development of communication skills has been part of every subject curriculum throughout my career.

I’ve written before about some of the problems associated with programmes for developing ‘soft-skills’ within schools. Despite the bold claims, the success of previous attempts to explicitly manipulate the personality development of children through schools has been fairly lacklustre. In the time since I wrote that piece back in May of last year, interest in pupils’ personality and character has only increased.

An example of this current preoccupation with the personalities of pupils comes from a recent article in The New York Times, “Should Schools Teach Personality?” The article relates claims that certain personality characteristics are better predictors of academic success than intelligence, and that these characteristics can be taught.

Another example is the DFE this week, inviting schools to apply for awards for leadership in character education.

“The new Character Awards schools will help give schools and organisations the tools and support they need to ensure they develop well-rounded pupils ready to go on to an apprenticeship, university or the world of work.”

After some reflection, I think I only have three problems with the direction this bandwagon is heading:

The concept and measurement of ‘personality’ is little better than astrology in providing useful information about individuals.

The correlation between personality characteristics and outcome measures like job performance is rather more modest than some claims might have you believe.

Personality differences – where they exist – may be a consequence rather than the cause of children from low SES backgrounds underachieving in school.

Can we measure ‘personality’?

“Personality tests … produce descriptions of people that are nothing like human beings as they actually are: complicated, contradictory, changeable across time and place.”

We ought to start with some sort of definition: What is personality? A brief description might be that personality is the characteristic patterns of thoughts and behaviours (including emotional responses) that are consistent within an individual across time and don’t depend upon the situation.

I make my students laugh and occasionally irritate other psychologists by arguing that personality measures are little better than horoscopes (though I’m gratified to see that I’m not alone in this position).

It can be argued that descriptions produced by personality tests are more focused on indulging our obsessive interest in our ‘selves’ than providing any predictive or useful knowledge about that ‘self’. Human behaviour is so complicated and contingent, that in order to find invariant traits, human behaviour is rarefied to such a level of abstraction that those few traits which might be considered invariant provide almost no insight into an individual.

I think it’s slightly ironic that businesses which demand ‘highly developed soft-skills’ like team working, time management, resilience, flexibility, problem-solving and communication skills and rate character as the most important factor when recruiting school or college leavers are potentially forming these priorities on the basis of imprecise concepts and questionable measures of personality.

A classic example would be the Myers-Briggs Type Indicator – the most frequently used test of personality used in businesses across the world:

“Take the Myers-Briggs Type Indicator (MBTI), the world’s most popular psychometric test, which is based on Jung’s theory of personality types. Over two million are administered every year.”

Firstly, perhaps because it uses forced choice questions (e.g. Do you prefer to arrange dates, parties, etc., well in advance – or – be free to do whatever looks like fun when the time comes?) there are enormous problems with the test-retest reliability of the measure (a big problem if you’re claiming to measure stable features of personality).

“One problem is that it displays what statisticians call low “test-retest reliability.” So if you retake the test after only a five-week gap, there’s around a 50% chance that you will fall into a different personality category compared to the first time you took the test.”

But perhaps more importantly, it provides no valid predictive insight into an individual’s behaviour or future performance. Stromberg (2014) summarises the problem nicely:

“This isn’t a test designed to accurately categorize people, but a test designed to make them feel happy after taking it. This is one of the reasons why it’s persisted for so many years in the corporate world, despite being disregarded by psychologists.”

Those in business and politics calling for more ‘character education’ would do well to look outside those domains when forming their opinions. Whilst it seems intuitive and ‘common-sense’ for schools to seek to inculcate those personality characteristics which employers value, there are serious issues in the way these characteristics are measured – and thus inevitable flaws in the claims of educational institutions purporting to have developed these characteristics in children.

Does personality predict job success?

A construct of personality related to the MBTI is the ‘Big Five’ or ‘OCEAN’ (Openness, Conscientiousness, Extraversion, Agreeableness and Neuroticism). This construct receives more attention within psychology and is widely perceived to be more valid (despite the fact that it shares 4 dimensions with the MBTI – which lacks a measure of neuroticism).

Given the interest business has in schools developing pupils’ personality traits, to what extent does a person’s scores on measures of the ‘Big Five’ correlate with job performance?

We might expect all of them to influence job performance to an extent, but perhaps the one related strongest to the concepts of ‘resilience’ and ‘grit’ would be Conscientiousness (e.g. persistent, hard-working, responsible). Indeed, it’s hard to think of a job where the traits associated with Conscientiousness might not be desirable!

Surprisingly though, the relationship between personality and job performance seems pretty weak. In a meta-analysis by Barrick and Mount (1991) they found that the OCEAN characteristics only correlated modestly with job performance. The only dimension of personality which correlated to job performance to any consistent degree was (unsurprisingly) conscientiousness (with average Rho scores from 0.20 to 0.23). Interestingly, they relate findings that suggest that Conscientiousness and cognitive ability are only weakly linked – presumably the basis for claims that personality characteristics may be more important than academic ability to potential employers.

However, I wonder if this correlation hides the genuine cause behind the association of conscientious character and job performance. Perhaps it is not a ‘conscientiousness trait’ that causes people to work harder and more responsibly at work – but rather job satisfaction which drives the individual to work more conscientiously. There have certainly been plenty of studies which have looked at the relationship between job satisfaction and job performance: e.g here, here and here

Of course, researchers have also wondered whether it’s our personalities that also influence our degree of job satisfaction.

Again, however, the causal link between personality and job satisfaction might easily be argued to be the other way around: It seems plausible that people with high job satisfaction are less anxious, more out-going and harder-working than those who find their jobs unfulfilling, unrewarding or unbearably stressful.

Of course, creating work which fosters high levels of job satisfaction is much harder and probably more expensive than simply selecting recruits on the basis of questionnaires into their personality – is this one reason employers are so interested in the personality characteristics of school leavers?

Does personality cause underachievement in kids from low SES backgrounds?

One reason for this current obsession with ‘character’ is that socio-economic deprivation is associated with poor academic outcomes. The claim takes the implicit form that pupils without socio-economic advantage lack the ‘character’ it takes to succeed academically. Thus, schools need to teach them this ‘character’ in order for them to overcome their socio-economic disadvantages.

However, is it really the case that socio-economically deprived children lack the personality traits to succeed in school?

Well, there’s some evidence to suggest that deprivation is associated with higher levels of neuroticism and psychoticism – as well as lower levels of mental wellbeing.

Though, it would seem odd to suggest that the cause of an individual’s deprivation was their higher levels of negative personality traits; or that by somehow ‘treating’ their personality they would overcome their deprivation. It seems at least plausible to argue that the experience of deprivation would cause people to respond by becoming more anxious or less caring about the feelings of others or their own safety.

In Sam Freedman’s recent blog article, he points to evidence that suggests ‘non-cognitive’ factors aren’t the problem in schools that politicians and business leaders seem to believe:

“And the trends are very clear. Ironically, given the recent obsession with “character building” amongst policymakers, there have been big improvements in a range of “non-cognitive” measures. Reported bullying has fallen; the percentage who claim to have tried alcohol has plummeted; aspiration has increased – higher percentages say they are likely to go to university; and relationships with parents seem to be a lot stronger too.”

“The answer returns us to the biased brain, and a mental flaw known as the fundamental attribution error. It turns out that when we evaluate the behavior of others we naturally overemphasize the role of personality – we assume people are always aggressive or always dishonest or always sarcastic – and undervalue the role of context and the pervasive influence of situations.”

It appeals to this bias to attribute a student’s success or failure to their personality characteristics rather than to understand the situational factors which may have facilitated that success or failure.

A student’s personality (or the school that ‘failed’ to instil that trait) could become an easy scapegoat compared to asking the harder questions about the socio-economic factors and societal values which undermine student success in school or work (certainly harder to solve). We live in a society where social inequity appears to have been growing over the years and shows no sign of reversal (or political will to reverse it).

A simple analogy for teachers is this: Imagine as a result of the surveys into teaching workload there was a suggestion that teachers ‘lacked resilience’ (a perceived deficit in character) and needed some sort of personality retraining to help them cope with job demands – rather than seeking to reduce workload (the situational cause of the problem).

I suspect teachers would be rather less enthusiastic about this suggestion than many appear to be for ‘resilience’, ‘grit’ or ‘growth mindset’ training for children coming from economically deprived backgrounds.

“It seems to me what is called for is an exquisite balance between two conflicting needs: the most skeptical scrutiny of all hypotheses that are served up to us and at the same time a great openness to new ideas … If you are only skeptical, then no new ideas make it through to you … On the other hand, if you are open to the point of gullibility, … then you cannot distinguish the useful ideas from the worthless ones.”

Carl Sagan in “The Burden of Skepticism”

Much of the debate regarding the role of research evidence in education appears to involve a series of misconceptions about what evidence can and – perhaps more importantly – cannot tell us.

The nature of empirical claims
Education has long been vulnerable to untested – or simply false – empirical claims. These have frequently been accepted and propagated (through CPD and teacher training) without question. For example:

• “Students must drink water in lessons or else their brains will shrink and they won’t be able to concentrate.”
• “Students retain 75% of what they do but only 10% of what they read.”
• “Students who are right-brain dominant learn differently to students who are left-brain dominant.”

In some cases these claims have lacked validity – e.g. they directly contradict some pretty basic biology (brain shrinkage). In other cases the claims have been demonstrated as empirically false (e.g. the pyramid of learning) but most teachers appear unaware of the evidence.

In many cases, those sceptical of education research aren’t against this use of evidence. Weeding out such spurious claims and nonsense appears a fairly uncontroversial use of ‘evidence in education’.

However, in addition to challenging spurious claims, evidence-based research can help inform teachers about strategies and techniques which they may find useful in the classroom – or ones might help groups of students within a school population.

“Students whose teacher has more knowledge of the misconceptions related to their subject make better progress than students whose teacher does not.”

“Pupils with below-average reading scores catch up with their peers more quickly with intensive reading intervention A rather than intervention B.”

These are the types of question that are open to evidence-based research. That’s the honest purpose of evidence – to test empirical claims.

The nature of value statements

“Friendship is unnecessary, like philosophy, like art. . . . It has no survival value; rather it is one of those things that gives value to survival.”

C. S. Lewis in “The Four Loves”

However, the debate within education is much wider than simply empirical claims about how kids learn and what sort of strategies might help particular groups of students in schools. Where evidence-based research stops is where our values start. Value statements tend to phrased differently: They tend to suggest something ‘should’ be the case rather than ‘is’.

Some value statements:

“Children should play more chess in schools.”

“It’s important that pupils read Shakespeare.”

“We should challenge racism, sexism and homophobia.”

No amount of science can tell you what your values should be. These values cannot be contested or compared by any amount of research. They are personal statements of social aspiration rather than empirical claims. To fight for your values you must argue and persuade other people why they are right: write a polemic article for a newspaper, lobby your MP, argue with people at conferences …

Attempts to prop up such value statements by cherry picking or manufacturing ‘evidence’ is at best naïve and at worst dishonest. We can’t use ‘science’ to decide our values. That’s not what science is for. Science is a tool for helping us understand ‘what is’ (though imperfectly) – not ‘what should be’.

Where values and evidence interact
One reason many education debates appear to get muddled is because the value sets of individuals conflict with the outcomes measured in an empirical study. Thus an empirical question …

… pre-supposes that ‘greater retention of abstract scientific concepts’ is a ‘desirable thing’ for children to develop through school. This is a problem that science can’t help with. For people whose values hold that such an outcome is desirable – such research evidence has significance. For people whose values oppose this outcome – such research evidence is irrelevant.

Teachers often hold different value-sets when it comes to ‘what is important’ about education. Where people are suspicious of evidence-informed practice it is quite frequently because they disagree with the values embedded in the outcome measures used in the research – and the feeling that ‘evidence’ is being used to shut down the discussion about ‘what’s important’ in the first place.

Some ways forward: Are we arguing about validity or values?
There are different sorts of arguments about education research – and the frequent conflation of validity with values serves to create more heat than light in discussions.

One thing that helps clarify debates about education is to be clear what it is you are disagreeing with. It helps when those values are articulated clearly and early in a discussion. An example of this is Coe et al (2014) in their recent review ‘What makes great teaching?’

“We define effective teaching as that which leads to improved student achievement using outcomes that matter to their future success. Defining effective teaching is not easy. The research keeps coming back to this critical point: student progress is the yardstick by which teacher quality should be assessed. Ultimately, for a judgement about whether teaching is effective, to be seen as trustworthy, it must be checked against the progress being made by students.”

By clearly defining ‘great teaching’ in this way, Coe et al make clear the value-set driving the research – ‘progress made by students is desirable’.

There are fundamentally different ways to disagree with this:

You could argue that the notion of ‘progress made by students’ shouldn’t be the defining characteristic of great teaching. This is a rejection of the values of the research. Disagreeing in this way moves the discussion out of the realm of ‘science’ or ‘evidence’; into the political arena where you will have to persuade others of the virtue of your alternative values.

You could argue that the research methods used did not measure progress. This is an argument about the validity of the research, but doesn’t disagree with the values. Disagreeing in this way keeps the discussion within the realm of ‘science’ and ‘evidence’; you will need to explain why the measurements used were not valid – and (better still) suggest more valid ways to make such measurements.

Some ways forward: Value statements and empirical claims don’t mix
Where individuals are clearly arguing about values, the interjection of ‘evidence’ is unhelpful and feels like one party is trying to shut down the debate. This is likely the reason that some teachers have a problem with the idea of evidence-informed practice. There’s the feeling that the whole discussion about ‘what is important’ has been side-lined by the shift of focus to evidence claims.

However, when arguing about values it’s relatively easy to slip into empirical statements. For example:

“I don’t believe we should judge student success in school in terms of exam results. All this emphasis on exams just makes children anxious and miserable.”

The first part of this statement requires no empirical evidence – it’s simply an expression of values. You may have to argue why exam results shouldn’t be a priority, or what should be a priority in their place, but ‘evidence’ isn’t the issue. The second part of this statement is making an empirical claim – you should be prepared to support that statement with some sort of evidence about the effects of exams on student well-being.

Some ways forward: Are there common professional values?
What’s badly needed within teaching is some identification and articulation of genuinely common values. Here, arguably, is where teaching is very different from other professions – like medicine. For medical doctors the ‘purpose of medicine’ has been more or less agreed within the profession, but can we find a common ‘purpose of education’?

The reason, perhaps, that education remains such a political football isn’t the lack of empirical methods underpinning our instructional techniques – I suggest – but that the lack of commonly articulated values which allows/encourages politicians to dictate those values to us. For education to move forward – and for teachers to gain greater recognition as a profession – there is a need to identify common ground in terms of values.

Perhaps there aren’t any, and like ‘British values’, attempts to articulate them will only make our divisions more sharply defined! We’re a diverse profession – individuals from a wide variety of social backgrounds and academic disciplines – but it’s hard to believe there isn’t something, even something quite small and simple, which unites us all?

‘Our doubts are traitors, and make us lose the good we oft might win, by fearing to attempt.’

Measure for measure, Act I Scene IV

ResearchED Research Leads Network Day, 13th December 2014

It is perhaps indicative of the character of the researchED movement that the ‘Research Leads Network Day’ keynote was a ‘reality check’ delivered by Professor Rob Coe – warning that we risk being little more than another education fad and reminding us that there was currently no robust evidence supporting the idea of research leads in schools.

That might have been the end of the whole event – we might have marched off, heads low, back to our schools in shame (which would have been a pity given the large turnout and the distance some people had come) – but fortunately having warned us against hubris, he offered some plausible suggestions for ways forward.

Lost in translation

There is a growing body of evidence which can usefully inform pedagogical practice and the interventions schools implement. Whilst the conclusions we draw must remain tentative, we can at least start by selecting strategies which have a higher probability of succeeding. However, making good use of this evidence requires some understanding of the methodologies, the strengths and limitations, involved in the studies and meta-analyses of education research. An example of how easily the nuance can be lost was recently related by Sam Freedman: ‘A tale of two classrooms’ published by Demos.

“The dangers of this approach were illustrated with the toolkit entry on teaching assistants. Initially, teaching assistants were rated as having no impact. This was picked up by various newspapers, unsurprisingly given that around £4 billion a year is spent on teaching assistants. As a result the EEF was forced to put out a clarifying statement explaining that, while research suggests that on average teaching assistants do not have a positive effect on attainment, other studies showed that if deployed in certain ways teaching assistants can have a very significant impact.”

Work in trying to summarise ‘what works’, like the EEF toolkit or Hattie’s Visible Learning, are the starting point rather than the final word in discussions about school improvement or teacher development. What research evidence can tell us is that ‘something worked for the researchers’ – but it still requires interpretation and significant adaptation to fit within the context of a particular school. Indeed, the circumstances where those positive or negative effects were found may be radically different our own ‘real life’ setting.

That’s not to say such evidence lacks value. Where we have a choice of approach we might take (even if one option is to do nothing new), where the context is similar enough to our own setting that the effect might plausibly generalise and where we agree with the outcome measures used to establish the effectiveness of that strategy – then these high impact strategies represent ‘good bets’. It’s just that they aren’t ‘plug and play’ compatible.

These ‘toolkits’ of education research require significant self-assembly – and that requires some additional tools not supplied in the box – not least, an understanding of research methods.

This was a point highlighted by Alex Quigley and Carl Hendrick – who announced themselves the ‘Cannon and Ball’ of researchED (they are a great double act, though I can’t repeat Carl’s catchphrase related to individuals who haven’t read King Lear). In a briefly serious segment of their presentation, they stressed the role of a research lead as something of a translator. Teachers enter the profession from a wide range of disciplines. Some come with an understanding of the methodologies used in education research, but most do not. There’s also a lot of research out there. Some of it relates to the sorts of questions that teachers and school leaders have – but finding it and assessing its quality is no mean feat. The task of triaging research papers and reviews, pointing towards and summarising the ones that might best provide the starting points for useful discussions within schools, is a role that a research lead might plausibly assume.

Though, the point was raised that this aspect of a research lead’s role is significantly hampered by the inaccessibility of research papers. Opening up education journal access to teachers should be a priority – especially where that research has been conducted using public money. Perhaps this is something a National College could usefully champion.

Mind the gap

The gap between theory and practice was a key issue highlighted in David Weston, Sam Freedman’s and Keven Bartle’s talk on the role of research in ITT provision. ‘Front loaded’ theory, as featured in traditional ITT provision, tends to go unheeded by beginning teachers as they get swamped by the ‘survival instincts’ of proving competence in the classroom. On the other hand, in the absence of theory, teaching becomes a sterile set of routines – a set of ‘signature pedagogies’ which lack flexibility and form the basis of a false consensus about ‘what works’ in teaching.

This divorce between theory and practice is something that risks further undermining the professionalism of teaching. Somehow, we need to marry the best of theory (which Sam identified as a combination of cognitive science and behavioural psychology, and educational research such as Wiliam or the EEF) alongside the practice of basic skills and strategies through teaching practice across the early career development of teachers. The desired outcome is that teachers emerge into the profession as ‘informed consumers of research’– able to articulate questions about their practice and critically engage with future developments in education research. One possible route forward with this involves greater collaboration between HE and schools – and perhaps this is an area which can be facilitated by research leads.

An example of this was related by Daniel Harvey, who talked about the evolving CPD offer within his school supported by Dr Phil Wood at the University of Leicester. He explained how teachers have engaged in regular sessions to support ‘deliberate enquiry’ – small scale research projects which are presented back to whole-staff and written up as a report. He identified some of the pitfalls of engaging teachers with this kind of research model; not least how the contrived groups and short time-scales undermined the first attempt (both features changed in the adapted model they are currently trialling). The ambition appeared to be to produce high-quality research reports, broadly equivalent to Masters level, perhaps with a view to publishing them as a collection of work.

In truth, however, for all the virtue of taking part in such deliberate enquiry – whether that is formal lesson study or the more informal coaching model we use here at Turnford – I suspect that the utility of trying to raise these projects to publishable standard is limited.

Firstly, the informal link to HE that Daniel’s school enjoys isn’t scalable across a school system (I suspect even Dr Phil’s beneficence would be exhausted eventually). Secondly, there is the problem of the time demands of creating high-quality written research reports; especially against the backdrop of high teacher workload in England presented in Emily Knowles’ session. Lastly, if a teacher is going to invest time reading research, large numbers of context-specific, small-scale case studies or case series studies undertaken by teachers is unlikely to have sort of generalisable impact we want.

At best, teachers focussing on the action research of other teachers might provide ideas for research projects they might undertake themselves, but at worst I wonder if this might not simply embed the problem of ‘signature pedagogies’ identified by Keven Bartle. One doesn’t have to look very far to find of where such self-contained professional practice merely appears to embed some of the worst misconceptions of the profession.

Many of the problems within our profession have arisen, I believe, because the teachers have been talking to themselves rather too much – reflecting on and recycling current practice rather than looking outward towards the wider body of evidence emerging within psychology and education research. Indeed, if teachers are going to find the time to read, then they could do worse than read ‘Why don’t students like school?’ by Daniel Willingham (which has been the focus of our newly formed ‘Book Club’ this term).

Alternatively, ‘What makes great teaching?’ by Rob Coe, Cesare Aloisi, Steve Higgins and Lee Elliot Major, would be a worthwhile read for any teacher – and this review formed the focus of the final session of the Research Leads Network day.

The report itself deserves more space than I could summarise here (Sean Allison has written a good summary of the highlights)

For me, I suppose the issue that the report underlines is the need for a shift of emphasis when it comes to making assessments of teachers: from high-stakes, summative judgements towards low-stakes, formative and developmental feedback. To that end, might a research lead’s role be developing a toolkit of ways for teachers to investigate their teaching (whether for self-evaluation, peer coaching or mentoring).

One suggestion was using student surveys as a way of investigating teaching. We’ve been developing the use of the MET student survey as a source of insight into teaching. Within the confidential framework of coaching, a number of teachers have used an adapted version of the survey with their classes – and used the results as the basis of selecting a focus for whole-school coaching.

A question Rob Coe raised was whether we could test our (near universal) confidence in our high expectations of students. A question like ‘How often is your teacher satisfied with your work?’ might provide some interesting student feedback on how well our high expectations are communicated.

Subject knowledge is an area frequently overlooked within school CPD programmes. Rob Coe posed the challenge of how many of us would get 100% if we sat the exam ourselves (though I did point out a slight flaw with this scheme)

Another interesting suggestion was to look at ‘time on task’ for selected students in lessons. This is a feasible way, perhaps, to measure the impact of intervention programmes like mentoring. A question like: ‘What fraction of a lesson does a student appear to be thinking hard about the learning material?’ would be problematic to measure in any objective sense – but a suitably designed behavioural checklist might allow an observer to approximately gauge this and whether it improves over time.

Rob Coe also drew attention to the benefits of examining and practising specific routines. Short, say 5 min, video clips of a teacher engaged in questioning might provide useful feedback for development or a good comparison to our own classroom practice.

Interestingly, one strategy that didn’t appear very high on the list was the current practice of work scrutiny. It seems possible that in convincing Ofsted that observations were an unreliable measure of teaching quality, we may have invited an even less valid method for making such judgements. Ho hum …

Developing professional scepticism

One abiding theme which emerged from the day was the need for a critical perspective within schools. We need someone, most likely a teacher outside of line management, to ask ‘where is the evidence?’ and act as a devil’s advocate. I think that teaching, as a profession, can only really move forward if we foster a greater professional scepticism – and perhaps this can provide some measure of our impact as research leads

We’ve got some ‘good bets’, but uncertainty about ‘what works’ in any robust or generalisable sense is the honest position currently. This, not to mention the general difficulty of establishing anyone’s effectiveness in a school environment, makes it hard to assess whether research leads are improving the quality of teaching within a school.

That may change as the body of evidence regarding good teaching and good schools matures – but in the meantime the prevalence of ‘nonsense’ within education is something that could be tackled. There’s evidence of a need for it, Dekker et al (2012) and more recently Howard-Jones (2014) report that teaching is veritably saturated with misconceptions and myths.

According to the Coe et al (2014) review, teacher’s beliefs about how children learn and what teaching strategies to use when, do have a measurable influence on student outcomes. So an operationalised question would be: Is the incidence of these pedagogical misconceptions reduced in schools which have developed a research lead role?

Building a plane while flying it

We were asked to suggest questions / areas for focus for the next research lead event. Here are two that occurred to me (so far):

Firstly, how can we facilitate communication and dissemination of good quality research between research leads? Tom Bennett once suggested creating a researchED peer-reviewed journal – a typically ambitious project for one of the ‘world’s greatest teachers’

At the moment, I mainly hear about new pieces of research through haunting twitter – which means it is somewhat left to chance. Most teachers don’t have the time or access to trawl through journal articles looking for pearls – but collectively we might have a better chance. Would there be any merit in setting up a researchED ‘research digest’ – highlighting some of the best sources of evidence and new papers that research leads might translate into their schools? What would it look like? How would we filter what went in it? Would we want to divide it by educational sector or keep it general? Would it focus principally on pedagogy, and/or school leadership or relate also to education policy? Who would be prepared to contribute to it and how would quality control be handled?

Secondly, there’s no formal role or definition of what a research lead does – we’re building the plane while flying it – so I’d be interested to hear other experiences of teachers developing their research lead roles within schools. How closely does it interact with CPD and ITT / NQT provision? Do research leads sit outside, alongside or within school leadership? What are they looking to develop next in terms of disseminating evidence and research? How much of their role is facilitating teachers doing research? What sort of relationships exist between schools that research leads are involved with (e.g. teaching alliances, etc)? What are they doing that their school feels ‘earns their keep’? Sharing various models of what a research leads do (or might be going to try) within their schools might offer areas we can adopt and adapt as we develop our own roles.

This is a new version of my attempt to list all the education bloggers based in, or from, the UK. There may still be mistakes, but I have added many blogs that weren’t there before and removed many that shouldn’t be there. I have attempted to leave out all blogs that haven’t been updated in the last 6 months. I have missed out ones that are probably more to do with journalism than blogging. I have missed out one because of ethical issues (don’t ask).