Schools’ reputations, results and intakes evolve together. It is almost impossible to identify which came first, and equally impossible to separate these at any point in time. ‘Good’ schools generally get ‘good’ results and have ‘good’ intakes, and the opposite is equally true. Any fundamental change is gradual and takes place over a long period of time.

It should be obvious why this happens. Parents look for family homes once their children are born (or in anticipation of the birth of their children), and naturally parents consider the local school reputation, results and intake when they consider where they will buy or rent. Those with a choice of where to live, and therefore where to school their children, will exercise that choice.

Over time, as more (or fewer) parents want to live in an area around a school, the school’s reputation and results will rise (or fall). This is because the key driver for a school’s reputation and results is how much of an effort parents have made to get their children into the school. Of course, schools with good results are often well run places, with excellent and effective teaching. But the school’s task gets easier the more parents support the school and vice versa. Schools reputations, results and intakes evolve together.

On March 26th, Alan Milburn appeared at Policy Exchange and, in a short speech, ticked pretty much every box on the School reform Big List of Wrong Ideas. As fast as many damaging myths are demolished, they appear, phoenix-like, from the ruins. Without direct experience of schools, policy makers and those who attempt to influence them hold some clearly ridiculous opinions about what education is and what it can do. Here are some gems: ‘What happens in schools will have the greatest influence on social mobility’ Mr Milburn seems to think that having a more educated workforce will magic jobs into being. It won’t. If anything, without any underlying change in the wider economic picture, a more skilled workforce would drive down salaries and wages. Higher, better quality, employment is a function of the jobs market, not labour skills. And more people with qualifications will – unless the economy picks up - simply mean that people without jobs will have more qualifications, and those who do have jobs will be paid less. Of course we want a highly educated workforce – who wouldn’t want that? But we need more good jobs to increase social mobility, and jobs come with prosperity and expansion of the jobs market, which is beyond the remit of education. ‘Poor schools hold back poorer children’. How many times do we have to go through this? Ofsted judge schools on achievement which is a function of intake. So poor children attend schools which are judged to be poor, not the other way around. The vast majority of schools do a good job of educating those who attend them, and those which struggle do so for reasons beyond the remit of school. ‘Low expectations hold children back’. According to the Social Mobility Commission’s Cracking the Code report, teachers say that they emphatically do not have ‘low expectations’ of their children. Those surveyed did, apparently, think that other teachers have low expectations – the “Wasn’t me, it was everyone else” argument which has long been a favourite of school children everywhere. It’s extraordinary that this hearsay is reported as having any validity. It’s politically driven nonsense. ‘London can do it, so can you’. The London Effect – despite research from CMBO and the IFS demonstrating otherwise, as mentioned but largely dismissed by Mr Milburn – is still being spun as being a function of government policy rather than the unusual social make up of London, which is an increasingly odd Global island in the southern part of England. Unless other parts of the country can attract ambitious parents and their children in huge numbers, there is no comparison whatsoever. ‘We can close the gap between poor children and everyone else’. The damaging Closing the Gap narrative – imported wholesale from the USA, where it has equally damaging effects – completely ignores the reason why some children achieve more in school, focusing instead on the overall progress of those who are poor, thus, in effect, holding schools responsible for social inequality. Children who achieve more in school – as shown by the recent the Sutton Trust’s Subject to Background report, for example – do so because of support they receive outside of school, not because their schooling is any better. And just how you help one group at the expense of another – based purely on how wealthy their parents are – has not been shown to work, and it's hard to see exactly how that might happen given what we know about the nature of disadvantage. All of this stuff is not new. Alan Milburn has provided a clear summary of the views of a certain type of policy maker: Those who see education as a function of schools and teachers rather than of children and parents. It is, as many of us never tire of pointing out, a little more complicated than that.

Step 1. Set an ambition for what you want your school to achieve with PP funding.Some of the schools aiming high express this ambition in terms of becoming one of the 17 per cent of schools in which those on free school meals (FSM) do better than the average for all pupils nationally.

How is it possible for all schools to do ‘better than average’? If a school aims for this goal, isn't the school relying on other schools doing relatively worse 'than average for all pupils nationally'? (Update: It has been pointed out that this step refers to children on FSM doing 'better than average', not 'all children do better than average'. I see the point - given that the national percentage of children claiming FSM is around 20% or so, it is (theoretically) possible for 'those on free school meals (FSM) do better than the average for all pupils nationally.' For many reasons, detailed elsewhere on this blog, this would be highly unlikely, and it still relies on factors outside the control of a school.)

Step 3. Decide on the desired outcomes of your PP spending.Closing the gap between PP pupils and others in the school

Step 4. Against each desired outcome, identify success criteria.This could be expressed as a number – ‘closing the gap between the attainment of PP-eligible pupils and that of all pupils nationally by x per cent this year and by y per cent the following year’.

See the Whole Education website (www.wholeeducation.org) to learn about how Whole Education Network schools are developing a fully rounded education for their pupils as part of their ‘closing the gap’ and raising achievement strategies.

Have you really just promoted your own business? Is there a clear conflict of interest here?

Step 9. Monitor the progress of PP-eligible pupils frequently.Collect, analyse and use your data to maximum effect in monitoring the progress of every PP-eligible pupil. This should be done frequently, so that interventions can be put in place quickly, as soon as a pupil is starting to slip.

Is learning a linear process in which one can ‘start to slip’? Is it possible to ‘monitor the progress of a pupil’ using numbers on a spreadsheet?

(Another update - for those who haven't read the rest of my blog, hello. Glad you made it here. I think the Pupil Premium is a laudable initiative which is being derailed by pressure being put on schools. Please read my Poleaxed by Pupil Premium posts for further details.)

I’ve documented the way in which Ofsted have taken the Pupil Premium, a policy introduced with the best of intentions but with flawed understanding of education in England, and made an absolute mess of it. Schools are now being damned for forces well out of their control. Based on nothing more than their socio-economic status, one group of children in a school are expected to make better progress than another group in the same school. It’s so misguided, you might have thought that misuse of the Pupil Premium policy couldn’t get worse. But it can, and has. Pupil Premium isn’t just being used to hold schools hostage to the vagaries of their intakes by Ofsted. The Department of Education is in on the act too, with the ludicrous Pupil Premium Awards.The Awards which don’t reward what they say they reward The Pupil Premium awards were first awarded in 2013, with Nick Clegg giving three lucky schools £3,000 each. A further 22 schools were given some money ‘in recognition of the way they have pioneered the use of the pupil premium () to help those children reach their potential and reduce educational inequalities.’ Mr Clegg gave away some more money in 2014, when three schools won £10,000 and 3 runners up were given £3,000 each. 21 schools were given money in all. This time, there was a dedicated website for the awards. This time, the money was given to ‘schools that have done the most with their funding to help close the performance gap between their poorest pupils and their peers’. Except that this simply isn’t true. The Pupil Premium Awards simply line up every school in the country and then selects the schools with the best test scores. Well, not quite every school in the country. Only schools which are Good or Outstanding according to Ofsted count, although since those rating are based on results, that’s no real surprise. You have to have a few children who’ve attracted Pupil Premium money, but not many (three will do). Oh, and – as will Ofsted – your non-disadvantaged pupils must make less progress than your disadvantaged pupils. Or as the criteria put it, you must:‘Show good improvement for disadvantaged pupils in () results. The judges () want to see evidence that clearly shows the educational attainment of disadvantaged pupils rising at a faster rate than their peers, but in the context of all pupils’ performance improving.’ (My emphasis)So these schools must be doing amazing things with the additional cash from the Pupil Premium, no? Well… what did these schools do with the small amounts of Pupil Premium money they were given to deserve the prize money they received? Er… nothing, as it happens. They were given the money not based on their ‘pioneering use of the Pupil Premium’, but on their test results alone. Whilst some of the schools no doubt spent their PP money wisely – and I’m sure the schools were very happy to take the government’s cash, the award simply reflected the fact that they got lucky with their cohorts, and their test scores were better than the test scores of other schools.So how much can we get next year? For 2015, the boat has well and true been pushed out, with extraordinary amount of money being given award to random schools. The total prize pot has skyrocketed, with awards of £250,000 for the winning Secondary School and £100,000 for the luckiest Primary School. An additional £4,000,000 (that’s right, FOUR MILLION POUNDS) will be given away on top of this. That’s right, £4,350,000 pounds given away to schools which have to do absolutely nothing whatsoever to be in with a shout.Eh? I don’t have to do anything? No. As the 2015 website confirms, ‘We will analyse the national test results of primary schools and the examination results of secondary schools and determine which schools have demonstrated the most sustained improvement in achievement for disadvantaged pupils. The schools with the best performance will receive at least a qualifying award of £5000 for secondary schools and £1000 for primary schools.’ So you are eligible for the cash based purely on your test results.Hang on, I thought we could win more? You could. But then you have to make some stuff up.Make stuff up? You can’t simply get the big money without jumping through a hoop or two. We want to give you the cash. All we need is some kind of justification for giving you a bigger prize.Like what? Oh, just make something up. We have to be able to get some positive publicity, so we need a story. Anything will do, really.Can you give me an example? Yes, no problem. Remember the 25 winners in 2014? We could only make up one story for the free giveaway. Basically, the head read something and had a chat with some people, and then did something it would probably have done anyway, which everybody then decided worked (and Ofsted loved it too!). In 2013, there were three case studies, which are all perfectly sensible post-hoc, evidence-free explanations of success. You should be able to come up with some kind of similar Halo Effect to explain why your test results were better than everyone else’s. Easy. I exaggerate somewhat, for comic effect. But only somewhat. And it’s not really funny. This an example of politicians taking a serious problem, that of chronic educational underachievement by some people in very difficult circumstances and turning it into a game show. It demeans all involved, and it’s a sad indictment on the level of educational debate that I haven’t read any criticism of this farce anywhere else, which has forced me to write this. Instead of a fancy dinner, and few uncritical puff pieces in the media, the money used to pay for this daft publicity stunt could be used elsewhere, rewarding something which someone somewhere has actually done to help the disadvantaged in our society. Congratulations to the schools given the money. I hope they have spent it well. But the Pupil Premium Awards are a sad indictment politicians playing politics with people's lives.

At the ResearchED 2014 conference on 6th September, Sean Harford made the following statement during his session with Mike Cladingbowl and Andrew Old. “The charge that we are overly data driven () is a really odd one to me, because I can’t work that if you’ve got a system that has 80% of schools being graded ‘Good or better’, how anyone with any mathematical background can say that was overly data driven because you’re going to get over half of schools above an average of whatever score you are looking at and 50% of schools below, if it normalised, so the idea that the inspectors are going there blindly following the data doesn’t seem to fit with that overall picture we have at the moment.” (7:38-8:17) I want to show that it is entirely reasonable to suggest that Inspectors do follow data; if not blindly, then to an extent which would entirely explain why 80% of schools are judged ‘Good or better’. Take the suggestion that 80% of schools are graded ‘Good or better’ based purely on the data used to judge them. This can also be stated as ‘20% of schools are graded ‘not Good’ based purely on the data used to judge them’. The hypothesis that, given the methodology used to produce and analyse data, judging around 20% of schools to be ‘not Good’ is entirely reasonable.What evidence is there which suggests this hypothesis might be reasonable?If data is used to judge schools, where does the data come from? The primary source of data for Inspectors is RAISEonline, as outlined in paragraph 4 of the Schools Inspection Handbook.

How does RAISEonline indicate ‘not Good’ results? RAISEonline doesn’t simply divide schools into ‘above average’ and ‘below average’. If this were the case, Sean would be correct, and this would clearly mean that 50% of schools were ‘good’ and 50% were ‘not Good’. This isn’t what RAISEonline does. Instead, RAISEonline uses simple comparisons between the test scores allocated to pupils in a given school and the test scores of students nationally, and uses a 95% confidence interval to indicate those results which do not appear to be as a result of chance. In plain English, the difference between a school’s scores and national scores could be from a wide range of differences. If a score is outside of 95% of expected scores based on the national distribution of scores, it is held to be significant. An Inspector is directed to consider these significant differences as either ‘Sig+’ (much better than expected) or ‘Sig-‘ (much worse than expected). RAISEonline documentation states that ‘Significance is a statistical term that shows if a difference or relationship exists between populations or samples of data’. This isn’t correct, as I discussed here, but it’s what Inspectors are told. A ‘Sig-‘ indication therefore leaves the impression that a particular test score is ‘not good’, rather than that ‘it is unusual’.So could RAISEonline suggest 20% of schools are ‘not Good’? Given the blunt comparison of school test scores and national test scores, ‘Sig-‘ appears next to ‘a difference between populations (and) samples of data’. So a particular test score – “Key Stage 1 to Key Stage 2 fine grades value added: performance of groups within school - pupil characteristics, Children in receipt of Free School Meals”, to take a typical measure in RAISEonline – could be highly unusual and be indicated as ‘Sig-‘. There could be any number of reasons for this, but the clear impression that it is ‘not Good’. There are hundreds of numbers in each RAISEonline report which have the potential to be marked as ‘Sig-‘. This increases the number of RAISEonline reports which have a noticeable number of categories marked as ‘Sig-‘, and raises the number of schools which – according to a non-expert reading of RAISEonline – ‘not Good’. My suggestion is that it is entirely reasonable to suggest that, given the methodology used to produce and analyse data, that around 20% of schools are judged to be ‘not Good’ based on data.How could you check if this was the case? RAISEonline reports are not available for analysis by the general public. Ofsted has access to them, however, and could undertake an analysis of school inspection reports and RAISEonline reports to test the hypothesis outlined above. I would expect to find a significant correlation between Achievement of Pupils grades and numbers of ‘Sig-‘ indictors in RAISEonline, which would be entirely consistent with Ofsted’s judgement that 20% of schools to be ‘not Good’.

Externally assessed grades at both Key Stage 2 and GCSE are effectively guesses. The vagaries of designing and marking written tests of a pupil’s ability mean that the guesses are often wrong, and simply offer a snapshot of a pupil’s actual knowledge, skills and understanding of a given subject rather than their 'true score'. On a different day, with a different test, a pupil is highly likely to get a completely different mark and quite likely to be awarded a different grade entirely, making the grades pupils are awarded guesswork at best.*Written tests are not very reliable As Lord Bew noted in his report on KS2 testing, “It is generally accepted that any test or examination, however well constructed, will always include a degree of measurement error. We understand that, as with all tests where pupils are categorised, the level thresholds in Key Stage 2 tests mean that one mark can make the difference between one level and the next. That mark could be lost or gained through a pupil mis-reading an instruction in the test or making a fortunate choice in a multiple- choice question, or through slight variations in marking practice. These differences will be highly significant for the individual pupil.” (p55) Lord Bew noted that Dylan Wiliam had suggested that 32% of pupils could be given the wrong National Curriculum level. Wiliam noted that, “we must be aware that the results of even the best tests can be wildly inaccurate for individual students, and that high-stakes decisions should never be based on the results of individual tests” (p3) Much of the edifice of data-driven education is built on precisely this type of fundamental flaw, which does not recognise that grades are guesswork.Grade boundaries are very narrow The difference between one grade and another is often simply too narrow, and children are miscategorised as a result. This has significant implications for the current system of high-stakes accountability by which teachers and schools are judged. The 'data' used to assess 'performance' is, quite simply, not up to the task. For example, at Key Stage 2, there are externally marked written assessments of Numeracy, Reading and Spelling Grammar and Punctuation (SPAG). I have looked closely at Numeracy, although it would be possible to look at reading and SPAG in the same way. There are three Numeracy papers, with 100 marks available in total. The grade thresholds vary from year to year. In 2014, they were as follows:

The middle level – Level 4 – has a 32 mark spread, so any child is at most 16 marks away from either Level 3 or Level 5. The key observation is that a significant number of children will be within a hand full of marks of a higher or lower level. Those with either 40+ or 70+ marks could simply have misread a question, missed an entire page (this happens with surprising regularity), managed to transcribe the wrong answer and so on. Some children will, of course, be lucky, and on mistakes and anomalies may even themselves out. For some children, however, they will not.

The reading test has a total of 50 marks, with grade boundaries at 15, 24 and 39 marks. Half of all Level 4 results are within 7.5 marks of the adjacent grade. The SPAG test, there are 70 marks with grade boundaries at 32, 49, 61 marks. 50% of Level 4 results are within just 6 marks of the adjacent grade.

For an individual child, their result is reported as a single grade (this is what the child takes home at the end of year 6). The raw result is also recorded as a single number (a fine grade) which is used by data-crunching analyses such as that used by RAISEonline and the FFT. There is no confidence interval, or range of potential outcomes. The result is a simple snapshot which takes no account of any potential noise in the result.

Whilst, theoretically at least, levels have been consigned to history, children were still awarded levels at the end of Year 6 in 2014, and the current thinking on future assessments (see commentary by Warwick Mansellhere)suggest that similar flawed 'data' will continue to be used to assess pupils, teachers and schools.

At GSCE, with many more subjects, things are a little harder to summarise. It is possible to look at some grade boundaries to give an idea of the narrow boundaries between adjacent grades.

An example of a set of grade boundaries can be found on the AQA website. Looking at the grade boundaries for English language on Page 7, shows the following:

This shows around half of students awarded a C grade are within 3.5 marks of an adjacent grade, within 14 marks of an A and 19 marks of an E. A child awarded a C with 60 marks is within 8 marks of an A.

Some grade boundaries are incredibly narrow. In the 2011 PE01 exam, the grade boundaries were a mere 2 marks between the top grades:

Essentially, the chance of being awarded a particular grade comes down to luck.

Castles built on sand

The grades an individual pupil receives are simply guesswork. We should not have allowed anyone to build an accountability infrastructure based on this 'data'.

*The main exception to this guesswork is for children working at the very highest and very lowest levels of ability. If a pupil knows everything there is to know about a subject at KS2 or GCSE, their grade is practically guaranteed, barring some catastrophic performance on the day of the exam, since scored beyond 100% are not possible. Likewise, a child who knows none of the required information to answer the questions in a paper will achieve zero on the test and cannot achieve less.

Policy Makers, Politicians and Think Tanks frequently make assumptions about state education which are fundamentally flawed. The first is the assumption that schools - and by extension teachers – have sole responsibility for the academic achievement of children in their charge. The second is that OFSTED Quality of Teaching grades are a reflection of the teaching in a given school. There is clear evidence that Quality of Teaching grades are the same as the Overall Effectiveness grade given by OFSTED. Both are driven by (highly dubious analysis of) achievement data and are not altered by the observations inspectors make in the few hours they spend in schools before judging a school. Observations, which are extremely subjective, can show whatever the observer wishes to observe. As Professor Robert Coe says, ‘If your lesson is judged ‘Inadequate’ there is a 90% chance that a second observer would give a different rating.’ If an inspector wants to judge teaching to be a particular grade, they can. The grade inspectors give reflects the (dubious) data, not the quality of teaching. Most recently, Policy Exchange published Watching the Watchmen, in which authors Harriet Waldegrave and Jonathan Simons note that the ‘Achievement subgrade agrees most strongly with the overall grade, for both primaries and secondaries, followed very closely by the Quality of Teaching subgrade.’ (p26) Watching the watchmen includes the following graphs to bring this relationship home:

OFSTED Lead Inspector Mary Myatt writes on her blog: ‘the quality of teaching judgement links closely to the judgement on achievement. If a ‘good’ or even ‘outstanding’ lesson does not lead to good or better progress over time, then it follows that the quality of teaching is likely to require improvement. And the flip side of this is that if a lesson is observed which requires improvement but the progress is good, then the judgement on the quality of teaching over time will be good’. The data rules the judgement, not the teaching itself. David Didau sums up the feeling of most of those working in schools when he says that ’the commonly held view amongst most teachers and school leaders is that a lead inspector makes a preliminary judgement based on a school’s RAISE online data and then turns up in classrooms looking for confirmation of a decision that has already been made.’

Where an education commentator makes the assumption that OFSTED Quality of Teaching grades are a reflection of the teaching in a given school, any case, policy or commentary built on this assumption is fundamentally flawed.

Policy Makers, Politicians and Think Tanks frequently make assumptions about state education which are fundamentally flawed. The first and most important of these is the assumption that schools - and by extension teachers – have the biggest impact on the academic achievement of children in their charge.

Study after study has shown that, on average, educational achievement is linked to factors which are external to schools. Here are a few:

This graph by Christopher Cook shows the average picture of exam results in England by wealth, for example:

Clearly, there are many ‘hero’ examples of those who succeed despite historical disadvantage, and, anecdotally, many teachers and schools make huge differences to the lives of children in their care. Equally, many advantaged children do not succeed in education. All children clearly have the capacity to succeed in education, but the overall picture cannot simply be ignored. There is no suggestion that poorer 'working class' children will always do less well than affluent 'middle class' children, but the statistics can not simply be ignored.

Many education commentators hold the following broad areas responsible for disadvantaged children achieving lower academic results on average than their more affluent counterparts:

The CurriculumSchool QualityTeacher Quality Teaching PhilosophyLow Expectations of the Disadvantaged by the school systemBehaviour Issues in SchoolsAssessment Issues in Schools

My view is that, where an education commentator makes the assumption that the relative achievement of pupils - and by extension schools with a significant proportion of advantaged or disadvantaged pupils – is primarily due to any or all of these broad areas, it should be remembered that ‘Social class remains the strongest predictor of educational achievement in the UK’ (p1, Perry and Francis, 2009), and that, clearly, social class is influenced by factors external to a school.

Where an education commentator makes the assumption that schools - and by extension teachers – have the biggest impact on the academic achievement of children in their charge, any case, policy or commentary built on this assumption is fundamentally flawed.