Serious flaws in how PISA measured student behaviour and how Australian media reported the results

By Alan Reid

International student performance test results can spark media frenzy around the world. Results and rankings published by the Organisation for Economic Co-operation and Development (OECD) are scrutinized with forensic intensity and any ranking that is not an improvement is usually labelled a ‘problem’ by the politicians and media of the country involved. Much time, energy and media space is spent trying to find solutions to such problems.

This is pretty dramatic stuff. Not only do the test results apparently tell us the standard of Australian education is on the decline, but they also show that Australian classrooms are in chaos.

As these OECD test results inform our policy makers and contribute to the growing belief in our community that our education system is in crisis, I believe the methods used to derive the information should be scrutinised carefully. I am also very interested in how the media reports OECD findings.

Over the past few years, many researchers have raised questions about whether the PISA tests really do tell us much about education standards. In this blog I want to focus on the efficacy of some of the research connected to the PISA tests, specifically that relating to classroom discipline, and examine the way our media handled the information that was released.

To start we need to look closely at what the PISA tests measure, how the testing is done and how classroom discipline was included in the latest results.

What is PISA and how was classroom discipline included?

PISA is an OECD administered test of the performance of students aged 15 years in Mathematical Literacy, Science Literacy and Reading Literacy. It has been conducted every three years since 2000, with the most recent tests being undertaken in 2015 and the results published in December 2016. In 2015, 72 countries participated in the tests which are two hours in length. They are taken by a stratified sample of students in each country. In Australia in 2015 about 750 schools and 14,500 students were involved in the PISA tests.

How ‘classroom disciplinary climate’ was involved in PISA testing

During the PISA testing process, other data are gathered for the purpose of fleshing out a full picture of some of the contextual and resource factors influencing student learning. Thus in 2015, Principals were asked to respond to questions about school management, school climate, school resources, etc; and student perspectives were gleaned from a range of questions and responses relating to Science which was major domain in 2015. These questions focused on such matters as classroom environment, truancy, classroom disciplinary climate, motivation and interest in Science, and so on.

All these data are used to produce ‘key findings’ in relation to school learning environment, equity, and student attitudes to Science. Such findings emerge after multiple cross correlations are made between PISA scores, student and schools’ socio-economic status, and the data drawn from responses to questionnaires. They are written up in volumes of OECD reports, replete with charts, scatter plots and tables.

In 2015 students were asked to respond to statements related to classroom discipline. They were asked: ‘How often do these things happen in your science classes?

Students don’t listen to what the teacher says

There is noise and disorder

The teacher has to wait a long time for the students to quieten down

Students cannot work well

Students don’t start working for a long time after the lesson begins.

Then, for each of the five statements, students had to tick one of the boxes on a four point scale from (a) never or hardly ever; (b) in some lessons; (c) in most lessons; and (d) in all lessons.

Problems with the PISA process and interpretation of data

Even before we look at what is done with the results of the questions posed in PISA about classroom discipline, alarm bells would be ringing for many educators reading this blog.

No rationale for what is a good classroom environment

For a start, the five statements listed above are based on some unexplained pedagogical assumptions. They imply that a ‘disciplined’ classroom environment is one that is quiet and teacher directed, but there is no rationale provided for why such a view has been adopted. Nor is it explained why the five features of such an environment have been selected above other possible features. They are simply named as the arbiters of ‘disciplinary climate’ in schools.

Problem of possible interpretation

However, let’s accept for the moment that the five statements represent a contemporary view of classroom disciplinary climate. The next problem is one of interpretation. Is it not possible that students from across 72 countries might understand some of these statements differently? Might it not be that the diversity of languages and cultures of so many countries produces some varying interpretations of what is meant by the statements, for example that:

for some students, ‘don’t listen to what the teacher says’, might mean ‘I don’t listen’ or for others ‘they don’t listen’; or that students have completely different interpretations of ‘not listening’;

what constitutes ‘noise and disorder’ in one context/culture might differ from another;

for different students, a teacher ‘waiting a long time’ for quiet might vary from 10 seconds to 10 minutes;

‘students cannot work well’ might be interpreted by some as ‘I cannot work well’ and by others as ‘they cannot work well’; or that some interpret ‘work well’ to refer to the quality of work rather than the capacity to undertake that work; and so on.

These possible difficulties appear not to trouble the designers. From this point on, certainty enters the equation.

Statisticians standardise the questionable data gathered

The five questionnaire items are inverted and standardised with a mean of 0 and a standard deviation of 1, to define the index of disciplinary climate in science classes. Students’ views on how conducive classrooms are to learning are then combined to develop a composite index – a measurement of the disciplinary climate in their schools. Positive values on this index indicate more positive levels of disciplinary climate in science classes.

Once combined, the next step is to construct a table purporting to show the disciplinary climate in the science classes of 15 year olds in each country. The table comprises an alphabetical list of countries, with the mean index score listed alongside each country, so allowing for easy comparison. This is followed by a series of tables containing overall disciplinary climate scores broken down by each of the disciplinary ‘problems’, correlated with such factors as performance in the PISA Science test, schools and students socio-economic profile, type of school (eg public or private), location (urban or rural) and so on.

ACER reports the results ‘from an Australian perspective’

The ACER report summarises these research findings from an Australian perspective. First, it compares Australia’s ‘mean disciplinary climate index score’ to selected comparison cities/countries such as Hong Kong, Singapore, Japan, and Finland. It reports that:

Students in Japan had the highest levels of positive disciplinary climate in science classes with a mean index score of 0.83, followed by students in Hong Kong (China) (mean index score: 0.35). Students in Australia and New Zealand reported the lowest levels of positive disciplinary climate in their science classes with mean index scores of – 0.19 and – 0.15 respectively, which were significantly lower than the OECD average of 0.00 (Thomson, Bortoli and Underwood, 2017, p. 277).

Then the ACER report compares scores within Australia by State and Territory; by ‘disciplinary problem’; and by socio-economic background. The report concludes that:

Even in the more advantaged schools, almost one third of students reported that in most or every lesson, students don’t listen to what the teacher says. One third of students in more advantaged schools and one half of the students in lower socioeconomic schools also reported that there is noise and disorder in the classroom (Thomson et al, 2017, p. 280).

What can we make of this research?

You will note from the description above, that there would need to be a number of caveats placed on the research outcomes. First, the data relate to a quite specific student cohort who are 15 years old of age, and are based only on science classes. That is, the research findings cannot be used to generalise about other subjects in the same year level, let alone about primary and/or secondary schooling.

Second, there are some questions about the classroom disciplinary data that call into question the certainty with which the numbers are calculated and compared. These relate to student motivation in answering the questions, and to the differing interpretations by people from many different cultures about the meaning of the same words and phrases.

Third, there are well-documented problems related to the data with which the questionnaire responses are cross-correlated, such as the validity of the PISA test scores.

In short, it may well be that discipline is a problem in Australian schools, but this research cannot provide us with that information. Surely the most one can say is that the results might point to the need for more extended research. But far from a measured response, the media fed the findings into the continuing narrative about falling standards in Australian education.

The media plays a pivotal role

When ACER released its report, the headlines and associated commentary once again damned Australian schools. Here is the daily paper from my hometown of Adelaide.

Disorder the order of the day for Aussie schools (Advertiser, 15/3/2017)

‘Australian school students are significantly rowdier and less disciplined than those overseas, research has found. An ACER report, released today, says half the students in disadvantaged schools nationally, and a third of students in advantaged schools, reported ‘noise and disorder’ in most or all of their classes…. In December, the Advertiser reported the (PISA) test results showed the academic abilities of Australian students were in ‘absolute decline’. Now the school discipline results show Australian schools performed considerably worse than the average across OECD nations…. Federal Education Minister Simon Birmingham said the testing showed that there was ‘essentially no relationship between spending per student and outcomes. This research demonstrates that more money spent within a school doesn’t automatically buy you better discipline, engagement or ambition’, he said (Williams, Advertiser 15/3/17).

Mainstream newspapers all over the country repeated the same messages. Once again, media commentators and politicians had fodder for a fresh round of teacher bashing.

Let’s look at what is happening here:

The mainstream press have broadened the research findings to encompass not just 15 year old students in science classrooms, but ALL students (primary and secondary) across ALL subject areas;

The research report findings have been picked up without any mention of some of the difficulties associated with conducting such research across so many cultures and countries. The numbers are treated with reverence, and the findings as the immutable ‘truth’;

The mainstream press have cherry picked negative results to get a headline, ignoring such findings in the same ACER report that, for example, Australia is well above the OECD average in terms of the interest that students have in their learning in Science, and the level of teacher support they receive;

Key politicians begin to use the research findings as a justification for not having to spend more money on education, and to blame schools and students for the ‘classroom chaos’.

These errors and omissions reinforce the narrative being promulgated in mainstream media and by politicians and current policy makers that standards in Australian education are in serious decline. If such judgments are being made on the basis of flawed data reported in a flawed way by the media, they contribute to a misdiagnosis of the causes of identified problems, and to the wrong policy directions being set.

The information that is garnered from the PISA process every three years may have the potential to contribute to policy making. But if PISA is to be used as a key arbiter of educational quality, then we need to ensure that its methodology is subjected to critical scrutiny. And politicians and policy makers alike need to look beyond the simplistic and often downright wrong media reporting of PISA results.

Alan Reid is Professor Emeritus of Education at the University of South Australia. Professor Reid’s research interests include educational policy, curriculum change, social justice and education, citizenship education and the history and politics of public education. He has published widely in these areas and gives many talks and papers to professional groups, nationally and internationally. These include a number of named Lectures and Orations, including the Radford Lecture (AARE); the Fritz Duras Memorial Lecture (ACHPER); the Selby-Smith Oration (ACE); the Hedley Beare Oration (ACE -NT); the Phillip Hughes Oration (ACE – ACT); the Garth Boomer Memorial Lecture (ACSA); and the national conference of the AEU.

Alan once again nails the argument. PISA envy has become an international obsession used by neo-cons to berate and belittle Australian teachers and schools who are doing the very best possible with limited resources in difficult and disadvantaged conditions.

NAPLAN has a similar obsession. Both these tests are contested knowledge.

Thank you for your insights into what is happening with the media coverage of Australia’s performance on PISA. An interesting and informative account of the gaps in the reporting. Now to get the politicians and policy makers to look beyond these flawed messages.