Primary menu

The use of technology-based games in educational assessment has implications for what is valued in education. The use of games shifts the focus away from seeking common understanding to strategic goal orientation or performativity.

Lyotard’s (1984) work on performativity provides a good basis for critiquing the use of games in educational assessment, for Lyotard, technology provides for

a game pertaining not to the true, the just, or the beautiful, etc., but to efficiency: a technical “move” is “good” when it does better and/or expends less energy than another. (p. 44)

This observation concurs with my own experience when completing a unit on corporate strategy in a university delivered business degree. The corporate strategy unit included the game ‘Ages of Empire’ as part of the coursework and unit assessment. This game did not address truth. For example, the weapons were not true in relation to laws of physics, and the progress in civilisations was not true in relation to understood history. That the game required me to ‘kill’ members of my opponents’ army did not equate to understood justice. Further, while the game used colours in the way that poker machines draws in the vulnerable, the game did not represent beauty. Instead, the game focused on a strategic orientation towards vanquishing others in the shortest time. In these respects, Lyotard is spot on.

I became very good at the game ‘Ages of Empire’. At the final exam, a guy who I had earlier worked with in group presentations was randomly next to me on the game’s ‘strategic landscape’. As per the rules of the game, I annihilated him out of the game swiftly, and went on to win the game overall. I got a high mark for the course, but a potential friendship was lost. That is the nature of the gaming.

Habermas (1985) considers social action (action involving other people) as either communicative in seeking common understanding, or strategic in being goal oriented. Action can also be instrumental (acting towards objects). Educational assessments are generally communicatively oriented, in that they to try find how much a student’s knowledge is held in common with the teacher or educational system. Educational assessment can also be instrumentally oriented, for example in how well a student can master a technology. When acting strategically, as in a game, an assessment asks a student to act, openly or deceptively, against another student or computer agent. This later scenario is the most troublesome for me.

Educational assessment experts have been playing with game-based assessment for some time. Shute and Torres (2012) argue that game-based assessment support views that consider learning as goal oriented. Mislevy, Behrens, Dicerbo, Frezzo, and West (2012) also find principles of game design as compatible with principles of learning; particularly when it is framed around the structure of reasoning. However, for me, the question remains, what is being assessed.

While game-based learning and simulation environments make it possible to assess students’ interaction with complex systems (Frezzo, Behrens, & Mislevy, 2009), a concern remains that they will drift into performativity. That is, away from truth, justice and beauty, and into a performance maze where what is considered as performance is determined by computer programmers, and what is considered good performance is determined by input/output ratios. Performativity par excellence (Lyotard, 1984; Lyotard & Thébaud, 1985/2008). It evokes the dystopia of the films The Hunger Games and The Matrix (I haven’t seen either though).

The key philosophical thinkers in the latter half of the twenty first century were all profoundly moved by what happened in World War II. When we start to think about game-based assessment, we need to follow their lead and think whether we want students to learn when it is just to kill, or if we want them learn to kill as many as they can in a short time.

I am concerned about the UK Push’s emphasis on the empirical. While a focus on empiricism maybe understandable given contemporary proliferation of data (Behrens & DiCerbo, 2014), the historical tension between the empirical and rational is ignored by the UK Push; as it drifts away from reason into ideology.

There are many ways of characterising the rational side of science. Plato uses forms, Kant uses categories, Kuhn uses paradigms, Morgan (1997) uses metaphor, others use the concept of structures. Each of these different characterisations address different aspects of the rational in the rational-empirical divide. In this blog, I will use the term mental model drawn from Senge (1992).

Mental models frame how we conduct empirical activities, they frame how we collect and interpret data. This has traditionally been a chicken or the egg argument; does the mental model or the data come first. Contemporary approaches consider there to be dialectic between the two, where one informs the other. That is, mental models evolve with the awareness and availability of new data and evidence.

Mental models can frame how educational assessments are conducted. For example, TIMSS and PISA describe their mental models in respective frameworks. Each program uses a different mental model, and this results in different empirical data and claims. These claims can often be reconciled through further reasoning; as Wu (2009) does, for example, for mathematics achievement in TIMSS and PISA.

Different mental models sometimes exist for the same phenomenon. Light provides a good example. In physics, light can be considered a particle studied using methods and equations used for the study of billiard balls. Light can also be considered a wave studied using methods for studying waves in a pond. The wave and particle model are equally valid with their use depending on purpose.

Sometimes new mental models supplant older ones that no longer fit. The movement of planets provides a classic example. Once the earth was considered stationary and at the centre of the cosmos. This led to complicated equations to describe the movement of planets. Copernicus came up with a better model that considered the sun as stationary. This led to less complicated equations and easier science. This is what Kuhn (1970) characterises as a paradigm shift.

An important feature of paradigm shifts, and better ways of knowing, is that it requires the development of a better model. Paradigm shifts cannot be created simply by criticising old ideas as is the want of the UK Push. The criticism from the UK Push is the kind that Galileo experienced from the Roman Inquisition when he first proposed that the sun be the centre of planetary motion.

Different mental models can be used for the same phenomena in different spheres of life. Music provides an excellent example. Sound can be studied through the science of physics to make technologies for creating music, special effects and so forth. Performers of music on the other hand communicate differently using scales, chords, time signatures etc. The discipline of music can vary across cultures and is distinct from the discipline of physics. Then there is the aesthetic experience of enjoying music which is considerably subjective and can be oblivious to theories in both music and physics.

Mental models around social constructions – how people relate to each other – have been considerably challenged in the past decades. Major blind spots have been identified since Jefferson found it self-evident ”that all men are created equal” in 1776. That statement itself left out women leading to the feminist movement (for example Butler, 1990/2007; Irigaray, 1995). Former colonies felt marginalised leading to postcolonialism (for example Said, 1978/1994; Spivak, 1985/2010). There is also Foucault (1990) who challenged traditional perceptions of sexuality and marginalisation. Crenshaw (1991) highlights how race, gender and other identity categories feed into politics and marginalise individuals. These forces challenge shared mental models in the social sciences.

Language is central to mental models and to how they work in the social sphere. For example, Butler (1990/2007, pp. 12-13) notes that social science refers to things like gender as a dimension of analysis, but gender can also be applied to a person, where language is used to create a person’s gender. Here Butler is drawing on the work of the British philosopher Austin (1962/1975) in How to Do Things With Words. Austin points out that not only can words be used describe things, words can also use to construct things like gender. In this way, mental models, as articulated through language, can be used to construct or reinforce the identity of individuals including students.

Empirical evidence is therefore only half of the scientific method. Scientific arguments also requires examination of the rationales, mental models, and language used. London based Driver, Newton, and Osborne (2000, p. 289), for example, argue that empirical work should not be portrayed as the basic procedural step of scientific practice. Instead, they consider empirical evidence as providing evidence for knowledge claims that can be tested using argument through models such as proposed by Toulmin (1958/2003).

With its fixation on empirical evidence the UK Push are oblivious to the mental models that furnish interpretations of data, evidence, and about what works. Conscious or unconscious ideologies underpinning their mental models remain hidden from scrutiny, argument and debate. That the UK Push’s focus is on debasing and ridiculing existing ideas, often drawn from the margins of the literature, points to a reductive and dangerous form of reasoning. Particularly in the absence of proffering new well-formed ideas. In not being able to create new ideas, all that is left is to import the ideas of others, such as the UK phonics check.

Spivak, G. C. (2010). Can the subaltern speak? In R. C. Morris (Ed.), Can the subaltern speak?: Reflections on the history of an idea (pp. 21-78). New York: Columbia University Press. (Original work published 1985)

Toulmin, S. E. (2003). The uses of argument (Updated ed.). New York: Cambridge university press. (Original work published 1958)

One “side” approaches T&L from a scientific POV, the other from a sociological POV. You won’t agree as you are arguing past each other.

Rejoinder one

To afford the UK Push the status of a side is to elevate them beyond their competence. They are not my interlocutors. Education brings many perspectives, of which Australia has an abundance. My interlocutors include:

anybody seeking to engage for new objective, social and personal understandings

philosophers from analytics and continental (German & French) perspectives, including feminists and post-colonialists

academics from education and other perspectives (plus their literature)

educational theorists on validity and approaches to argument

curriculum developers

test developers

teachers and early childhood workers working from an ethic of care

educational leaders including leading teachers, principals, regional, state and federal managers

organisations implementing global assessments

organisations implementing national assessments

state organisations implementing secondary school certificates

schools performing evaluations

teachers assessing in the classroom

parents of infants who are required to assess their new born

bureaucracies coordinating resources for school systems

teachers of initial teacher education

professionals providing professional development and coaching to the teaching workforce

teacher organisations and unions

authorities that develop teacher education standards

higher education providers that implement teacher education standards

human resource units the support teachers

employ health and safety organisations that seek to support teachers, particularly to address bullying

There over 300,000 across 8 states and territories, plus those working in higher education, bureaucracies and consultancy roles, that provide for a diverse Australian educational landscape. Whenever I engage with these interlocutors I am regularly humbled by their depth of knowledge and expertise.

The UK Push are thugs seeking to pillory and vilify those working in the Australia’s education sector, they are not seeking to constructively engage with it. They do not represent any side or perspective.

I am increasingly concerned by bullying around learning styles arising from a so called globalised spontaneous movement I’ll call the UK Push. The UK Push into Australia is having a corrosive effect on Australia’s pluralistic educational culture and debates. I do not deny that Australia’s education sector requires attention. I’m arguing that the UK Push is regressive and toxic because it promotes blunt forms of argumentation that do not consider social context. I also consider much of the argument disingenuous in that it is goal oriented and not oriented towards understanding.

Central to my concern is how Australian educators are being harassed in the guise of myth busting using blunt forms of logic. I first became aware of this when I attended my first AARE conference in Melbourne (2016) when sharing coffee with random attendees. The latest twitter storm was being discussed, and one person said they had abandoned twitter that morning because their AARE session had been vilified by the UK Push.

Many educators approach education from an ethic of care and are particularly prone to bullying. As Noddings (2003) explains, a person who engages others from an ethic of care “is not seeking the answer but the involvement” (p. 176). Care is of primary importance in education. It is through an ethic of care that new insights and understandings become possible. When involvement is inauthentic and hostile, those engaging can experience toxicity and distress. Of course, those who approach life from an ethic of care still need to reason, but this reasoning needs to proceed with an empathy for different perspectives. It requires moral development (Gilligan, 1977; Kohlberg, 1971; Murphy & Gilligan, 1980). This form of reasoning is lacking in the UK Push.

Deciding what is a good decision in one classroom and not in another is important in education. Context matters, and the logic of the UK Push lack this moral and ethical sophistication. By way of example, I was recently watching a video of a UK speaker – who is associated with the UK Push – describing the nature of logic. He was asked about logic and context. One question piqued my interest. The questioner said that while he accepted that it was wrong to kill, he asked if in certain situations and certain contexts this view might be different. The speaker glossed over this question, saying it was a matter of ‘generalisation’, where ‘generalisation’ refers to saying things like ‘dogs have four legs, and know that there might be some dogs without, without worrying about being inaccurate’. While acknowledging that the context may have been rushed and informal, the speaker floundered on the question of assumptions underlying an argument, and floundered when presented with a moral and ethical question.

The UK Push lack the kind of sophisticated reasoning that is central to education. For example, the field of educational assessment has highly evolved evidentiary approaches to reasoning (Mislevy, Steinberg, & Almond, 2003). Newton and Shaw (2014), also from the UK, provide an excellent analysis on the historical vicissitudes of educational assessment validity. A recent special edition of Assessment In Education scientifically progressed this debate (Newton & Baird, 2016). The USA also has a strong tradition through the work of Messick (1989) with its concern for social consequences, and the argument-based approach of Kane (2006). Many educators in Australia are well versed in these approaches. Not so the UK Push.

Justifying generalisations and presuppositions to an argument are key to the process of argumentation. In not providing these justifications, the UK Push are blind to the a priori assumptions behind their uses of data and evidence. Bex, Prakken, Reed, and Walton (2003, p. 142) argue that generalisations may be validly challenged on three grounds: the source of the generalisation, the circumstances in which the generalisation applies, and on the nature of the generalisation itself.

The arguments of the UK Push lack moral and ethical reasoning. Their approach to argument is based on forms of logic and reasoning that are tautological. This is a form of argumentation where an argument’s conclusions are embedded in its presuppositions; for example, “Anne is one of Jack’s sisters; All Jack’s sisters have red hair; So, Anne has red hair.” (Toulmin, 1958/2003, p. 115).

The UK Push lack the sophisticated form of arguments that calls for the justification of claims based on evidence, and which provides opportunity for rebuttal. This is the approach advocated by British philosopher Toulmin (1958/2003), and adopted in contemporary approaches to educational assessment validity (Kane, 2006; Mislevy, 2006).

The UK Push rely on blunt forms of reasoning that ignore claims to normative rightness and subjective truthfulness. They ignore Australia’s context and the unique legal, social, cultural, and economic constraints that Australian teachers work in. This is illustrated in the UK Push’s desire to arbitrarily translate programs such as the phonics test directly to Australia. These claims are made without justification of the appropriateness of the test for the local context. They do not reference the knowledge that Australian teachers have on phonics, nor Australia’s research on the matter, including a National Inquiry. Further, the efficacy of the UK program remains moot (Clark, 2013; Bradbury, 2014).

The pernicious nature of the UK Push’s use of argumentation is evident in learning styles. The UK Push pillories those who engage with learning styles, yet provide no original research on the matter. Refusing to even consider the various constructions or models of learning styles. Engaging not for progress, but for power and control. Scientific progress generally occurs when better ideas supplant problematic ones (Kuhn, 1970). Paradigm shifts requires real research, not badgering through cherry picked arguments from other researchers. For example, I suggest to move from learning styles towards Kress’ (2010) notion of semiotic resources. But the UK Push do not seek to progress the debate, illustrating a strategic orientation, and not an orientation towards new understanding.

While hounding people about learning styles, the UK Push do not explicate or describe the theories about which they seek to engage. The term “learning styles” is used as an empty signifier, without explication. This leads to a form of debating that Carnap (1928/2005), for example, considers pseudoproblems of philosophy, and illustrates that the UK Push are not interested in scientific discourse, but rather, in gaining power and control over other educators and educational research.

It is clear to me that the debate about learning styles is not a scientific debate. It’s a commercial pursuit into Australia’s educational fabric. Lewin (1947), for example, developed the notion of unfreeze, change and freeze in the study of group dynamics. It seems that a bastardised version of this model is happening with the UK Push. Where bovver boys are attempting to unfreeze so a UK saviour can emerge in its wake. That Australian academics are legitimating this push is even sadder.

Grassroots is a sought-after designation and is associated with an unfettered authenticity in communion with colleagues. Australian education has a grassroots tradition. As a junior teacher, I observed senior teachers move to work in the union movement, to the exam board, into various branches of the bureaucracy, and into academia. During the political maelstrom of Victoria’s Kennett years, teachers were being moved all over the place. In one of those moves, I was mistakenly sent to a school that was in the process of being demolished. Instead of bleating, it was at that moment that I decided to contribute to fixing the system. That’s the grassroots approach.

About a decade after being mistakenly moved to a demolished school, I worked in the role of HR Manager, in the role that authorised that mistaken move those years before. A couple of years after that, we had installed a new recruitment and payroll management system for Victorian teachers (eduPay). A few years after that, perhaps because of the transparency of the new payroll system, the anti-corruption commission cleaned out the Department. Now everything is back in a row. That’s the grassroots approach.

On my journey into the Department, I was fortunate enough to work on the PISA project. There too the project directors were once teachers in the Victorian system. Further, Andreas Schleicher had completed his Masters at Victoria’s regional Deakin University. He too thought about a grassroots movement to provide more data to schools, he was in search of an organisation. That’s the grassroots approach.

We all like to see ourselves as grassroots, we all like to think of ourselves as authentic, and more authentic than the systems in which we work. But as we form networks, connect with the powerful, engage with the corporations, the integrity of any grassroots claim diminishes. Whether working in the Department, working for PISA, or working for the VCAA, when you’re in the system, when you are in an organised network, you are the system. And yes, the system has problems, and systems need to be fixed.

Most teachers would like to consider themselves as grassroots, as authentic, and being in touch with everyday life. I’m like that too, but to enjoy that authenticity I have had to resign from jobs several times. Bureaucratic work can be boring, and involve much comprise. I prefer to move on when the job is done, I’m like Dave Grohl, I can’t sit still. Currently, I’m fortunate enough to be below grass roots, a full-time student working part-time with refugees. And enjoying it.

My PhD addresses concerns with education in Victoria and Australia, and it considers deep structural issues and assumptions. I’ll blog more on that later. There are always complaints against the establishment, there are always claims about what them and they are doing. A key part of my journey has been to find ‘them’ and ‘they’ that the many talk about. I’ve always found that ‘they’ are ‘us’, and ‘they’ are ‘you’. Wherever you go, there you are. Of course there is power in the confused Canberra, in the corridors of Sydney that likes to assert itself, and then there is AITSL which seeks to be a statutory authority but is not one. There is also AARE, and anybody who thinks they are the establishment has rocks in their head. Everybody knows that most of them are from Queensland, and Melbourne is where it is at.

Like many, I was introduced to several theories of learning styles in the 1990s. Neuro-linguistic programming (Grinder, 1991) and experiential learning cycles (Kolb & Fry, 1975) were two that I remember. I never tested my students on what style worked best for them. I didn’t even ask if they had a preference, even though Pashler, McDaniel, Rohrer, and Bjork (2008) find that there is “ample evidence that children and adults will, if asked, express preferences” (p 105). Instead, I used theories of learning styles to cycle through different ways of approaching and presenting educational content. Cycling through various teaching approaches proved effective for the aims and objectives at the time. I accept that these approaches are now out of date.

During the nineties, I also had an interest in cultural studies (what ever happened to that) with a focussed interest on semiotics (Barthes, 1967, 1957/1993; Chandler, 2002; Eco, 1980/1998, 1988/2001). That was fun.

When I left teaching I was lucky enough to get a job working on the PISA project (I didn’t mention that before :-). As all the important people were busy, I found myself managing PISA’s first computer-based assessment which was a side project (OECD, 2010). Luckily, I was surrounded by competent people, mainly university students whose opinions hadn’t yet calcified, so the project went well. It was in managing this project that a basic understanding of the visual (movies, animations and images), kinaesthetic (input devices) and auditory (sound) proved useful. As was a rudimentary understanding of semiotics.

Field operations for the PISA’s first computer-based assessment went well. The project specified that countries cart standardised laptops to all sampled schools. This meant that the assessment conditions were highly standardised and the data of good quality. However, national centres found lugging laptops too cumbersome, and as countries started to complain they began to call it the Koomen model. Three countries persisted to a main trial, and when the data was analysed, that’s when the real problem began.

The data from the computer-based test made sense, but was quite distinct from the paper-based test. The movies, animations and other features suggested that the science assessed on the computer was different to science assessed using paper. For example, boys did better on computer-based because the reading load was less. The Koomen model didn’t work due to the heavy laptops, and because the data didn’t fit with the paper. So, I went to hide in the public service for a bit.

The mental models (Senge, 1992) I had at the time didn’t fit computer-based assessment. However, I have recently been contemplating Kress (2010, pp. 5-7) who argues that technology provides for greater ‘semiotic resources’ for meaning making. This is now becoming the key idea for me. Technology has afforded new ways of making meaning and education has not quite caught up. Education is still caught up in solid, but dated, forms of syntactic structures (Chomsky, 1957/2015), dated forms of learning styles (Grinder, 1991), and dated approaches to culture (Jameson, 1991). From my perspective, educational research could focus more on how the newly available ‘semiotic resources’ affect meaning making for students. It is these developments that are likely to underlie Schleicher’s admission that the new computer-based PISA results might not be comparable, and why ACARA is having trouble moving NAPLAN online.

Habermas (1979) argues that societies develop along cognitive-technical and moral-practical dimensions. I would also argue that societies develop along an aesthetic-expressive dimension. Habermas (1975/2005) also argues that developments along the cognitive-technical dimension can fracture normative structures and destroy barriers of participation and create dysfunction and regression. This is the focus of my research in technology-based educational assessment.

Easter signals a time of new beginnings, a celebration of the spring Goddess (antipodes). Some think of the Germanic Ēostre, the Greek Eos, or the Roman Aurora. There are other characterisations. However one regards Easter, it is a time for spending time together and reflecting.

Easter, like spring, celebrates the possibility of the new. Transition from an old to a new is progress, but regression from the new to the old is also possible. These transitions have traditions too, the Greeks had the vanquished Titans and the victorious Olympians. The Olympians had the fabulous Dionysiac forces of creative-destruction; forces appropriated for capitalism by the economist Schumpeter.

The old is associated with violent struggles exemplified in Greek mythology through the authoritarian Cronos who wielded the harvesting scythe. Christian mythology has the one vengeful god of the old testament. In each tradition, the new is associated with plurality and greater shared understanding.

The Easter festival of spring brings hope for a new way of being. A hope to move away from the old violent confrontation of the Titans, towards the pluralistic understanding of the Olympians. A transition talked about by Confucius when he asks never to impose on others what you would not choose for yourself. This maxim has had variation over the centuries and is a sentiment that asks us to consider a range of narrative traditions other than our own.

I have been reflecting on these things since reading Nietzsche’s Birth of the Tragedy over the weekend, it is a book that draws on the Dionysiac tradition of Greek theatre. I have been reflecting on what these myths can tell us about the NAPLAN online, and about Safe Schools.

Over a decade ago, I was fortunate enough to find myself leading a team implementing the first computer-based assessment in PISA. It worked, as no one was watching. The technical innovation was mainly created by university students, who made it look easy. The project brought together item developers, film directors, animators, software developers, psychometricians and field operations experts. It worked because experts in each field recognised the needs of experts in other fields. For me, it demonstrated the plurality of Olympus, and the plurality of the new testament. This project allowed me to glimpse in what the future might look like.

Off course, the future did not arrive. Instead of progress, there was regression. There was money to be made and the Titans with singular interests returned.

While I have no details, I imagine ten years later NAPLAN online is still stuck in that destructive battle of the Titans. NAPLAN online is a task eminently doable, it could have been done several years ago. But I imagine that people are fighting for their own vested interests, with an unwillingness, or an incapacity, to consider the perspectives of others.

Regression is also evident in the battle over safe schools, a failure of one to see the point of view of the other. A failure to see a plurality of views, regressing to a clash of the Titans.

To save NAPLAN online, to save Safe Schools, we all need to morally develop so that we understand a plurality of perspectives. And while the mortal Titans fight, the Olympians’ unquenchable thirst for laughter will remain.

Whenever I see Mark Latham in trouble my first reaction is to question his mental health. The latest episode was not different, it simply led me to question Sky’s health and safety policy and practices, not their strategic and editorial decisions.

I cannot see how a sane man can make the kind of comments Latham makes. I know his sanity has been endorsed by greats such as Whitlam, as well as the vetting processes of the ALP, various elite writer’s conferences, and for Latham’s Sky media gig. Yet I cannot accept his commentary as being that of rational person.

So why are people so drawn to Latham, why did Sky keep him for so long, why did Jacqueline Maley write such an ironically engaging article on not engaging with him. What is the world’s fascination with Mark Latham? What is this trope, what is this anger, and why are people drawn?

The validity of the feminist cause is beyond challenge. The battle over the principle of equality was won decades ago, but social change has been slow and is now confused. To recap, there has always been a grand emancipatory narrative between master and slave (Hegel), or proletariat and bourgeoisie (Marx). However feminist writers, for example Irigaray, have correctly challenged this traditional grand emancipatory narrative by arguing that women are treated as goods for exchange within it. That is, women were traditionally not accepted as citizens within the broader emancipatory narrative. This required women to write their own emancipatory narrative we know as feminism, or what some conservative French philosophers might call a self-legitimating little narrative.

The narrative of feminism has had some, yet not universal, success in broadening opportunities for women and providing women access to power. That progress has been uneven is leading to new toxic dynamics. There are now women in positions of power, while underlying inequality remains and festers.

Contemporary problems are more evident among the elites than among the hoi polloi. One Nation has shown that conservative men and women do not have problems with female leadership. Pauline Hanson’s stable leadership of Australia’s conservative movement for over 20 years is in stark contrast to other parties considered more progressive. That Queensland is the only state to popularly elect a woman, twice, shows that Hanson is not an isolated case. Given a choice between an authoritarian male (Newman), and a sensible female (Palaszczuk), even conservatives seem to prefer the later. Among the general population there seems little problem with women and power.

Problems are more evident among elites and spheres of life considered progressive. That only 25 per cent of professors and one third of vice-chancellor in Australian universities are women is one example (click for article). The progressive Greens have only been led by a woman for 3 of the 20 odd years in parliament, or around 15% of the time. The progressive ALP butchered their one chance, with Gillard’s chances being cruelled more by internal machinations than harassment from outside the party.

The elite trade of the fourth estate provides another example. The ABC’s Insiders program, the elite political show on Australia’s elite broadcaster, has 74% male appearances to date for 2017 (see table below). This is comparable to the gender crisis in academia. Part of this male dominance could be explained by vestiges of a patriarchal past; in that it could be argued that Barrie Cassidy and Mike Bowers are the preeminent experts due to past male advantage. This is the sort of argument the ALP uses to kick the gender equality can down the road to 2025 (click for article). However, that over 75% of guests are men as well as nearly two thirds of guest panellists, is freshly baked contemporary gender bias. There is nothing self-evident in contemporary Australian society to suggest that men are better panellists and guests than women. Particularly given the tiresome spats between Henderson and Marr that are relics from the sixties.

The ABC’s Insiders exemplifies some broader dynamics. It shows that journalism is one field where women are as competent as men, yet still lack voice in shaping the social sphere and the debates. This seems not isolated to journalism, as women are often promoted in other fields based on their functional efficiency then ignored in conversations that shape their workplace and society, with women remaining as good for exchange, this time as a “highly functional robot”. What is also not isolated is that such gender bias is often excused when it involves a charismatic male, which in the case of the ABC’s Insiders is Barrie Cassidy. I have no doubt that Barrie Cassidy is the good bloke as presented. But all systemic injustices breed contempt, and I often wonder if this contempt is not projected on the faces of the likes of Latham.

The toxicity of the current gender debates might be better explained by dynamics among the elites than among the hoi polloi. Women working in the elite of the fourth estate are clearly being denied equal voice within it. We also have women in the elite of the fourth estate who ply their craft in a form of journalism that projects responsibility for inequality onto males of the proletariat. They seek to antagonise certain elements of the male proletariat, then thrive on the inarticulate toxicity of their trolling. They tease inarticulate marginalised working class men for their inability to exploit their natural advantage as white men. A class of men traditionally addressed by the emancipatory politics of the labour movement. It is sometimes difficult to ascertain the purpose of such journalism other than for its shock value. Of course, I do not include Badham and Maley to be among this group.

The gender dynamics that Badham writes about are real and as nefarious as she describes. However, I do wonder if feminism should continue to be the dominant vehicle through which they are addressed, and whether it may not be better to pursue them through a broader emancipatory narrative. This is not to say that women cannot pursue their issues through feminism, there is always freedom of association. But with equality women get equal rights to be as stupid, strategic and as nasty some men. Equality also gives women of the elite equal rights to be nasty to both men and women of the proletariat.

The trope of the self-legitimating little narrative espoused by French philosophers of the past has perhaps run its course. It has served the disempowered and minorities well for decades with some effectiveness. But the self-legitimating little narratives of identity politics have been appropriated for harm by the likes of Geert Wilders, and with catastrophic potential by those in power like Donald Trump. It may be time to again address the disenfranchised through a broader emancipatory narrative, as one species on one planet.

I’ve happened upon two powerful and memorable pieces of television in my lifetime. One was Keating’s Redfern speech, the second was Rosie Batty’s first press conference after the death of her son. It would have been perfectly reasonable to expect a tale of misandry from Rosie Batty after such a horrific event. Instead, Rosie exemplified reason and compassion in saying that bad things can happen to all people at any time. She then successfully campaigned for better public policy and for better justice under one law. We can all continue to learn from Rosie’s approach. That Mark Latham was Rosie’s harshest critic, best exemplifies the state of his mental health.

First a quick thanks to all those who responded to the blog. The response was heartening.

The point that I was trying to make with the blog was that there are random students doing NAPLAN who are getting widely misleading results. Probably not a high percentage, but when over a million students undertake the test, 5% is 50,000 students. Most of those students might not even be psychologically affected. But some, perhaps those who tried hard on the promise of an ice-cream, or those who tried hard to please mum who is going through a rough trot, or those who tried hard one last time to be good at numbers or words, will be. Out of one million, the number of students may be less than 10,000, or less than 1%. But this worries the caring teacher type as it causes unnecessary grief.

Students, perhaps more than adults, are particularly vulnerable when things are unfair. Students roughly know where they are with their school work. When they receive feedback that is fair, justified and agrees with their self-perceptions, they generally accept it thoughtfully. Fairness is a big thing in testing (for example, see Camilli, 2006; Zieky, 2015). When feedback is unfair, students can have maladaptive emotions (for example see Vogl & Pekrun, 2016).

As a former maths teacher, I like the mathematics of assessment, often finding it more beautiful than useful. I like the simplicity, elegance and flexibility of the Rasch model (Adams & Wu, 2007; Rasch, 1960). I like the magic of plausible values, that random numbers can sometimes be more useful than real ones (Wu & Adams, 2002). And what is there not to like about a number called a warm estimate (or WLE), that gives a student who doesn’t get anything right a non zero score. But this magic does not always work for me with NAPLAN reports.

so this student’s graphic report looks something like this (on top) then annotated below

So my claim that student scores are not reliably reported in NAPLAN remains, as does the observation that this unreliability is not clearly communicated to students and parents. Further, it used inappropriately by the media to label improvers, coasters and strugglers .

Response 2

Sure, and most school-based paper-and-pencil tests administered by teachers are administered under classical test theory (CTT) principles. Perhaps CTT tests lack precision, but student scores relate to the content on the test which enables teachers to give meaningful feedback around that content on the test.

NAPLAN, is based on the Rasch (1960) principle that the raw score is the sufficient statistic and reports on levels which is abstracted from the content (read the book). This makes it close to impossible to give meaningful feedback. Teachers would need to administer another test, or get evidence from elsewhere, to provide meaningful feedback.

Online may provide enhanced precision, but the feedback will remain less useful, and that students within cohorts will do different forms, teachers will need to trawl through test forms to have any hope of providing feedback. Further, the socialising narrative of the test will be diminished, and likely to alienate marginal students.

Conclusion

Moss (2003) argues that in “much of what I do, I have no need to draw and warrant fixed interpretations of students’ capabilities; rather, it is my job to help them make those interpretations obsolete.” The reporting regime of NAPLAN makes results, in the terms of Austin (1962), performative. That is, NAPLAN reporting does not so much describe, but they create and define. Performative statements are not true or false, but either happy or unhappy. Some of NAPLAN student reporting is unhappy.

The release of international reports on education as well as NAPLAN have placed teachers under much pressure. Most of this pressure arises from innuendo, or what statisticians call correlations. Is this pressure warranted?

NAPLAN reporting of student abilities is unreliable. This is likely to have tragic effects for some students. These individual tragedies are largely silent except to the student. The dubious accuracy of NAPLAN results questions the fairness of recent media reports that label students as big improvers, coasters, and strugglers.

NAPLAN reports student results as dots within bands numbered from 1 to 10. That these dots are solid conveys a sense of certainty, a certainty not matched by the mathematics. It is normal practice in statistics to show a confidence interval. For example, a 90% confidence interval would show a range in which we are 90% confident a student’s ability is located. NAPLAN does not report these confidence intervals for individual students.

Margaret Wu (2016) finds that if NAPLAN included confidence intervals, it would not be possible to confidently locate a student in a particular band. That is, around one in ten students is being reported in the wrong band. This effect is random and potentially has tragic consequences.

Over one million students do the NAPLAN tests so there are over one million stories. Once the unreliability is considered new stories emerge for our improvers, coasters and strugglers. Improvers, for example, could simply be those students reported below their level one year, and above their level the next. Most students would be coasters. In statistics, this is regression to the mean.

While most students would receive a NAPLAN score close to where they should, about 10%, or more than 100,000 students, receive a misleading message. This includes students who may have tried hard to improve, only to be randomly reported below their real level. It also includes students who are coasting, but are randomly reported as excelling. Both types of misleading messages affect student motivation. That these little tragedies are occurring in large numbers is likely to be undermining Australia’s international performance.

NAPLAN doesn’t assess curriculum, it only “broadly reflect aspects of literacy and numeracy within the curriculum in all jurisdictions” (ACARA, 2016). If teachers were to teach only ‘aspects of curriculum’, and provide student feedback in the haphazard fashion of NAPLAN, they would be ridiculed.

Teachers are being held accountable to dubious statistics. For example, the American Educational Research Association (2015) strongly cautions against the use of value-added-models. Yet Australia reports student progress without reservation or qualification on the My School website (myschool.edu.au). This is not in the interest of students, teachers, or schools. In whose interest this reporting is occurring remains opaque.

Australia’s education measurement industry is a plagued with vested interests. With over 300,000 Australian teachers, everybody wants a piece of the pie. Teacher training, teacher supply, and teacher development provide commercial opportunities. This feeding frenzy is a disgrace and should stop.