The numbers are people

Educational research has two cultures, but unlike the two in C.P. Snow’s famous talk, they do overlap and they do talk to each other. One is fiercely qualitative, concerned directly and immediately with the lived human experience of learners and teachers, in all its ethnographic complexity, subtlety, and sophistication. The other is determinedly quantitative, concerned with what can be counted and known in more regular, repeatable, transferable ways.

My sympathies lie with both, but my inclination is definitely towards the empirical, quantitative, and generalisable. But that mustn’t be at the cost of losing sight of the human perspective. With the rise of learning analytics, and more and more quantification of learning – of which I’m a small part – it’s easy to be sucked in to watching the numbers and forgetting the people.

Two things have come to haunt me recently. Both are about predictive analytics: using all the data we have about learners, and previous learner’s success or failure, and using that to predict the success (or otherwise) of learners who haven’t finished yet – or even signed up. The OU is doing a lot of work in this area at the moment. The ethical issues are complex and difficult. I think both the quantitative and the qualitative perspectives are needed to guide our policy development here.

Alex

The first haunting story is the case of individual predictions of likely failure. When you do a thoughtful presentation or paper about predictive models in learning analytics, you always raise the tricky issue of what to do about it. What is the ethical thing to do when your predictive algorithm says there’s very little chance that a would-be student will pass your course? Is it right to take their time, effort and money (or that of whoever is subsidising their place), when it will almost certainly come to very little? But on the other hand, is it right to block them from study? Given that one of the biggest predictors of failure is membership of systematically disadvantaged demographic groups, such a policy would almost certainly have a discriminatory impact. And at the Open University, of all places, we can’t go systematically shutting the door against people.

This question rightly gets a lot of air time. And I don’t think we have a good answer. The simplistic answer is to say admit them but give them lots of extra help. This is probably the right approach, but how much help should you give? Clearly not all resources should be poured in to one small group of students. (Though I think a lot more should be poured in to supporting systematically-disadvantaged groups at university.) And what if they’re still highly likely to fail even given extra support?

One very-bad answer to the question is to pretend we don’t know this: don’t crunch the numbers, and if you do, don’t let anyone who might be in a position to act on the information do anything about it. I think we need to be extremely careful about handling that sort of data – simply saying to a student “well the computer says you’re going to fail” isn’t going to help anybody. But surely it’s not hard to argue the case in a university that deliberate ignorance is very rarely a sound policy.

If you’re talking MOOC completion, maybe if you are a certain sort of person you can not care about completion, and maybe you can say that people who didn’t finish the course maybe got something out of it. But when you’re talking university completion, and are the sort of person who deliberately chooses to work at the Open University, you can’t not care about it.

Anyway – so far so much abstract moral philosophy.

The thing that haunted me was talking to one of our ace statisticians about their predictive model, very much in its early stages. It struck me as impressively precise – way more so than examples I’ve seen in other universities. Although our statistician is, of course, brilliant, I’m sure they’d be modest enough to acknowledge that this is probably because we have so much more data than other universities, because we have so many more students. That you could predict completion rates on a course so precisely (and hopefully accurately) – before the course had even started – was eye-popping. It was even more staggering at individual level – where, of course, the model is less precise, but where the impact is easier to grasp at a human level.

I was looking down a page of example students, with the ID numbers obscured, with several of their characteristics listed, and the model’s predicted probability of completion and passing the course at the right hand side. One leapt out at me: a student who had about a 70% probability of completing the course – pretty much that course’s completion rate – but a 1% chance of passing.

Suddenly, the abstract was concrete. This was an actual student, about to start on one of our courses, who was pretty likely to struggle their way through their course … but was almost certain to fail. They were from a systematically disadvantaged group, they were short of money and so claiming financial assistance on the course, and were starting to study with us on a 2nd-year undergraduate level course with no prior educational qualifications whatsoever.

I wanted to de-anonymise them and move heaven and earth to urge them to shift to one of our Access modules. (Though if they were in England, funding changes make this less straightforward than it used to be.) But of course I couldn’t, and probably rightly. I don’t know their name, and probably never will. But in the tradition of qualitative research, I’m going to give them a pseudonym to refer to, rather than treating them as an abstract data point, and call them Alex.

Encountering Alex gave new power to what had previously been an abstract moral question. Alex was a real, individual person, with all their hopes and dreams. They were a real human being facing a grim situation. They were emphatically not having the easy, privileged life I enjoy. Probably, Alex desperately wanted to study this particular module for some very personal reason, which is why they’d rejected previous counsel to start studying with a different and easier module. It’s not hard to imagine that Alex hoped it might transform their life. It almost certainly won’t – at least, not in a good way.

I’ve had the enormous privilege of meeting some students who were very much like Alex at OU graduation ceremonies. (Although I’m pretty sure that almost all of them started with introductory-level modules.) Creating opportunities for people like Alex to overcome the odds is so much a part of what the OU stands for. But it’s hard. When does keeping the door to success open become giving unrealistic and misleading hope?

Bobby

The second haunting example is about the work that we pour in to producing OU courses.

I was talking to the same ace statistician about the same model, but this time in the context of comparing courses to see what we could learn about how to make them better, so fewer students drop out or fail. I suddenly realised that I’d heard them say that ‘module’ was just one of the factors in their model, and not a terribly important one, but I hadn’t really grasped what that meant.

Sure, which module you’re studying has an influence on whether you’ll complete and pass, but that’s actually a much smaller factor than, for instance, whether you are on financial assistance or have no previous educational qualifications.

Almost all my OU-focused work is about trying to make modules better. The same is even more true of most of the other people I work with. Producing a good course is really hard. It takes huge amounts of effort and time. It’s what most of the academics at the OU spend most of their time doing. We’re working hard to try to speed it up and make it more efficient, but even so, it’s difficult and slow. And it’s not a terribly important factor in predicting student success.

Does that mean the model says all that work is wasted? All that effort poured in to getting the best explanations, the best structure, the best illustrations, the best feedback and assessment – for so little? I regularly talk interesting things about course production with a colleague I’ll call Bobby. At the moment, Bobby is deep in the middle of a particularly challenging piece of course writing. When I mentioned this nugget about the model to Bobby, they were aghast: was all this intellectual angst a waste of time?

No!

What it’s saying is that all that effort on module production means that they almost all reach a similar very high standard. If, for some reason, one of our modules hasn’t got it right, it’s pretty quickly identified as a problem (usually well before it gets to students) and huge efforts poured in to fix it. What the model is telling Bobby is that it expects all that intellectual angst to result in a pretty good module. Bobby is (rightly) focused on all that’s not right about the course at the moment, and the huge task involved in making it right; the model says they’ll probably, eventually, succeed.

Inside and outside view

Part of what these two have in common, I think, is the difference between what Kahneman and Tversky call the ‘inside’ and the ‘outside’ view. To illustrate that, think of a project being proposed, and imagine that you’re interested in predicting whether it’ll be completed on time. One way to answer is the inside view, looking in detail at this particular project: the project plan, what’s involved, who’ll do it, what the challenges are, how those might be addressed, and so on. If we’re good at it, we’ll break it down in to its component parts, give an estimate for each part with a margin of error, add it up, then add on another margin of error. And we still come up with over-optimistic forecasts. The other way is to take the outside view, comparing this project to others similar to it, asking how many of them completed on time, and using that as the estimate. Kahneman and Tversky are careful not to say that the outside view is always better or right, but it can give you valuable insight. Kahneman has an amusing personal anecdote that makes this point well.

The outside view is what the predictive analytics are bringing here. It’s telling us that Alex’s endeavour is almost certain to fail, and that Bobby’s is almost certain to succeed, even though neither of them experience things that way at the moment. However, the outside view is of necessity de-personalised. It doesn’t (directly) tell them what they need to do to succeed: that’ll come from the inside view.

We need both. Ben Goldacre, the campaigner on medical evidence, writes powerfully about how doctors have to learn to block out their intense personal feelings about patients’ suffering and death, in order to get the job of reducing it done. But:

With research, we have the opposite challenge: we need to force ourselves to give the numbers the emotional content they deserve. This isn’t just true for the problems set out in this book, it’s also true for teaching evidence-based medicine. Go back to page 15 and look at the forest plot for the Cochrane logo. [Explanation here.] That abstraction speaks to a horror: parents whose babies struggled to breathe and then died, with all the suffering and horror that entails, the bereavement, the scarred relationships, and the pain. Nobody who is taught that graph in medical school experiences one hundredth of the emotional impact from it that they do from telling one mother that her child is dead: yet that graph is a representation of exactly that suffering, on a grand scale.

He uses this to argue that using emotive terms like “suffering and death” is better than “morbidity and mortality”, because it helps to reduce complacency in improving evidence in medical practice.

As I’m fond of saying, education is not as immediately a matter of life and death as medicine is. But as we get more and more data, and make more and more use of it, we must not lose sight of the human impact of what we’re talking about. It’s an easy trap to fall in to: ‘non-completion’ is not only more bloodless than ‘failure’ or ‘dropout’, it’s the more technically accurate term for what the statistics are measuring, so in academic writing it is the correct term to use. But in wider conversation, ‘dropout’ doesn’t come close to conveying the impact on most individuals. Think about Alex. Is their experience of university likely to be life enhancing? Does ‘non-completion’ capture what it meant to them?

And who is in a position to do something about it? I say it’s us. I say we failed Alex. The medics have had “first, do no harm” as a guiding principle for a very long time – we educationalists should surely aspire to at least that.

Lest my quantitative chums think I’ve gone soft, or my ethnographer colleagues think I’ve abandoned my touchingly naive empiricism, I am emphatically not meaning we should make less use of hard evidence. I’m saying we need more and better hard evidence (where we can get it), and we must use it better – precisely because the human cost of not doing so is so huge.

The numbers are people!

–
This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.