science

When a researcher tells me their research topic is fascinating, I get a sinking feeling, like when someone announces they are going to tell me about the dream they had last night. Here’s something I’ve learnt in my time as a scholar: finding something fascinating isn’t a good guide to what actually is fascinating.

There are plenty of questions which I used to find fascinating. Topics which, for years, I would have said fascinated me, but which now I think are of little interest. That feeling that something was a deep question, that it promised – somehow – to help reveal the secrets of the universe, wasn’t to be trusted. With a bit more thought, or experience, my fascination in a topic turned out to be a dead end. Now I think those topics are not productive to research, they don’t promise to reveal anything. What appeared to be a mysterious contradiction was just a blunt fact, a universal symbol turned out to be boring particular.

I’m not going to give you an example, because I don’t want to focus on a specific case, but on that feeling of fascination which drives our curiosity, which must in some form be the foundation of a research programme.

Personal fascination is a poor guide to a good research topic, but it has also been the guide for the research I’ve done of which I’m most proud, and which I think makes the most important contributions.

The trick is to not to blindly trust your fascination, but to draw it out. Can you explain why something is so fascinating? Can you show the connections to wider topics? Can you show that suggested explanations are inadequate?

Without action all you have is a feeling, which has as much currency with other people as when you try and explain one of your dreams. You may feel deeply involved, but there’s no compelling reason for other people to be.

But I will mention one severe but useful private test – a touchstone of strong inference – that removes the necessity for third-person criticism, because it is a test that anyone can learn to carry with him for use as needed. It is our old friend the Baconian “exclusion,” but I call it “The Question.” Obviously it should be applied as much to one’s own thinking as to others’. It consists of asking in your own mind, on hearing any scientific explanation or theory put forward, “But sir, what experiment could disprove your hypothesis?” ; or, on hearing a scientific experiment described, “But sir, what hypothesis does your experiment disprove?”

Poldrank and Poline’s new paper in TICS (2015) asserts pretty clearly that the field of neuroimaging is behind on open science. Data and analysis code are rarely shared, despite the clear need: studies are often underpowered, there are multiple possible analytic paths.

They offer some guidelines for best practice around data sharing and re-analysis:

Recognise that researcher error is not fraud

Share analysis code, as well as data

Distinguish ‘Empirical irreproducibility’ (failure to replicate a finding on the original researchers’ own terms) from ‘interpretative irreproducibility’ (failure to endorse the original researchers’ conclusions based on a difference of, e.g., analytic method)

They also over three useful best practice guidelines for any researchers who are thinking of blogging a reanalysis based on other researchers’ data (as Russ has himself)

Contact the original authors before publishing to give them right of reply

Share your analysis code, along with your conclusions

Allow comments

And there are some useful comments about authorship rights for research based on open data. Providing the original data alone should not entitle you to authorship on subsequent papers (unless you have also contributed significant expertise to a re-analysis). Rather, it would be better if the researchers contributing data to an open repository publish a data paper which can be cited by anyone performing additional analyses.

Ed Yong has some excellent guidelines for scientists on giving comments to journalists, but I wanted to add a single piece of advice, one which will help whether you are talking to Ed or to less scrupulous journalists:

“Don’t be afraid to tell the journalist what the story is”

By this I mean you are allowed to not answer the question. This feels weird, since it violates conversational and academic rules, but the thing the journalist should be interested in is the real story. The questions just exist to get to that (which is why Ed says he often asks pretty vague questions). If you think the journalist is asking the wrong question, don’t answer it – tell them what the right question is.

If you restrict yourself to answering the wrong questions, the risk for everyone is that the (mistaken) framing stays in place, just with a few qualifications from you. For example, if the journalist is researching a study which says “fabulous brain training method boosts IQ” your comments that the study has flaws, or is a provisional result only, will lead to the headline “fabulous brain training method boosts IQ”. Or, if you are lucky, “fabulous brain training method might boost IQ”. And down in paragraph 4 will be some quote from you warning people not to get carried away.

Far better would be to give the journalist an alternative story, rather than some doubts. Tell them “no brain training method you can pay for works any better than free methods which are available to everyone”. Or “the brain is a machine which runs on blood, the best thing for your brain is physical exercise, not brain training”. This is news people can use. If you really disagree with a study, offering an alternative narrative is your best chance of that study being put in the correct context. “You don’t beat owt with nowt”, as they say.

This is what – I think – Ed is getting at when he says he wants the context from scientists, the “something interesting that I couldn’t have predicted”.

Further reading: George Lakoff “Don’t Think of an Elephant!: Know Your Values and Frame the Debate” (an actual book, so no hyperlink!)

There’s a hoo-har in psychology right now about replication. Spurred on by some high profile fraud cases, awareness of the structural biases surrounding publication and perennial rumblings about statistical malpractice, many are asking if the effects reported in the literature are real. There are some laudable projects aimed at improving best practice in science – journals of null results, pre-registration for experiments, the Center for Open Science (see previous link), but it occurs to me that all of this ignores an important bit of context. At the risk of stating the obvious: you need to build in support for replications only to the extent that these do not happen as part of normal practice.

Cumulative science inherently supports replication. For most of science, what counts on news is based on what has been done before – not just in an abstract theoretical sense, but in the sense that it relies on those results being true to make the experiments work. Since I’m a psychologist, and my greatest expertise is in my own work, I’ll give you an example from this recent paper. It’s a study of action learning, but we use a stimulus control technique from colour psychophysics (and by ‘we’, I really mean Martin, who did all the hard stuff). As part of preparing the experiment we replicated some results using stimuli of this type. Only because this work had been done (thanks Petroc!) could we design our experiment; and if this work didn’t replicate, we would have found out in the course of preparing for our study of action learning. Previously in my career I’ve had occasion to do direct replications, and I’ve almost always found the effect reported. I haven’t agreed with the interpretation of why the effect happens, or I’ve found that my beliefs about the effect from just reading the literature were wrong, but the effect has been there.

It is important that replication is possible, but I’ve been bemused that there has been such a noise about creating space for additional formal replications. It makes me wonder what people believe about psychology. If a field was one where news was made by collecting isolated interesting phenomena, then I there would be more need for structures to support formal replication. Should I take the reverse lesson from this – the extent to which people call for structures to support formal replication is evidence of the lack of cumulative science in psychology?

So, previously on this blog (here, and here) I was playing around with the bootstrap as a way of testing if two samples are drawn from a different underlying distribution, by simulating samples with known differences and throwing different tests at the samples. The problem was that I was using the wrong bootstrap test. Tim was kind enough to look at what I’d done and point out that I should have concatenated my two sets of numbers and the pulled two samples from that set, calculated the mean difference and then used that statistic to constructed a probability distribution function against which I could compare my measured statistic (ie the difference of means) to perform a hypothesis test (viz. ‘what are the chances that I could have got this difference of means if the two distributions are not different?’). For people who prefer to think in code, the corrected bootstrap is at the end of this post.

Using the correct bootstrap method, this is what you get:

So what you can see is that, basically, the bootstrap is little improvement over the t-test. Perhaps a marginal amount. As Cosma pointed out, the ex-gaussian / reaction time distributions I’m using look pretty normal at lower sample sizes, so it isn’t too surprising that the t-test is robust. Using the median rather than the mean damages the sensitivity of the bootstrap (contra my previous, erroneous, results). My intuition is that the mean, as a statistic, is influenced by the whole distribution in a way the median isn’t, so it a better summary statistic (statisticians, you can tell me if this makes sense). The mean test is far more sensitive, but, as discussed previously, this is because it has an unacceptably high false alarm rate which is insufficiently penalised by d-prime.

Update: Cosma’s notes on the bootstrap are here and recommened if you want the fundamentals and are already degree-level comfortable with statistical theory.

Update: This post used an incorrect implementation of the bootstrap, so the conclusions don’t hold. See this correction

Mike suggested that I alter the variance of the underlying distibutions. This makes total sense, since it matches what we are usually trying to do in psychological research – detect a small difference in a lot of noise. So I made the underlying distibutions look a lot like reaction time distributions, with a 30ms difference between them. The code is

Where m is the sample size, and d is either 0 or 30. For a very large sample, the distributions look like this:

After a discussion with Jim I looked at the hit rate and false alarm rate separately. For the simple comparison of means, the false alarm rate stays around 0.5 (as you’d predict). For the other tests it drops to about 0.05. The simple comparison of means is so sensitive to a true difference, however, that the dprime can still be superior to that of the other tests. Which suggests dprime is not a good summary statistic to me, rather than that we should do testing simply by comparing the sample means.

So I rerun the procedure I described before, but with higher variance on the underlying samples.

The results are very similar. The bootstrap using the mean as the test statistic is worse than the t-test. The bootstrap using the median is clear superior. This surprises me. I had been told that the bootstrap was superior for nonparametric distributions. In this case it seems as if using the mean as a test statistic eliminates the potential superiority of bootstrapping.

This is still a work in progress, so I will investigate further and may have to update this conclusion as the story evolves.

Update: This post used an incorrect implementation of the bootstrap, so the conclusions don’t hold. See this correction

This surprised me. I decided to try out bootstrapping as a method of testing if two sets of numbers are drawn from different distributions. I did this by generating sets of numbers of size m from two ex-gaussian distributions which are identical except for a fixed difference, d

s1=randn(1,m)+exp(randn(1,m));
s2=randn(1,m)+exp(randn(1,m))+d;

All code is matlab. Sorry about that.

Then, for each pair of numbers I apply a series of different tests for if the distributions are different.
1. Standard t-test (0.05 significance level)
2. Is the mean(s1)
3. Bootstrapping using mean as the test statistic (0.05 significance level)
4. Bootstrapping using the median as the test statistic (0.05 significance level)

I do that 5000 times for each difference, d, and each sample size, m. Then I take the average answer from each test (where 1 is 'conclude there distributions are different' and 0 is 'don't conclude the distributions are different'). For the case where d > 0 this gives you a hit rate, the likelihood that the test will tell you there is a difference when there is a difference. For d = 0.5 you get a difference that most of the tests can detect the majority of the time as long as the sample is more than 50. For the case where d = 0, you can calculate the false alarm rate for each test (at each sample size).

From these you can calculate d-prime as a standard index of sensitivity and plot the result. Sttest, Smean, Sbootstrap and Sbootstrap2 are matrices which hold the likelihood of the four tests giving a positive answer for each sample size (columns) for two differences, 0 and 0.5 (the rows):

Previously I blogged about an experiment which used the time it takes people to make decisions to try and elucidate something about the underlying mechanisms of information processing (Stafford, Ingram & Gurney, 2011) . This post is about the companion paper to that experiment, reporting some computational modelling inspired by the experiment (Stafford & Gurney, 2011).

The experiment contained a surprising result, or at least a result that I claim should surprise some decision theorists. We has asked people to make a simple judgement – to name out loud the ink colour of a word stimulus, the famous Stroop Task (Stroop, 1935). We found that two factors which affected the decision time had independent effects – the size of the effect of each factors was not effected by the other factor. (The factors were the strength of the colour, in terms of how pale vs deep it was, and how the word was related to the colour, matching it, contradicting it or being irrelevant). This type of result is known as “additive factors” (because they add independently of each other. On a graph of results this looks like parallel lines).

There’s a long tradition in psychology of making an inference from this pattern of experimental results to saying something about the underlying information processing that must be going on. Known as the additive factors methodology (Donders, 1868–1869/1969; Sternberg, 1998), the logic is this: if we systematically vary two things about a decision and they have independent effects on response times, then the two things are operating on separate loci in the decision making architecture – thus proving that there are separate loci in the decision making architecture. Therefore, we can use experiments which measure only outcomes – the time it takes to respond – to ask questions about cognitive architecture; i.e. questions about how information is transformed and combined as it travels between input and output.

The problem with this approach is that it commits a logical fallacy. True separate information processing modules can produce additive factors in response data (A -> B), but that doesn’t mean that additive factors in response time data imply separate information processing modules (B -> A). My work involved taking a widely used model of information processing in the Stroop task (Cohen et al, 1990) and altering it so it contained discrete processing stages, or not. This allowed me to simulate response times in a situation where I knew for certain the architecture – because I’d built the information processing system. The result was surprising. Yes, a system of discrete stages could generate the pattern of data I’d observed experimentally and reported in Stafford, Ingram & Gurney (2011), but so could a single stage system in which all information was continuously processed in parallel, with no discrete information processing modules. Even stranger, both of these kind of systems could be made to produce either additive or non-additive factors without changing their underlying architecture.

The conclusion is straightforward. Although inferring different processing stages (or ‘modules’) from additive factors in data is a venerable tradition in psychology, and one that remains popular (Sternberg, 2011), it is a mistake. As Henson (2011) points out, there’s too much non-linearity in cognitive processing, so that you need additional constraints if you want to make inferences about cognitive modules.

Thanks to Jon Simons for spotting the Sternberg and Henson papers, and so inadvertantly promting this bit of research blogging

Henson, R. N. (2011). How to discover modules in mind and brain: The curse of nonlinearity, and blessing of neuroimaging. A comment on Sternberg (2011). Cognitive Neuropsychology, 28(3-4), 209-223. doi:10.1080/02643294.2011.561305

I’ve had a pair of papers published recently and I thought I’d have a go at putting simply what the research reported in them shows.

The first is called ‘Pieron’s Law holds during Stroop conflict: insights into the architecture of decision making‘. It reports a variation on the famous Stroop task. The Stroop task involves naming the ink colour of various words, words which can themselves be the name of colours. So you find yourself looking at the word GREEN in red ink and your job is to say “red”. If the word matches the ink colour people respond faster and more accurately; if the word doesn’t match, they are slower and less accurate. What we did was vary the strength of the colour component of the stimilus – e.g. we used more and less intense red ‘ink’ (actually we presented the stimuli on a computer screen, so the ink was pixel values). There’s a well established relationship between stimulus strength and responding – the ‘Pieron’s Law’ of the title – showing how response speed decreases with increasing stimulus strength.

So our experiment simply took two well know psychological findings and combined them in a single experiment. The result is interesting because it can help us arbitrate between different theories of how decisions are made. One popular theory of decision making is that all the information relevant to the decision is optimally combined to produce the swiftest and most accuracte response (Bogacz, 2007). There’s lots of support for this theory, including evidence from looking at responses of humans making simple judgements, recordings from the brain cells of monkeys and deep connections to statistical theory. It’s without doubt that the brain can and does integrate information optimally in some circumstances. What is interesting to me is that this optimal information integration perspective is completely at odds with the most successful research programme in post-war psychology: the heuristics and biases approach. This body of evidence suggest that human decision making is very non-optimal, with all sorts of systemmatic errors creeping into the way people combine information to make a decision. The explanation for these errors is that we process information using heuristics, mental shortcuts which give a good answer most of the time and cut down on the amount of effort which have to expend in deciding (“do what you did last time” is probably the most common decision heuristic).

My experiment connects to these ideas because it asked people to make a simple judgement (the colour of the ink), like the experiments supporting an optimal information integration perspective on decision making, but the judgement requested was just marginally more complex because we manipulate both Stroop condition (whether the word and ink matched) and colour strength. If you are a straight-down-the-line optimal information decision theorists then you must believe that evidence about the decision based on the word is combined with evidence about the decision based on the colour to make a single ‘amount of evidence’ variable which drives the decision. In the paper I call this the ‘common metric’ hypothesis. The logic is a bit involved (see the paper), but a consequence of this hypothesis is that the size of the effect of the word condition should vary across the colour strength condition, and vice versa. In other words, you should see an interaction. Visually, the lines on the graph of results would be non-parallel.

Here’s what we found:

What you’re looking at is a graph of response times (the y-axis) for different colour strengths (the x-axis). The three lines are the three Stroop conditions: when the word matches the colour (‘congruent’), when it doesn’t match (‘conflict’) and when there is no word (‘control’). The result: there is no interaction between these two factors – the lines are parallel.

The implication is that you don’t need to move very far from simple perceptual decision making before human decision making starts to look non-optimal – or at least non optimal in the sense of combining information from different sources. This is important because of the widespread celebration of decision making as informationally optimal. Reconciling this research programme with the wider heuristics and biases approach is important work, and fits more generally with an honourable tradition in science of finding “boundary conditions” where one way the world works gives way to another way.

“Knowledge is potentially infinite. What we can attend to at a given moment is severely limited. So there’s always a question as to what will count as knowledge in a given context, and another about who will decide what counts. These questions ….are almost always properly political, that is they require a judgement about what is good, a judgement which the scientist is no more competent to render than any other citizen.”

I’ve been listening to the CBC series (2009) “How to Think about Science” (listen here, download here). The first episode starts with Simon Schaffer, co-author of the The Leviathan and the Air Pump. Schaffer argues that scientists are believed because they organise trust well, rather than because they organise skepticism well (which is more in line with the conventional image of science). Far from questioning everything, as we are told science teaches, scientists are successful as expects because of the extended network of organisations, techniques and individuals that allows scientists, in short, to know who to trust.

Schaffer also highlights the social context of the birth of science, focussing on the need for expertise —for something to place trust in — at a time of military, political and ideological conflict. Obviously, our need for certainty is as great in current times.

Understanding of the processes of science, Schaffer asserts, is required for a true understanding of the products of science, and public understanding of both is required for an informed and empowered citizenry.

This last point puts the debate about open access scientific journals in a larger and more urgent perspective. In this view, open access is far more than a merely internal matter to academia, or even merely a simple ethical question (the public fund scientists, the publications of scientists should be free to the public). Rather, open access is foundational to the institution of trusted knowledge that science (and academia more widely) purports to be. The success of early science was in establishing the credibility of mechanisms for ‘remote witnessing’ of phenomenon. The closed-access publishing system threatens to undermine the credibility of scientific knowledge. Once you recognise that scientific knowledge is not the inviolable product of angelic virtue on the part of science, you concede that there the truth of scientific propositions is not enough — we need to take seriously the institutions of trust that allow science to be believed. The status of expert who cannot be questioned is a flattering one, but it relies on short-term cache. If we care about science and the value of scholarship more widely then open access publishing is an urgent priority.

David Eagleman has an article in The Atlantic The Brain on Trial, in which he ‘describes how the foundations of our criminal-justice system are beginning to crumble, and proposes a new way forward for law and order.’

The ever more successful endeavours of neuroscience to link behaviour to biology, claim Eagleman, mean that we will have to acknowledge that the ‘simplistic’ categorisation of individuals into responsible and not-responsible for their actions is untenable. Instead we should admit that culpability is graded and refocus our legal system on rehabilitation and the prevention of recidivism.

In fact, rehabilitation has long been admitted as a core purpose of the justice system, though of course that’s no reason to complain about someone reiterating its importance (and obviously the call for a refocussing on rehabilitation makes most sense in a culture addicted to incarceration). What is harmful is the implication that you need neuroscience to be able to realise that circumstances and history make some people more able to make responsible choices. Neuroscience just expands our idea of what counts as ‘circumstances’, to include aspects of the internal environment – ie our biology.

However, according to Eagleman, a brave new world of evidence-based justice awaits:

As brain science improves, we will better understand that people exist along continua of capabilities, rather than in simplistic categories. And we will be better able to tailor sentencing and rehabilitation for the individual, rather than maintain the pretence that all brains respond identically to complex challenges and that all people therefore deserve the same punishments.

This is profoundly misleading, giving the impression that the justice system gives the same punishments for the same crimes (which is doesn’t) and that it was only neuroscientific ignorance that forced legal philosophers to create the category of ‘legally responsible’.

Another view is that, the simple idea of legal responsibility was adopted as a deliberate choice, a choice we make hand in hand with that of equality before the law. We do this because as the alternative to legal equality is odious, so the alternative to equality of responsibility is pernicious. The criminal justice system already de facto admits to gradations of responsibility, how exactly does Eagleman imagine that it could be improved by formalising a graded notion of responsibility? Far from crumbling, as Eagleman claims, the criminal justice system is already a compromise between the need to view people as responsible and the recognition that not all choices are equally free. The revolution heralded by Eagleman’s barrage of rhetorical questions and attacks on strawmen is a damp squib. If the neurosciences are going to make a genuine contribution to issues like this, the onus must be on us to engage with existing thought on complicated matters like criminal justice and provide detailed evidence of how neuroscience can inform these existing systems, rather than pretending that new findings in the lab can sweep away thousands of years of cultural and philosophical endeavour.

This is a plot of the number of citations turned up by a simple “Web of Knowledge” search for papers containing the words “dopamine” and “reinforcement learning”, against year of publication. The rise, dating from approximately the time of publication of the first computational theory of phasic dopamine function, is rapid. There are, as far as I know, two computational theories of phasic dopamine function. One from Schultz, Dayan and Montague (1997) and one from our team here in Sheffield (Redgrave and Gurney, 2006)

I’ve been invited to give a talk at York Centre for Complex Systems Analysis. I’ll be speaking on the 13th of May, a friday, to the title “Infering cognitive architectures from high-resolution behavioural data”. It’ll be an overview of what it is exactly that I try to do as part of my work.

Abstract: I will give an overview of some of the work done in our lab, the Adaptive Behaviour Research Group (http://www.abrg.group.shef.ac.uk/ ) in the Department of Psychology, University of Sheffield. Across human, non-human animal, simulation and robotics platforms we investigate the neural circuits that allow intelligent behaviour, bringing to bear psychological, neuroscientific and computational perspectives. We are particularly interested in the action selection problem – that of deciding what to do next (and of doing it). This talk will focus on my own work looking at 3 paradigms where we have collected high-resolution behavioural data in humans – mistakes made by expert touch typists, eye-movements during visual search and a novel paradigm for investigating the learning of new motor skills. I will make comments on how we analyse such data in order to make inferences about the underlying architecture of human decision making.

I want to make a radio documentary about how science really works. The popular imagination has been captured by a model of science which is incomplete and unhelpful. Science doesn’t produce neutral facts, it is process whose very nature is contested within the institution of science as well as from outside. Science is a complex social process, and may not even be a single unified thing.

This documentary I’m imagining would start in a University bar on a friday night, were we could hear some scientists talk about work in the lab in the way scientists all over the world do, not in the language of journal papers, grant applications and popular TV features, but as the work which they know intimately, with its set-backs, rivalries and esoteric rewards. We’d then visit a few important thinkers to get some vital alternative perspectives on how science works:

Steve Fuller from Warwick could tell us about the social construction of knowledge, about how science rewrites the history of discoveries to present an ideal of its process as logical and inevitable when in fact is it accidental and contingent. Someone could outline Feyerabend’s “Against Method” and we could see some scientists get irate at his deconstruction of the sacred cows of the naive, traditional model of how science works (which, in my experience, is what tends to happen when you throw Feyerabend at them).

Terence Kealey, VC of Buckingham University and author of “Sex, science and profits” will explode the myth that publicly funded research is good for the economy and outline his idea that “there’s no such thing as science, just scientists”.

Ben Goldacre will take us into the murky world of pharmaceutical research and show us the ways industry funding can distort “pure science”.

Finally, we tackle science and politics, talking to the climate researchers at the centre of the “Climate Gate” email scandal and show how the mistaken ideal of “science as objective” gets in the way of a proper understanding of the role of science in political debate. (Basically, my argument is that an overly idealised model of science leaves open the rhetorical space for an unhelpful cultural relativism, whereby the critical theorists can claim that science is just a social construction and the political fringes feel they can contest scientific consensus with a GCSE biology and the will to believe). We’ll talk to Jim Manzi who will outline his idea of causal density, showing why applying the scientific method to problems of society will not be as straightforward as the cheerleaders of scientific rationalism assume.

Now, who would like to make this documentary with me?

(NB I have not sought the involvement/permission of the people named in this post!)

Like other amateurs, Koestler finds it difficult to understand why scientists seem so often to shirk the study of really fundamental or challenging problems. With Robert Graves he regrets the absence of ‘intense research’ upon variations in the – ah – ’emotive potentials of the sense modalities’. He wonders why ‘the genetics of behaviour’ should still be ‘uncharted territory’ and asks whether this may not be because the framework of Neo-Darwinism is too rickety to support an inquiry. The real reason is so much simpler: the problem is very, very difficult. Goodness knows how it is to be got at. It may be outflanked or it may yield to attrition, but probably not to a direct assault. No scientist is admired for failing in the attempt to solve problems that lie beyond his competence. The most he can hope for is the kindly contempt earned by the Utopian politician. If politics is the art of the possible, research is surely the art of the soluble. Both are immensely practical-minded affairs.
Although much of Koestler’s book has to do with explanation, he seems to pay little attention to the narrowly scientific usages of the concept. Some of the ‘explanations’ he quotes with approval are simply analgesic pills which dull the aches of incomprehension without going into their causes. The kind of explanation the scientist spends most of his time thinking up and testing – the hypotheses which enfolds the matters to be explained among its logical consequences – gets little attention.

Peter Medawar, from a review of Arthur Koestler’s “The Act of Creation” (New Statesman, 19 June 1964) and republished in ‘The Art of the Soluble’ (1967)

Viewed as a language, theory has no substantive content; it is a set of tautologies. Its function is to serve as a filing system for organizing empirical material and facilitating our understanding of it; and the criteria by which it is to be judged are appropriate to a filing system. Are the categories clearly and precisely defined? Are they exhaustive? Do we know where to file each individual item, or is there considerable ambiguity? Is the system of headings and subheadings so designed that we can quickly find an item we want, or must we hunt from place to place? Are the items we shall want to consider jointly filed? Does the filing system avoid elaborate cross-references?

Simon Singh approached his debate with homeopathy-promoting MP David Tredinnick all wrong this morning. He dived into a critique of the studies Tredinnick presented, thus allowing him to maintain the advantage of framing the debate and losing most of the audience with discussion of statistics and control groups [1].

Instead, he should have laughed at the MP and said gently something like “It is undoubtedly true that homeopathy does work, the only question is about why it works. All the evidence suggests that the effect is due to a combination of the power of individual’s beliefs about homoeopathy and the healing benefits of a meaningful relationship with a physician. For every 1 study that says, like David Tredinnick’s three, that homeopathy has benefits beyond those of placebo there are 50 which suggest that homeopathy medicines are inert and all the properties ascribed to them are properties of belief and relationships. Because of this, we need to ask if we want to allow a misguided homeopathy industry to charge us for medicines which we know to be snake oil, and whether there is not some less expensive and less deceitful way we can access the powerful healing effects that placebos such as homeopathy provide.”

On that last point, I’ve had an idea. Homeopathy is fake medicine, and obviously this has lots of benefits. All the power of placebos! Minimal risk and side-effects! Safe to use in combination with conventional medicine! The only downside I can see is that only patients you allow to remain misinformed can benefit and that the homeopathy industry has all this rigmarole involved in the preparation and delivery of the product that necessarily makes it expensive. So why not sell fake homopathic medicine? I don’t see how homeopaths could object if the medical establishment turns their strategy back on them. We could even use their experimental methods to replicate the successful results they’ve found with homeopathic treatment for our fake-homeopathic treatment. Instead of branded pharmaceuticals you can buy generic pharmaceuticals which have the same chemical composition at the fraction of the price, why can’t we buy homeopathy generics which are equally inert? Doctors could be free to prescribe them, saving the NHS money and simultaneously allowing patients access to all the wonderful benefits of placebo.

[1] Not that discussion of statistics and control groups is a bad thing, or a guaranteed way to lose your audience, I just think Singh lost his because of the way he discussed statistics and control groups, and because it wasn’t essential to the wider issues

The lesson I draw … is that a uniform ‘scientific view of the world’ may be useful for people doing science – it gives them motivation without tying them down. It is like a flag. Though presenting a single pattern it makes people do many different things. However, it is a disaster for outsiders (philosophers, fly-by-night mystics, prophets of a New Age, the “educated public”). It suggests to them the most narrowminded religious commitment and encourages a similar narrowmindedness on their part

Paul Feyerabend, in ‘Against Method’ (third edition, chapter 19). ‘the “educated public” is included in the list in his ‘Conquest of Abundance‘, in which this section is repeated with a few changes.

The Emotional Cartography book launch was on friday and went off a treat. Since I had, unusually for me, planned my talk by writing it out in full I have reproduced it below. This is more-or-less what I said:

There is a saying that those who want to enjoy laws and sausages should not find out how they are made. I think the same is true about facts. Or rather I think that anyone who wants to believe in simple honest facts, objective lumps of knowledge which are true and eternal, ought to stay away from the places where facts are produced. When you see how facts are made you ought to gain, I believe, a healthy scepticism about how they are used.

I am a scientist — an experimental psychologist — and I work in a University. In the University, in the Faculty of Science, we like to think we are the factory of facts. Yet, it still surprises me that some of my colleagues still believe in simple honest facts, even after years in the workrooms squeezing the meat of messy reality into the offal tubes of truth. Many times I’ve been faced down by these colleagues who refuse to believe that some complex social or political dilemma is really problematic. “Just find out the facts” they say. “When we know the facts, what do — about schools, israel-palestine, whatever — will become obvious”

Curious that anyone who has seen facts being made can still believe that on their own they’ll help!

I’m reminded of another quote, this one by Matt Cartmill – Professor of Biology, at Duke University. He said,

“As an adolescent I aspired to lasting fame, I craved factual certainty, and I thirsted for a meaningful vision of human life – so I became a scientist. This is like becoming an archbishop so that you can meet girls.”

Now don’t get me wrong, facts do exist. We look to the stars and ask if things can be known — can things be known? And things can be known. There is right and wrong, true and untrue. Facts, in this sense, do exist. But they aren’t enough for a balanced intellectual diet.

I think facts are seductive because they take a lot of technical skill to produce. If you want to make even a basic truth which will hang together long enough to survive being passed around, you need a lot of disciplinary training. You need expensive and complex measuring equipment, you need esoteric statistical techniques and you need to make the right comparisons. All this takes time and money and a lot of discipline specific experience. No wonder scientists are proud of their facts, and the facts themselves invoke some envy and respect.

The problem is that facts always — always! — come with a set of presumptions, they always come along with a view of the world that they promote.

If you’ve read Thomas Pynchon’s ‘Gravity’s Rainbow’, his sprawling riotous novel of wartime paranoia you might know one of his Proverbs for Paranoids: “If you can get them asking the wrong questions, you don’t have to worry about the answers”

To stop this getting too abstract, and to bring it back to the book that we’re gathered here to launch, let’s talk about maps.

Everyone interested in maps should read Dennis Wood’s “The Power of Maps”. I believe this so strongly I have forced myself to only say this once, so that was it.

I read this book while I was working on a thing called the Sheffield Greenmap. Greenmaps are community mapping projects designed to mark environmental and community sites in a local area. Dennis Wood talks about the selective accuracy of maps, how they show one part of the world, and can seduce you into thinking (because of the professionalism of their accuracy) that their representation is the way to view the world. But, he said, “Accuracy is not the issue. Selectivity is the issue”. Perhaps because I was involved with the Sheffield Greenmap project these words of his resonated with me very strongly. Maps are choices about how to view the world. When I looked at maps of my hometown, maps which I would happily point to and say “this is Sheffield”, I saw the one way roads marked, the petrol stations. The base I used for the greenmap of the area around the University was based on the University’s own map. Running up the hill from the University in Sheffield is a road with parks either side of it. When I came to look at the University map, I noticed, for the first time, that only one of the parks was marked. The road, you see, was also a socio-demographic division between the leafy suburb of the university and the estates next door. For the University, the park next to the estates didn’t exist (or at least not as a place for students and University visitors to go).

The greenmap project made me look at the maps of Sheffield and ask why what was on them was on them. Why the roads and the petrol stations, why not the scenic routes through park, the community cafes and the places you could lock your bike up?

Accuracy wasn’t the issue. Selectivity was.

Wood discusses the ‘general purpose’ map as a particularly insidious example of interests using maps to mask their interests. The generality, the lack of explicit purpose, in a map disguises that it represents the end of a careful and directed process of selection. Like the scientific facts, the beauty of the technical process can blind you to the bias inherent in the construction.

I was invited to contribute to the Emotional Cartography because of Mind Hacks, a book I wrote with Matt Webb and a few valiant contributors in 2004. Mind Hacks is a collection of 100 do it yourself experiments that you can try at home and which reveal something about the moment-to-moment workings of your mind and brain. Our ambition with the book was to perform a smash and grab on the goodies of cognitive neuroscience, to open source some of the fascinating science that has been done, turning it into demonstrations which anyone could try. We wanted to make some of what was known about the mind available to be re-purposed by other people. So designers, artists, educators and whoever could notice and reuse various principles of how our minds make sense of the world.

Lately I’ve come to think — and this was inspired by writing the chapter in Emotional Cartography — that the view of the mind we took in Mind Hacks was limited. And this limitation reflects that of academic psychology as a whole. We focussed on the perceptions, thoughts and feelings of isolated individuals, rather than of people in their full social context, in interaction with others.

This is why I’m excited about Emotional cartography. It takes the idea of open-sourcing the production and consumption of facts to the social level.

First, emotional cartography. We’re a visuo-spatial species. We love sights, spaces, exploring with our eyes. We reify this prioritisation into maps, which are themselves inherently visuo-spatial. If you believe that maps are a kind of technology for thinking with, which I do (my chapter in the book is about this), then this is turn makes it easier to think about the kinds of things which are easiest to show on a map. The maps of physical space then make this selection bias invisible, by pretending to be natural.

Emotional Cartography makes another kind of information mappable, and this opens up the space to think with and debate about what that mapping makes explicit. For example: why are people anxious or excited in this place? Is that something we should do something about?

The other reason I’m excited about emotional cartography is because, truly, it opens up a space for emotional cartographies — a refocussing on the process of mapping and remapping. By open-sourcing mapping it allows mapping to be a processes rather than a product and this powerfully opens up space for people to take part in the negotiation of the representation of their own geographical spaces. Rather than one true map of a locale, there are many maps, and these maps can be a medium for the mappers to meet and discuss their feelings, the places where they live, and the interrelation of these two things.

So let me finish by congratulating Christian on his idea and all his hard work which resulted in this book, let me congratulate the other authors for their contributions and let me commend to you the practice of emotional cartography because, as should be obvious by now, in all areas of life, including map making, I believe it is far more satisfying to be a participant than a mere consumer.

The rust inside this kettle shows an emergent pattern that is typical of the self-organising dynamics of reaction-diffusion systems.

One example of self-organising dynamics is in the topographic map of ocular dominance columns in the visual cortex. These intricate maps display a fascinating combination and interplay of regularity and irregularity. Such patterns have been modelled by computational neuroscientists using the Kohonen algorithm and variants

Simplified models of artificial situations can be offered for either of two purposes. One is ambitious: these are “basic models” – first approximations that can be elaborated to simulate with higher fidelity the real situations we want to examine. The second is modest: whether or not these models constitute a “starting set” on which better approximations can be built, they illustrate the kind of analysis that is needed, some of the phenomena to be anticipated, and some of the questions worth asking.

The second, more modest, accomplishment is my only aim in the preceding demonstrations. The models were selected for easy description, easy visualization, and easy algebraic treatment. But even these artificial models invite elaboration. In the closed model [of self-sorting of a fixed population across two sub-groups (‘rooms’) according to individual’s preferences for a group mean age closest to their own], for example, we could invoke a new variable, perhaps “density”, and get a new division between the two rooms at a point where the greater attractiveness of the age level is balanced by the greater crowding. To do this requires interpreting “room” concretely rather than abstractly, with some physical dimension of some facility in short supply. (A child may prefer to be on the baseball squad which has older children, but not if he gets to play less frequently; a person may prefer to travel with an older group, but not if it reduces his chances of a window seat; a person may prefer the older discussion group, but not if it means a more crowded room, more noise, fewer turns at talking, and less chance of being elected chairman.) As we add dimensions to the model, and the model becomes more particular, we can be less confident that our model is of something we shall ever want to examine. And after a certain amount of heuristic experiments with building blocks, it becomes more productive to identify the actual characteristics of the phenomena we want to study, rather than to explore general properties of self-sorting on a continuous variable. Nursing homes, tennis clubs, bridge tournaments, social groupings, law firms, apartment buildings, undergraduate colleges, and dancing classes may display a number of similar phenomena in their membership; and there be a number of respects in which age, I.Q., walking speed, driving speed, income, seniority, body size, and social distinction motivate similar behaviours. But the success of analysis eventually depends as much on identifying what is peculiar to one of them as on the insight developed by studying what is common to them.

There are only two species that have language — humans, and honeybees. Other animals communicate, but its only us two that have language. And language means grammar; some abstract structure which conveys meaning according to the arrangement of symbols in that structure.

Our language is vastly more sophisticated than the honeybees. Their language is something called a waggle-dance, which conveys information about a food source between a individual who has returned to the hive from foraging and between her fellow workers. She peforms her waggle dance and the length and orientation of different parts of the dance indicates the quality of the food source and the direction in relation the the current position of the sun. The dance has structure, and that structure conveys the meaning of the components within it — its a language, a primitive language, but still the only thing that looks close to ours in the animal kingdom.

Why is that? Why is the only other grammar not found in a fellow primate, nor even a fellow mammal but found in an insect?

Here’s my theory — language is a system with unparalleled power to communicate information. But this means that it also has unparalleled ability to deceive, which is what one of the basic properties of communication systems. If you can use language to convey a very specific message, you can also use it to make very specific deceptions, for example tricking someone the food is in one direction, while you go and enjoy it in another. Because of the unprecedented capacity of language to deceive, it exists at the top of a steep evolutionary mountain. Any species which is evolving language must have some protection against the threat of deception, otherwise the only defense is to ignore language-based communications all-together (in which case you don’t get any benefit, and so language-evolution never gets off the ground).

The human defense against deception-using-language is based on our other cognitive abilities — the ability to reason about who to trust and when to trust them. Co-evolving these capacities with language is one strategy which allows the evolution of language. The honeybees have used another, circumventing the threat of deception by making deception evolutionarily pointless: honeybees in a hive are all genetically identical, so although language inherantly contains the capacity to deceive, in honeybees there is no reason why deception itself would evolve to diminish the benefits of communicating through language.

Update: I was wrong about honey bees being genetically identical, but I don’t think it demolishes the argument (quoting N.) “Honey bees are not genetically identical. Worker sisters share 3/4 of the genes with each other, but would only share 1/2 their genes with any offspring they might produce, so they can better propagate their genes by helping the queen produce more sisters. Wikipedia link here

Following on from my earlier post about the way psychologists look at the world, let me tell you a story which I think illustrates very well the tendency academic psychologists have for reductionism. It’s a story about a recent paper on the phenomenon of cognitive dissonance, and about a discussion of that paper by a group of psychologists that I was lucky enough to be part of.

Cognitive Dissonance is a term which describes an uncomfortable feeling we experience when our actions and beliefs are contradictory. For example, we might believe that we are environmentally conscious and responsible citizen, but might take the action of flying to Spain for the weekend. Our beliefs about ourselves seem to be in contradiction with our actions. Leon Festinger, who proposed dissonance theory, suggested that in situations like this we are motivated to reduce dissonance by adjusting our beliefs to be in line with our actions.

Obviously after-the-event it is a little too late to adjust our actions, so our beliefs are the only remaining point of movement. In the flying to Spain example you might be motivated by cognitive dissonance to change what you believe about flying: maybe you come to believe that flying isn’t actually that bad for the environment, or that focussing on personal choices isn’t the best way to understand environmental problems, or you could even go all the way and decide that you’re not an environmentally responsible person.

The classic experiment of dissonance theory involved recruiting male students to take part in a crushingly boring experiment. The boring part was an hour of two trivial actions — loading spools into a tray, turning pegs a quarter-turn in a peg-board. At the end of this, after the students through the experiment was over, was the interesting part of us. The students were offered either $1 or $20 to tell the next participant in the experiment (actually the female accomplice of the experimenter) that the experiment she was about to do was really enjoyable. After telling this lie, the participants were then interviewed about how enjoyable they really found the experiment. What would you expect from this procedure? Now one view would predict that the students paid $20 would enjoy the experiment more. This is certainly what behaviourist psychology would predict — a larger reward should produce a bigger effect (with the effect being a shift from remembering the task as boring, which is was, to remembering it being enjoyable, which getting £20 presumably was). But cognitive dissonance theory suggests that the opposite would happen. Those paid $20 would have no need to change their beliefs about the task. They lied about how enjoyable the task was to the accomplice, something which presumably contradicted their beliefs about themselves as nice and trustworthy people, but they did it for a good reason, the $20. Now consider the group paid only $1. They lied about how enjoyable the task was, but looking around for a reason they cannot find one — what kind of person would lie to an innocent for only $1? So, the theory goes, they would experience dissonance between their actions and their beliefs and reduce this by adjusting their beliefs: they would come to believe that they actually did enjoy the boring task, and this is the reason that they told the accomplice that it was enjoyable. And, in fact, this is what happened.

At this point I want you to notice two things about cognitive dissonance. Firstly, it requires the existence of quite sophisticated mental machinery to operate. Not only do you need to have abstract beliefs about the world and yourself, you need to have some mechanism which detects when these beliefs are in contradiction with each other or with your actions, and which can (unconsciously) adjust selective beliefs to reduce this contradiction. The second thing to notice is that all this sophisticated mental machinery is postulated to exist from changes in behaviour, it is never directly measured. We don’t have any evidence that the change in attitudes really does result from an uncomfortable internal state (‘dissonance’) or that any such dissonance does result from an unconscious perception of the contradiction between beliefs and actions.

So, to the recent paper and to reductionism. The paper, by Louisa Egan and colleagues at Yale [ref below] is titled ‘The Origins of Cognitive Dissonance‘, and represents one kind of reductive strategy that psychologists might employ when considering a theory like cognitive dissonance. The experiments in the paper (summarised here and here) both involved demonstrating cognitive dissonance in two groups which do not have the sophisticated mental machinery normally considered necessary for cognitive dissonance — four year-old children, and monkeys. The reductionism of the paper, which the authors are quite explicit about, is to show that something like cognitive dissonance can occur in these two groups despite their lack of elaborate explicit beliefs. Unlike the students in Festinger’s classic experiment we can’t suppose that the children or the monkeys have thoughts about their thoughts in the same way that dissonance theory suggests.

To demonstrate this the authors employed an experimental method that could be used with subjects who did not have language, but would still allow them to observe the core phenomena of dissonance theory — the adjusting of attitudes in line with previous actions. The method worked like this. For each participant — be they a child or a monkey — the experimenters identified three items (stickers for the children, coloured M&M’s for the monkeys) which the participant preferred equally. In other words, if we call the three items A, B and C then the child or monkey liked all of the items the same amount. Then the experimenter forced the participating child or monkey to choose between two of the items (lets say A and B), so that they only got one. Next the child or monkey was offered a choice between item C and the item they did not choose before. So, if the first choice was between A and B and the participant chose A, then the next choice would be between B and C. What does dissonance theory predict for this kind of situation? Well, originally the three items are equally preferred — that’s how the items are selected. After someone is forced to make a first choice, between A and B, cognitive dissonance supposedly comes into play. The participant now has a reason to adjust their attitudes, and the way they do this is to downgrade their evaluation of their unchosen item. This will is known as being happy with what you got or “I must not like B as much, because I chose A”. So on the second choice (B vs C) the participants are more likely to choose C (more likely that chance, and more likely than a control group that goes straight to the ‘second’ choice). This prediction is exactly what the experimenters found, in both children and monkeys, and the startling thing is that this occurred despite the fact that we know that neither group was explicitly talking to themselves in the way I outline the dissonance theory prediction above (“I must not prefer B as much…etc”). Obviously something like cognitive dissonance can be produced by far simpler mental machinery than that usually invoked to explain it, conclude the experimenters. In this way, The paper is a call to reduce the level at which we try and explain cognitive dissonance.

How far should you go when trying to reduce the level of theory-complexity that is needed to explain something? Psychologists know the answer to this immediately — as far as possible! So when our happy band of psychologists got to discussing the the Egan paper it wasn’t long before someone came up with a new suggestion, a further reduction.

What if, it was suggested, there was nothing like dissonance going on in the Egan et al experiments? After all, there was no direct measurement of anxiety or discomfort, so why suppose that dissonance occurred at all — perhaps, if we can come up with a plausible alternative, we can do away with dissonance all together. Imagine this, see if you find it plausible: all of us, including monkeys and children, possess a very simple cognitive mechanism which saves us energy by remembering our choices and, when similar situations arise, applying our old choices to new situations, thus cutting down on decision time. That sounds plausible, and it would explain the Egan et al results if you accept that the result of the first, A vs B, decision is not just “choosing A” but is also “not choosing B”. So, when you get to the second choice, B vs C, you are more likely to choose C because you are simply re-applying the previous decision of “not choosing B”, rather than performing some complicated re-evaluation of your previously held attitudes a-la cognitive dissonance theory.

At this point in the discussion the psychologists in the room were feeling pretty pleased with themselves — we’d started out to with cognitive dissonance, reduced the level of complexity of mental processes required to explain the phenomenon (the Egan et al result) and then we’d taken things one step further and reduced the complexity of the phenomenon itself. At this point, we had a discussion of how widely the ‘decisional inertia’ reinterpretation could be applied to supposed cognitive dissonance phenomena. Obviously we’d have only been really satisfied with the reinterpretation if it applied more widely than just to this one set of experiments under consideration.

But further treats were in store. What if we could reduce things again, what if we could make even simpler the processes involved? We’d already started to convince ourselves that the experimental results could be produced by simple cognitive processes rather than complex cognitive processes, perhaps we could come up with a theory about how the experimental results can be produced without any cognitive processes at all! Now that would be really reductive.

Here’s what was suggested, not as definitely what was happening, but as a possibility for what could potentially be happening — and remember, if you are sharing a table with reductionists then they will prefer the simple theory by default because it is simpler. You will need to persuade them of the reasons to accept any complex theory before they abandon the simple one. Imagine that there is no change at all going on in the preferences of the monkeys and the children. Instead, imagine — o the simplicity! — that any participant in the experiment merely has a set of existing preferences. These preferences don’t even have to be mental, by preferences all I mean are consistent behaviours towards the items in question (stickers for the children, M&Ms for the monkeys). From here, via a bit of statistical theory, we can produce the experimental result with out any recourse to change in preferences, cognitive dissonance or indeed anything mental. Here’s how. Whenever you measure anything you get inaccuracies. This means that your end result reflects two things: the true value and some random ‘noise’ which either raises or lowers the result away from the true value. Now think about the Egan et al experiment. The experimenters picked three items, A, B and C, which the children or monkeys ‘preferred equally’, but what did this mean? It meant only that when the experimenters measured preference their result was the same for items A, B and C. And we know, as statistically-savvy psychologists, that those results don’t reflect the true preferences for A, B and C, but instead reflect the true preferences plus some noise. In reality, we can suppose, the children and monkeys actually do prefer each item differently from the others. Furthermore this might even be how they make their choice. So when they are presented with A vs B and choose A it may be because, on average, they preferred A all along. Now watch closely at what happens next. The experimental participants are given a second choice which depends on their first choice. If at first they chose A over B then the second choice is B vs C. But if they chose B over A then the second choice is A vs C. We know the results: they then choose C more than the unchosen option from the first choice, be it A or B, but now we have another theory as to why this might be. What could be happening is merely that, after the mistaken equivalence of A, B and C, the true preferences of the monkey or child are showing through, and the selective presentation of options on the second choice is making it look like they are changing their preferences in line with dissonance theory. Because the unchosen option from the first choice is more likely to have a lower true preference value (that, after all, may be why it was the unchosen option), it is consequently less likely to be preferred in the second choice, not because preferences have changed, but because it was less preferred all along. In the control condition, where no first choice is presented, their is no selective presentation of A and B and so the effect of the true values for preferences for A and B will tend to average out rather than produce a preferential selection of C.

Now obviously the next step with this theory would be to test if it is true, and check some details which might suggest how likely this is. Did Egan et al assess the reliability their initial preference evaluation? Did they test preferences and then re-test them at a later date to see if they were reliable? These and many other things could persuade us that such an explanation might or might not be very likely. The important thing, for now, is that we’ve come up with an explanation that seems as as simple as it could possibly be and still explain the experimental results.

For psychologists, reductionism is a value as well as a habit. We seek to use established simple features of the mind to explain as many things as possible before we get carried away with theories which rely on novel or complex possibilities. The reductionist position isn’t the right one in every situation, but it is an essential guiding principle in our investigations of how the mind works.

Robert Park’s article ‘The Seven Warning Signs of Bogus Science is a result of his attempt to help judges faced with expert witnesses making scientific arguments. He has attempted to come up with heuristics of bad science: ‘indicators that a scientific claim lies well outside the bounds of rational scientific discourse’

Here they are:

1. The discoverer pitches the claim directly to the media.

2. The discoverer says that a powerful establishment is trying to suppress his or her work.

3. The scientific effect involved is always at the very limit of detection.

4. Evidence for a discovery is anecdotal.

5. The discoverer says a belief is credible because it has endured for centuries.

6. The discoverer has worked in isolation.

7. The discoverer must propose new laws of nature to explain an observation.

It’s a good list. Sadly, however, telling the difference between sense and nonsense is never going to be easy. Even the best of us, when we get out of our field, can feel at a loss. It feels to me that your position on the classic controversial science debates (global warming, alternative medicine, creationism) is utterly removed from the facts either way, but instead depends on the pre-theoretical commitments you have made. So, for example, a preference for conventional vs alternative medicine, or creation science vs evolution, is in fact impossible to refute from within the frame of reference of the person with that preference (this will be obvious to any creationist who has tried talking to an evolutionist, or vice-versa). Rather than a choice which can be faulted on facts, it is really a case of choices about what kinds of information define facts. All views of the world have biases in them, the distinction between a scientist and a pseudoscientist is not about which each believes to be true, but rather about what set of systemmatic biases each has decided to place their faith in.

How to lie with Statistics is a classic. A ‘sort of primer in ways to use statistics to deceive’, says the author Darrell Huff. Why teach the world how to lie with statistics? Because, he says, ‘crooks already know these tricks; honest men must learn them in self-defence’.

The bulk of the book is worked examples of classic statistical slights-of-hand: graphs with missing sections on the axes, different kinds of averages, post-hoc observation of correlations. What I want to do here is just review the last chapter ‘How to talk back to statistics’, which gives some rules of thumb on how to ‘look a phoney statistic in the eye and face it down’ whilst recognising ‘sound and usable data in the wilderness of fraud’. Huff gives the reader five simple questions with which to arms themselves, which i summarise and then provide some commentary on at the end.

Huff’s five questions with which to arm yourself against statistics:

1. Who says so?

Can we suspect deliberate or unconscious bias in the originator of the statistic? Huff recommends looking for an “O.K Name” – e.g. a university – as some slim promise of reliability. Second to this he recommends being careful to distinguish the originator of the ‘data’ from the originator of the conclusion or intepretation.

2. How Does He Know?

Is the sample biased? representative? large enough?

3. What’s Missing?

Statistics given without a measure of realiability are ‘not to be taken very seriously’. What is the relevant base rates / appropriate comparison figure? Do averages disguide important variations?

4. Did somebody change the subject?

E.g. More reported cases are not the same as more cases, what people say they do (or will do) is not the same as what they actually do (or will do), association (correlation) is not causation.

5. Does it make sense?

Is the figure spuriously accuracy? Convert percentages to real numbers and convert real numbers to percentages, compare both with your intuitions.

Commentary by tom

Who says so?

Two things interest me about the recommendation to base judgements of credibility, even if just in part, on authority. Firstly, by doing this Huff is conceeding that we are simply not able to make a thorough independent evaluation of the facts ourselves. This is in contradiction to the idea that science is Best because if we doubt something we can check it out for ourselves. The pragmatic response to this is obviously ‘well, you won’t check out everything, but you could check out any individual thing if you wanted’. Is this much consolation in a world where everyone, including the authorities, are assaulted by too much information to check out personally? Scientific authority then becomes a matter of which social structures, which use which truth-heuristics, you trust rather than a matter of direct proof (“it says so in the bible!” verses “it says so in a respectable academic paper!”?).

The second thing that interests me is that the advice to rely on authorities becomes problemmatic for those who either don’t know who the authorities are, or who distrust the usual authorities. What proportion of the population knows that the basic unit of scientific authority is the peer-reviewed journal paper? You can see that if you don’t know this you immediately lose a vital heuristic for evaluating the credibility of research you are told about. In a similar vein, even experts in one domain may be ignorant of the authorities in another domain — leading to similar problems with judging credibility. If you know about but simply don’t trust the established authorities you are similarly lost at sea when trying to evaluate incoming evidence (a reason, I’ll bet, for the mixed quality of information available from, variously, conspiracy theorists and alternative medicine practicioners).

How Does He Know?

This is perhaps the most important question you can ask in my opinion. Often all that is required to dispell the superficially-convincing fog that accompanies some statistic or factoid is to ask How did they find out? What would actually be involved in gathering that information? Could it possibly be correct? For example, ‘if you die in your dream, you really die’ How do they know?! Dead people aren’t exactly available for comment.

What’s Missing?

Knowing what is missing is the hardest trick, in my opinion. It’s a mark both of expertise and of genuine intelligence to be able to pick up on what isn’t being said, to notice when the intepretation of what you’re being told could be fundamentally altered by something you aren’t being told (because, of course, this involes imaging a bunch of counter-factuals). Outside the realm of statistics the idea of frame-analysis speaks to this idea of making invisible what isn’t talked about.

Did somebody change the subject?

Does it make sense?

Both good checks to carry out when challenged by a statistic. It is unfortunate that statistics seem to have an inherant air of authority – a kind of wow factor – and these questions are good tools with which to start dismantling it. I think this wow factor is because statistics seem to imply rigourous, unbiased, comprenhensive investigation, even though they may in fact arise from nothing of the sort. In the same way that evolution will produce imitators who have the colouration of being poisonous, or whatever, without actually bothering to have to produce the poison, and most social situations will attract free-riders who want to get the benefits without paying the costs, so there is an evolution of rhetoric strategies to include things which carry the trappings of credible information without going through the processes which are actually causal in making the information from these sources credible. So we get statistics because everybody knows science uses statistics, we get figures quoted to the second decimal place when the margin of error is a hundred times larger than this level of accuracy, and we get nonsensical arguments supported using citations, even though the studies or works cited are utterly without credibility, because having citations in your arguments is an established form of credible arguments which is easy to reproduce for any argument, whatever the level of credibility.