Anecdotes and Simple Observations are Dangerous; Words and Narratives are Not.

In a recent blog post on stories, and following some themes from an earlier talk by Tyler Cowen, David Evans ends by suggesting: “Vivid and touching tales move us more than statistics. So let’s listen to some stories… then let’s look at some hard data and rigorous analysis before we make any big decisions.” Stories, in this sense, are potentially idiosyncratic and over-simplified and, therefore, may be misleading as well as moving. I acknowledge that this is a dangerous situation.

However, there are a couple things that are frustrating about the above quote, intentional or not.

Second, it suggests that the main role of stories (words) is to dress up and humanize statistics – or, at best, to generate hypotheses for future research. This seems both unfair and out-of-step with increasing calls for mixed-methods to take our understanding beyond ‘what works’ (average treatment effects) to ‘why’ (causal mechanisms) – with ‘why’ probably being fairly crucial to ‘decision-making’ (Paluck’s piece worth checking out in this regard).

In this post, I try to make the case that there are important potential distinctions between anecdotes and stories/narratives that are too often overlooked when discussing qualitative data, focusing on representativeness and the counterfactual. Second, I suggest that just because many researchers do not collect or analyse qualitative work rigorously does not mean it cannot (or should not) be done this way. Third, I make a few remarks about numbers.

As a small soapbox and aside, even calls for mixed-methods for making causal claims give unnecessary priority to quantitative data and statistical analysis for making causal claims, in my opinion. A randomized-control trial – randomizing who gets a treatment and who will remain in the comparison group – is a method of assigning treatment. It doesn’t *necessarily* imply what kind of data will be collected and analyzed within that framework.

Anecdotes, narratives and stories

As to the danger of stories, what Evans, Cowen, and others (partly) caution against is believing, using or being seduced by anecdotes – stories from a single point of view. Here I agree – development decisions (and legislative and policy decisions more generally) have too often been taken on the basis of a compelling anecdote. But not all stories are mere anecdotes, though this is what is implied when ‘hard data’ are equated with ‘statistics’ (an equation that becomes all the more odd when, say, the ‘rigorous’ bit of the analysis is referred to as the ‘quantitative narrative’).

Single stories from single points in time in single places – anecdotes – are indeed potentially dangerous and misleading. Anecdotes lack both representativeness and a counterfactual – both of which are important for making credible (causal) claims and both of which are feasible to accomplish with qualitative work. As revealed with the phrase ‘quantitative narrative,’ humans respond well to narratives – they help us make sense of things – the trick is to tell them from as many perspectives as possible to not un-mess the messiness too far.

Representitiveness: It is clear from the growing buzz about external validity that we need to be cautious of even the most objective and verifiable data analysed in the very most rigorous and internally valid way because it simply may not apply elsewhere (e.g. here and here). Putting this concern aside for a moment, both qualitative and quantitative data can be collected to be as representative of a particular time and place and circumstance as possible. I say more about this, below.

Counterfactuals: Cowen notes that many stories can be summed up as ‘a stranger came to town.’ True, to understand something causal about this (which is where anecdotes and tales following particular plot-lines can lead us astray), we would like to consider what would have happened if the stranger had not come to town and/or what happened in the town next door that the stranger by-passed. But those are still stories and they can be collected in multiple places, at multiple time points. Instead of dismissing it or using it only as window-dressing, we can demand more of qualitative data so that it can tell a multi-faceted, multi-perspectival, representative story.

That rigor thing

Perhaps it seems that we have a clearer idea of how to be rigorous with collecting and analysing quantitative data. I don’t think this is necessarily true — but it does seem that many quant-focused researchers trying out mixed methods for the first time don’t even bother to consider how to make the qualitative data more rigorous by applying similar criteria as they might to the quant part. This strikes me as very odd. We need to start holding qualitative data collection and analysis to higher standards, not be tempted to scrap it just because some people do it poorly.

An excellent piece on this (though there are plenty of manuals on qualitative data collection and analysis) is by Lincoln and Guba. They suggest that ‘conventional’ rigor addresses internal validity (which they take as ‘truth value’), external validity, consistency/replicability and neutrality. (The extent to which quantitative research in the social sciences fulfils all these criteria is another debate for another time.) They highlight the concept of ‘trustworthiness’ – capturing credibility, transferrability, dependability and confirmability – as a counterpart to rigor in the quantitative social sciences. It’s a paper worth reading.

Regardless of what types of data are being collected, representativeness is important to being able to accommodate messiness and heterogeneity. If a research team uses stratification along several to select its sample for quantitative data collection (or intends to look at specific heterogeneities/sub-groups for the analysis), it boggles my mind why those same criteria are not used to select participants for qualitative data. Why does representativeness so often get reduced to four focus groups among men and four among women? Equally puzzling, qualitative data are too often collected only in the ‘treated’ groups. Why does the counterfactual go out the window when we are discussing open-ended interview or textual data?

Similarly, qualitative work has a counterpart to statistical power and sample size considerations: saturation. Generally, when the researcher starts hearing the same answers over and over, saturation is ‘reached.’ A predetermined number of interviews or focus groups does not guarantee saturation. Research budgets and timetables that take qualitative work seriously should start to accommodate that reality. In addition, Lincoln and Guba suggest that length of engagement – with observations over time also enhancing representativeness – is critical to credibility. The nature of qualitative work, with more emphasis on simultaneous and iterative data collection and analysis can make use of that time to follow up on leads and insights revealed over the study period.

Also bizarre to me is that quant-focused researchers tend to spend much more time discussing data analysis than data collection and coding for quantitative stuff but then put absolutely all the focus (of the limited attention-slice qualitative gets) on collecting qualitative data and none into how those data are analysed or will be used. Too often, the report tells me that a focus group discussion was done and, if convenient, it is pointed out that the findings corroborate or ‘explain’ the numeric findings. Huh? If I am given no idea of the range of answers given (let’s say the counterpart of a minimum and a maximum) or how ‘common themes’ were determined, that thing that one person said in a focus group just becomes an anecdote with no real ‘place’ in the reporting of the results except as a useful aside. One more thing on credibility – the equivalent of internal validity. Lincoln and Guba say that credibility *requires* using member-checks (stay tuned for a paper on this), which means sharing the results of the analysis back with those who provided the raw data so that interpretations can, at least in part, be co-constructed. This helps prevent off-the-mark speculation and situation analyses but also helps to breakdown the need to ‘represent’ people who ‘cannot represent themselves’ – as Said quotes from Marx. I’ve said a few things about this sort of shared interpretation here, recognizing that respondents’ perceptions will reflect the stories they tell themselves. That said, as development researchers increasingly look at nudging behavior, the stories (not-always-rational) actors tell themselves are potentially all the more important. We need to collect and present them well.

One key hurdle I see with enhancing the perceived rigor and non-anecdotal-ness of qualitative work is that it is hard to display the equivalent of descriptive statistics for textual/interviewin data. That doesn’t mean we shouldn’t try. In addition, it is more difficult and unwieldy to share (even ‘cleaned’) qualitative data than the quantitative equivalent, as increasingly happens to allow for replication. Still, if this would enhance some of credibility of the multifaceted stories revealed by these data, it is worth pushing this frontier.

Numbers aren’t always clean

In terms of stories we tell ourselves, one is that data are no longer messy (and, often by implication, are clean, hard, ‘true’) because they fit in a spreadsheet. Everything that happened in the field, all the surveyors’ concerns about certain questions or reports of outright lying all often seem to fade from view as soon as the data make it into a spreadsheet. If you ask a farmer how many chickens he has and he gives you a story about how he had 5, 2 got stolen yesterday but his brother will give him 4 tomorrow, regardless of what number the enumerator records, the messiness has been contained for the analyst but not in the reality of the farmer that is meant to be represented.

In general, if we want to talk about creating credible, causal narratives that can be believed and that can inform decision-making at one or more levels, we need to talk about (a) careful collection of all types of data and (b) getting better at rigorously analysing and incorporating qualitative data into the overall ‘narrative’ for triangulating towards the ‘hard’ truth, not equating qualitative data with anecdotes.

Heather, great post! I am a bit biased in agreeing with you since we, at UNDP, have been experimenting with micro-narratives, a method that on hand allows us to collect thousands of snippets of stories from citizens and on the others provides a structured method of self-interpretation (citizens themselves interpret stories as opposed to the consultants, so through disintermediation the bias is non existent) and analysis that allows us to follow almost real time trends in relations to subtle changes in perceptions, behaviors, priorities, etc.

In my view as well it has never been one way or another (quantitative vs qualitative) but how to fill in the very obvious gaps and weaknesses of hard, quantitative data. we are still early in the exploration but initial results are promising- id be more than happy to share!

This is a great, and very thorough post. I agree that we need a mixture of the two to tell a complete story, and that an over-reliance on numbers alone isn't going to give the full picture. I thought you might also like to read why we tend to favour stories however, at least from a psych perspective. Here's my take on that, here on Whydev: http://www.whydev.org/why-personal-stories-trump-numbers-in-global-development/

Thanks much (and sorry for the belated response). Very interesting post! I agree very much with your point that "Even though we push for statistical information to demonstrate to the public the net effect of what works, and what doesn’t, or talk about the need in terms of numbers of people, we still need to keep the message centered around human beings. Without a human story, our ability to empathise and understand is severely hampered." I do hope we start to see people collecting participant and other stakeholder perceptions with more care, analyzing them carefully and being transparent about which 'cases' they choose to present.

Whether you lean qualitative or a quantitative, you’ve got to have stories to make information stick. Regardless of time or place, stories touch deep human’s psychological processes of perception, learning and memory to help us do this. The human mind has evolved a narrative sensemaking faculty that allows us to perceive and experience the chaos of reality in such a way that the brain then reassembles the various bits of experience into a story in the effort to understand and remember. Stories balance the logical (sequence) and the emotional (empathy) aspects of our brains.

My question is: Is our sector’s over-reliance on “killer facts” and numbers (e.g. the “data dash,” obsessive measurement disorder) just a reflection of our fear and thus our unhealthy relationship with risk? In many development programs, precise ways of measuring results in order to make consequential judgments about how to help people and affect social change remain elusive. But that’s hard for us do-gooders to admit.

Your question resonates with me, however, in my story I'd like to make a distinction between 'risk' and 'uncertainty'.
Risk with it's connotations around registers and measurement and costed mitigations, arguably is part of the problem, and perhaps this explains why people have an unhealthy relationship with it. Rather than perpetuating this way of thinking that starts from the premise that everything can be measured and I know everything, maybe it would be better to be a little more humble and accept the fact that we do not know everything and sometimes uncertainties play a part. [Even well tried recipes fail sometimes when you bake them in a different oven.]

So I suggest, and appeal to people reading this narrative, to differentiate between risk and uncertainty and embrace uncertainty rather than attempt to reduce everything to a risk.

Thank you so much for this conversation. I agree both that much of what we do in this sector is uncertain and that there is pressure to not admit to uncertainty - to receive funding, to move up in a priority list, etc.

At least retrospectively, though, we can learn a lot about why something did or did not work because, unlike burned cookies in a new oven, we have the advantage of being able to talk to a wide (and potentially systematic) variety of stakeholders about what happened, to carefully analyze this information, and to refer to it to make future efforts slightly less uncertain. To do this well, monitoring, process evaluation, stakeholder analyses and a variety of qualitative methods need to be treated as integral to evaluation and learning.

Your question resonates with me, however, in my story I'd like to make a distinction between 'risk' and 'uncertainty'.
Risk with it's connotations around registers and measurement and costed mitigations, arguably is part of the problem, and perhaps this explains why people have an unhealthy relationship with it. Rather than perpetuating this way of thinking that starts from the premise that everything can be measured and I know everything, maybe it would be better to be a little more humble and accept the fact that we do not know everything and sometimes uncertainties play a part. [Even well tried recipes fail sometimes when you bake them in a different oven.]

So I suggest, and appeal to people reading this narrative, to differentiate between risk and uncertainty and embrace uncertainty rather than attempt to reduce everything to a risk.