January 29, 2007

"The plural of anecdote is not data"

Of course it is. What you can argue over is whether or not the data / anecdotes were properly collected, properly analyzed, properly interpreted, and so forth, but lots of anecdotes do constitute a dataset. Many well-collected, well-analyzed, and well-interpreted data began their lives as anecdotes that suggested a pattern to a curious person who then dug deeper into the issue. It's easy to generate lots of data, most of which will be worthless for a particular question. You can point any instrument at any part of the world and collect lots of data. However, what machines can't at present do is become intrigued by something that seems a bit odd and in need of explanation -- it's these initial, humble anecdote-seeds that the investigator, equipped with his tools, nourishes into a conclusion (either for or against his hunch).

Why is statistics the target of so many retardedly nihilistic sound-bites? Lemme think about that one...

Sex differences in interest in politics is another good example of greater male variance, like how males are overrepresented at the gifted and special education tails of the IQ curve. I'd say males are overrepresented among those who have good sense, rational thought processes, and so on, in political discussions -- but they're also overrepresented among those who can't tell their head from their ass and yet insist on confidently shouting prescriptions at everyone else (the majority of policy wonks).

In my experience, the people who say "the plural of anecdote is not data" are using it in reference to something they've seen themselves or heard about from other people; in other words, they're referring to a dataset in which N=1 and are cautious to point out that it may be unwarranted to jump to any conclusions.

i agree on the premise that, by definition, many anecdotes constitute data.

but, the point is usually whether that data is representative enough to make complex decisions. the answer is usually no, because picking the right sample is critical and it usually doesn't happen if I just talk to my many friends in California and conclude that over 40% of people living in the US really care about scientific research and they speak pretty good Spanish.