Hey hey! Welcome back! It is week 2 of the read-along, and we’ve got some good stuff going on today. After spending last week learning what bullshit is, this week we’re going to focus on how to spot it in the wild. This is well timed because a few days ago I had a distressing discussion with a high school teacher-friend who had assigned her kids some of my Intro to Internet Science posts as a prelude to a section on fake news. She had asked them to write an essay about the topic of “fake stuff on the internet” before the discussion in class, and apparently more than a few of them said something to the effect of “that’s nice, but I’ve never heard of fake news so this is not a problem in my life”. Groooooooooooooooooooooooooooan

Of course the problem with bullshit is that no one warns you you’re going to see it, and no one slaps you afterwards and says “you just read that uncritically”. With so much of the bullshit these days being spread by social media, inattentional blindness is in high gear. If 50% of study participants can’t see a gorilla when they’re trying to watch a bouncing ball, what makes you think you’re going to correctly spot bullshit while you’re trying to post pictures/think of funny status updates/score a political point against your uncle/see how your ex is doing????

The only hope is to teach yourself some ticks and remain eternally vigilant. In other words (and with apologies to Hunter S Thompson): I hate to advocate pessimism, skeptical toolkits, the GRIM test and constant paranoia, but they’ve always worked for me.

With that intro, let’s get to the readings! First up is Chapter 12 of Carl Sagan’s Demon Haunted World: Science as a Candle in the Dark: The Fine Art of Baloney Detection. I don’t think I’d read this chapter since I first read this book maybe 15 years ago or so, so it was a lot of fun to read again. Sagan starts by making a differentiation that will be familiar to those who read last week’s piece: those who believe common misconceptions vs those who promote them professionally. The example he uses is being able to contact the dead. He admits to his own longing to talk to his deceased parents and how much appeal the belief that sometimes you can “feel” the dead has to most of us. As an atheist, he firmly believed the idea of life after death was baloney, but he gives a pass to the large number of people who believe in life after death or even those who believe they’ve had contact with the dead in their personal lives. To him, those beliefs are normal and even if you don’t think they are true or rational, they are hard to criticize. Where his wrath kicks in is those who seek to make money off of promoting this stuff and encouraging people to believe in irrational things, like psychics or mediums. He believes that undermining a society’s ability and desire to seek out independent truth and facts is one of the worst things a person can do. This isn’t just psychics doing this of course, but most of the advertising world as well, who will throw any “fact” at you if you just buy their product. In response to this constant barrage of misinformation and misdirection, he offers a “tool kit” for skeptical thinking. The whole thing is on the 4th and 5th page, but the short version is this:

Get independent confirmation of facts

Encourage debate

Don’t trust authority blindly

Come up with multiple possible explanations

Don’t stick to one explanation just because it is the one you thought of

Find something to quantify, which makes everything easier to compare

Make sure the whole chain of the argument works. Don’t let people mumble through part of it.

Prefer simple explanations (Occam’s razor)

Look for something falsifiable. If something can never be proven wrong, it is, well, never going to be proven wrong.

Keep a friendly statistician around at all times

Okay, fine, that last one’s mine, not Sagan’s, but he does come out swinging for well designed experiments. He also includes a really helpful list of the most common logical fallacies (if you want a nice online version, try this one). He concludes with a discussion of corporate advertising, sponsored research, and tobacco companies. Confusing science and skewed research helped promote tobacco for much longer than it should have stuck around.

With the stage set by Sagan, the rest of the readings include some specific tips and tricks to spot various issues with numbers and data. Some are basic plausibility checks, and some are more advanced. These are:

The “what does this number even mean” check: Last week we talked about bullshit as “unclarifiable unclarity”, and this case study is a good example of doing that with numbers. Written by West and Bergstrom, this example looks at a packet of hot cocoa that claims to be “99.9% caffeine free”. It is not so much that the claim is implausible or even inaccurate, but that it is completely meaningless. If you’re measuring by weight, even a highly caffeinated drink will be mostly “caffeine free”. While it is likely the cocoa actually is low caffeine, this statistic doesn’t give you much insight. It is the appearance of information without any actual substance.

Fermi estimations: A technique named after Enrico Fermi, its focus is to get people to focus on getting people to guess numbers based on the order of magnitude (ie 10 vs 100 vs 1000, etc), not the exact number. When doing rough calculations with large numbers, this can actually yield surprisingly accurate results. To play around with making these estimates, they provide a link to this game here. There’s a good book on this and how to solve problems like “how many piano tuners work in New York City?” called Guesstimation if you’re really in to it.

Being able to focus in on the order of magnitude is surprisingly helpful in spotting bullshit, as is shown in the case study of food stamp fraud numbers. A news report from Fox News says that food stamp fraud costs tax payers $70 million dollars a year, and asked if this level of fraud means it is time to end food stamps. If we take that number at face value, is this a big deal? Using Fermi estimations, you can figure out a ballpark number for total food stamp payouts, and determine that this loss would be around .2% of all benefits paid. That is really close to the number you get if you dig up all the real numbers: .09% of all benefits paid.

GRIM testing: Edging in to the deeper end of the pool, this is a neat little trick that mostly has applications for those reviewing studies with small sample sizes. GRIM stands for “granularity-related inconsistency of means” test, and it is a way of quickly and easily looking for data problems. The full explanation (plus the fascinating story of its development) is here, but here’s the quick version: if your sample size is small and you are counting whole numbers, your mean has to end in very predictable decimal places. If it doesn’t, something’s wrong. For example, a study says that 10 people reported having an average of 2.24 children is bogus. Why? Because 2.24= total number of kids/10, and the total number of kids would have to be 22.4. There are a lot of possible explanations for this, but most of them get down the types of sloppiness or confusion that might make you question other parts of the paper.

Newcomb-Benford Law: This law is one of my favorites because it was spotted back in 1881 for a reason that simply wouldn’t happen today: uneven wear on books. Back when slide rules were scarce and people had to actually look through a book of numbers to figure out what a logarithm for a certain value was, an astronomer named Simon Newcomb noticed that the books were really worn out in the first sections where the numbers that started with low numbers were, and rather clean in the back where the leading digits were higher. He began to wonder if “random” numbers found in nature were more likely to start with small digits than large ones, then he just decided to declare it was so and said that the probability that the leading digits was a certain value d was equal to the log((d+1)/d). Basically, a random number like the population of a country will have a 30% chance of starting with 1, and only a 5% chance of starting with a 9.

Despite having very little proof other than a worn out book, it turns out this law is actually pretty true. Machine generated data can gum up the works a bit, but natural phenomena tend to follow this rule. Benford got his name in there by pulling data from hundreds of sources: rivers, populations, physical constants, even random numbers from the pages of Reader’s Digest and categorizing them by leading digit. He got 20,000 numbers together and found that low leading digits simply WERE more common. The proposed mathematical explanations for this are not light reading no matter what they promise, but it is pretty much enough to know that it is a thing. It has been used to detect election fraud and is also used in forensic accounting, but basically all the layperson needs to know is that numbers lists that start with high digits aren’t as plausible as those that start with low ones.

And one more for the road: It is worth noting that there is actually another Benford Law that would be not-irrelevant in a course like this. Benford’s Law of Controversy states that “passion is inversely proportional to the amount of real information available”.

All of these tricks may seem like a lot to keep in mind, so if you want some practice take the advice I give to the high school students: find a cause you really care about and go read bad arguments or propaganda from the “other side”. As I’ve mentioned before, your ability to do math improves dramatically when said math helps you prove a point you feel emotionally attached to. Using this to your advantage while learning these tricks might help you get them down a little faster. Of course the problem with learning these tricks is that unless you’re entirely hypocritical, eventually you might have to turn them around on your own side, so be forewarned of that.To this day the high point of my blogging career is when my political activist brother left me a voicemail screaming “I JUST LEFT A MEETING WITH PEOPLE I LIKE MAKING A POINT I AGREE WITH BUT THEY USED BAD STATISTICS THAT I FIGURED OUT WERE WRONG AND I COULDN’T STOP STARING AT THEM AND I HAD TO CORRECT THEM IN FRONT OF EVERYONE AND THEN THEY TOLD ME IT DIDN’T MATTER AND NOW I’M MAD AT THEM AND YOU!!!!”.

So what am I taking away from this week? A few things:

Even if you’re not a “numbers person”, a good sense of how numbers work can go a long way towards checking the plausibility of a claim

Paranoia is just good sense if people really are out to get you. People who are trying to sell you something are not the most trustworthy sources

Math tricks are fun

People named Benford come up with an unusual number of bullshit related laws

I’m still checking that last one out, but it seems plausible.

And that wraps up this week! Next week we’ll be wallowing in “the natural ecology of bullshit”, so make sure you meander back next Sunday for that. Bring boots. It’ll be fun.

“Look for something falsifiable. If something can never be proven wrong, it is, well, never going to be proven wrong.”

Also it strongly tends[*] to be a roundabout way of saying “anything could happen [and then I’m right]”. With or without the bracketed puffery, “anything could happen” is not a very useful thing to know about a phenomenon: adding it to your stock of knowledge is about as useful as adding zero dollars to your bank balance.

[*] All the exceptions that I can think of are based on the difference between what is conveyed by ordinary idiomatic speech and what is conveyed by formal logic which superficially corresponds to the speech. E.g. if you declare “A is no greater than B” in ordinary idiomatic speech, it can be weaselly and misleading if you know the truth is that A is equal to B. And the ordinary idiomatic speech rules on unspoken side conditions are more complicated (and more relevant to sentences like “anything can happen”) than that.