Seven billion is not a number you can count to on your fingers. The media loves to make pointless comparisons that emphasise the size of the number. From my local rag:

“$6.2 billion: Former Google CEO Eric Schmidt’s net worth, according to a September Forbes estimate, is even exceeded by the population number 7 billion.”

It should considered a victory for media literacy that they got the direction of the inequality right. The comparison, and ones beside it in a infographic, all make seven billion seem like a lot. And it is when compared to the number of people in your living room. But it’s not compared to the number of bacteria in your body. Just because you can’t count that high doesn’t mean that number of people can’t exist on Earth. I think this is a big part of the reason why the problem of overpopulation, which I recognise is a legitimate concern, is often overstated (see this report from the Radical Statistics Population Studies Group for more on this).

Those of us who juggle orders of magnitude professionally have log scales and scientific notation to help us. How do we help those without these tools? For a start, compare numbers of people to numbers of people, rather than numbers of dollars. The infographic does compare seven billion to the number of people who can fit in AT&T Park, which is better than nothing. I’d prefer something like this:

The world population is about 22 USAs.

The US population is about 42 Bay Areas.

So the world population is about 900 to 1000 Bay Areas.

I think this kind of unsensationalist description is genuinely useful.

There’s a surfeit of studies showing that puny humans are inefficient when reasoning under uncertainty. Lately a lot of attention has been given to tail risks. Rare events can have huge consequences. Furthermore, it’s hard to put meaningful probabilities on some of these events — how do you assign a probability to the event that the Greenland ice sheet melts? What would such a probability even mean?

On a more basic level, we’re bad at reasoning even when probabilities are known and not extreme. The best evidence for this is the continued existence of roulette. Would it be enough to get punters to agree that on any given spin, all 38 (or 37) numbers are equally likely? If they’re just betting red or black, then I think so. Counting shows that less than half of the numbers are red, so you can convince someone that even money is losing in the long-run.

But most inveterate roulettists (I’m related to a few) don’t bet this way. They have more complicated schemes that involve covering weird subsets of the numbers. To explain that these bets are a bad idea, you have to get into expectation*. I might be unemployed, yet I still don’t have the time to teach expected values to everyone in my extended family.

On one level, then, probabilistic reasoning that treats outcomes as either deterministic or coinflips improves upon purely deterministic reasoning. On another, though, we make the 50% mark too magical — at least I do sometimes, like when reading weather forecasts. When the chance of rain is at least 50%, I’ll act as if rain is certain. When it’s less than 50%, I’ll act as if it’s certain there’ll be no rain. It’s a computationally cheap heuristic, since I don’t want to do a cost-benefit analysis every time I’m deciding whether or not to go out tomorrow. Still, it’s a heuristic that’s been resistant to improvement, even as it’s become clear to rain-phobic me that a 20% or 30% of rain should be a strong enough deterrent.

So I’m wary of overprivileging the 50% mark in attempts to explicate the probability scale (Nick Barrowman has a good one). Not only do probabilities of 50% usually occur in artificial situations (or in artificial models for situations), but payoffs are usually uneven. Put another way, a 0-50-100 scale is better than a 0-100 scale, but once you have more than one intermediate point, 50 shouldn’t be special anymore.

*I guess you could decompose bets on multiple numbers into bets on singletons, and make the point through equally likely outcomes, but good luck with that.

For some reason rough statistics don’t bother me as much as rough probabilities.

Exhibit A: Surveys of Occupy Wall Street protesters. These are obviously subject to all sorts of sampling biases, and are unlikely to be representative of the movement as a whole. Yet not only do they fail to irk me, I’m happy they’re being collected.

The following is not mathematically rigorous, since the events of yesterday evening were contingent upon one another in various ways. But just for fun, let’s put all of them together in sequence:

The Red Sox had just a 0.3 percent chance of failing to make the playoffs on Sept. 3.

The Rays had just a 0.3 percent chance of coming back after trailing 7-0 with two innings to play.

The Red Sox had only about a 2 percent chance of losing their game against Baltimore, when the Orioles were down to their last strike.

The Rays had about a 2 percent chance of winning in the bottom of the 9th, with Johnson also down to his last strike.

Multiply those four probabilities together, and you get a combined probability of about one chance in 278 million of all these events coming together in quite this way.

I care about baseball less than I care about Occupy Wall Street, but this bugs me a lot. It’s disclaimered, but that doesn’t stop the number from being quoted as gospel.

I think this has something to do with “statistics” being closer to raw numbers, whereas probabilities require some level of abstraction. You can do statistics without a model in any meaningful sense, but probabilities require a reference class to at least be implied. If you’re not clear on the reference class, you might end up doing dumb things like multiplying things that have no business being multiplied. Obviously I use rough probabilities all the time, but try to make the model explicit if it’s for public consumption, and try to have it very clear in my head in any case. Otherwise, stick with frequencies.

Let’s say you want to study the relationship between economic inequality and attitude toward the self, with social hierarchy perhaps a common cause. “Attitude toward the self” isn’t well-defined and hard to measure, and there’s little multinational data. What you could do is try to directly compare two countries. You could pick the USA, a highly economically unequal society, and Japan, a relatively economically equal society (forget that Japan is highly hierarchical for the moment).

What else could result in American and Japanese attitudes differing?

The US is much more multiracial than Japan

The US is mostly Christian, Japan Buddhist and Shinto

Japan has universal health insurance

Japan lost the Second World War

Evangelion was better in the original language

The most fundamental problem is sample size: even if you had identical twin countries, one pair ain’t enough. If you want to draw a strong conclusion from one pair, you need some strong qualitative arguments. If you want strong conclusions from quantitative data, you need quite a bit of data, and strong assumptions on top of that.

If we add an exit clause like “or there’s confounding” to 3., we weaken the argument to uselessness.

Now, although we can’t eliminate the possibility of confounding, we can get interesting evidence if there’s more to the data than “both increase”. If the peaks and troughs are simultaneous, then there’s some kind of strong relationship between the variables, whether causal or not. If one variable consistently leads the other, this can suggest direction of causation, though it’s easy to kid ourselves.

Barring such clear pattern, we need a more complicated causal model. It doesn’t have to be too complicated: something as simple as “X causes Z, which in turn causes Y” is interesting enough to have implications, if true, and is somewhat more susceptible to falsification. We can add more detail to the model as necessary. But you need to be explicit about pathways. Draw a picture if necessary.