My Bayesian Mind

by Ketil; May 2, 2013

Apparently, there’s a small civil war going on between frequentists and Bayesians. Nate Silver’s otherwise enjoyable book, The Signal and the Noise contains a whole chapter devoted to back-talking Fisher and the frequentist viewpoint. I have generally failed to quite appreciate this schism, to my (admittedly limited) understanding, these are just mathematical methods, applicable or not depending on the circumstances. There is one thing that bothers me, though.

Frequentists and science

The Scientific Method, in the positivist voice of Karl Popper, starts out with the formulation of a hypothesis - a model of how some aspect of the world works. This model is then used to make predictions, and the predictions are compared to observations. The important thing for Popper is that observations that confirm a theory aren’t particularly valuable, one should always try to disprove hypotheses, and after one fails to do that in all ways imaginable, the model is accepted.

This is derived from logic, where this takes the form of an implication: H ⇒ D. Now observing that D (our expected observation) is true doesn’t really tell us anything about the value of H (our hypothesis). But on the other hand, if we observe that D is false, we know that H must be false as well.

This is closely tied to the frequentist viewpoint, which is kind of a probabilstic phrasing of the above. You formulate the conventional view, the null hypothesis (H0), and a P-value, the probability of making this observation given the null hypothesis is true. Generally, you have alternative hypothesis (H) in mind, and ideally, your observations would be much more likely under H - but it isn’t really necessary, and you can’t strictly claim that H is likely to be right, just because H0 is likely wrong.

Bayes formula

Bayes formula, on the other hand, says something about the probability of a hypothesis being true, which is usually what you are interested in. It looks like this:

P(H∣D) = P(D∣H)P(H)/P(D)

It reads as: the probability of a hypothesis, H, given some observation D, is equal to the probablity of observing the data if the hypothesis is true, multiplied by the prior probability of the hypothesis (how likely we thought it was to be true before making our observation), and divided by the (also prior) probability of making our observation regardless of whether H is true or not.

One important difference is that unlike in Popper’s case, observations confirming the theory add value, an observation D modifies the probability that H is true by the ratio of P(D∣H)/P(D).

The brain

What has puzzled me, is that the scientific method is allegedly an effective and efficient way of gaining and improving knowledge. After all, that is the point of science, isn’t it? Yet, our minds clearly don’t work that way at all.

Even putting aside well-known logical fallacies, like confirmation bias, how we are terrible at estimating risk and probabilities, and how we form opinions from feelings, not rational analysis - it is clear that we are (or at least I am) not using anything even remotely resembling Poppers scientific method in day to day life.

For instance, I do occasional orienteering, running around in the wilderness equipped with a map and a compass, and trying to work my way to a sequence of control points as fast as possible. As it happens, I’m not terribly good at it, and consequently, I get lost. How do I work out where I am on the map, and get back on track? I formulate a hypothesis: perhaps I am here. Then, I try to confirm this hypothesis: if I am here, there should be a boulder there. Check if there is a boulder. If yes, I’m usually satisfied, and chart my way using this hypothesis. If not, Popper would say that I should discard my hypothesis, but instead, I look for a different confirmation: there should also be a gully there. I’m just not letting go that easily.

Another example: I like to solve crosswords. Often, I have a candidate word that fits (i.e. a hypothetical answer), but too few letters confirm it. If I can find a perpendicular word that support my hypothesis by confiriming an agreeing letter, I accept it - almost immediately. I never bother to try to find words that would falsify my hyptothetical answer.

I think it is clear that these are examples of Bayesian reasoning. In the first example, I have a hypothesis with some - well, let’s be frank: a lot of uncertainity. The prior P(H) is rather low. Now, I observe a boulder where I expected it from the hypothesis. I can then strengthen my hypothesis by P(D∣H)/P(D), or the probability of observing the boulder given that I am where I thought, divided by the probability of observing a boulder in some random place. Unless the terrain is generally very rocky, this improves my confidence a lot.

It is important that P(D∣H) isn’t one - even if I am right about my whereabouts, it is still possible that I don’t observe the boulder. The boulder might be hidden by bushes, or the map might be wrong, for instance.

The argument is similar for crosswords, I leave that as an excercise for the reader.

The effectiveness and efficiency of science

What bothers me about this, is that it looks to me as if we are wired to do our decision making using methods that resemble Bayes. And we are clearly not using falsification as a means of deriving knowledge. Now, nature isn’t perfect, but it is usually pretty good at optimizing things. If the scientific method of falsification really is an effective and efficient way of extracting knowledge from observations, why don’t we use it?

One possible reason for our innate Bayesian preference is that in the real world, we rarely deal in absolutes. As in the case with the boulder, our information is quite noisy, and it can be dangerous to change our opinion haphazardly, based on potentially unreliable information.

But even if our brains may be more Bayesian than Popperian, we’re not very good at it. We routinely make all kinds of crazy illogical errors, and it’s hard to imagine that this is somehow advantageous in evolution.

And perhaps that is the reason why Bayes works for us, we are intuitively sloppy, and in reality just have to make do with rough approximations. And the reason science sticks to positivist-frequentist methods, is that science really wants to deal with absolutes.

Afterword

The astute reader may have noticed that my argumentation in this text is bayesian. What can I say? I can’t help it, it’s in my nature. And if you found it convincing, chances are it’s in yours, too.

References

While writing this, I ran across this link. Although I can’t claim to have reasoned anywhere near as clearly or eloquently about it, it sums up my sentiment pretty well.

I couldn’t find any links off-hand, but I am aware that some people appear to think that intelligence evolved, not to improve our understanding of our environment, but as a social tool. Or: evolution made us dumb enough to get into all kinds of trouble, but compensated by making us smart enought that we could talk us out of it. Sometimes. I don’t buy it.