Against parsimony

Reading some relatively boring election-related stuff from Andrew Gelman, I came across an older post of his titled Against Parsimony. That stuck out to me, since parsimony had long been emphasized as an aspect of good theories. Fewer moving parts generally means less probability of error (this is related to the conjunction fallacy) and in practice it is necessary to avoid curve-fitting. Having a simple theory which predicts out-of-sample data is generally taken as an indicator as finding something about reality, which is why physicists looking for the most fundamental laws put so much emphasis on “beauty” or “elegance” of theories/equations. Gelman took his title from a paper by Albert O. Hirschmann. I still haven’t finished “Exit, Voice and Loyalty” though since starting it a while back I’ve bought and got midway through Mark Kleiman’s “When Brute Force Fails” and just today started Mancur Olson’s “The Rise and Decline of Nations”, in the introduction of which he spends a lot of words on the importance of parsimony in explanations of the titular problem.

I don’t find Gelman’s argument particularly useful or compelling. OK, so over-simplification can be bad – good grief, who new? The practice of throwing in tons of fudge parameters that leads to a delusion of understanding of complex problems is much more common and much more dangerous.

Reductionism: describe a bridge down to the exact position of every particle that makes it up. The information is so overwhelming as to be unworkable.

I don’t agree. There are plenty of problems that would require such an accurate description of the bridge – we tend to ignore them because we can’t handle that amount of data even if we could generate it.

I suppose you’re right Nanonymous, but people sometimes use the phrases interchangeably.

Thanks for the link Stephen. I’ll try and find a more exact dingalink for readers. UPDATE: Wasn’t able to find a good subset where they explicitly discussed parsimony, but this topic sounds closest. UPDATE 2: I kept listening and found that this topic might be more relevant.

The combinations that can arise in a long series of coin flips can be divided into regular sequences, which are highly improbable, and irregular sequences, which are vastly more numerous. Wherever we see symmetry or regularity, we seek a cause. Compressibility implies causation. We can rid ourselves of the problematic nature of traditional inductive probability by redefining probability in terms of computational theory, via Kolmogorov complexity: The chance of generating string x in the form of a short program p:U(p)=x is
P(p) = 2^(|x|-K(x)).
It is 2^(|x|-K(x)) times more likely that x arose as the result of an algorithmic process than by a random process.