Over the last year or two I’ve become pretty convinced (though not necessarily for good reason) that inference to the best explanation (IBE) is the main (if not the only) tool of inference in non-deductive contexts. Even in mathematics I’d like to say this is the case, both in arriving at standard axioms, in supporting new ones, and in developing conjectures. (Of course, this is pending an account of explanation in mathematics.)

Anyway, some discussions I’ve had during that time with Peter Gerdes have made me wonder some more about the nature of explanation. He has on several occasions argued against the use of inference to the best explanation, making a claim something like saying that for something to be a good explanation presupposes that it is true, so we can’t recognize good explanations until after we’ve recognized the truth of the explainer.

Now, I don’t know the literature on this at all (I should probably look at this quite a bit before I decide to get around to graduating), so I don’t even know for instance what sorts of things A and B are supposed to be if A explains B. (Theories? Propositions? Facts? Events?) At any rate, it seems clear that you can’t explain something that didn’t happen, so B should be true (or actual, or whatever the appropriate property is for the sort of entity at question). However, I think this doesn’t seem so clear for A.

In ordinary usage, it does at first seem that A has to be true – I can’t explain why Mary is looking around on the ground by saying that she lost her wallet, unless she actually did lose her wallet. However, in the scientific case (and I would guess, the more complicated ordinary cases as well), it seems that good explanations really can come from false theories. For instance, Newton’s laws of gravitation and of motion explain Kepler’s laws of planetary motion (or at least, the data leading to his postulation of them) quite well – even though we all believe Newton’s laws don’t actually obtain. In fact, for this particular set of data, it’s not at all clear that relativity (or quantum gravity, or string theory, or …) is at all a better explanation just because it happens to be true (or closer to the truth).

There does seem to be something to the simplicity of the Newtonian explanation that makes it preferable. In addition, Newtonian mechanics is close enough to being correct that it seems to be useful as an explanation even though it’s actually false. That is, it helps us conceptualize what’s going on, make predictions about related facts, remember Kepler’s laws when we’re not literally memorizing them, and so on. There are very few senses in which saying “God is crying” is a good explanation for why it’s raining, and a lot more in which saying something not quite accurate about warm fronts and dew points and such is. If our notion of explanation wasn’t tolerant of falsehood in the antecedent, then science would rarely (if ever) help us explain anything – after all, we have good reason to believe that every scientific theory believed more than twenty years ago was false, which itself gives us good evidence to believe that current ones are false as well. However, it seems clear that science generally provides us better and better explanations of all sorts of phenomena, suggesting that false theories can in fact provide good explanations.

If explanation really is falsity-tolerant in the antecedent, as I think, then I think we can get IBE off the ground. Of course, we’d need to tell some story like what Jonah Schupbach was saying about a year ago at his blog, about why IBE tends to lead us towards the truth, even if it doesn’t presuppose it. And we’d have to watch out for the worries van Fraassen raises for using IBE as a supplement to probabilistic reasnong (which I learned about from a post by Dustin Locke on his blog). I think these are compatible, if Jonah is right that IBE is just a heuristic for simplifying bayesian computations, rather than a supplement to them as van Fraassen supposes. But we’d need to work things out in more detail of course.

In Tony Martin’s paper, “Evidence in Mathematics” (in Truth in Mathematics, edited by Dales and Oliveri), he gives arguments that one should adopt axioms well up the large cardinal hierarchy (countably many Woodins, I believe) because they provide a good explanation for various facts that we can already observe from ZFC. This is because they are (approximately) equivalent to projective determinacy, which states that every “projective” set of real numbers has various nice properties. The investigation of these properties led to new unifying results in recursion theory, stating that various sets of Turing degrees contain cones, and that the Wadge degrees have a particular nice structure up to a very high level. Results related to these properties are now important in recursion theory and wouldn’t have been discovered without the axiom, and every particular consequence of these results that has been considered has in fact been verified directly from ZFC (though often in more difficult ways). Since projective determinacy is known to be independent of ZFC (if consistent), it seems that we need to postulate it in order to properly explain the phenomena we can already observe, just knowing ZFC.

Based on arguments like this, it seems that Quine misstated the naturalist position, when he said that it should tell us to adopt ZFC+V=L. His reasoning was that ZFC should be accepted because it is an indispensable part of our scientific explanations of the world. However, the only particular sets that are indispensable in these explanations are all constructible, so those are the only ones whose existence we should countenance (all the other sets seem to be in some sense idle, like the angels that make sure gravity keeps doing its job, and the elves that make quarks obey the strong nuclear force). Thus, we should believe that the constructible sets are all that exist, so V=L.

But V=L is incompatible even with fairly weak large cardinal axioms (the existence of a measurable cardinal), and therefore with the stronger axioms advocated by Martin as part of an explanation of what’s going on with sets of Turing degrees and such. So I think Martin’s argument suggests that we do in fact have evidence for sets beyond L. This evidence may not be based directly in the physical world, but if Quine is serious about his holistic picture of science, then ZFC is just as much a part of science as relativity, and just as ZFC is justified because we need (large parts of) it for relativity (and just about every other scientific theory we’ve ever considered), and relativity is justified because it gives the best explanations of our observations, it seems that projective determinacy is justified because it gives the best explanations of phenomena in ZFC.

So if we believe the indispensability theorist, then we really should believe most of the large cardinal axioms, and not just ZFC. I mentioned this point in passing in a recent post.

However, I think most of this will be able to go through for the fictionalist just as well as for the indispensability-argument realist. If ZFC isn’t actually indispensable for our science, but is still quite useful, then someone like Hartry Field is willing to accept it at least as a good story, even if not literally the truth. But once we’re considering the story, I think we should adopt projective determinacy within the story as well. It seems to me that what is true in a fiction is not just literally what the author has asserted, but furhter facts may be as well, if they provide good explanations for what the author has in fact asserted. For instance, in a detective novel with a stupid detective, there may be enough clues presented for the reader to find out who did it, even if the detective never does and the author never explicitly says who did. And in a movie, it may become clear that a certain scene was actually a dream and not reality, because that’s the best way to reconcile it with the rest of the characters’ actions and desires. I think the audience discovers these facts in just the same way that we use inference to the best explanation in science (and our ordinary lives). Such inferences are always defeasible (we may find a better explanation, the author may explicitly deny the truth of the inference, further evidence may count against the inference, etc.) but it seems plausible that they are always active, whether in fiction or reality.

Therefore, I think that the fictionalist is just as justified in ascending the large cardinal hierarchy as the indispensability theorist, and both of them are in fact justified. Penelope Maddy is worried that they might not be, because of Quine’s argument I’ve paraphrased above (I’ve paraphrased that argument from my memory of Maddy’s paraphrase of it, so I may have misrepresented one or both of them through an inaccurate memory). This worry is a large part of what drives her to her position in Naturalism in Mathematics, but I think it is unjustified. Both the fictionalist and the Quinean naturalist should accept large cardinal axioms, just as Maddy believes set theorists should.

Yesterday I was explaining to a chemist friend just what sorts of questions philosophers of physics, biology, and math are interested in, and we were speculating what philosophers of chemistry might work on. (I had just found Synthese’s June 1997 special issue on philosophy of chemistry, but hadn’t read any of the articles yet.) It became clear in our discussion that he saw the primary goal of science as enabling us to do useful things, while I had always seen the goal as enabling us to understand how the world works.

Of course, it’s clear that having either as a fundamental goal licenses the other as an instrumental goal – it’s generally hard to change the world without having any understanding, and hard to understand the world without using various aspects of technology to change small parts of it. There does seem to be an ordinary language distinction between science and technology, in which science focuses on understanding and technology focuses on acting. But it’s also probably true that this distinction is overstated – it’s likely that large numbers of scientists see each of “understanding the world” and “making the world a better place” as their primary goal, and an even larger group might say that it’s some combination of the two. So we can’t just ask the scientists which is more important.

Arguing in favor of the understanding side, it seems to be a very (scientifically) unsatisfactory situation when pharmacologists are able to provide medicines that treat various conditions, even though they have no understanding of the underlying mechanisms. If we compare this situation to the converse, we see that in mathematics, it’s a perfectly normal (and not distressing) situation to develop understanding of some system without thereby increasing our practical powers. But the defender of the practically-based picture of science might respond that math is a non-representative case, and point out that a large part of the string theory controversy is exactly about the fact that string theory may explain the world, but it doesn’t help us do anything. It might just be a prejudice of philosophers to say that understanding is the more fundamental goal of inquiry, and ability is only secondary – after all, in our profession, epistemology is central, while philosophy of action and even ethics are somewhat secondary.

Of course, to switch from a view of science as aimed at explanation to a view of science as aimed at practical results would mean a radical change in a more pragmatist direction. But it’s not clear just how we can argue that such a shift would be wrong.

I remember one theorem that I proved, and yet I really couldn’t see why it was true. It worried me for years and years… I kept worrying about it, and five or six years later I understood why it had to be true. Then I got an entirely different proof… Using quite different techniques, it was quite clear why it had to be true.

On the FOM e-mail list there have recently been a few discussions of the notion of explanation in mathematics by Allen Hazen, Richard Heck, and Richard Zach, suggesting that it may or may not be ready yet as a well-defined enough topic to work on. However, I think probably the best way to give it better definition is to gather more examples of it.

So if you, or any of your friends or colleagues, have good examples of proofs that are explanatory (or not), then send them to me at easwaran at berkeley dot edu. I suppose if it’s a very short example, or just a link to some example posted elsewhere, a comment would be good too, so that others can see it. Once I have a few of them, I’ll try to figure out some useful way to make these proofs accessible to others too, and credit the submitters.

Probably the most useful examples would be two proofs of the same result, one of which is clearly a better explanation than the other. For instance, Fürstenberg’s topological proof of the infinitude of primes, despite being remarkably clever, is clearly not as good an explanation of this fact as Euclid’s original proof. Of course, plenty of good examples will probably be like whatever Atiyah was talking about above, and exist in contemporary research, rather than in well-established results. I don’t expect to necessarily understand the relevant proofs, but it’ll still be helpful to have a collection including them, both for other people’s use, and in case I want to check some general or structural relations between the proofs.

The phrase, “the unreasonable effectiveness of mathematics” goes back to the title of an essay by the physicist Eugene Wigner in 1960. He points out that mathematics is developed largely on aesthetic grounds, and yet large parts of it eventually get co-opted by physics and the other natural sciences to formalize parts of their theories. There seems to be no reason to believe that mathematics (especially the limited fragment of mathematics that humans actually get around to developing) should have anything to do with the physical world. He then goes on to point out how surprising it should be that it’s even possible to formulate laws of physics in the first place, let alone that they should be mathematical. And he spends the last little bit of the essay discussing the conceivability both of finding a unified theory to which all our scientific theories are approximations, or of the impossibility of such a theory, which would leave us with multiple contradictory theories, each good for its own domain. The fact that we’ve managed to come so far seems to cry out for explanation.

Wigner seems to miss some aspects of the development of mathematics though. He suggests that mathematicians find something beautiful and develop it, but doesn’t point out that these theories are actually very often developed just to explain things in already-established areas of mathematics. For instance, complex numbers were developed to fill in the steps in the solution of certain cubic equations over the real numbers. At least some of the theory of groups was first developed specifically by Galois and Abel to show why there was no corresponding method for solving quintic equations. If all of mathematics was developed for motivations resembling these (as I think plausible), then once we realize that the very basic parts of mathematics are applicable, it may be no surprise that the rest of it is as well. If the natural numbers apply to some phenomenon, and some other theory was developed to explain the natural numbers, then it seems plausible that this theory would be applicable to the explanation of the phenomenon the natural numbers apply to.

Of course, this still leaves open the question of why so much mathematics seems to apply in contexts other than these. If group theory was developed to explain properties of real numbers and other fields, then why should it apply to the fundamental particles of physics in a context independent of any such field?

Greg Frost-Arnold has a fascinating post suggesting that in fact in pre-Galilean astronomy, the effectiveness of mathematics might not have seemed so unreasonable. After all, they believed then that the “heavenly bodies” had similar properties of permanence and perfection to the objects of mathematics. And if they were all created by the same God, then it would make sense that mathematics and astronomy had a lot of overlap. The effectiveness only started seeming really unreasonable when Newton showed that there were mathematical theories unifying earthly and astronomical motion.

At any rate, this contemporary effectiveness of mathematics, which seems so unreasonable, for some reason hasn’t been a very central question in the philosophy of mathematics. Instead, people have focused on more foundational questions about mathematics, like what the nature of mathematical truth is, and how it is that we have access to it. But I think Hartry Field’s program in Science Without Numbers gives the closest thing to an explanation for the effectiveness of mathematics. His main goal is to prove a certain claim about the ontology of mathematics (namely, that there is none), but I think it’s more successful as an extension of the methods of Krantz, Luce, Suppes, and Tversky in their Foundations of Measurement to explain how mathematics can be applied in a rigorous manner. He formulates the axioms of Newtonian mechanics in a way that the mathematics that is applied to it can be straightforwardly seen to be a conservative extension. Thus, he justifies this application.

Michael Dummett, in “What is Mathematics About?” criticizes this program, saying that “Field envisages the justification of his conservative extension thesis as being accomplished only piecemeal.” Dummett suggests that this would be unsatisfying, because it would never make mathematics completely justified, but only justify particular applications of particular theories. Whether or not he’s right that this is all that Field would accomplish (Field seems to claim to have shown that all of mathematics is conservative over any non-mathematical theory), I think that this is actually almost exactly the goal we should want to achieve. It wouldn’t do to suggest that any mathematical theory can be applied to any aspect of the world – there’s only certain applications that make sense, and only those should be justified. We would still face some puzzles as to why it is that so much mathematics ends up applying to so much of the physical world, but at least each particular application would no longer seem so unreasonable.

I went to a math graduate student talk yesterday about regular primes and their relations to Fermat’s Last Theorem, class numbers of fields, zeta functions, and the like. The thing that struck me most about the talk was how many “proofs” due to Euler were used that really did nothing like what a proof is supposed to do.

Here’s a simple example of the sorts of “proof” involved in the lecture – we know that if a geometric series 1+r+r^2+r^3+… converges, then a simple calculation shows that it converges to 1/(1-r). (If we just multiply through by 1-r, we see that every term cancels except for the 1 – more rigorously, if we multiply the partial sums by 1-r, we get 1-r^(n+1), and if |r|p-adic distance for some prime p, we get the p-adic numbers as the completion. Amazingly enough, in the 2-adics, the series 1+2+4+8+16+… really does converge to -1. And in the 5-adics, the series 1+5+25+125+… really does converge to -1/4. (The argument from above actually works completely unchanged, except that |r|Here’s another blogospheric discussion of this phenomenon.