darling it's better down where it's meta

The Casuist’s Razor

“Casuistry” is today a near synonym for “sophistry”: a certain kind of intricate, deceptive reasoning; highly pejorative. The word originally referred to case-by-case moral analysis (and, as philosophical jargon, still does). But the casuist, evidently, abused the rich particularities offered by reality to justify his prior intuitions. With a torrent of excuses and exceptions he eroded the barrier between right and wrong. This was unacceptable.

If casuistry has fallen out of fashion, then principled reasoning is our new rising star.a Our most successful scientific theories—physics and evolution in particular—are seen as having succeeded on the strength of their simplicity, their ability to explain a wide range of phenomena using only deep, universal principles. The direction of causality is unclear, but today’s intellectual discourse is saturated with a similar reductionist impulse, which I contend is as much aesthetic as practical. Consider the 2012 Annual Question from Edge.org: “What is your favorite deep, elegant, or beautiful explanation?” One wonders whether that disjunction was really necessary.

There are epistemological advantages to keeping your theories small. Derive all your judgments from simple premises, and you no longer risk overfitting. An argument with fewer moving parts requires less justification, is less vulnerable. Meta-level considerations can pinpoint common patterns to achieve vast compression. Hail Occam’s Razor.

A theory of theories

But in a sense, this kind of “principled” discourse is the exact opposite of reasoning “from first principles” in physics, even as it springs from physics envy.b In the latter, one begins with the fundamental and derives the contingent, generally proceeding along the direction of causation.c In the former, one begins by identifying broad patterns in object-level phenomena, perhaps suggestive of common underlying mechanisms, which one then uses as heuristics for analyzing further phenomena. But the patterns are not themselves fundamental or causal. To rely on them is singularly emergentist, not reductionist.d

The meta level has its own characteristic vulnerabilities, and that’s just the first of them:e In reifying these patterns, you begin to see them as more fundamental than they really are, conflating the above senses of “principle.”f You begin to give your theory too much explanatory power.g You slip into the habit of thoughtlessly pattern-matching, or of letting connotations sneak in, or of taking your principle beyond the bounds of its validity; you could just be wrong, and the cost of being wrong is then multiplied by the scope of your meta-argument.h You feel that you’ve solved a problem just by naming it.i It’s often unclear how to apply meta-level considerations, or hard to notice when to do so. The difficulty of communicating life advice is a special case of the difficultly of meta discourse: some things require experience to really internalize; someone else’s crystallized experiences don’t provide a guide to action.j You lose your purpose and fail to make progress—we can trade principled arguments all day without reference to what we’re really talking about, and go to bed having learned nothing.k A crystallized pattern can be a mere distraction, if the information provided by its corresponding heuristic would be screenedoff by the specific details of any individual case.l

Those last couple pitfalls are the most important, and if you can avoid them, you’ll have an easier time with the rest. A theory that doesn’t account for detailed behavior is an approximation, and even in scientific domains, you can find conflicting approximations. When that happens—and it’s “when,” not “if”—if you want to keep using your approximation, you have to use the details of the situation to explain why your approximation is valid or otherwise reconcile your principles. Your best defense against reductio ad absurdum, against Proving Too Much, is casuistry. Expect things to be complex, expect details to matter. Don’t ascribe intention or agency to abstract concepts and institutions. Look for chains of cause and effect. Look at individual moving parts and the forces acting on them. Make empirical predictions, and look for unintended empirical predictions. Don’t use an outside view to screen off the inside view; use it to force a confrontation where they differ. Ask what the complementary principle explains, and look to the specifics to resolve the tension.m

An aside on analogies

The Casuist’s Razor, which I’ll postpone defining, has a close relationship with the Proves Too Much heuristic—the broader the theory, the more likely to prove too much—and this makes me pause for one last digressive warning about the use of analogy in argument. We all know that arguments from analogy are weak at best and misleading at worst. The Noncentral Fallacy is a form of analogical reasoning, and it’s a candidate for Worst Argument In The World. If you want to sustain your argument, you have to work painstakingly to find and account for any dissimilarities.n

Regrettably, using the Proves Too Much heuristic requires that you make some kind of analogy: you must claim that your interlocutor’s argument would carry through in another situation (where it evidently does not). You certainly can do all the entailed justification, as Eliezer does for The Physics Diet. But if you don’t, your cries of “Proves Too Much” will themselves prove too much, all because you failed to account for the details. Analogies are useful as far as they point you to those crucial details.o

An example

I was recently frustrated by some discussions of gender disparity in various fields. Some people claimed the ratios were evidence of gender bias; others responded that this claim proved too much, because law, medicine, and business were more equitable, so it must be a matter of choice. But once you start accounting for everything—pay, prestige, history, socialization, position within a field, the differences over time between jazz and classical musicians and MBAs and entrepreneurs and authors and genre authors and—you start to worry about whether you can sustain the analogy. That maybe there’s no Grand Unified Theory here.

You have one explanation that says people freely choose their careers according to their values, skills, and interests, and another that says systematic bias hinders or discourages people from joining and excelling in a field. On the surface, each seems to explain some of your data, and maybe principled excuses could be made—here we manage parity in the face of bias because prestige and pay incentivize women to fight for it, or there we have disparity due to historical mistakes of which we’re now somehow innocent—but these are never satisfactory to everyone, since people can come up with more excusing principles than independent fields to tease them apart.p

So maybe things really are “intrinsically messy and complex and involving many different perspectives, not all of which can be reduced to a single common framework,” as Tyler Cowen suggests.q The solution starts looking a lot more like casuistry, and you pause before you brush aside evidence of real problems or differences with a broad wave of your hands.

So what does application of the Razor look like? Recognize that you have these principled explanations in tension, admit that both are at least relevant, that your casual non-causal statistical inference from the data you’re looking at isn’t going to cut it, and look at what actually happens. In particular, look in places where the two explanations are expected to conflict.r When you have conflicting non-constructive arguments, it’s a good sign that you need to do some construction. Proof by contradiction is unsatisfactory to begin withs, but if you’re facing mutual reductio ad absurdum then it’s practically unsalvageable. I’ll leave the task of finding this narrow causal information to you. Then, in evaluating this information, don’t even bother compressing it into one explanation or the other. The question has been dissolved as well as it will be. Act as you see fit.t

Detailed judgments

Principled moral and legal judgments share many problems with the above kinds of explanation.

In a recent LW discussion, Sophronius used the following (made-up) example of a failure to write concisely and informatively: “I personally believe that, in cases X Y Z and under circumstances B and C, ceteris paribus and barring obvious exceptions, it seems safe to say that murder is wrong, though of course I could be mistaken.”u

To which I responded: “Your ‘murder is wrong’ example is a poorly-constructed sentence, sure, for reasons beyond the above. But the details and qualifications are the real content of that statement. I don’t think that’s because ‘murder is wrong’ is an uncharacteristically content-free claim. For any basic principle or sweeping generalization, there will be cases where it obviously works, cases where it obviously doesn’t, and the real information is in where you draw the line.”

This is how you bring a discussion back down to the object level: Take the argument from marginal cases, an attack on affording moral status to humans alone—no value-relevant principle selects all and only humans—and generalize it. “To make that judgment, you have to distinguish between these cases, but any principled basis for doing so fails” always struck me as a very powerful, general argument. Disputants wield it in debates on animal rights, copyright and piracy, software patents, drug use, abortion and infanticide, protected and prohibited speech. Bizarrely, people then follow this up with a claim that their principles are the right ones.

Some counterarguments and attempted substitutions aim to patch up the targeted principles: the “broken chair” response to the argument from marginal cases, or “abortion is murder once the fetus can survive (aided) outside the womb.” This is risky. If you search the space of all possible principles, you’ll probably find one that that supports your conclusions.v

Instead, take up then the corresponding principled counterargument to the generalized marginal-cases argument: “There are clearly acceptable and clearly unacceptable cases; we have to draw the line somewhere. We can’t afford to judge every marginal case individually, let alone do so consistently, so it’s best just to define an arbitrary boundary.”w

And then it turns out that both parties just want to draw the line in different places, principles be damned.

You can try to hone your meta-level arguments, but in the end, you’d do better to get down to the nitty-gritty of consequentialism. There’s a philosophical argument for consequentialism in here, but I don’t care about that. I care about the practical issues of resolving marginal cases whatever your ethics or meta-ethics. That is: don’t try to judge them by picking better principles. Consider the value of true positives and true negatives; consider the costs of false positives and false negatives; consider how the cross-hair of any principle apportions cases to each quadrant. Weigh consistency and simplicity against improved discrimination. Your goal is to skip to that part of the discussion as quickly as possible. Call it the Casuist’s Razor, stripping your discourse of the empty posturing and the grand unified theorizing that leads nowhere, leaving the hard core of object-level analysis and empirical predictions.x

On self-defeating arguments

Yes, I noticed, don’t worry. I’ve explained the principle of insufficiency of principled explanation; I think you know what to do. My rant against excessive meta, against a fixation on theories and against discourse purely about discourse, against thinking on the page and talking in circles and ending with nothing useful to take away, is all of those things. Hopefully it will be my only such essay.y

I brought in examples where I could, but perhaps they’ll be insufficient or too distracting.`` My main hope is that I’ll now have something to point to when these problems next come up, and a set of crystallized ideas that I can draw on more lucidly. It’s a footnote to any polemical essay I write, including itself. It’s meant to push a certain audience in a certain direction, and it’s styled to appeal to that audience. I’m not trying to make a broad philosophical point about the validity of statistical inference or the nature of explanation or moral particularism or consequentialism. I’m more trying to pull a deep unseriousness of thought out of its own blind spot. The Casuist’s Razor itself is nothing new; we all know the virtues of specificity and empiricism. But messy complexity isn’t going away, and we can all use better tools to recognize and analyze it.

By the way, physics envy poisons even physics. I made a related point on the learning of physics, but the best actual work, too, sweats the details, and I’d guess that the research community’s aesthetic exerts an undue pull in the tug-of-war between useful and obstructive idealization. (back)

I sympathize with Tyler Cowen’s sentiments on the basis of his disagreements with Robin Hanson, although I can’t speak to those particular disagreements: “Robin is very fond on powerful theories which invoke a very small number of basic elements and give those elements great force. He likes to focus on one very central mechanism in seeking an explanation or developing policy advice. Modern physics and Darwin hold too strong a sway in his underlying mental models. He is also very fond of hypotheses involving the idea of a great transformation sometime in the future, and these transformations are often driven by the mechanism he has in mind. I tend to see good social science explanations or proposals as intrinsically messy and complex and involving many different perspectives, not all of which can be reduced to a single common framework. I know that many of my claims sound vague to Robin’s logical atomism, but I believe that, given our current state of knowledge, Robin is seeking a false precision and he is sometimes missing out on an important multiplicity of perspectives. Many of his views should be more cautious.” Amusingly, Robin Hanson elsewhere compares himself to Bryan Caplan almost identically: “Bryan’s picture seems to be of a long metal chain linked at only one end to a solid foundation; chains of reasoning mainly introduce errors, so we do best to find and hold close to our few most confident intuitions. My picture is more like Quine’s “fabric,” a large hammock made of string tied to hundreds of leaves of invisible trees; we can’t trust each leaf much, but even so we can stay aloft by connecting each piece of string to many others and continually checking for and repairing broken strings.” I wonder what Caplan has to say about Cowen. (back)

Because it’s bad form to go meta without reference to an object level, I’m going to take examples from the answers to the Edge.org question. I think that should help me avoid these vulnerabilities in my own writing, and I won’t hurt anyone’s feelings. You can mouse-over footnotes if that helps maintain the flow of reading. (back)

Mihaly Csikszentmihalyi: “I refer to the well-known lines Lord Acton wrote in a letter from Naples in 1887 to the effect that: ‘Power tends to corrupt, and absolute power corrupts absolutely.’ At least one philosopher of science has written that on this sentence an entire science of human beings could be built.” (back)

Armand Marie Leroi skirts this steep slope; I’m well-disposed to variation-selection processes, but it’s probably not the best explanation for everything. (back)

Satyajit Das: “Inexactness and quantum mechanics challenge faith as well as concepts of truth and order. They imply a probabilistic world of matter, where we cannot know anything with certainty but only as a possibility. It removes the Newtonian elements of space and time from any underlying reality.” (back)

Example of both of the previous problems: P. Z. Myers tells us that his most fundamental explanation “is a mode of thinking: to understand how something works, you must first understand how it got that way.” (back)

One of the only helpful comments I ever received on an essay, long ago: “Stop thinking on the page.” I’ll let myself be the example for this one: extra credit on this one is pointing out where I lose sight of the ground. (back)

I somewhat get this vibe, that of a label that doesn’t provide additional information, from Luke’s use of “contrarian” here, but I imagine he knows better than that: see his footnote 1. (back)

That said, a hallmark of Scott Alexander’s universally praised style is the extended analogy. So how does he make it work? Scott is aware of the difficulties, I would guess, since he rarely explicitly argues by analogy, misreaders to the contrary. Scott’s extended analogies are more like framing devices, metaphors designed to engage the reader and lay out the qualitative structure of his reasoning; he then argues the metaphor’s tenor on its own merits. A pure example of this is his Empire/Forest Fire. Elsewhere his analogies play the role of a concrete example in an abstract argument—see for example In Favor of Niceness, Community, and Civilization, section III. I do have doubts about the virtue of this technique in practice. Improvements to readability aside, non-argumentative analogies still induce the subtle intuitive effects that make everyday analogical reasoning so dangerous, although I wouldn’t go so far as to call this a sinister rhetorical trick. Nope, turns out he just really, really likes analogies, and lately it hasn’t been working well for him. (back)

See also Mary Hesse on analogies in science: useful not as arguments or evidence in themselves, but rather for generating questions whose resolutions advance models. (back)

“Ah, but if prestige matters then what about prestigious field X?”/”That use of history would make the wrong prediction for field Y!” followed by “Well but X/Y also has A/B going on…” (back)

I wanted to bring in Tetlock and foxes and hedgehogs for some better empirical grounding, but there’s a bit of tension in what I understand to be his conclusions: humans who take more complex and combined perspectives (foxes) make better calibrated and more discriminating predictions than those who use one big idea (hedgehogs), but both frequently do worse than simple models. I highly recommend his book Expert Political Judgment: How Good Is It? How Can We Know?, but it’s worth keeping in mind that he’s describing a modest statistical difference measured in a particular way in a particular domain. (back)

Consider my comment and later reply on the SSC post about compound interest for a related example. (back)

to go further aside, if I explain that a proposed heat engine that works using the expansion of freezing water does not in fact produce unlimited work for finite heat “because that violates thermodynamic laws,” true as that might be, then you’ll be left feeling something’s missing until you actually compute the engine’s maximum efficiency by walking through its cycle and see how the Clausius-Clapeyron relation limits it and then work out some physical intuition for what that means (back)

What I’m calling the Casuist’s Razor here is really just a specific technique for dissolving questions about explanations: look to causal accounts at the borders. This takes work. It’s fine if you don’t feel like doing it or if the necessary information isn’t available; just admit that you’re out of your depth. (back)

Both the post and most of its comments strike me as another example of a bad meta discussion. Sophronius made some useful clarifications and specific claims in the comments, which fact I think only reinforces my point to follow. (back)

My real aim here is to give it a name that’s so bad you’ll be embarrassed to say it, so you don’t elevate or overuse the idea. (back)

I think some related points remain to be made about meta-level actions. I’m not yet sure whether they’re related enough for me to work them in to these comments about discourse and judgment. In particular, I worry about certain courses of action proposed in the name of leverage, which are vulnerable to many of the above considerations, among others. (back)

I want to avoid framing object-level examples of a meta-level principle as too distracting, since a major theme here is about avoiding the object level as a failure mode of excessive meta. If you don’t have enough illustrative examples, maybe your principle isn’t that good to begin with. (back)

In this metaphor, stars take centuries to rise.

By the way, physics envy poisons even physics. I made a related point on the learning of physics, but the best actual work, too, sweats the details, and I’d guess that the research community’s aesthetic exerts an undue pull in the tug-of-war between useful and obstructive idealization.

I sympathize with Tyler Cowen’s sentiments on the basis of his disagreements with Robin Hanson, although I can’t speak to those particular disagreements: “Robin is very fond on powerful theories which invoke a very small number of basic elements and give those elements great force. He likes to focus on one very central mechanism in seeking an explanation or developing policy advice. Modern physics and Darwin hold too strong a sway in his underlying mental models. He is also very fond of hypotheses involving the idea of a great transformation sometime in the future, and these transformations are often driven by the mechanism he has in mind. I tend to see good social science explanations or proposals as intrinsically messy and complex and involving many different perspectives, not all of which can be reduced to a single common framework. I know that many of my claims sound vague to Robin’s logical atomism, but I believe that, given our current state of knowledge, Robin is seeking a false precision and he is sometimes missing out on an important multiplicity of perspectives. Many of his views should be more cautious.” Amusingly, Robin Hanson elsewhere compares himself to Bryan Caplan almost identically: “Bryan’s picture seems to be of a long metal chain linked at only one end to a solid foundation; chains of reasoning mainly introduce errors, so we do best to find and hold close to our few most confident intuitions. My picture is more like Quine’s “fabric,” a large hammock made of string tied to hundreds of leaves of invisible trees; we can’t trust each leaf much, but even so we can stay aloft by connecting each piece of string to many others and continually checking for and repairing broken strings.” I wonder what Caplan has to say about Cowen.

Because it’s bad form to go meta without reference to an object level, I’m going to take examples from the answers to the Edge.org question. I think that should help me avoid these vulnerabilities in my own writing, and I won’t hurt anyone’s feelings. You can mouse-over footnotes if that helps maintain the flow of reading.

Mihaly Csikszentmihalyi: “I refer to the well-known lines Lord Acton wrote in a letter from Naples in 1887 to the effect that: ‘Power tends to corrupt, and absolute power corrupts absolutely.’ At least one philosopher of science has written that on this sentence an entire science of human beings could be built.”

Armand Marie Leroi skirts this steep slope; I’m well-disposed to variation-selection processes, but it’s probably not the best explanation for everything.

Satyajit Das: “Inexactness and quantum mechanics challenge faith as well as concepts of truth and order. They imply a probabilistic world of matter, where we cannot know anything with certainty but only as a possibility. It removes the Newtonian elements of space and time from any underlying reality.”

Example of both of the previous problems: P. Z. Myers tells us that his most fundamental explanation “is a mode of thinking: to understand how something works, you must first understand how it got that way.”

One of the only helpful comments I ever received on an essay, long ago: “Stop thinking on the page.” I’ll let myself be the example for this one: extra credit on this one is pointing out where I lose sight of the ground.

I somewhat get this vibe, that of a label that doesn’t provide additional information, from Luke’s use of “contrarian” here, but I imagine he knows better than that: see his footnote 1.

That said, a hallmark of Scott Alexander’s universally praised style is the extended analogy. So how does he make it work? Scott is aware of the difficulties, I would guess, since he rarely explicitly argues by analogy, misreaders to the contrary. Scott’s extended analogies are more like framing devices, metaphors designed to engage the reader and lay out the qualitative structure of his reasoning; he then argues the metaphor’s tenor on its own merits. A pure example of this is his Empire/Forest Fire. Elsewhere his analogies play the role of a concrete example in an abstract argument—see for example In Favor of Niceness, Community, and Civilization, section III. I do have doubts about the virtue of this technique in practice. Improvements to readability aside, non-argumentative analogies still induce the subtle intuitive effects that make everyday analogical reasoning so dangerous, although I wouldn’t go so far as to call this a sinister rhetorical trick. Nope, turns out he just really, really likes analogies, and lately it hasn’t been working well for him.

See also Mary Hesse on analogies in science: useful not as arguments or evidence in themselves, but rather for generating questions whose resolutions advance models.

“Ah, but if prestige matters then what about prestigious field X?”/”That use of history would make the wrong prediction for field Y!” followed by “Well but X/Y also has A/B going on…”

I wanted to bring in Tetlock and foxes and hedgehogs for some better empirical grounding, but there’s a bit of tension in what I understand to be his conclusions: humans who take more complex and combined perspectives (foxes) make better calibrated and more discriminating predictions than those who use one big idea (hedgehogs), but both frequently do worse than simple models. I highly recommend his book Expert Political Judgment: How Good Is It? How Can We Know?, but it’s worth keeping in mind that he’s describing a modest statistical difference measured in a particular way in a particular domain.

Consider my comment and later reply on the SSC post about compound interest for a related example.

to go further aside, if I explain that a proposed heat engine that works using the expansion of freezing water does not in fact produce unlimited work for finite heat “because that violates thermodynamic laws,” true as that might be, then you’ll be left feeling something’s missing until you actually compute the engine’s maximum efficiency by walking through its cycle and see how the Clausius-Clapeyron relation limits it and then work out some physical intuition for what that means

What I’m calling the Casuist’s Razor here is really just a specific technique for dissolving questions about explanations: look to causal accounts at the borders. This takes work. It’s fine if you don’t feel like doing it or if the necessary information isn’t available; just admit that you’re out of your depth.

Both the post and most of its comments strike me as another example of a bad meta discussion. Sophronius made some useful clarifications and specific claims in the comments, which fact I think only reinforces my point to follow.

My real aim here is to give it a name that’s so bad you’ll be embarrassed to say it, so you don’t elevate or overuse the idea.

I think some related points remain to be made about meta-level actions. I’m not yet sure whether they’re related enough for me to work them in to these comments about discourse and judgment. In particular, I worry about certain courses of action proposed in the name of leverage, which are vulnerable to many of the above considerations, among others.

I want to avoid framing object-level examples of a meta-level principle as too distracting, since a major theme here is about avoiding the object level as a failure mode of excessive meta. If you don’t have enough illustrative examples, maybe your principle isn’t that good to begin with.