Scientific Explanation

Issues concerning scientific explanation have been a focus of
philosophical attention from Pre-Socratic times through the modern
period. However, recent discussion really begins with the development
of the Deductive-Nomological (DN) model. This model has had
many advocates (including Popper 1935, 1959, Braithwaite 1953,
Gardiner, 1959, Nagel 1961) but unquestionably the most detailed and
influential statement is due to Carl Hempel (Hempel 1942, 1965, and
Hempel & Oppenheim 1948). These papers and the reaction to them
have structured subsequent discussion concerning scientific
explanation to an extraordinary degree. After some general remarks by
way of background and orientation (Section 1), this entry describes
the DN model and its extensions, and then turns to some
well-known objections (Section 2). It next describes a variety of
subsequent attempts to develop alternative models of explanation,
including Wesley Salmon's Statistical Relevance (Section 3)
and Causal Mechanical (Section 4) models and the
Unificationist models due to Michael Friedman and Philip
Kitcher (Section 5). Section 6 provides a summary and discusses
directions for future work.

As will become apparent, “scientific explanation” is a
topic that raises a number of interrelated issues. Some background
orientation will be useful before turning to the details of competing
models. A presupposition of most recent discussion has been that
science sometimes provides explanations (rather than something that
falls short of explanation—e.g., “mere
description”) and that the task of a “theory” or
“model” of scientific explanation is to characterize the
structure of such explanations. It is thus assumed that there is (at
some suitably abstract and general level of description) a single kind
or form of explanation that is “scientific”. In fact, the
notion of “scientific explanation” suggests at least two
contrasts—first, a contrast between those
“explanations” that are characteristic of
“science” and those explanations that are not, and,
second, a contrast between “explanation” and something
else. However, with respect to the first contrast, the tendency in
much of the recent philosophical literature has been to assume that
there is a substantial continuity between the sorts of explanations
found in science and at least some forms of explanation found in more
ordinary non-scientific contexts, with the latter embodying in a more
or less inchoate way features that are present in a more detailed,
precise, rigorous etc. form in the former. It is further assumed that
it is the task of a theory of explanation to capture what is common to
both scientific and at least some more ordinary forms of
explanation. These assumptions help to explain (what may otherwise
strike the reader as curious) why, as this entry will illustrate,
discussions of scientific explanation so often move back and forth
between examples drawn from bona-fide science (e.g., explanations of
the trajectories of the planets that appeal to Newtonian mechanics)
and more homey examples involving the tipping over of inkwells.

With respect to the second contrast, most models of explanation
assume that it is possible for a set of claims to be true, accurate,
supported by evidence, and so on and yet unexplanatory (at least of
anything that the typical explanation-seeker is likely to want
explained). For example, all of the accounts of scientific explanation
described below would agree that an account of the appearance of a
particular species of bird of the sort found in a bird guidebook is,
however accurate, not an explanation of anything of interest to
biologists (e.g., the development, characteristic features, or
behavior of that species). Instead, such an account is “merely
descriptive”. However, different models of explanation provide
different accounts of what the contrast between the explanatory and
merely descriptive consists in.

A related point is that while most theorists of scientific
explanation have proposed models that are intended to cover at least
some cases of explanation that we would not think of as part of
science, they have nonetheless assumed some implicit restriction on
the kinds of explanation they have sought to reconstruct. It has often
been noted that the word “explanation” is used in a wide
variety of ways in ordinary English—we speak of explaining the
meaning of a word, explaining the background to philosophical theories
of explanation, explaining how to bake a pie, explaining why one made
a certain decision (where this is to offer a justification) and so on.
Although the various models discussed below have sometimes been
criticized for their failure to capture all of these forms of
“explanation” (see, e.g., Scriven, 1959), it is clear that
they were never intended to do this. Instead, their intended
explicandum is, very roughly, explanations of why
things happen, where the “things” in question can be
either particular events or something more general—e.g.,
regularities or repeatable patterns in nature. Paradigms of this sort
of explanation include the explanation for the advance in the
perihelion of mercury provided by General Relativity, the explanation
of the extinction of the dinosaurs in terms of the impact of a large
asteroid at the end of the Cretaceous period, the explanation provided
by the police for why a traffic accident occurred (the driver was
drinking and there was ice on the road), and the standard explanation
provided in economics textbooks for why monopolies will, in comparison
with firms in perfectly competitive markets, raise prices and reduce
output.

Finally, a few words about the broader epistemological/methodological
background to the models described below. Many philosophers think of
concepts like “explanation”, “law”,
“cause”, and “support for counterfactuals” as
part of an interrelated family or circle of concepts that are
“modal” in character . For familiar
“empiricist” reasons, Hempel and many other early
defenders of the DN model regarded these concepts as not well
understood, at least prior to analysis. It was assumed that it would
be “circular” to explain one concept from this family in
terms of others from the same family and that they must instead be
explicated in terms of other concepts from outside the modal
family—concepts that more obviously satisfied (what were taken
to be) empiricist standards of intelligibility and testability. For
example, in Hempel's version of the DN model, the notion of a
“law” plays a key role in explicating the concept of
“explanation”, and his assumption is that laws are just
regularities that meet certain further conditions that are also
acceptable to empiricists. As we shall see, these empiricist standards
(and an accompanying unwillingness to employ modal concepts as
primitives) have continued to play a central role in the models of
explanation developed subsequent to the DN model.

There are many interesting historical questions about the DN
model that remain largely unexplored. Why did “scientific
explanation” emerge when it did as a major topic for
philosophical discussion? Why were the “logical
empiricist”philosophers of science who defended the DN
model so willing to accept the idea that science provides
“explanations”, given the tendency of many earlier
writers in the positivist tradition to think of
“explanation” as a rather subjective or
“metaphysical” matter and to contrast it unfavorably with
“description”, which they regarded as a more legitimate
goal for empirical science? And why was discussion, at least
initially, organized around “explanation” rather than
“causation”, since (as we shall observe) it is often the
latter notion that seems to be of central interest in subsequent
debates and since the former notion seems (to many contemporary
sensibilities) somewhat vague and ill-defined? At least part of the
answer to this last question seems to be that (again as explained in
more detail below) Hempel and other defenders of the DN model
inherited standard empiricist or Humean scruples about the notion of
causation. They assumed that causal notions are only (scientifically
or metaphysically) acceptable to the extent that it is possible to
paraphrase or re-describe them in ways that satisfied empiricist
criteria for meaningfulness and legitimacy. One obvious way of doing
this was to take causal claims to be tantamount to claims about the
obtaining of “regularities” (that is patterns of uniform
association in nature). It is just this idea that is captured by
the DN model (see below). Part of the initial appeal of the
topic of “scientific explanation” was thus that it
functioned as a more respectable surrogate for (or entry point into)
the problematic topic of
causation[1].
Another motivation was the interest of Hempel and other early
defenders of the DN model in forms of explanation such as
“functional explanation” (thought to be employed in such
special sciences as biology and anthropology) that were not obviously
causal. This also made it natural to frame discussion around a broad
category of explanation rather than narrower notions of
“causation” (cf. Hempel, 1965b).

Suggested Readings: Salmon (1989) is a superb
critical survey of all the models of scientific explanation discussed
in this entry. Pitt (1988) and Ruben (1993) are anthologies that
contain a number of influential articles.

According to the Deductive-Nomological Model, a scientific
explanation consists of two major “constituents”: an
explanandum, a sentence “describing the phenomenon to
be explained” and an explanans, “the class of
those sentences which are adduced to account for the phenomenon”
(Hempel and Oppenheim, 1948, reprinted in Hempel, 1965, p. 247). For
the explanans to successfully explain the explanandum several
conditions must be met. First, “the explanandum must be a
logical consequence of the explanans” and “the sentences
constituting the explanans must be true” (Hempel, 1965,
p. 248). That is, the explanation should take the form of a sound
deductive argument in which the explanandum follows as a conclusion
from the premises in the explanans. This is the
“deductive” component of the model. Second, the explanans
must contain at least one “law of nature” and this must be
an essential premise in the derivation in the sense that the
derivation of the explanandum would not be valid if this premise were
removed. This is the “nomological” component of the
model—“nomological” being a philosophical term of
art which, suppressing some niceties, means (roughly)
“lawful”. In its most general formulation, the
DN model is meant to apply both to the explanation of
“general regularities” or “laws” such as (to
use Hempel and Oppenheim's examples) why light conforms to the law of
refraction and also to the explanation of particular events, conceived
as occurring at a particular time and place, such as the bent
appearance of the partially submerged oars of a rowboat on a
particular occasion of viewing. As an additional illustration of a
DN explanation of a particular event, consider a derivation
of the position of Mars at some future time from Newton's laws of
motion, the Newtonian inverse square law governing gravity, and
information about the mass of the sun, the mass of Mars and the
present position and velocity of each. In this derivation the various
Newtonian laws figure as essential premises and they are used, in
conjunction with appropriate information about initial conditions (the
masses of Mars and the sun and so on), to derive the explanandum (the
future position of Mars) via a deductively valid argument. The
DN criteria are thus satisfied.

The notion of a sound deductive argument is (arguably) relatively
clear (or at least something that can be regarded as antecedently
understood from the point of view of characterizing scientific
explanation). But what about the other major component of the
DN model—that of a law of nature? The basic intuition
that guides the DN model goes something like this: Within the
class of true generalizations, we may distinguish between those that
are only “accidentally true” and those that are
“laws”. To use Hempel's examples, the generalization

(2.2.1)
All members of the Greensbury School Board for 1964 are bald

is, if true, only accidentally so. In contrast,

(2.2.2)
All gases expand when heated under constant pressure

is a law. Thus, according to the DN model, the latter
generalization can be used, in conjunction with information that some
particular sample of gas has been heated under constant pressure, to
explain why it has expanded. By contrast, the former generalization
(2.2.1) in conjunction with the information that a particular person
$n$ is a member of the 1964 Greensbury school board, cannot be used to
explain why $n$ is bald.

While this example may seem clear enough, what exactly is it that
distinguishes true accidental generalizations from laws? This has been
the subject of a great deal of philosophical discussion, most of which
must be beyond the scope of this
entry.[2]
For reasons explained in Section 1, Hempel assumes that an adequate
account must explain the notion of law in terms of notions that lie
outside the modal
family.[3]
In his (1965) he considers a number of familiar
proposals having this
character[4]
and finds them all wanting, remarking that the problem of
characterizing the notion of law has proved “highly
recalcitrant” (1965, p.338). It seems fair to say, however, that
his underlying assumption is that, at bottom, laws are just
exceptionless generalizations describing regularities that meet
certain additional distinguishing conditions that he is not at present
able to formulate.

In subsequent decades, there have been a number of other proposed
criteria for lawhood. Although each proposal has its adherents, none
has won general
acceptance.[5]
What implications does this have for the DN model? One
possible assessment is that all the DN model really requires
is that there be agreement in a substantial range of particular cases
about which generalizations are laws. If such agreement exists; it
matters little for the DN model if we are unable to formulate
completely general criteria that distinguish between laws and
accidentally true generalizations in all possible cases. For example,
even without an adequate account of lawhood, we can surely agree that
(2.2.2) is a law and (2.2.1) is not and this is all we need to
conclude that (2.2.2) can figure in DN explanations while
(2.2.1) cannot.

Unfortunately, however, matters are not always so
straightforward. One important issue raised by the DN model
concerns the explanatory status of the so-called special
sciences—biology, psychology, economics and so on. These
sciences are full of generalizations that appear to play an
explanatory role and yet fail to satisfy many of the standard criteria
for lawfulness. For example, although Mendel's law of segregation (M)
(which states that in sexually reproducing organisms each of the two
alternative forms (alleles) of a gene specifying a trait at a locus in
a given organism has 0.5 probability of ending up in a gamete) is
widely used in models in evolutionary biology, it has a number of
exceptions, such as meiotic drive. A similar point holds for the
principles of rational choice theory (such as the generalization that
preferences are transitive) which figure centrally in economics.
Other widely used generalizations in the special sciences have very
narrow scope in comparison with paradigmatic laws, hold only over
restricted spatio-temporal regions, and lack explicit theoretical
integration.

There is considerable disagreement over whether such generalizations
are laws. Some philosophers (e.g., Woodward, 2000) suggest that such
generalizations satisfy too few of the standard criteria to count as
laws but can nevertheless figure in explanations; if so, it apparently
follows that we must abandon the DN requirement that all
explanations must appeal to laws. Others (e. g., Mitchell, 1997),
emphasizing different criteria for lawfulness, conclude instead that
generalizations like (M) are laws and hence no threat to the
requirement that explanations must invoke laws. In the absence of a
more principled account of laws, it is hard to evaluate these
competing claims and hence hard to assess the implications of the
DN model for the special sciences. More generally, in the
absence of a generally accepted account of lawhood, the rationale for
the fundamental contrast between laws and non-laws which is at the
heart of what the DN model requires is unclear: it is hard to
assess the claim that all explanations must cite laws, without a clear
account of what a law is and what it contributes to successful
explanation. At the very least, providing such an account is an
important item of unfinished business for advocates of the DN
model.

The DN model is meant to capture explanation via deduction
from deterministic laws and this raises the obvious question of the
explanatory status of statistical laws. Do such laws explain at all
and if so, what do they explain, and under what conditions? In his
(1965) Hempel distinguishes two varieties of statistical
explanation. The first of these, deductive-statistical
(DS) explanation, involves the deduction of “a narrower
statistical uniformity” from a more general set of premises, at
least one of which involves a more general statistical law. Since
DS explanation involves deduction of the explanandum from a
law, it conforms to the same general pattern as the DN
explanation of regularities. However, in addition to DS
explanation, Hempel also recognizes a distinctive sort of statistical
explanation, which he calls inductive-statistical or
IS explanation, involving the subsumption of individual
events (like the recovery of a particular person from streptococcus
infection) under (what he regards as) statistical laws (such as a law
specifying the probability of recovery, given that penicillin has been
taken).

While the explanandum of a DN or DS explanation can
be deduced from the explanans, one cannot deduce that some particular
individual, John Jones, has recovered from the above statistical law
and the information that he has taken penicillin. At most what can be
deduced from this information is that recovery is more or less
probable. In IS explanation, the relation between explanans
and explanandum is, in Hempel's words, “inductive,” rather
than deductive—hence the name inductive-statistical
explanation. The details of Hempel's account are complex, but the
underlying idea is roughly this: an IS explanation will be
good or successful to the extent that its explanans confers high
probability on its explanandum outcome.

Thus if it is statistical law that the probability of recovery from
streptococcus, given that one has taken penicillin, is high, and Jones
has taken penicillin and recovered, this information can be used to
provide an IS explanation of Jones' recovery. However if the
probability of recovery is low (e.g. less than 0.5), given that Jones
has taken penicillin, then, even if Jones recovers, we cannot use this
information to provide an IS explanation of his recovery.

Why suppose that all (or even some) explanations have a DN
or IS structure? There are two ideas which play a central
motivating role in Hempel's (1965) discussion. The first connects the
information provided by a DN argument with a certain
conception of what it is to achieve understanding of why something
happens—it appeals to an idea about the object or point of
giving an explanation. Hempel writes

… a DN explanation answers the question
“Why did the explanandum-phenomenon occur?” by
showing that the phenomenon resulted from certain particular
circumstances, specified in $C_1, C_2, \ldots, C_k$, in
accordance with the laws $L_1, L_2, \ldots, L_r$. By pointing
this out, the argument shows that, given the particular circumstances
and the laws in question, the occurrence of the phenomenon was to
be expected; and it is in this sense that the explanation enables
us to understand why the phenomenon occurred. (1965, p. 337,
italics in original)

One can think of IS explanation as involving a natural
generalization of this idea. While an IS explanation does not
show that the explanandum-phenomenon was to be expected with
certainty, it does the next best thing: it shows that the
explanandum-phenomenon is at least to be expected with high
probability and in this way provides understanding. Stated more
generally, both the DN and IS models, share the
common idea that, as Salmon (1989) puts it, “the essence of
scientific explanation can be described as nomic
expectability—that is expectability on the basis of
lawful connections” (1989, p. 57).

The second main motivation for the DN/IS model has to do
with the role of causal claims in scientific explanation. There is
considerable disagreement among philosophers about whether all
explanations in science and in ordinary life are causal and also
disagreement about what the distinction (if any) between causal and
non-causal explanations consists
in.[6]
Nonetheless, virtually everyone, including Hempel, agrees that many
scientific explanations cite information about causes. However,
Hempel, along with most other early advocates of the DN
model, is unwilling to take the notion of causation as primitive in
the theory of explanation—that is, he was unwilling to simply
say that $X$ figures in an explanation of $Y$ if and
only if $X$ causes $Y$. Instead, adherents of the
DN model have generally looked for an account of causation
that satisfies the empiricist requirements described in Section 1. In
particular, advocates of the DN model have generally accepted
a broadly Humean or regularity theory of causation, according to which
(very roughly) all causal claims imply the existence of some
corresponding regularity (a “law”) linking cause to
effect. This is then taken to show that all causal explanations
“imply,” perhaps only “implicitly,” that such
a law/regularity exists and hence that laws are “involved”
in all such explanations, just as the DN model claims.

To illustrate of this line of argument, consider

(2.4.1)
The impact of my knee on the desk caused the tipping over of
the inkwell.

(2.4.1) is a so-called singular causal explanation, advanced by
Michael Scriven (1962) as a counterexample to the claim that the
DN model describes necessary conditions for successful
explanation. According to Scriven, (2.4.1) explains the tipping over
of the inkwell even though no law or generalization figures explicitly
in (2.4.1) and (2.4.1) appears to consist of a single sentence, rather
than a deductive argument. Hempel's response (1965, 360ff) is that the
occurrence of “caused” in (2.4.1) should not be left
unanalyzed or taken as explanatory just as it stands. Instead (2.4.1)
should be understood as “implicitly” or
“tacitly” claiming there is a “law” or
regularity linking knee impacts to tipping over of inkwells. According
to Hempel, it is the implicit claim that some such law holds that
“distinguishes” (2.4.1) from “a mere sequential
narrative” in which the spilling is said to follow the impact
but without any claim of causal connection—a narrative that
(Hempel thinks) would clearly not be explanatory. This linking law is
the nomological premise in the DN argument that, according to
Hempel, is “implicitly” asserted by (2.2.1).

There are two related but distinct ways of understanding this
argument, both of which are suggested by portions of Hempel's
discussion. According to the first, Hempel's claim is that the real
underlying structure of (2.4.1) is something like:

(2.4.2)$(L)$
Whenever knees impact tables on which an
inkwell sits and further conditions $K$ are met (where
$K$ specifies that the impact is sufficiently forceful, etc.),
the inkwell will tip over. (Reference to $K$ is necessary since
the impact of knees on table with inkwells does not always result in
tipping.)

$(I)$
My knee impacted a tables on which an inkwell sits and
further conditions $K$ are met.

$(E)$
The inkwell tips over.

Hence, to the extent that it is explanatory, (2.4.1)
“implicitly” satisfies the DN/IS requirements after
all—it is a DN/IS argument (namely 2.4.2) in
disguise.

There is a second interpretation of Hempel's argument that, unlike
the first interpretation, does not require that we think of the full
content of (2.4.2) as somehow already implicit in (2.4.1) Instead,
(2.4.2) plays the role of an ideal against which (2.4.1)
should be measured. (2.4.2) spells out what information a complete,
fully adequate explanation for $E$ would need to
contain—information that is present in (2.4.1) only in a partial
or incomplete way. On this view of the matter, we think of (2.4.1) as
an
explanation-sketch (cf. Hempel, 1965b, 423ff) which conveys
some of the information conveyed by (2.4.2) or points in the
direction of the more complete explanation (2.4.2). Ideally, singular
causal explanations like (2.4.1) should be replaced by explicit
DN explanations like (2.4.2).

On either interpretation, however, the basic idea is that a proper
explication of the role of causal claims in explanation leads via a
Humean or regularity theory of causation, to the conclusion that, at
least ideally, explanations should satisfy the DN/IS model.
Let us call this line of argument the “hidden structure”
argument in recognition of the role it assigns to a hidden (or at
least non-explicit) DN structure that is claimed to be
associated with (2.4.1).

This strategy will be examined in section 2.6, but let me first
comment on a feature of the discussion so far that may seem puzzling.
The boundaries of the category “scientific explanation”
are far from clear, but while (2.4.1) is arguably an explanation, it
is not what one usually thinks of as
“science”—instead it is a claim from “ordinary
life” or “common sense”. This raises the question of
why adherents of the DN/IS model don't simply respond to the
alleged counterexample (2.4.1) by denying that it is an instance of
the category “scientific explanation”—that is, by
claiming that the DN/IS model is not an attempt to
reconstruct the structure of explanations like (2.4.1) but is rather
only meant to apply to explanations that are properly regarded as
“scientific”. The fact that this response is not often
adopted by advocates of the DN model is an indication of the
extent to which, as noted in section 1, it is implicitly assumed in
most discussions of scientific explanation that there are important
similarities or continuities in structure between explanations like
(2.4.1) and explanations that are more obviously scientific and that
these similarities that should be captured by some common account that
applies to both. Indeed, it is a striking feature not just of Hempel
(1965) but of many other treatments of scientific explanation that
much of the discussion in fact focuses on “ordinary life”
singular causal explanations similar to (2.4.1), the tacit assumption
being that conclusions about the structure of such explanations have
fairly direct implications for understanding explanation in
science.

As explained above, examples like (2.4.1) are potential
counterexamples to the claim that the DN model provides
necessary conditions for explanation. There are also a number
of well-known counterexamples to the claim that the DN model
provides sufficient conditions for successful scientific
explanation. Here are two illustrations.

Explanatory Asymmetries. There are many cases in
which a derivation of an explanandum $E$ from a law $L$
and initial conditions $I$ seems explanatory but a
“backward” derivation of $I$ from $E$ and
the same law $L$ does not seem explanatory, even though the
latter, like the former, appears to meet the criteria for successful
DN explanation. For example, one can derive the length
$s$ of the shadow cast by a flagpole from the height $h$
of the pole and the angle θ of the sun above the horizon and
laws about the rectilinear propagation of light. This derivation meets
the DN criteria and seems explanatory. On the other hand, a
derivation (2.5.1) of $h$ from $s$ and $\theta$ and the
same laws also meets the DN criteria but does not seem
explanatory. Examples like this suggest that at least some
explanations possess directional or asymmetric features to which the
DN model is insensitive.

Explanatory Irrelevancies. A derivation can satisfy
the DN criteria and yet be a defective explanation because it
contains irrelevancies besides those associated with the directional
features of explanation. Consider an example due to Wesley Salmon
(Salmon, 1971, p.34):

(2.5.2)$(L)$
All males who take birth control pills regularly fail to get pregnant

$(K)$
John Jones is a male who has been taking birth control pills regularly

$(E)$
John Jones fails to get pregnant

It is arguable that $(L)$ meets the criteria for lawfulness
imposed by Hempel and many other writers. (If one wants to deny that
$L$ is a law one needs some principled, generally accepted
basis for this judgment and, as explained above, it is unclear what
this basis is.) Moreover, (2.5.2) is certainly a sound deductive
argument in which $L$ occurs as an essential
premise. Nonetheless, most people judge that $(L)$ and
$(K)$ are no explanation of $E$. There are many other
similar illustrations. For example (Kyburg 1965), it is presumably a
law (or at least an exceptionless, counterfactual supporting
generalization) that all samples of table salt that have been hexed by
being touched with the wand of a witch dissolve when placed in
water. One may use this generalization as a premise in a DN
derivation which has as its conclusion that some particular hexed
sample of salt has dissolved in water. But again the hexing is
irrelevant to the dissolving and such a derivation is no
explanation.

One obvious diagnosis of the difficulties posed by examples like
(2.5.1) and (2.5.2) focuses on the role of causation in explanation.
According to this analysis, to explain an outcome we must cite its
causes and (2.5.1) and (2.5.2) fail to do this. As Salmon (1989, p.47)
puts it, “a flagpole of a certain height causes a shadow of a
given length and thereby explains the length of the shadow”. By
contrast, “the shadow does not cause the flagpole and
consequently cannot explain its height”. Similarly, taking
birth control pills does not cause Jones' failure to get pregnant and
this is why (2.5.2) fails to be an acceptable explanation. On this
analysis, what (2.5.1) and (2.5. 2) show is that a derivation can
satisfy the DN criteria and yet fail to identify the causes
of an explanandum—when this happens the derivation will fail
to be explanatory.

As explained above, advocates of the DN model would not
regard this diagnosis as very illuminating, unless accompanied by some
account of causation that does not simply take this notion as
primitive. (Salmon in fact provides such an account, which we will
consider in Section 4.) We should note, however, that an apparent
lesson of (2.5.1) and (2.5.2) is that the regularity account of
causation favored by DN theorists is at best incomplete: the
occurrence of $c$, $e$ and the existence of some
regularity or law linking them (or $x$'s having property
$P$ and $x$'s having property $Q$ and some law
linking these) is not a sufficient condition for the truth of
the claim that $c$ caused $e$ or $x$'s having
$P$ is causally or explanatorily relevant to $x$'s
having Q. More generally, if the counterexamples (2.5.1) and
(2.5.2) are accepted, it follows that the DN model fails to
state sufficient conditions for explanation. Explaining an outcome
isn't just a matter of showing that it is nomically
expectable.

There are two possible reactions one might have to this observation.
One is that the idea that explanation is a matter of nomic
expectability is correct as far as it goes, but that something more is
required as well. According to this assessment, the DN/IS
model does state a necessary condition for successful
explanation and, moreover, a condition that is a non-redundant part of
a set of conditions that are jointly sufficient for explanation.
However, some other, independent feature, $X$ (which will
account for the directional features of explanation and insure the
kind of explanatory relevance that is apparently missing in the birth
control example) must be added to the DN model to achieve a
successful account of explanation. The idea is thus that Nomic
Expectability $+\ X = $ Explanation. Something like this idea is
endorsed, by the unificationist models of explanation developed by
Friedman (1974) and Kitcher (1989), which are discussed in Section 5
below.

A second, more radical possible conclusion is that the DN
account of the goal or rationale of explanation is mistaken in some
much more fundamental way and that the DN model does not even
state necessary conditions for successful explanation. As noted above,
unless the hidden structure argument is accepted, this conclusion is
strongly suggested by examples like (2.4.1) (“The impact of my
knee caused the tipping over of the inkwell”) which appear to
involve explanation without the explicit citing of a law or a
deductive structure. To assess whether the DN/IS model
provides necessary conditions for explanation, we thus must consider
the hidden structure strategy in more detail.

It might seem that the contention of the hidden structure strategy
that singular causal explanations like (2.4.1) are implicit
DN/IS explanations or sketches of such explanations
is at best relevant to the question of whether the DNIS model
provides an adequate reconstruction of this particular sort of
explanation. In fact, however, Hempel's strategy of treating
explanations as devices for conveying information, but in a
“partial” or “incomplete” way, about
underlying “ideal” explanations of a prima-facie quite
different form that are at least partly epistemically hidden from
those who use the original, non-ideal explanation has continued to be
very popular in recent theorizing about scientific explanation. This
strategy forms the basis, for example, for Peter Railton's (1978,
1981) contrast between an “ideal explanatory text” which
contains all of the causal and nomological information relevant to
some outcome of interest and the “non-ideal” explanations
like (2.4.1)that we actually give. According to Railton, the latter
provide “explanatory information” in virtue of conveying
information about some limited portion or aspect of the ideal text and
are explanatory in virtue of doing so. The hidden structure strategy
also plays an important role in the unificationist account of
explanation developed by Philip Kitcher (1989) who likewise insists we
must “distinguish between what is said on an occasion in which
explanatory information is given and the ideal underlying
explanation” (Kitcher, 1989, p. 414.) Indeed, any account of
explanation that, like Kitcher's unificationist model, insists that
laws (or generalizations of considerable generality) and deductive
structure are necessary conditions for successful explanation will
need to appeal to something like hidden structure strategy since it is
generally accepted that there are many apparent explanations that do
not conform to such conditions in their overt structure.

Although the hidden structure strategy deserves more attention than
it can receive here, several points seem clear. First, the notion of
one explanation “conveying information about” another
“underlying” explanation requires considerable spelling
out. Depending on what “underlying” is understood to mean,
it is arguable that there are many explanations underlying
(2.4.1)—(i) the explanation (2.4.2), assuming that condition
$K$ can be specified in a non-trivial way, (ii) an explanation
at the level of classical physics that makes reference to laws
governing inelastic collisions, the behavior of liquids when not
confined to containers, and so on, and (iii) an explanation in which
the behavior of the whole system is characterized in terms of some
more fundamental physical theory (quantum mechanics, superstring
theory etc.). Are all of these explanations implicit in
(2.4.1) or does (2.4.1) convey partial information about all
of them? In what sense of “implicit” or “conveys
information about” could this possibly be true?

Railton (1981) suggests that an explanatory claim provides
information about an underlying ideal text if the former reduces
uncertainty about some of the properties of the text, in the sense of
ruling in or out various possibilities concerning its structure. As
Railton recognizes, this has proposal has many counterintuitive
consequences. To use Railton's own example, “the relevant ideal
text contains more than $10^2$ words in English”, if
true, counts as an explanation for an episode of radioactive
decay (1981, p. 246). Similarly, the claim that $X$ and
$Y$ are correlated, will count as a partial
explanation of $X$ and $Y$ on the plausible assumption
that this claim conveys the information that one of three
possibilities is likely to be true—either $X$ causes
$Y$ or $Y$ causes $X$ or they have a common
cause—and thus reduces uncertainty about the contents of the
ideal underlying text. This contrasts with the widespread judgment
that correlations in themselves are not explanatory. Indeed, on a view
like Railton's, even the claim that some outcome has no causes or is
governed by no laws counts as an “explanation” of that
outcome, supposing that claim is true. In fact, such a claim is
apparently maximally explanatory, since it conveys everything
that there is to be said about the ideal explanatory text associated
with that event. Examples like these suggest that not every claim that
reduces uncertainty about the contents of an ideal explanatory text
should be regarded as itself explanatory—such a view allows too
much to count as an explanation.

Is it plausible to regard the text that contains all of the full
details of causal and nomological information relevant to some outcome
as at least an “ideal” against which various candidate
explanations of that outcome are to be judged? Suppose we are
presented with an explanation from economics or psychology that does
not appeal to any generalization that we are prepared to count as a
law but that underlying this “non-ideal” explanation is
some incredibly complex set of facts described in terms of classical
mechanics and electromagnetism, along with the relevant laws of these
theories. If, as almost certainly will be the case, this underlying
“explanation” is computationally intractable, and full of
irrelevant detail (see section 4 below for more on what this might
mean), one might wonder in what sense it is an ideal against which the
original explanation should be measured. Will the economics
explanation really be better according as to whether it conveys as
much information as possible about these underlying details?

Finally, consider the connection between explanation and
understanding. One ordinarily thinks of an explanation as something
that provides understanding. Relatedly, part of the task of a theory
of explanation is to identify those structural features of
explanations (or the information they convey) in virtue of which they
provide understanding. For example, as noted above, the DN
model connects understanding with the provision of information about
nomic expectability—the idea is that understanding why an
outcome occurs is a matter of seeing that it was to be expected on the
basis of a law. The problem this raises for the hidden structure
strategy is that the information associated with the hidden structure
alleged to underlie “non-ideal” explanations like (2.4.1)
is typically unknown or epistemically inaccessible to those who use
the explanation. It is hard to see how this structure or information
can contribute to understanding if it is epistemically hidden in this
way. For example, it seems plausible that many (if not almost all)
users of (2.4.1) (both those who might offer it as an explanation and
those recipients who take it to provide understanding) are unaware of
the DN structure that underlies it—indeed it is plausible
that many users lack the notion of a law of nature and of a
deductively valid argument and hence any notion that there is
any (unknown) DN argument underlying (2.4.1). If
this is the case, how can the mere obtaining of this DN
structure, independently of anyone's awareness of its existence,
function so as to provide understanding when (2.4.1) is used? Instead,
it seems that the features of (2.4.1) that endow it with explanatory
import—that make it an explanation—must be features
that can be known or grasped or recognized by those who use the
explanation. A similar point will hold for many other candidate
explanations that fail to conform to the DN requirements such
as explanations from sciences like economics and psychology that seem
to lack laws.

What can we conclude from this discussion of the hidden structure
strategy? If the strategy fails, there will be a large number of
apparent explanations that fail to satisfy the necessary conditions
for explanation imposed by the DN/IS model. On the other
hand, it is possible that there are ways of developing the hidden
structure strategy that respond adequately to the difficulties
described above. If so, the idea that the DN/IS
requirements are at least necessary conditions for ideal explanation
may be defensible after all, although the counterexamples to the
sufficiency of the model noted in will remain.

Suggested Readings. The most authoritative and
comprehensive statement of the DN and IS models is
probably Hempel 1965b. This is reprinted in Hempel, 1965a, along with
a number of other papers that touch on various aspects of the problem
of scientific explanation. In addition to the references cited in this
section, Salmon, 1989, pp. 46ff describes a number of well-known
counterexamples to the DN/IS models and discusses their
significance.

Much of the subsequent literature on explanation has been motivated
by attempts to capture the features of causal or explanatory relevance
that appear to be left out of examples like (2.5.1) and (2.5.2),
typically within the empiricist constraints described above. Wesley
Salmon's statistical relevance (or SR) model (Salmon, 1971)
is a very influential attempt to capture these features in terms of
the notion of statistical relevance or conditional dependence
relationships. Given some class or population $A$, an attribute
$C$ will be statistically relevant to another
attribute $B$ if and only if
$P(B \mid A.C) \ne P(B \mid A)$—that is, if and only if the probability of
$B$ conditional on $A$ and $C$ is different from
the probability of $B$ conditional on $A$ alone. The
intuition underlying the SR model is that statistically
relevant properties (or information about statistically relevant
relationships) are explanatory and statistically irrelevant properties
are not. In other words, the notion of a property making a difference
for an explanandum is unpacked in terms of statistical relevance
relationships.

To illustrate this idea, suppose that in the birth control pills
example (2.5.2) the original population $T$ includes both
genders. Then

assuming that not all
women in the population take birth control pills. In other words, if
you are a male in this population, taking birth control pills is
statistically irrelevant to whether you become pregnant, while if you
are a female it is relevant. In this way we can capture the idea that
taking birth control pills is explanatorily irrelevant to pregnancy
among males but not among females.

To characterize the SR model more precisely we need the
notion of a homogenous partition. A homogenous partition of
$A$ is a set of subclasses or cells $C_i$ of
$A$ that are mutually exclusive and exhaustive, where
$P(B \mid A.C_i) \ne P(B \mid A.C_j)$
for all $C_i \ne C_j$ and
where no further statistically relevant partition of any of the cells
$A, C_i$ can be made with respect to $B$—that
is, there are no additional attributes $D_k$ in
$A$ such that $P(B \mid A.C_i) \ne
P(B \mid A.C_i.D_k)$.

On the SR model, an explanation of why some member
$x$ of the class characterized by attribute $A$ has
attribute $B$ consists of the following information:

the prior probability of $B$ within $A$ : $P(B \mid A)
= p$.

A homogeneous partition of $A$ with respect to $B$, $(A. C_1,
\ldots, A. C_n)$, together with the probability of $B$ within each cell
of the partition: $P(B \mid A.C_i) = p_i$ and

The cell of the partition to which $x$ belongs.

To employ one of Salmon's examples, suppose we want to construct an
SR explanation of why $x$ who has a strep infection =
$S$, recovers quickly = $Q$. Let $T (-T)$
according to whether $x$ is (is not) treated with
penicillin, and $R (-R)$ = according to whether the subject has
a penicillin-resistant strain. Assume for the sake of argument that no
other factors are relevant to quick recovery. There are four possible
combinations of these properties: $T.R$, $-T.R$, $T.-R$, $-T.-R$, but
let us assume that

That is, the
probability of quick recovery, given that one has strep, is the same
for those who have the resistant strain regardless of whether or not
they are treated and also the same for those who have not been
treated. By contrast, the probability of recovery is different
(presumably greater) among those with strep who have been treated and
do not have the resistant strain.

In this case $[S. (T.R \vee -T.R \vee -R.-T)]$,
$[S.T.-R]$ is a homogenous partition
of $S$ with respect to $Q$. The
SR explanation of $x$'s recovery will consist of a
statement of the probability of quick recovery among all those with
strep (this is (i) above), a statement of the probability of recovery
in each of the two cells of the above partition ((ii) above), and the
cell to which $x$ belongs, which is $S.T.R$ ((iii)
above). Intuitively, the idea is that this information tells us about
the relevance of each of the possible combinations of the properties
$T$ and $R$ to quick recovery among those with strep and
is explanatory for just this reason.

The SR model has a number of distinctive features that have
generated substantial discussion. First, note that according to the
SR model, and in contrast to the DN/IS model, an
explanation is not an argument—either in the sense of a
deductively valid argument in which the explanandum follows as a
conclusion from the explanans or in the sense of an inductive argument
in which the explanandum follows with high probability from the
explanans, as in the case of IS explanation. Instead, an
explanation is an assembly of information that is statistically
relevant to an explanandum. Salmon argues (and takes the birth control
example (2.6.2) to illustrate) that the criteria that a good argument
must satisfy (e.g., criteria that insure deductive soundness or some
inductive analogue) are simply different from those a good explanation
must satisfy. Among other things, as Salmon puts it,
“irrelevancies [are] harmless in arguments but fatal in
explanations” (1989, p. 102). As explained above, in associating
successful explanation with the provision of information about
statistical relevance relationships, the SR model attempts to
accommodate this observation.

A second, closely related point is that the SR model departs
from the IS model in abandoning the idea that a statistical
explanation of an outcome must provide information from which it
follows the outcome occurred with high probability. As the reader may
check, the statement of the SR model above imposes no such
high probability requirement; instead, even very unlikely outcomes
will be explained as long as the criteria for SR explanation
are met. Suppose that, in the above example, the probability of quick
recovery from strep, given treatment and the presence of a
non-resistant strain, is rather low (e.g., 0.2). Nonetheless, if the
criteria (i)–(iii) above—a homogeneous partition with
correct probability values for each cell in the partition—are
satisfied, we may use this information to explain why $x$, who
had a non-resistant strain of strep and was treated, recovered
quickly. Indeed, according to the SR model, we may explain
why some $x$ which is $A$ is $B$, even if the
conditional probability of $B$ given $A$ and the cell
$C_i$ to which $x$ belongs
$(p_i = P(B \mid A.C_i))$ is
less than the prior probability $(p =
P(B \mid A))$ of $B$ in $A$. For example, if the
prior probability of quick recovery among all those with any form of
strep is 0.5 and the probability of quick recovery of those with a
resistant strain who are untreated is 0.1, we may nonetheless explain
why $y$, who meets these last conditions $(-T.R)$,
recovered quickly (assuming he did) by citing the cell to which he
belongs (the fact that he had the resistant strain and was
untreated), the probability of recovery given that he falls in this
cell, and the other sort of information described above. More
generally, what matters on the SR model is not whether the
value of the probability of the explanandum-outcome is high or low (or
even high or low in comparison with its prior probability) but rather
whether the putative explanans cites all and only statistically
relevant factors and whether the probabilities it invokes are
correct. One consequence of this, which Salmon endorses while
acknowledging that many will regard it as unintuitive, is that on the
SR model, the same explanans $E$ may explain both an
explanandum $M$ and explananda that are inconsistent with
$M$, such as $-M$. For example, the same explanans
will explain both why a subject with strep and certain other
properties (e.g., $T$ and $-R$) recovers quickly, if he
does, and also why he does not recover if he does not. By contrast, on
the DN or IS models, if $E$ explains $M,E$ cannot also
explain $-M$.

The intuition that, contrary to the IS model, the value that
a candidate explanans assigns to an explanandum-outcome should not
matter for the goodness of the explanation it provides can be
motivated in the following way. Consider a genuinely indeterministic
coin which is biased strongly $(p = 0.9)$ toward heads when
tossed. Suppose that if it is not tossed the coin has probability of
0.5 of being in either the heads or tails position and that whether or
not the coin is tossed is the only factor that is statistically
relevant to whether it is heads or tails. According to the IS
model, if the coin is tossed and comes up heads, we can explain this
outcome by appealing to the fact that the coin was tossed (since under
this condition the probability of heads is high) but if the coin is
tossed and comes up tails we cannot explain this outcome, since its
probability is low . The contrary intuition underlying the SR
model is that we understand both outcomes equally well. The bias of
the coin and the fact that the coin has been tossed are the only
factors relevant to either outcome and those factors are common to
both outcomes—once we have cited the toss (and specified the
probability values for heads and tails on tossing), we left nothing
out that influences either outcome. Similarly, Salmon argues, if it is
really true that the partition in the example involving quick recovery
from strep is objectively homogenous—if there are no other
factors that are statistically relevant to quick recovery besides
whether the subject has been treated and has a resistant
strain—then once we have specified the probability of quick
recovery under all combinations of these factors, and the combination
of factors possessed by the subject whose recovery (or not, as the
case may be) we want to explain, we have specified all information
relevant to recovery and in this sense fully explained the outcome for
the subject.[7]

In assessing these claims, it will be useful to take a step back and
ask just what it is that these competing models of statistical
explanation (Hempel's IS model and Salmon's SR
model) are intended to be reconstructions of. In the literature on
this topic two classes of examples or applications figure
prominently. First, there are examples drawn from quantum- mechanics
(QM). Suppose, for example, a particle has a probability
$p$ that is strictly between 0 and 1 of penetrating a potential
barrier. Models of statistical explanation assume that if the particle
does penetrate the barrier, QM explains this
outcome—the IS and SR models are intended to
capture the structure of such explanations. Second, there are examples
drawn from biomedical (or epidemiological) and social scientific
applications—recovery from strep or, to cite one of Salmon's
extended illustrations (Salmon, 1971), the factors relevant to
juvenile delinquency in teen-age boys.

This is, to say the least, a heterogeneous class of examples. In the
case of QM, the usual understanding is that the various
no-hidden variable results establish that any empirically adequate
theory of quantum mechanical phenomena must be irreducibly
indeterministic. It is thus plausible that when we use the Schrödinger
equation to derive the probability that a particle with a certain
kinetic energy will tunnel through a potential barrier of a certain
shape, this representation satisfies the SR model's
“objective homogeneity” condition—there are no
additional omitted variables that would affect the probability of
barrier penetration. By contrast, it seems quite unlikely that this
homogeneity condition will be satisfied in most (indeed, in any) of
the biomedical and sociological illustrations that have figured in the
literature on statistical explanation. In the case of recovery from
strep, for example, it is very plausible that there are many other
factors besides the two mentioned above that affect the probability of
recovery—these additional factors will include the state of
the subject's immune system, various features of the subject's general
level of health, the precise character of the strain of disease to
which the subject is exposed (resistant versus non-resistant is almost
certainly too coarse-grained a dichotomy) and so on. Similarly for
episodes of juvenile delinquency. In these cases, in contrast to the
cases from quantum mechanics, we lack a theory or body of results that
delimits the factors that are potentially relevant to the probability
of the outcome that interests us. Thus, in realistic examples of
assemblages of statistically relevant factors from biomedicine and
social science, the objective homogeneity condition is unlikely to be
satisfied, or in any practical sense, satisfiable.

A related difference concerns the way in which statistical evidence
figures in these two sorts of applications. Some quantum mechanical
phenomena such as radioactive decay are irreducibly
indeterministic. By contrast, in the biomedical and social scientific
applications, while the relevant evidence is
“statistical”, there is typically no corresponding
assumption that the phenomena of interest are irreducibly
indeterministic. This particularly clear in connection with the social
scientific examples (such as risk factors for juvenile delinquency)
that Salmon discusses. Here the relevant methodology involves
so-called causal modeling or structural equation techniques. At least
on the most straightforward way of applying such procedures, the
equations that govern whether a particular individual becomes a
juvenile delinquent are (if interpreted literally) deterministic.
According to such approaches, the phenomena being modeled
look as though they are indeterministic because some of the
variables which are relevant to their behavior, the influence of which
is summarized by a so-called error term, are unknown or
unmeasured. Statistical information about the incidence of juvenile
delinquency among individuals in various conditions plays the role of
evidence that is used to estimate parameters (the
coefficients) in the deterministic equations that are taken to
describe the processes governing the onset of delinquency. A similar
point holds for at least many biomedical
examples.[8]

Several preliminary conclusions are suggested by these observations.
First, it is far from obvious that we should try to construct a
single, unified model of statistical explanation that applies to both
quantum mechanics and macroscopic phenomena like delinquency or
recovery from infection. Second, and relatedly, while explanation in
QM satisfies the objective homogeneity condition, it is
dubious that the sorts of “statistical explanations” found
in the social and biomedical sciences do so. In other words, if an
objective homogeneity condition is imposed on statistical explanation,
it is not clear that there will be any examples of successful
statistical explanation outside of quantum mechanics.

With these observations in mind, let us revisit the question of what
is explained by statistical theories, whether quantum mechanical or
macroscopic. As we have seen, both Hempel and Salmon, as well as most
subsequent contributors to the literature on statistical explanation,
have tended to assume that statistical theories that assign a
probability to some outcome strictly between 0 and 1 should
nonetheless be interpreted as explaining that outcome. Given this
common starting point, Salmon is quite persuasive in arguing that it
is arbitrary to hold, as Hempel does, that only individual outcomes
with high probability can be explained. But why should we accept the
starting point? Why not take Salmon's argument instead to be a reason
for rejecting the idea that statistical theories explain individual
outcomes, whether of high or low probability? If we take this view, we
need not conclude that a theory like QM is unexplanatory.
Instead, we may take the explananda of QM to be facts about
the probabilities or expectation values of outcomes rather than
individual outcomes themselves. On this view, the explananda that are
explained by QM are a (proper) subset of those that can be
derived from it—at least in this respect, the explanations
provided by QM are like DS explanations in
structure. Woodward (1989) argues that this construal allows us to say
all that we might legitimately wish to say about the explanatory
virtues of QM. If this is correct, there is no obvious need
for a separate theory of statistical explanation of individual
outcomes of the sort that Hempel and Salmon sought to devise (But see
footnote 7).

In the case of juvenile delinquency and causal modeling techniques it
is, if anything, even more intuitive that what is being explained is
not, e.g. why some particular boy, Albert, became a juvenile
delinquent, but rather something more general—e.g., why the
expected incidence of delinquency is higher among certain subgroups
than others. Again such explananda are deducible from the system of
equations used to model juvenile delinquency. Taking this view of what
is explained by statistical theories allows us to avoid various
unintuitive consequences of Hempel's model (e.g., that high
probability but not low probability outcomes are explained) and of
Salmon's model (e.g., the same explanans $E$ explains both
$M$ and $-M$. At the very least, those who have
sought to construct models of statistical explanation of individual
outcomes need to provide a more detailed elucidation of why such
models are needed and of the features of scientific theorizing they
are designed to
capture.[9]

As we have just seen, the SR model raises a number of
interesting questions about the statistical explanation of individual
outcomes—questions that are important independently of the details
of the SR model itself. This section will abstract away from
such questions and focus instead on the root motivation for the
SR model. We may take this to consist of two ideas: (i)
explanations must cite causal relationships and (ii) causal
relationships are captured by statistical relevance
relationships. Even if (i) is accepted, a fundamental problem with the
SR model is that (ii) is false—as a substantial body of
work[10]
has made clear, casual relationships are greatly underdetermined by
statistical relevance relationships. Consider another example from
Salmon (1971): a system in which atmospheric pressure $A$ is a common
cause of the occurrence of a storm $S$ and the reading of a barometer
$B$ with no causal relationship between $B$ and $S$. Salmon claims
that in such a system $B$ and $S$ will be correlated but that $B$ is
statistically irrelevant to $S$ given $A$—i.e. $P(S \mid| A.B) =
P(S \mid A)$. By contrast, (Salmon claims) $A$ remains relevant to $S$
given $B$—i.e., $P(S \mid A.B) \ne P(S \mid
B)$. Similarly, $S$ is irrelevant to $B$
given $A$ but $A$ remains relevant $B$
given $S$. In this way, Salmon's SR model attempts to
capture the idea that
$A$ is explanatorily (and causally) relevant to $S$
while $B$ is not and that $A$ is explanatorily and
causally relevant to $B$ while $S$ is not.

These contentions about the connection between causal claims and
statistical relevance relations are consequences of a more general
principle called the Causal Markov condition which has been
extensively discussed in the recent literature on
causation.[11]
A set of variables standing in a causal relationship and an
associated probability distribution over those variables satisfy the
Causal Markov condition if and only if conditional on its direct
causes every variable is independent of every other variable except
possibly for its effects. Two relevant points have emerged from
discussion of this condition. The first, which was in effect noted by
Salmon himself in work subsequent to his (1971), is that there are
circumstances in which the Causal Markov condition fails and hence in
which causal claims do not imply the screening off relationships
described above. This can happen, for example, if the variables to
which the condition is applied are characterized in an insufficiently
fine-grained
way.[12]
The second and more fundamental observation is that, depending on the
details of the case, many different sets of causal relationships may
be compatible with the same statistical relevance relationships, even
assuming that the Causal Markov condition is satisfied. For example, a
structure in which $B$ causes $A$ which in turn
causes $S$ will, if we assume the Causal Markov condition (that
is, make assumptions like Salmon's connecting causation and
statistical relevance relationships), lead to exactly the same
statistical relevance relationships as in the example in which
$A$ is a common cause of $B$ and $S$. Similarly
if $S$ causes $A$ which in turn causes $B$. In
structures with more variables, this underdetermination of causal
relationships by statistical relevance relationships may be far more
extreme. Thus a list of statistical relevance relationships, which is
what the SR model provides, need not tell us which causal
relationships are operative. To the extent that explanation has to do
with the identification of the causal relationships on which an
explanandum-outcome depends, the SR model fails to fully
capture these.

Selected Readings. Salmon, 1971a provides a detailed
statement and defense of the SR model. This essay, as well as
papers by Jeffrey (1969) and Greeno (1970) which defend views broadly
similar to the SR model, are collected in Salmon, 1971b.
Additional discussion of the model as well as a more recent
characterization of “objective homogeneity” can be found
in Salmon, 1984. Cartwright, 1979 contains some influential criticisms
of the SR model. Theorems specifying the precise extent of
the underdetermination of causal claims by evidence about statistical
relevance relationships can be found in Spirtes, Glymour and Scheines,
1993, 2000, chapter 4.

In more recent work (especially, Salmon, 1984) Salmon abandoned the
attempt to characterize explanation or causal relationships in purely
statistical terms. Instead, he developed a new account which he called
the Causal Mechanical (CM) model of explanation—an
account which is similar in both content and spirit to so-called
process theories of causation of the sort defended by
philosophers like Philip Dowe (2000). We may think of the
CM model as an attempt to capture the “something
more” involved in causal and explanatory relationships over and
above facts about statistical relevance, again while attempting to
remain within a broadly Humean framework.

The CM model employs several central ideas. A causal
process is a physical process, like the movement of a baseball
through space, that is characterized by the ability to transmit a
mark in a continuous way. (“Continuous”
generally, although perhaps not always, means “spatio-temporally
continuous”.) Intuitively, a mark is some local modification to
the structure of a process—for example, a scuff on the
surface of a baseball or a dent an automobile fender. A process is
capable of transmitting a mark if, once the mark is introduced at one
spatio-temporal location, it will persist to other spatio-temporal
locations even in the absence of any further interaction. In this
sense the baseball will transmit the scuff mark from one location to
another. Similarly, a moving automobile is a causal process because a
mark in the form of a dent in a fender will be transmitted by this
process from one spatio-temporal location to another. Causal processes
contrast with pseudo-processes which lack the ability to
transmit marks. An example is the shadow of a moving physical
object. The intuitive idea is that, if we try to mark the shadow by
modifying its shape at one point (for example, by altering a light
source or introducing a second occluding object), this modification
will not persist unless we continually intervene to maintain it as the
shadow occupies successive spatio-temporal positions. In other words,
the modification will not be transmitted by the structure of the
shadow itself, as it would in the case of a genuine causal
process.

We should note for future reference that, as characterized by Salmon,
the ability to transmit a mark is clearly a counterfactual notion, in
several senses. To begin with, a process may be a causal process even
if it does not in fact transmit any mark, as long as it is true that
if it were appropriately marked, it would transmit the mark.
Moreover, the notion of marking itself involves a counterfactual
contrast—a contrast between how a process behaves when marked
and how it would behave if left unmarked. Although Salmon, like
Hempel, has always been suspicious of counterfactuals, his view at the
time that he first introduced the CM model was that the
counterfactuals involved in the characterization of mark transmission
were relatively unproblematic, in part because they seemed
experimentally testable in a fairly direct way. Nonetheless the
reliance of the CM model, as originally formulated, on
counterfactuals shows that it does not completely satisfy the Humean
strictures described above. In subsequent work, described in Section
4.4 below, Salmon attempted to construct a version of the CM
model that completely avoids reliance on counterfactuals.

The other major element in Salmon's model is the notion of a
causal interaction. A casual interaction involves a
spatio-temporal intersection between two causal processes which
modifies the structure of both—each process comes to have
features it would not have had in the absence of the interaction. A
collision between two cars that dents both is a paradigmatic causal
interaction.

According to the CM model, an explanation of some event
$E$ will trace the causal processes and interactions leading up
to $E$ (Salmon calls this the etiological aspect of
the explanation), or at least some portion of these, as well as
describing the processes and interactions that make up the event
itself (the constitutive aspect of explanation). In this way,
the explanation shows how $E$ “fit[s] into a causal
nexus”(1984, p.9).

The suggestion that explanation involves “fitting” an
explanandum into a causal nexus does not give us any very precise
characterization of what the relationship between $E$ and other
causal processes and interactions must be if information about the
latter is to explain $E$. Nonetheless, it seems clear enough
how the intuitive idea is meant to apply to specific examples. Suppose
that a cue ball, set in motion by the impact of a cue stick, strikes a
stationary eight ball with the result that the eight ball is put in
motion and the cue ball changes direction. The impact of the stick
also transmits some blue chalk to the cue ball which is then
transferred to the eight ball on impact. The cue stick, the cue ball,
and the eight ball are causal processes, as is shown by the
transmission of the chalk mark, and the collision of the cue stick
with the cue ball and the collision of the cue and eight balls are
causal interactions. Salmon's idea is that citing such facts about
processes and interactions explains the motion of the balls after the
collision; by contrast, if one of these balls casts a shadow that
moves across the other, this will be causally and explanatorily
irrelevant to its subsequent motion since the shadow is a
pseudo-process.

As the cue ball example illustrates, the CM model takes as
its paradigms of causal interaction examples such as collisions in
which there is “action by contact” and no spatio-temporal
gaps in the transmission of causal influence. There is little doubt
that explanations in which there are no such gaps (no “action at
a distance”) often strike us as particularly
satisfying.[13]
However, as Christopher Hitchcock shows in an illuminating paper
(Hitchcock, 1995), even here the CM model leaves out
something important. Consider the usual elementary textbook
“scientific explanation” of the motion of the balls in the
above example following their collision. This explanation proceeds by
deriving that motion from information about their masses and velocity
before the collision, the assumption that the collision is perfectly
elastic, and the law of the conservation of linear momentum. We
usually think of the information conveyed by this derivation as
showing that it is the mass and velocity of the balls, rather than,
say, their color or the presence of the blue chalk mark, that is
explanatorily relevant to their subsequent motion. However, it is hard
to see what in the CM model allows us to pick out the linear
momentum of the balls, as opposed to these other features, as
explanatorily relevant. Part of the difficulty is that to express such
relatively fine-grained judgments of explanatory relevance (that it is
linear momentum rather than chalk marks that matters) we need to talk
about relationships between properties or magnitudes and it is not
clear how to express such judgments in terms of facts about causal
processes and interactions. Both the linear momentum and the chalk
mark communicated to the cue ball by the cue stick are marks
transmitted by the spatio-temporally continuous causal process
consisting of the motion of the cue ball. Both marks are then
transmitted via an interaction to the eight ball. There appears to be
nothing in Salmon's notion of mark transmission or the notion of a
causal process that allows one to distinguish between the
explanatorily relevant momentum and the explanatorily irrelevant blue
chalk mark.

Ironically, as Hitchcock goes on to note, a similar observation may
be made about the birth control pills example (2.5.2) originally
devised by Salmon to illustrate the failure of the DN model
to capture the notion of explanatory relevance. Spatio-temporally
continuous causal processes that transmit marks as well as causal
interactions are at work when male Mr. Jones ingests birth control
pills—the pills dissolve, components enter his bloodstream,
are metabolized or processed in some way, and so on. Similarly,
spatio-temporally continuous causal processes (albeit different
processes) are at work when female Ms. Jones takes birth control
pills. However, the pills are irrelevant to Mr. Jones non-pregnancy,
and relevant to Ms. Jones' non-pregnancy. Again, it looks as though
the relevance or irrelevance of the birth control pills to Mr. or
Ms. Jones' failure to become pregnant cannot be captured just by
asking whether the processes leading up to these outcomes are causal
processes in Salmon's sense. A similar point holds for the hexed salt
example (2.6.3)—there are a spatio-temporally continuous
causal processes running from the witch's wand that touches the salt
sample to the individual Na and Cl ions formed when
the salt dissolves but this is not sufficient for the hexing to be
causally (or explanatorily) relevant to the dissolving.

A more general way of putting the problem revealed by these examples
is that those features of a process $P$ in virtue of which it
qualifies as a causal process (ability to transmit mark $M$)
may not be the features of $P$ that are causally or
explanatorily relevant to the outcome $E$ that we want to
explain ($M$ may be irrelevant to $E$ with some other
property $R$ of $P$ being the property which is causally
relevant to $E$). So while mark transmission may well be a
criterion that correctly distinguishes between causal
processes and pseudo-processes, it does not, as it
stands, provide the resources for distinguishing those
features or properties of a causal process that are
causally or explanatorily relevant to an outcome and those features
that are irrelevant.

A second set of worries has to do with the application of the
CM model to systems which depart in various respects from
simple physical paradigms such as the collision described above. There
are a number of examples of such systems. First, there are theories
like Newtonian gravitational theory which involve “action at a
distance” in a physically interesting sense. Second, there are a
number of examples from the literature on causation that do not
involve physically interesting forms of action at a distance but which
arguably involve causal interactions without intervening
spatio-temporally continuous processes or transfer of energy and
momentum from cause to effect. These include cases of causation by
omission and causation by “double prevention” or
“disconnection.”[14]
In all these cases, a literal application of the CM model
seems to yield the judgment that no explanation has been
provided—that Newtonian gravitational theory is unexplanatory
and so on. Many philosophers have been reluctant to accept this
assessment.

Yet another class of examples that raise problems for the CM
model involves putative explanations of the behavior of complex or
“higher level” systems—explanations that do not
explicitly cite spatio-temporally continuous causal processes
involving transfer of energy and momentum, even though we may think
that such processes are at work at a more “underlying”
level. Most explanations in disciplines like biology, psychology and
economics fall under this description, as do a number of
straightforwardly physical explanations.

As an illustration, suppose that a mole of gas is confined to a
container of volume $V_1$, at pressure
$P_1$, and temperature $T_1$. The
gas is then allowed to expand isothermally into a larger container of
volume $V_2$. One standard way of explaining the
behavior of the gas—its rate of diffusion and its subsequent
equilibrium pressure $P_2$—appeals to the
generalizations of phenomenological thermodynamics—e.g., the
ideal gas law, Graham's law of diffusion, and so on. Salmon appears to
regard putative explanations based on at least the first of these
generalizations as not explanatory because they do not trace
continuous causal processes—he thinks of the individual
molecules as causal processes but not the gas as a
whole.[15]
However, it is plainly impossible to trace the causal processes and
interactions represented by each of the $6 \times 10^{23}$ molecules making up the gas and the successive
interactions (collisions) it undergoes with every other molecule. The
usual statistical mechanical treatment, which Salmon presumably would
regard as explanatory, does not attempt to do this. Instead, it makes
certain general assumptions about the distribution of molecular
velocities and the forces involved in molecular collisions and then
uses these, in conjunction with the laws of mechanics, to derive and
solve a differential equation (the Boltzmann transport equation)
describing the overall behavior of the gas. This treatment abstracts
radically from the details of the causal processes involving
particular individual molecules and instead focuses on identifying
higher level variables that aggregate over many individual causal
processes and that figure in general patterns that govern the behavior
of the gas.

This example raises a number of questions. Just what does the
CM model require in the case of complex systems in which we
cannot trace individual causal processes, at least at a fine-grained
level? How exactly does the causal mechanical model avoid the
(disastrous) conclusion that any successful explanation of the
behavior of the gas must trace the trajectories of individual
molecules? Does the statistical mechanical explanation described above
successfully trace causal processes and interactions or specify a
causal mechanism in the sense demanded by the CM model, and
if so, what exactly does tracing causal processes and interactions
involve or amount to in connection with such a system? As matters now
stand both the CM model and the process theories of causation
that are its more recent descendants are incomplete.

There is another aspect of this example that is worthy of comment.
Even if, per impossible, an account that traced individual
molecular trajectories were to be produced, there are important
respects in which it would not provide the sort of explanation of the
macroscopic behavior of the gas that we are likely to be looking
for—and not just because such an account would be far too
complex to be followed by a human mind. There are a very large number
of different possible trajectories of the individual molecules in
addition to the trajectories actually taken that would produce the
macroscopic outcome—the final
pressure $P_2$—that we want to explain. This
information is certainly explanatorily relevant to the macroscopic
behavior of the gas and we would like our account of explanation to
accommodate this fact. Very roughly, given the laws governing molecular
collisions, one can show that almost all (i.e., all except a set of
measure zero) of the possible initial positions and momenta consistent
with the initial macroscopic state of the gas, as characterized by
$P_1$, $T_1$, and
$V_1$, will lead to molecular trajectories such that
the gas will evolve to the macroscopic outcome in which the gas
diffuses to an equilibrium state of uniform density through the
chamber at pressure $P_2$. Similarly, there is a
large range of different microstates of the gas compatible with each
of the various other possible values for the temperature of the gas
and each of these states will lead to a different final pressure
$P_{2^*}$. If we just trace the causal processes (in
the form of actual molecular trajectories) that lead to
$P_2$, as the CM model requires, we will
fail to represent or capture this information about the full range of
conditions under which $P_2$ and alternatives to it
will occur.

A similar point holds for explanations of the behavior of other sorts
of complex systems, such as those studied in biology and economics.
Consider the standard explanation, in terms of an upward shift of the
supply curve, with an unchanged demand curve, for the increase in the
price of oranges following a freeze. Underlying the behavior of this
market are individual spatio-temporally continuous causal processes
and interactions in Salmon's sense—there are a myriad of
individual transactions in which money in some form is exchanged for
physical goods, all of which involve transfers of matter or energy,
there is exchange of information about intentions or commitments to
buy or sell at various prices, all of which must take place in some
physical medium and involve transfers of energy, and so on. However,
it also seems plain that producing a full description of these
processes (supposing for the sake of argument that it was possible to
do this) will produce little or no insight into why these systems
behave as they do. Again, this is not just because any such
“explanation” will overwhelm our information processing
abilities. It is also the case that a great deal of the information
contained in such a description will be irrelevant to the behavior we
are trying to explain, for the same reason that a detailed description
of the individual molecular trajectories will contain information that
is irrelevant to the behavior of the gas. For example, while the
detailed description of the individual causal processes involved in
the operation of the market for oranges presumably will describe
whether individual consumers purchase oranges by cash, check, or
credit card, whether information about the freeze is communicated by
telephone or email, and so on, all of this is to a first approximation
irrelevant to the equilibrium price—given the supply and
demand curves, the equilibrium price will be the same as long as there
is a market in which consumers are able to purchase oranges by some
means, information about the freeze and about prices is available to
buyers and sellers in some form, and so
on.[16]
Moreover, those factors that are explanatorily relevant to
the equilibrium price, such as the shape of the demand and supply
curves, are not in any obvious sense themselves connected by
spatio-temporally continuous processes to the price (it is unclear
what this claim even means), although as emphasized above, the unknown
processes underlying the attainment of equilibrium are presumably
spatio-temporally continuous.

Again the issue is how an account like Salmon's can capture this
feature of successful explanation of the behavior of complex
systems—how the account guides us to find the
“right” level of description of the phenomena we are
trying to explain. In fact, as the above examples illustrate, the
requirements that Salmon imposes on causal processes-and in particular
the requirement of spatio-temporal continuity—often seem to lead
us away from the right level of description. The level at which the
spatio-temporal continuity constraint is most obviously respected (the
level at which, e.g., we describe a particular consumer as exchanging
cash for oranges or a grower as making an agreement via telephone with
a retailer to sell at a certain price) seems to be the wrong level for
achieving understanding.

In more recent work (e.g., Salmon, 1994), prompted in part by a
desire to avoid certain counterexamples advanced by Philip Kitcher
(Kitcher, 1989) to his characterization of mark transmission, Salmon
attempted to fashion a theory of causal explanation that completely
avoids any appeal to counterfactuals. In this new theory which is
influenced by the conserved process theory of causation of Dowe (Dowe,
2000), Salmon defined a causal process as a process that transmits a
non-zero amount of a conserved quantity at each moment in its
history. Conserved quantities are quantities so characterized in
physics—linear momentum, angular momentum, charge, and so
on. A causal interaction is an intersection of world lines associated
with causal processes involving exchange of a conserved
quantity. Finally, a process transmits a conserved quantity from
$A$ to $B$ if it possesses that quantity at every stage
without any interactions that involve an exchange of that quantity in
the half-open interval $(A, B]$.

One may doubt that this new theory really avoids reliance on
counterfactuals, but an even more fundamental difficulty is that it
still does not adequately deal with the problem of causal or
explanatory relevance described above. That is, we still face the
problem that the feature that makes a process causal (transmission of
some conserved quantity or other) may tell us little about which
features of the process are causally or explanatorily relevant to the
outcome we want to explain. For example, a moving billiard ball will
transmit many conserved quantities (linear momentum, angular momentum,
charge etc.) and many of these may be exchanged during a collision
with another ball. What is it that entitles us to single out the
linear momentum of the balls, rather than these other conserved
quantities as the property that is causally relevant to their
subsequent motion? In cases in which there appear to be no
conservation laws governing the explanatorily relevant property (i.e.,
cases in which the explanatorily relevant variables are not conserved
quantities) this difficulty seems even more acute. Properties like
“having ingested birth control pills,” “being
pregnant”, or “being a sample of hexed salt” do not
themselves figure in conservation laws. While one may say that both
birth control pills and hexed salt are causal processes because both
consist, at some underlying level, of processes that unambiguously
involve the transmission of conserved quantities like mass and charge,
this observation does not by itself tell us what, if anything, about
these underlying processes is relevant to pregnancy or dissolution in
water.

In a still more recent paper (Salmon, 1997), Salmon conceded this
point. He agreed that the notion of a causal process cannot by itself
capture the notion of causal and explanatory relevance. He suggested,
however, that this notion can be adequately captured by appealing to
the notion of a causal process and information about
statistical relevance relationships (that is, information about
conditional and unconditional (in)dependence relationships), with
the latter capturing the element of causal or explanatory dependence
that was missing from his previous account:

I would now say that (1) statistical relevance relations, in the
absence of information about connecting causal processes, lack
explanatory import and that (2) connecting causal processes, in the
absence of statistical relevance relations, also lack explanatory
import. (1997, p.476)

This suggestion is not developed in any detail in Salmon's paper, and
it is not easy to see how it can be made to work. We noted above that
statistical relevance relationships often greatly underdetermine the
causal relationships among a set of variables. What reason is there to
suppose that appealing to the notion of a causal process, in Salmon's
sense, will always or even usually remove this indeterminacy? We also
noted that the notion of a causal process cannot capture fine grained
notions of relevance between properties, that there can be causal
relevance between properties instances of which (at least at the level
of description at which they are characterized) are not linked by
spatio-temporally continuous or transference of conserved quantities,
and that properties can be so linked without being causally relevant
(recall the chalk mark that is transmitted from one billiard ball to
another). As long as it is possible (and why should it not be?) for
different causal claims to imply the same facts about statistical
relevance relationships and for these claims to differ in ways that
cannot be fully cashed out in terms of Salmon's notions of causal
processes and interactions, this new proposal will fail as well.

Selected Readings: Salmon, 1984 provides a detailed
statement of the Causal Mechanical model, as originally formulated.
Salmon, 1994 and 1997 provide a restatement of the model and respond
to criticisms. For discussion and criticism of the CM model,
see Kitcher, 1989, especially pp. 461ff, Woodward, 1989 and Hitchcock,
1995.

The basic idea of the unificationist account is that
scientific explanation is a matter of providing a unified account of a
range of different phenomena. This idea is unquestionably intuitively
appealing. Successful unification may exhibit connections or
relationships between phenomena previously thought to be unrelated and
this seems to be something that we expect good explanations to
do. Moreover, theory unification has clearly played an important role
in science. Paradigmatic examples include Newton's unification of
terrestrial and celestial theories of motion and Maxwell's unification
of electricity and magnetism. The key question, however, is whether
our intuitive notion (or notions) of unification can be made more
precise in a way that allows us to recover the features that we think
that good explanations should possess.

Michael Friedman (1974) is an important early attempt to do this.
Friedman's formulation of the unificationist idea was subsequently
shown to suffer from various technical problems (Kitcher, 1976) and
subsequent development of the unificationist treatment of explanation
has been most associated closely with Philip Kitcher (especially
Kitcher, 1989).

Let us begin by introducing some of Kitcher's technical vocabulary. A
schematic sentence is a sentence in which some of the
nonlogical vocabulary has been replaced by dummy letters. To use
Kitcher's examples, the sentence “Organisms homozygous for the
sickling allele develop sickle cell anemia” is associated with a
number of schematic sentences including “Organisms homozygous
for $A$ develop $P$” and “For all $X$
if $X$ is $O$ and $A$ then $X$
is $P$”. Filling instructions are directions that
specify how to fill in the dummy letters in schematic sentences. For
example, filling instructions might tell us to replace $A$ with
the name of an allele and $P$ with the name of a phenotypic
trait in the first of the above schematic sentences. Schematic
arguments are sequences of schematic sentences.
Classifications describe which sentences in schematic
arguments are premises and conclusions and what rules of inference are
used. An argument pattern is an ordered triple consisting of
a schematic argument, a set of sets of filling instructions, one for
each term of the schematic argument, and a classification of the
schematic argument. The more restrictions an argument pattern imposes
on the arguments that instantiate it, the more stringent it
is said to be.

Roughly speaking, Kitcher's guiding idea is that explanation is a
matter of deriving descriptions of many different phenomena by using
as few and as stringent argument patterns as possible over and over
again-the fewer the patterns used, the more stringent they are, and
the greater the range of different conclusions derived, the more
unified our explanations. Kitcher summarizes this view as follows:

Science advances our understanding of nature by showing us how to
derive descriptions of many phenomena, using the same pattern of
derivation again and again, and in demonstrating this, it teaches us
how to reduce the number of facts we have to accept as ultimate.
(p.423).

Kitcher does not propose a completely general theory of how the
various considerations he describes—number of conclusions,
number of patterns and stringency of patterns—are to be traded
off against one another, but does suggest that it often will be clear
enough what these considerations imply about the evaluation of
particular candidate explanations. His basic strategy is to attempt to
show that the derivations we regard as good or acceptable explanations
are instances of patterns that taken together score better according
to the criteria just described than the patterns instantiated by the
derivations we regard as defective explanations. Following Kitcher,
let us define the explanatory store $E(K)$ as
the set of argument patterns that maximally unifies $K$, the
set of beliefs accepted at a particular time in science. Showing that
a particular derivation is a good or acceptable explanation is then a
matter of showing that it belongs to the explanatory store.

As an illustration, consider Kitcher's treatment of the problem of
explanatory asymmetries (recall Section 2.5). Our present explanatory
practices—call these $P$—are committed to the
idea that derivations of a flagpole's height from the length of its
shadow are not explanatory. Kitcher compares $P$ with an
alternative systemization in which such derivations are regarded as
explanatory. According to Kitcher, $P$ includes the use of a
single “origin and development” (OD) pattern of
explanation, according to which the dimensions of objects-artifacts,
mountains, stars, organisms etc. are traced to “the conditions
under which the object originated and the modifications it has
subsequently undergone” (1989, p. 485). Now consider the
consequences of adding to $P$ an additional pattern $S$
(the shadow pattern) which permits the derivation of the dimensions of
objects from facts about their shadows. Since the OD pattern
already permits the derivation of all facts about the dimensions of
objects, the addition of the shadow pattern $S$ to $P$
will increase the number of argument patterns in $P$ and will
not allow us to derive any new conclusions. On the other hand, if we
were to drop OD from $P$ and replace it with the
shadow pattern, we would have no net change in the number of patterns
in $P$, but would be able to derive far fewer conclusions than
we would with OD, since many objects do not have shadows (or
enough shadows) from which to derive all of their dimensions. Thus
OD belongs to the explanatory store, and the shadow pattern
does not.

Kitcher's treatment of other familiar problem cases is similar. For
example, he notes that we believe that an explanation of why some
sample of salt dissolves in water that appeals to the fact that the
salt is hexed and the generalization $(H)$ that all hexed salt
dissolves in water is defective, at least in comparison with the
standard explanation that appeals just to the generalization that
$(D)$ all salt dissolves in water. He suggests that the
“basis for this belief” is that the derivation that
appeals to $(H)$ instantiates an argument pattern that belongs
to a totality of patterns that is less unifying than the totality
containing the derivation that appeals to $(D)$. In particular,
an explanatory store containing $(H)$ but not $(D)$ will
have a more restricted consequence set than a store containing
$(D)$ but not $(H)$, since the latter but not the former
allows for the derivation of facts about the dissolving of unhexed
salt in water. And the addition of $(H)$ to an explanatory
store containing $(D)$ will increase the number of patterns
without any compensating gain in what can be derived.

Kitcher acknowledges that there is nothing in the unificationist
account per se that requires that all explanation be
deductive: “there is no bar in principle to the use of
non-deductive arguments in the systemization of our
beliefs”. Nonetheless, “the task of comparing the unifying
power of different systemizations looks even more formidable if
nondeductive arguments are considered” and in part for this
reason Kitcher endorses the view that “in a certain sense,
all explanation is deductive” (p.448).

What is the role of causation on this account? Kitcher claims that
“the ‘because’ of causation is always derivative
from the ‘because’ of explanation.” (1989,
p.477). That is, our causal judgments simply reflect the explanatory
relationships that fall out of our (or our intellectual ancestors')
attempts to construct unified theories of nature. There is no
independent causal order over and above this which our explanations
must capture. Like many other philosophers, Kitcher takes very
seriously, even if in the end he perhaps does not fully endorse,
standard empiricist or Humean worries about the epistemic
accessibility and intelligibility of causal claims. Taking causal,
counterfactual or other notions belonging to the same family as
primitive in the theory of explanation is problematic. Kitcher
believes that it is a virtue of his theory that it does not do
this. Instead, Kitcher proposes to begin with the notion of
explanatory unification, characterized in terms of constraints on
deductive systemizations, where these constraints can be specified in
a quite general way that is independent of causal or counterfactual
notions, and then show how the causal claims we accept derive from our
efforts at unification.

As remarked at the beginning of this section, the idea that
explanation is connected in some way to unification is intuitively
appealing. Nonetheless Kitcher's particular way of cashing out this
connection seems problematic. Consider Kitcher's treatment of the
flagpole example. This depends heavily on the contingent truth that
some objects do not cast enough shadows to recover all of their
dimensions. But it seems to be part not just of common sense, but of
currently accepted physical theory that it would be inappropriate to
appeal to facts about the shadows cast by objects to explain their
dimensions even in a world in which all objects cast enough shadows
that all their dimensions could be recovered. It is unclear how
Kitcher's account can recover this judgment.

The matter becomes clearer if we turn our attention to a variant
example in which, unlike the shadow example, there are clearly just as
many backwards derivations from effects to causes as there are
derivations from causes to effects. Consider, following Barnes (1992),
a time-symmetric theory like Newtonian mechanics, applied to a closed
system like the solar system. Call derivations of the state of motion
of planets at some future time $t$ from information about their
present positions (at time $t_0$), masses, and
velocities, the forces incident on them at $t_0$, and
the laws of mechanics predictive. Now contrast such
derivations with retrodictive derivations in which the
present motions of the planets are derived from information about
their future velocities and positions at $t$, the forces
operative at $t$, and so on. It looks as though there will be
just as many retrodictive derivations as predictive derivations, and
each will require premises of exactly the same general sort—information
about positions, velocities, masses etc. and the same
laws. Thus the pattern or patterns instantiated by the retrodictive
derivations look(s) exactly as unified as the pattern or patterns
associated with the predictive derivations. However, we ordinarily
think of the predictive derivations and not the retrodictive
derivations as explanatory and the present state of the planets as the
cause of their future state and not vice-versa. It is again far from
obvious how considerations having to do with unification could
generate such an explanatory asymmetry.

One possible response to this second example is to bite the bullet
and to argue that from the point of view of fundamental physics, there
really is no difference in the explanatory import of the retrodictive
and predictive derivations, and that it is a virtue, not a defect, of
the unificationist approach that it reproduces this judgment. Whatever
might be said in favor of this response, it is not Kitcher's. His
claim is that our ordinary judgments about causal asymmetries can be
derived from the unificationist account. The example just described
casts doubt on this claim. More generally, it casts doubt on Kitcher's
contention that one can begin with the notion of explanatory
unification, understood in a way that does not presuppose causal
notions, and use it to derive the content of causal judgments.

This conclusion is reinforced by a more general consideration:
unification, as it figures in science is a quite heterogeneous notion,
covering many different sorts of
achievements.[17]
Some kinds of unification consist in the creation of a common
classificatory scheme or descriptive vocabulary where no satisfactory
scheme previously existed, as when early investigators like Linnaeus
constructed comprehensive and principled systems of biological
classification. Another kind of unification involves the creation of a
common mathematical framework or formalism which can be applied to
many different sorts of phenomena, as when the systems of equations
devised by Lagrange and Hamilton were first developed in connection
with mechanics and then applied to domains like electromagnetism and
thermodynamics. Still other cases involve what might be described as
genuine physical unification, where phenomena previously regarded as
having quite different causes or explanations are shown to be the
result of a common set of mechanisms or causal relationships. Newton's
demonstration that the orbits of the planets and the behavior of
terrestrial objects falling freely near the surface of the earth are
due to the same force of gravity and conform to the same laws of
motion was a physical unification in this sense.

Of these three kinds of activities only the third—physical
unification—seems to have much intuitively to do with
explanation, at least if we think of explanation as involving the
citing of causal relationships. In particular, depending on the
details of the case, the kind of unification associated with adoption
of a classificatory scheme may tell us little about causal
relationships. Moreover, as historical studies have made clear, a
similar point holds for formal or mathematical unification: the fact
that we can construct a common mathematical framework for dealing with
a range of different phenomena does not by any means automatically
insure that we have identified some set of common causal factors
responsible for those phenomena—i.e., that we have produced a
unified physical explanation of them. For example, the mere fact that
we can describe both the behavior of a system of gravitating masses
and the operation of an electric circuit by means of Lagrange's
equations does not mean that we have achieved a common explanation of
the behavior of both or that we have “unified” gravitation
and electricity in any physically interesting sense.

These considerations raise the following question: Is Kitcher's
account of unification sufficiently discriminating or nuanced to
distinguish those unifications having to do with explanation from
other sorts of unification? The worry is that it is not. The
conception of unification underlying Kitcher's account seems to be at
bottom one of descriptive economy or information compression—deriving
as much from as few patterns of inference as possible. Many
cases of classificatory and purely formal unification involving a
common mathematical framework seem to fit this
characterization. Consider schemes for biological classification and
schemes for the classification of geological and astronomical objects
like rocks and stars. If I know that individuals belong to a certain
classificatory category (e. g. $X$s are mammals or polar
bears), I can use this information to derive a great many of their
other properties ($X$s have backbones, hearts, their young are
born alive etc.) and this is a pattern of inference that can be used
repeatedly for many different sorts of $X$s. But despite the
willingness of some philosophers to regard such derivations as
explanatory, it is common scientific practice to regard such schemes
as “merely descriptive” and as telling us little or
nothing about the causes or mechanisms that explain why $X$s
have backbones or
hearts.[18]

Another illustration of the same general point is provided by the
numerous statistical procedures (factor analysis, cluster analysis,
multidimensional scaling techniques) that allow one to summarize or
represent large bodies of statistical information in an economical,
unified way and to derive more specific statistical facts from a much
smaller set of assumptions by repeated use of the same pattern of
argument. For example, knowing the “loading” of each of
$n$ intelligence tests on a single common factor $g$,
one can derive a much larger number $(n(n-1)/2)$ of
conclusions about pairwise correlations among these tests. Again,
however, it is doubtful that by itself this “unification”
tells us anything about the causes of performance on these tests.

Another fundamental difficulty with the unificationist account
derives from its reliance on what might be called a “winner take
all” conception of unification. On the one hand, it seems that
any plausible version of that account must yield the conclusion that
generalizations and theories can sometimes be explanatory with respect
to some set of phenomena even though more unifying explanations of
those phenomena are
known[19].
For example, Galileo's law can be used to explain facts about the
behavior of falling bodies even though it furnishes a less unifying
explanation than the laws of Newtonian mechanics and gravitational
theory, the latter are in turn explanatory even though the
explanations they provide are less unified than those provided by
General Relativity, the theories of Coulomb and Ampere are explanatory
even though the explanations they provide are less unified than the
explanations provided by Maxwell's theory, and so on. If we reject
this idea, we must adopt the conclusion that in any domain only the
most unified theory that is known is explanatory at all; everything
else is non-explanatory. Call this the winner-take-all conception of
explanatory unification.

The winner-take-all conception gives up on the apparently very
natural idea, which one would think that the unificationist would wish
to endorse, that an explanation can provide less unification than some
alternative, and hence be less deep or less good, but still qualify as
somewhat explanatory. However, Kitcher's treatment of the problems of
explanatory irrelevance and explanatory asymmetry seems to require
just this conception. Why is it that we cannot appeal to the fact that
this particular sample of salt has been hexed to explain why it
dissolves? According to Kitcher, any explanatory store containing a
generalization about the dissolving of hexed salt will be “less
unified” than a competing explanatory store according to which
the dissolving of the salt is explained by appeal to the
generalization that all salt dissolves in water. Similarly, the reason
why we cannot explain the height of a flagpole in terms of the length
of its shadow is that explanations of lengths of objects in terms of
facts about shadows do not belong to the “set of
explanations” which “collectively provides the best
systemization of our beliefs” (1989, p. 430). This analysis
clearly requires the winner-take-all idea that an explanation
$T_1$ that is less satisfactory from the point of
view of unification than some competing alternative
$T_2$ is unexplanatory, rather than merely
less explanatory than $T_2$. If Kitcher were
to reject the winner take all idea and hold instead that even if
$T_2$ is more unified than $T_1$, it
does not automatically follow that $T_1$ is
unexplanatory, then his solution to the problems of explanatory
irrelevance and asymmetry would no longer be available: his conclusion
should be that an “explanation” of Mr. Jones' failure to
get pregnant in terms of his ingestion of birth control pills is
genuinely explanatory, although less so than the alternative
explanation that invokes his gender, and similarly for a derivation of
the height of a flagpole from the length of its shadow.

Intuitively, the problem is that we need a theory of explanation that
captures several different possibilities. On the one hand, there are
generalizations and associated putative explanations (like the
generalization relating barometric pressure to the occurrence of
storms and the generalization relating the hexing of salt to its
dissolution in water) that are not explanatory at all; they fall below
the threshold of explanatoriness. On the other hand, above this
threshold there is something more like a continuum: a generalization
can be explanatory but provide less deep or good explanations than
some alternative. What we have just seen is that the unificationist
account has difficulty simultaneously capturing both of these
possibilities. Either there is no threshold (every derivation is
explanatory to some extent and it is just that some derivations belong
to systemizations that are less unifying and hence less explanatory
than others) or else there is no continuum (only the most unifying
systemizations are explanatory).

Recall that, according to Kitcher, causal knowledge derives from our
efforts at unification. However, as Kitcher also recognizes, it is
highly implausible that most individuals deliberately and
self-consciously go through the process of comparing competing
deductive systemizations with respect to number and stringency of
patterns and number of conclusions in order to determine which is most
unifying. His response to this observation is to hold that most people
acquire causal knowledge by absorbing the “lore” of their
communities, where this lore does reflect previous systematic efforts
at unification. He writes that “our everyday causal knowledge is
based on our early absorption of the theoretical picture of the world
bequeathed to us by our scientific tradition” (1989, p. 469)

How exactly is this suggestion supposed to work? While it is surely
true that individual human beings acquire a substantial amount of
causal knowledge by cultural transmission, it is also obvious that not
all causal knowledge is acquired in this way. Some causal knowledge
that individuals acquire involves learning from experience. Moreover,
unless we are willing to make extremely implausible assumptions about
the innateness of a large number of specific causal beliefs, the stock
of socially transmitted causal knowledge must itself have been
initially acquired in a way in which learning from experience played
an important role. The question that then arises is how this process
of learning from experience is supposed to work on a view like
Kitcher's about the source of our causal knowledge. If, as Kitcher
claims, “the idea that any one individual justifies the causal
judgments that he/she makes by recognizing the patterns of argument
that best unify his/her beliefs is clearly absurd” (1989,
p. 436), just what is it that is going on at the individual level when
people learn form experience? One possibility is that although
individuals do not knowingly go through the process of comparing the
degree of unification achieved by alternative systemizations when they
acquire new causal knowledge by learning from experience, they go
through this process tacitly or unconsciously, perhaps because of some
general disposition of the mind to seek unification. However, Kitcher
does not seem to endorse this idea and it does not fit very well with
his emphasis on the social transmission of causal
information. Moreover, it looks as though even unconscious unification
requires very sophisticated cognitive abilities (construction and
comparison of different deductive systemizations etc.) that it is
implausible to attribute to many causal learners, such as small
children.

One natural interpretation of the passages quoted above and others in
Kitcher (1989) is this: a social process of comparing alternative
systemizations of beliefs and drawing out their deductive consequences
occurs at the community level, with groups of people making arguments
to one another about which overall deductive systemizations best unify
the beliefs of the community as a whole. Particular causal beliefs are
justified at the community level by being shown to be part of the best
overall systemization of the beliefs of the community, and are then
passed on from the common community stock to individuals via a process
of social transmission.

An obvious problem with this picture is that the community-wide
process of justification must still be carried out in some fashion by
individual actors. If, as appears to be the case, there are many
societies which possess a substantial amount of causal and explanatory
knowledge but in which no one possesses an explicit or clearly
articulated concept of a deductively valid argument or is very skilled
at drawing out the deductive consequences of beliefs or possesses
explicit versions of Kitcher's concepts of number and stringency of
argument patterns, how exactly are community beliefs that reflect the
operation of these notions supposed to form? If, as Kitcher concedes,
it is psychologically unrealistic to assume that individual human
beings deliberately and self-consciously go through the process of
comparing alternative systemizations when they acquire causal beliefs
through experience, why is it any more realistic to suppose that this
process somehow occurs through the interactions of individual actors
at the community
level[20]?

There is a second, related difficulty. Assume, for the sake of
argument, that it is desirable to have a unified belief system in
Kitcher's sense—whether because unification is connected to
explanation and the latter is intrinsically valuable or because
unification is connected to other goals (e.g., confirmation) that are
desirable. It is still not obvious why it would be valuable to have a
set of beliefs that are a smallish proper subset of the beliefs that
comprise such a unified system, which is what most people seem to
have, given Kitcher's views about the transmission of causal
knowledge. Recall Kitcher's basic picture: when I acquire the belief
that, say, whether salt is hexed is causally irrelevant to whether it
dissolves and that whether it is placed in water is causally relevant,
I acquire a fragment of the community's overall systemization
$S$. But adding a fragment of $S$ or even a number of
fragments of $S$ to my belief store may not result in
my having a belief system that is unified, or that
facilitates whatever epistemic goals are associated with
unification. Of course if I end up adding all or most of $S$ to
my belief store, I will have at that point a set of beliefs that is
unified and that brings with it all of the benefits of
unification. But, as Kitcher agrees, it is unrealistic to suppose that
most people possess anything like the full systemization $S$
that best unifies all of the beliefs in their community. This seems to
be true, for example, of our own epistemic community, in which
knowledge—especially scientific knowledge—is highly
dispersed among a small group of experts and in which no single
person's mind (and still less the typical member's mind) contains or
operates in accordance with the systemization that best unifies the
beliefs of the entire community. More generally, it seems unlikely
that the different portions $B_i$ of the community
systemization $S$ that various individuals $i$ acquire
by means of cultural transmission will be in each case highly unified
systemizations. In short, it is a major problem with the cultural
transmission story that it is hard to see how unification could be
cognitively or practically valuable unless it characterizes the belief
systems of individuals and not just the community. However, taking the
sort of unification that Kitcher associates with causal and
explanatory knowledge to characterize individual belief systems seems
prima-facie psychologically unrealistic. This is not to say that
there is no way of making sense of the acquisition of causal knowledge
on the unificationist picture, but a great deal more needs to be said
about how this works.

Selected Readings: The most detailed statement of
Kitcher's position can be found in Kitcher, 1989. Salmon, 1989,
pp. 94ff. contains a critical discussion of Friedman's version of the
unificationist account of explanation but ends by advocating a
“rapprochement” between unificationist approaches and
Salmon's own causal mechanical model. Woodward, 2003, contains
additional criticisms of Kitcher's version of unificationism.

Despite their many differences, the accounts of Hempel (focusing now
on just the DN rather than the IS model), Salmon, Kitcher and others
discussed above largely share a common overall conception of what the
project of constructing a theory of explanation should involve and (to
a considerable extent) what criteria such a theory should satisfy if
it is to be successful. Lets us say that a theory of explanation
contains “pragmatic” elements if (i) those elements
require irreducible reference to facts about the interests, beliefs or
other features of the psychology of those providing or receiving the
explanation and/or (ii) irreducible reference to the
“context” in which the explanation occurs. (For what this
means, see below.) Although the writers discussed above agree that
pragmatic elements play some role in the activity of giving and
receiving explanations, they assume that there is a non-pragmatic core
to the notion of explanation which it is the central task of a theory
of explanation to capture. That is, it is assumed that this core
notion can be specified in a way that does not require reference to
features of the psychology of explainers or their audiences that and
it can be characterized in terms of features that are non-contextual
in the sense that they are sufficiently general, abstract
and“structural”that we can view them as holding across a
range of explanations with different contents and across a range of
different contexts. Often, but not always, it is claimed that many
aspects of these features can be captured formally, via relationships
like deductive entailment or statistical relevance. In addition, these
writers see the goal of a theory of explanation, as capturing the
notion of a correct explanation, as in “the (or an)
explanation of the photoelectric effect is such and such” as
opposed to the notion of an explanation's being considered explanatory
by a particular audience or not, a matter which presumably depends on
such considerations as whether the audience understands the terms in
which the explanation is framed. Finally, as noted in the Introduction
to this entry, writers in this tradition have not had as
their goal capturing all of various ways in which the word
“explanation” is used in ordinary English. They have
instead focused on a much more restricted class of examples in which
what is of interest is (something like) explaining “why”
some outcome or general phenomenon occurred, as opposed to explaining,
e.g., the meaning of a word or how to solve a differential
equation. The motivation for this restriction is simply the judgment
that an interesting and non-trivial theory is more likely to emerge if
it is restricted in scope in this way. For ease of reference, let us
call this the “traditional” conception of the task of a
theory of explanation.

Some or all of these assumptions and goals are rejected in pragmatic or as they are sometimes also called “contextual” accounts of explanation. Early contributors to this approach include Michael Scriven (e.g.,1962) and Sylvan Bromberger (e.g., 1966), with more systematic statements, due to van Fraassen (1980) and Achinstein (1983) appearing in the 1980s. Since it is not always clear just what the points of disagreement are between pragmatic and traditional accounts, some orienting remarks about this will be useful before turning to details.
Defenders of pragmatic approaches to explanation typically stress the
point that whether provision of a certain body of information to some
audience produces understanding or a sense of intelligibility or is
appropriate or illuminating for that audience depends on the background
knowledge and interests of the audience members and on other factors
having to do with the local context. For example, an explanation
of the deflection of starlight by the sun that appeals to the field
equations of General Relativity may be highly illuminating to a trained
physicist but unintelligible to layperson because of his background. Factors of this sort are grouped together as “pragmatic” and their influence is taken to illustrate at least one way in which pragmatic considerations enter into the notion of explanation.

Taken in itself the observation just described seem completely
uncontroversial and not in conflict with approaches to explanation
that are usually viewed as paradigmatically traditional. Indeed,as
remarked above writers like Hempel and Salmon explicitly agree that
explanation has a pragmatic dimension in the sense just
described—in fact, Hempel invokes the role of pragmatic factors
at a number of points to address prima-facie counterexamples to the DN
model[21]. This
suggests that, often at
least[22],
what is distinctive about pragmatic
approaches to explanation is not just the bare idea that explanation
has a “pragmatic dimension” but rather the further and
much stronger claim that that the traditional project of constructing
a model of explanation pursued by Hempel and others has so far been
unsuccessful ( and perhaps is bound to be unsuccessful) and
that this is so because pragmatic or contextual factors play
a central and ineliminable role in explanation in a way that resists
incorporation into models of the traditional sort. On this view, much
of what is distinctive about pragmatic accounts (including the
accounts of van Fraassen and Achinstein discussed below) is their
opposition to traditional accounts and their diagnosis of why
accounts fail—they fail because they omit pragmatic or
contextual elements. It will be important to keep this point in mind
in what follows because there is a certain tendency among advocates of
pragmatic theories to argue as though the superiority of their
approach is established simply by the observation that explanation has
a pragmatic dimension; instead it seems more appropriate to think that
the real issue is whether traditional approaches are inadequate in
principle because of their neglect of the pragmatic dimension of
explanation.

A second issue concerns an important ambiguity in the notion of
“pragmatic”. On one natural understanding of this notion, a
pragmatic consideration is one that has to do with utility or
usefulness in the service of some goal connected to human interests,
where these interests are in some relevant sense
“practical”. Call this notion
“pragmatic1”. On this construal,
Hempel's DN model might be correctly characterized as
a pragmatic1 theory (or as
containing pragmatic1 elements) since it links
explanatory information closely to the provision of information that is
useful for purposes of prediction and prediction certainly
qualifies as a pragmatic goal. For similar reasons,
Woodward's (2003) theory of explanation might also be counted as
a pragmatic1 theory since it connects explanation with the
provision of information that is useful for manipulation and
control—unquestionably useful goals. As these examples
suggest, models of explanation that aspire to traditional goals can be pragmatic1 theories.

In the context of theories of explanation, however, the label
“pragmatic” is usually intended to suggest a somewhat
different set of associations. In particular, “pragmatic”
is typically used to characterize considerations having to do with
facts about the psychology (interests, beliefs etc.) of those involved
in providing or receiving explanations and/or to characterize
considerations involving the local context, often with the suggestion
that both sets of considerations may vary in complex and idiosyncratic
ways that resist incorporation into the sort of general theory sort
sought by traditional
models.[23]
Call this set of
associations“pragmatic2”. Neither Hempel's nor
Woodward's theory is pragmatic2 . In particular, as the
example of the DN model illustrates, the fact that a theory
is pragmatic1 in the sense that it appeals to facts about
goals generally shared by human beings (such as prediction) to help to
motivate a model of explanation does not preclude attempting
to construct models of explanation satisfying traditional goals and
does not require commitment to the idea that explanation must be
understood as a pragmatic2 notion. We need to be careful to
distinguish these two different ways of thinking about the
“pragmatic” dimension of explanation.

Finally, as emphasized above, a concern with the pragmatics of
explanation, naturally connects with an interest in the
“psychology” of explanation, and this in turn suggests the relevance of empirical studies of sorts of information that
various subjects (ordinary folks, scientists) find explanatory, treat
as providing “understanding”, the distinctions subjects make among explanations and so on. Although there is a growing
literature in this area, the most prominent philosophical advocates of
pragmatic approaches to explanation have so far tended not to make use
of it. In this connection, it is worth pointing out that this
psychological literature goes well beyond the truisms found in
philosophical discussion about different people finding different sorts
of information explanatory depending on their interests. In particular,
psychologists have been very interested in exploring general features
or structural patterns present in information that various subjects
find explanatory. For example, Lombrozo (2010) finds evidence that
subjects prefer explanations that appeal to relationships that are
relatively stable (in the sense of continuing to hold across
changing
circumstances[24])
and Lien and Cheng (2000)
present evidence that in cases in which the explanandum $E$ has
a single candidate cause $C$, subjects prefer levels of
explanation/causal description that maximize $\Delta p =
\text{Pr}(E \mid C) - \text{Pr}(E \mid \text{not-}C)$.

Notice that in both cases these are relationships or patterns of the sort that traditional accounts of explanation attempt to capture. As these examples bring out, there is no necessary incompatibility between the project of trying to formulate
an account of explanation that satisfies traditional goals and an interest in the
psychology of explanation. It may be that subjects find certain sorts of information explanatory or understanding-producing because certain structural features of the sort that traditional accounts attempt to characterize are present in that
information—indeed this is what the Lombrozo and Lien and Cheng
papers suggest.

In the same vein, we also should distinguish the general
project of investigating the empirical psychology of explanation
(which can be pursued with a variety of different commitments
about best to theorize about explanation) from the more specific
claim that the characterization of what it is for an explanatory
relationship to hold between explanans and explanandum must be
given in “psychologistic” terms in the
sense that this requires irreducible reference to
psychological facts about particular audiences such as the vagaries of
what they happen to be interested in. In general, whether there
are robust regularities connecting structural or objective features in
bodies of information with whether that information is judged as
explanatory by various subjects ought to be regarded as an empirical
question and not as something that can settled from the armchair. It
might be true that there are no such regularities and that
what people find explanatory or productive of understanding varies
enormously, depending on their interests and on other psychological
factors, but this is something that needs to be shown, not assumed at
the outset of investigation.

One of the most influential recent pragmatic accounts of explanation
is associated with constructive empiricism. This is the thesis,
defended by Bas van Fraassen in his 1980 book, The Scientific
Image, that the aim of science (or at least “pure”
science) is the construction of theories that are “empirically
adequate” (that is, that yield a true or correct description of
observables) and not, as scientific realists suppose, theories that
aim to tell literally true stories about unobservables. Relatedly,
“acceptance” of a theory involves only the belief that it
is empirically adequate ( van Fraassen, 1980, p. 12). van Fraassen's
account of explanation, which is laid out in several articles and,
most fully, in Chapter Six of his (1980) is meant to fit with this
overall conception of science: it is a conception according to which
explanation per se is not an epistemic aim of “pure”
science (empirical adequacy is the only such aim), but rather a
“pragmatic” virtue, having to do with the
“application” of science. (Note that to the extent that
the application of science is taken to be a “pragmatic”
(i.e., pragmatic1) matter and the idea that explanation is
pragmatic in this respect is used to motivate the adoption of a
pragmatic (i.e., pragmatic2) theory of explanation, we have
a transition between the two notions of “pragmatic”
distinguished above.) Because explanation is a merely pragmatic
virtue, a concern with explanation is not something that can require
scientists to move beyond belief in the empirical adequacy of their
theories to belief in the literal truth of claims about unobservable
entities.

According to van Fraassen, explanations are answers to questions and
getting clear about the logic of questions is central to constructing a
theory of explanation. Questions can take many different forms, but
when the question of interest is a “why” question,
explanatory queries will typically take the following form: a
query about why some explanandum $P_k$ rather than any
one of the members of a contrast $X$ (a set of
possible alternatives to $P_k$) obtained. In addition,
some “relevance relation” $R$ is assumed by the
question. An answer $A$ to this question will take the
form “$P_k$ in contrast to (the rest of)
$X$ because $A$, where $A$ bears the relevance
relation $R$ to $[P_k, X]$”. To use van
Fraassen's example, consider “Why is this conductor warped?” Depending on the context, the intended contrast
might have to with, e.g., why this particular conductor
is warped in contrast to some other conductor that is
unwarped or alternatively, it might have to do with why this particular
conductor is warped now when it was previously unwarped. The
relevance relation $R$ similarly depends on the context and the
information which the questioner is interested in obtaining. For
example, $R$ might involve causal information (the question
might be a request for what caused the warping) but it also might have
to do with information about function, if the context was one in which
it is assumed that the shape of the conductor plays some functional
role in a power station which the questioner wants to know about.
Thus “context” enters into the explanation both by playing a role in specifying the contrast class
$X$ and the relevance relation $R$. van Fraassen
describes various rules for the “evaluation” of
answers. For example, $P_k$ and $A$
must be true, the other members of the contrast class must not be true,
$A$ must “favor” (raise the conditional probability
of) $P_k$ against alternatives, and $A$
must compare favorably with other answers to the same question, a
condition which itself has several aspects including, for
example, whether $A$ favors the topic more than
these other answers and whether $A$ is screened off by other
answers. However, he also makes it clear (as the example above
suggests) that a variety of different relevance relations
may be appropriate depending on context and that the evaluation of
answers also depends on context. Moreover, he explicitly denies
that there is anything distinctive about the category of scientific
explanation that has to do with its structure or form—instead, a scientific explanation is simply an explanation
that makes use of information that is (or at least that is treated as)
grounded in a “scientific” theory.

Van Fraassen sums up his view of explanation (and gestures at
his grounds for rejecting objectivist approaches) as follows

The discussion of explanation went wrong at the very beginning when
explanation was conceived of as a relation like description: a relation
between a theory and a fact. Really, it is a three-term relation
between theory, fact, and context. No wonder that no single relation
between theory and fact ever managed to fit more than a few examples!
Being an explanation is essentially relative for an explanation is an
answer… it is evaluated vis-à-vis a question,
which is a request for information. But exactly… what is
requested differs from context to context. (1980, p. 156)

Van Fraassen begins his chapter on explanation with a brief story that
provides a good point of entry into how he intends his account to
work. Recall from section 2.5 that a well-known counterexample to the
DN model involves the claim that one can explain the length
$S$ of the shadow cast by a flagpole in terms of the height $H$ of the
flagpole but that (supposedly) one cannot explain $H$ in terms of $S$,
despite the fact that one can construct a DN derivation from $S$ to
$H$. This is commonly taken to show that the DN model has left out
some factor having to do with the directional or asymmetric features
of explanation—e.g., perhaps an asymmetry in the relation
between cause and effect that ought to incorporated into one's model
of explanation. In van Fraassen's story, a straightforward causal
explanation of the usual sort of $S$ in terms of $H$ (although the
object in question is a tower rather than a flagpole) is first
offered. Then a second explanation, according to which the height of
the tower is “explained” by the fact that it was designed
to cast a shadow of a certain length is advanced. Presumably the moral
we are to draw is that as the context and perhaps the relevance
relation $R$ are varied, both

\[
H \text{ explains } S
\]

and

\[
S \text{ explains } H
\]

are acceptable (legitimate, appropriate etc.) explanations.
Moreover, since these variations in context and relevance relation turn
on variations in what is of interest to the explainer and his audience,
we are further encouraged to conclude that explanatory asymmetries have
their source in psychological facts about people's interests and background beliefs, rather than in, say, some asymmetry that exits in nature independently of these. Pragmatists about
explanation think that a similar conclusion holds for other features of
the explanatory relevance relation that philosophers have tried to
characterize in terms of traditional models of explanation.

One obvious response to this claim, made by several critics (e.g.
Kitcher and Salmon, 1987, p. 317), is that the example does not really
involve a case in which, depending on context, $H$
causally explains $S$ and $S$ causally
explains $H$. Instead, although $H$
does causally explain in $S$, it is (something like)
the desire for a shadow of length $S$ (rather than
$S$ itself) that explains (or at least causally explains) the
height (or the choice of height) for the tower. Or, if one prefers, in
the latter case we are given something like a functional explanation
(but not a causal explanation) for the height of the tower, in the
sense that we are told what the intended function of that choice of
height is. On either of these diagnoses, this will not be a case
in which whether $H$ provides a causal explanation of $S$
or whether instead $S$ provides a causal explanation of
$H$ shifts depending on factors having to do with the interests
of the speaker or audience or other contextual factors. If so, the
story about the tower does not show that the asymmetry present in the flagpole example must be accounted for in terms of pragmatic factors. It may be accounted for in some other way. In fact, although discussion must be beyond the scope of this essay, a number of possible candidates for such a non-pragmatic account of causal asymmetries have been proposed, both in philosophy and outside of it (for
example, in the machine learning literature). These candidates include
asymmetries in causal connectability of the sort described in Hausman
(1998), statistical asymmetries of various sorts (e.g., Spirtes,
Glymour, and Scheines, 2000) and asymmetries in informational
dependence (e.g., Janzing, 2012) . All of these proposals may be
wrong but it is hard to see how they are shown to be wrong just by the
sorts of observations advanced by van Fraassen in the tower and shadow story. Instead
showing they are wrong would require detailed critiques of the
proposal themselves[25].

A much more general criticism has been advanced against van
Fraassen's version of a pragmatic theory by Salmon and Kitcher
(1987). Basically, their complaint is that the relevance relation
$R$ in van Fraassen's account is completely unconstrained,
with the (what they regard as the obviously unacceptable) consequence
that for any pair of true propositions $P$ and $A$,
answer $A$ is relevant to $P$ via some
relevance relation and thus “explains” $P$. For
example, according to Salmon and Kitcher, we might define a
relationship of “astral influence” $R^*$, meeting van
Fraassen's criteria for being a relevance relation, such that the
time $t$ of a person's death is explained in terms of
$R^*$ and the position of various heavenly bodies at $t$.
Here it may seem that van Fraassen has a ready response. As noted
above, on van Fraassen's view, background knowledge and, in the
case of scientific explanation, current scientific knowledge, helps to
determine which are the acceptable relevance relations and acceptable
answers to the questions posed in requests for explanation—such
knowledge and the expectations that go along with it are part of the
relevant context when one asks for an explanation of time of death.
Obviously, astral influence is not an acceptable or legitimate
relevance relation according to modern science—hence appeal to
such a relation is not countenanced as explanatory by van
Fraassen's theory. More generally it might be argued that
available scientific knowledge will provide constraints on the
relevance relations and answers that exclude the “anything
goes” worry raised by Salmon and Kitcher—at least insofar
as the context is one in which a “scientific explanation”
is sought.

While this response may seem plausible enough as far as it goes, it
does bring out the extent to which much of the work of distinguishing
the explanatory from the non-explanatory in Van Fraassen's account
comes from a very general appeal to what is accepted as legitimate
background information in current science. Put differently, this
raises the worry that once one moves beyond van Fraassen's formal
machinery concerning questions and answers (which van Fraassen himself
acknowledges is relatively unconstraining), one is left with an
account according to which a scientific explanation is simply any
explanation employing claims from current science and a currently
scientifically approved relevance relation. Even if otherwise
unexceptionable, this proposal is, if not exactly trivial, at least
rather deflationary—it provides much less than many have hoped
for from a theory of explanation. In particular, in cases (of which
there are many examples) in which there is an ongoing argument or
dispute in some area of science not about whether some proposed theory
or model is true but rather about whether it explains some phenomenon,
it is not easy to see how the proposal even purports to provide
guidance. On the other hand, the obvious rejoinder that might be made
on van Fraassen's behalf is that no more ambitious treatment that
would satisfy the expectations associated with more traditional
accounts of explanation (including a demarcation of candidate
explanations into those that are “correct” and
“incorrect”) is possible—a theory like van
Fraassen's is as good as it gets. If there is no defensible theory of
explanation embodying a non-trivially constraining relevance relation,
it cannot be a good criticism of van Fraassen's theory that he fails
to provide
this[26].

So at least from van Fraassen's
perspective, traditional models are in no better position than his own
in providing such guidance.

A final point that is suggested by van Fraassen's theory
is this. In considering pragmatic theories, it matters a great deal
exactly where the “pragmatic” elements are claimed to enter
into the account of explanation. One point at which such considerations
seem clearly to enter is in the selection or characterization of
what an audience wants explained. This is reflected
in van Fraassen's theory in the choice of a
$P_k$ and an associated contrast class $X$.
Obviously, whether we are looking for an explanation of why, say,
this particular conductor is now bent when it was previously straight
or whether instead we want to know why this conductor is bent
while some other conductor is straight is a matter that depends on our
interests. However, this particular sort of “interest
relativity” (and associated phenomena having to do with the
role of contrastive focus in the characterization of explananda, which
really just serve to specify more exactly which particular explananda
we want explained) seems something that can be readily acknowledged by
traditional theories[27].
After all, it is not a threat to the DN or other models with similar
traditional aspirations that one audience may be interested in an
explanation of the photoelectric effect but not the deflection of
starlight by the sun and another audience may have the opposite
interests. What would be a threat to the DN and similar models would
be an argument that once some explanandum $E$ is fully specified,
whether explanans $M$ explains $E$ (that is, whether there is an
explanatory relation between $M$ and $E$) is itself
“interest-relative”. It is natural to interpret van
Fraassen as making this latter claim, both in connection with
explanatory asymmetries and more generally.

Another very influential pragmatic account of explanation focuses on
the act of explaining and treats this as an illocutionary act, in the
sense in which that notion is used in speech act theory. The most
systematic statement of this approach is due to Peter Achinstein (see
especially, Achinstein, 1983). Like many other pragmatic theorists,
Achinstein is interested in capturing a very broad notion of
explanation, which includes not just causal explanations (and not just
answers to why-questions), but such notions as explaining the meaning
of word, the rules of chess, the function of some biological structure
and so on. His account is much too complex to describe in full detail;
all that can be attempted here is a very rough sketch,

Achinstein's point of departure is what is involved in someone's
explaining something to another. According to Achinstein, in such
episodes the intention of the person doing the explaining is crucial:
in particular in explaining the explainer must have the intention to
render something (roughly, a certain type of indirect question $q$
corresponding to the something explained)
“understandable”. An explanation (understood as the
product of an explaining act) is then defined as ordered pair,
“one of whose members is an act type explaining $q$ [as
above]… and whose other member is a proposition that provides
an answer to the question $q$.” (2010, p. xi).For example,
Newton's explanation of why the tides occur is represented by the
ordered pair: (The tides occur because of the gravitational pull of
the moon: explaining why the tides occur). Achinstein distinguishes
between “correct” and “good” explanations. “A correct
explanation is one in which the propositional member of the ordered
pair is true” (2010, xi). A correct explanation may nonetheless
not be a good one because, e.g., it is inappropriate in various ways
to the abilities and interests of the audience to which it is
directed. The notion of a good explanation is further characterized in
terms of a set of instructions for explanation construction, where
these instructions are sensitive to the interests, beliefs and so on
of the audience. Such instructions might specify, e.g., that a causal
explanation rather than some other kind of explanation is sought or
that the explanation sought must make reference to micro-entities. A
very important feature of Achinstein“pragmatic”s position
is that there is no single universal set of instructions that is
appropriate for all audiences and contexts, either in science or
elsewhere. Thus traditional accounts that purport to provide such
instructions are (in this respect) mistaken. Achinstein writes

Now let me offer a conjecture. Suppose, following in the footsteps
of Hempel and Salmon, you formulate a set of objective, nonpragmatic
criteria that you think all scientific explanations must satisfy to be
evaluated highly. These criteria will be universal in the sense that
they are not to vary from one explanation to the next, but are to be
ones applicable to all scientific explanations. They are also universal
in the sense that they are not to incorporate specific empirical
assumptions or presuppositions that might be made by scientists in one
field or context but not another. So they might include the use of
laws, causal factors, and quantitative hypotheses, the satisfaction of
some criterion of unification or simplicity, and so forth. My conjecture is
that whatever set of objective, nonpragmatic, universal
criteria you propose you will be able to find or construct
counterexamples to it, but as a set of necessary conditions and as a
set of sufficient conditions. (2010, p. 137)

Achinstein illustrates this claim with reference to
Rutherford's 1911 explanation of alpha particle scattering.
Rutherford's explanation appealed to assumptions about atomic
structure—in particular, that the positive charge of an atom is
concentrated in its nucleus whose volume is small in comparison with
the total volume to the atom—to derive a quantitative expression
for the magnitude of scattering at various angles. According to
Achinstein, other competing explanations (e.g., an explanation which
just gave the quantitative expression governing scattering but did not
connect this to claims about atomic structure) can satisfy the various traditional
criteria for explanatory goodness found in the philosophy of
science literature (such explanations may have a DN structure, describe
causes, be unifying etc.) but will nonetheless be less good than
Rutherford's. Rutherford's explanation is good (or as
good as it is) because it provides an explanation “at the
subatomic level of matter in a way that physicists at the time
were interested in explaining scattering” (2010, p. 136,
italics in original). In other words, to explain the respects in which
Rutherford's explanation is good (or better than competitors) we must
make irreducible reference to the interests of physicists at
the time. In this sense, Achinstein's account of “good
explanation” is, as he says, “strongly pragmatic”.
His “conjecture” nicely captures much (although perhaps
not all)of what is at issue between pragmatic and traditional, not
purely non-pragmatic accounts of explanation. The central issue is
whether one can capture the respects in which Rutherford's explanation
is better than alternatives in terms of an explanatory relation that
can be specified independently of the interests of (and perhaps other
psychological facts concerning) particular audiences and also independently
of irreducibly “contextual” facts (such as the claim that in this
case a good explanation requires reference to the subatomic level,but
there is nothing more general to be said, independently of facts about
people's interests, about why explanations at this level are
preferable). A convincing argument for the second alternative would
presumably need (at least) to examine the existing traditional
accounts and show they are unsuccessful. This is a project Achinstein
undertakes in his (1983)—unsurprisingly, judgments about whether
he makes the case for the failure of objectivist accounts differ.

So far we have been treating “pragmatic” and
“traditional” accounts as diametrically opposed
possibilities. This corresponds to how these accounts are usually
presented, both by their defenders and detractors. However, it is
worthwhile (and, provides additional insight into both approaches), to
consider the possibility that some of the ideas associated with each
might be combined, thus enlarging the space of possible approaches to
explanation. First, we might distinguish the claim that explanation
has irreducibly “contextual” elements from the claim that
these contextual elements must be understood in terms of facts about
the psychology (interests etc.) of the parties to the explanation. An
alternative possibility is that explanation is indeed irreducibly
contextual, but that these contextual elements should be understood
non-psychologically—roughly in terms of the role of particular
empirical facts in explanation, where the relevance of these facts
resists capture by means of the resources employed in the DN and other
traditional models (that is, it resists capture in terms of
relationships, like deductive entailment, statistical relevance and so
on.) A possible illustration is provided by Achinstein's own
example of the explanation of the scattering of alpha particles in
terms of facts about nuclear structure. As noted above, Achinstein
himself thinks that in this case the goodness of the explanation is
context-dependent because it depends on psychological facts: in
particular, the goodness of the explanation reflects the fact that
physicists are particularly “interested” in explanations
appealing to nuclear structure. An alternative possibility is that the
explanation has irreducibly “contextual”elements in the
sense that there is something about the empirical details of this
particular case that makes facts about nuclear structure explanatorily
relevant to scattering but where this relevance cannot be fully
captured in terms of terms of the abstract, structural features
(Achinstein's “objective”, nonpragmatic, universal
criteria) on which traditional models of explanation focus. Thus this possibility involves a
non-psychological notion of “contextual” that
contrasts with the idea that the explanatory relation can be specified
in a “content-independent” way. To spell out this notion
of content-independence consider the DN model. This model is
content-independent in the sense that it claims that as long as a
certain abstract structural relationship holds between explanans
$(C_i, L_i)$ and explanandum $E$, it
does not matter what specifically one fills in
for $C_i$, $L_i$, and
$E$—the resulting structure is an explanation. A
contextualist about explanation in the non-psychological sense would
claim instead that for whatever content-independent candidate for the
explanatory relation that we specify (whether that specification is in
terms of deductive or probabilistic relationships or anything else
similarly formal and abstract or with similar aspirations to
universality) there will be examples instantiating this structure that
are explanatory and examples that are not explanatory—in this
sense that the particular content that we fill in for the candidate
explains matter to whether we have an explanation.

An additional analogy may help to flesh out this idea. John Norton (e.g., forthcoming) has advocated in a series of papers what he calls a “material theory of induction”. His view is that the reliability of various
inductive inferences is dependent on associated
specific empirical (“material”) assumptions in a way that precludes the formulation of any universal logic of
induction—there is no universal form of an inductive
argument that ensures reliability, regardless of the particular
content of that goes into that argument. However, it is not part
of Norton's view that inductive support is somehow a subjective
matter or relative to the interests etc. of particular
audiences—it is empirical facts of a non-psychological sort
(except of course when the evidential relations of interest concern
psychological hypotheses) that undergird evidential relationships. We might say that on his view inductive inference is “contextual” in the sense that it is not content-independent but that it is also does not require a psychologistic characterization.

The possibility under consideration is that a similar claim might be
true for explanation. Perhaps it is true, for example, that in order
to capture the respects in which Rutherford's explanation is a
good one, one needs to invoke, in addition to general relatively
content—independent requirements about unification,
derivability from laws and so on, constraints having to do with more
specific “local” material facts about atomic structure, in
the sense that nothing will count as an explanation of alpha
scattering that does not invoke such facts and that other explanations
with the same form not invoking atomic structure will not count as
explanatory. But perhaps these additional constraints have to do with
facts about what the world is like rather than, as Achinstein
suggests, facts about what physicists of the time were most interested
in. A view of this sort might capture (or concede) some of the claims
made by pragmatic approaches about the role of contextual elements in
explanation but would avoid some of the subjective or psychologistic
tendencies in such approaches. It would be “contextual” in
the sense that Norton's material theory of induction is
contextual.

A closely related thought is that if one is inclined to incorporate
contextual elements into the theory of explanation, there remains a
range of possibilities about how they might be combined with more
universalistic elements. As suggested above, rather than thinking of
these two sets of elements as simply standing in opposition to each
other, it may be better to think in terms of the two working together
in a synergistic way. As an illustration consider the notion of
unification. It may be that we cannot provide an adequate
characterization of this notion and its role in explanation in purely
formal, completely content-independent terms—e.g. in such terms
as deriving many conclusions from a few basic assumptions or replacing
theories with many free parameters with theories that have only a few
such parameters. Nonetheless it may be true that once local material
or empirical constraints are used to restrict the class of candidate
theories to be compared with respect to the unification they achieve,
something like counting basic assumptions or number of free parameters
(or more plausibly something in the same spirit but more
sophisticated) furnishes useful information about degree of
unification achieved. Again the analogy with theories of inductive
reasoning is suggestive. It is certainly not the case that all
attempts to provide formal or general theories in this area are
misguided or doomed to failure—the various treatments of
statistical inference and machine learning are obvious counterexamples
to this suggestion. On the other hand, the successful theories in this
area are not completely universal or content-independent; instead, in
many cases they yield results that seem sensible or normatively
correct in a certain range of applications or when certain empirical
background conditions are satisfied but not in other situations. In
other words, such theories are both sensitive to context and contain
elements that look objective and structural. Perhaps something like
this will turn out to be true of “explanation”.

Selected Readings. van Fraassen (1980),
especially Chapter Six and Achinstein (1983) are classic statements of
pragmatic approaches to explanation. These pragmatic accounts are
discussed and criticized in Salmon (1989). van Fraassen's
account is also discussed in Kitcher and Salmon (1987). De
Regt and Dieks (2005) is a recent defense of what the authors
describe as a “contextual” account of scientific
understanding and which engages with some of the themes in the
“pragmatics of explanation” literature.

What can we conclude from this recounting of some of the more
prominent recent attempts to construct models of scientific
explanation? What important issues remain open and what are the most
promising directions for future work? Of course, any effort at
stock-taking will reflect a particular point of view, but with this
caveat in mind, several observations seem plausible, even if not
completely uncontroversial.

The first concerns the role of causal information in scientific
explanation. It is a plausible, although by no means inevitable,
judgement.[28]
that many of the difficulties faced by the models described above
derive from their reliance on what appear to be inadequate treatments
of causation and causal relevance. The problems of explanatory
asymmetries and explanatory irrelevance described in section 2.5 seem
to show that the holding of a law (understood as a regularity) between
$C$ and $E$ is not sufficient for $C$ to cause
$E$; hence not a sufficient condition for $C$ to figure
in an explanation of $E$. If the argument of section 3.3 is
correct, the fundamental problem with the SR model is that
statistical relevance information is insufficient to fully capture
causal information in the sense that different causal structures can
be consistent with the same information about statistical relevance
relationships. Similarly, the CM model faces the difficulty
that information about causal processes and interactions is also
insufficient to fully capture causal relevance relations and that
there is a range of cases in which causal relationships hold between
$C$ and $E$ (and hence in which $C$ figures in an
explanation of $E$) although there is no connecting causal
process between $C$ and $E$. Finally, a fundamental
problem with unificationist models is that the content of our causal
judgments does not seem to fall out of our efforts at unification, at
least when unification is understood along the lines advocated by
Kitcher. For example, as discussed above, considerations having to do
with unification do not by themselves explain why it is appropriate to
explain effects in terms of their causes rather than vice-versa.

At the very least these observations suggest that progress in
connection with “scientific explanation” may require more
attention to the notion of causation and a more thorough-going
integration of discussions of explanation with the burgeoning
literature on causation, both within and outside of
philosophy.[29]
Counterfactual accounts of causation may be promising in this
connection (cf. Woodward, 2003).

Does this mean that a focus on causation should entirely replace the
project of developing models of explanation or that philosophers
should stop talking about explanation and instead talk just about
causation? Despite the centrality of causation in explanation, it is
arguable that completely subsuming the latter into the former loses
connections with some important issues. For one thing, causal claims
themselves seem to vary greatly in the extent to which they are
explanatorily deep or illuminating. Causal claims found in Newtonian
mechanics seem deeper or more satisfying from the point of view of
explanation than causal claims of “the rock broke the
window” variety. It is usually supposed that such differences
are connected to other features—for example to how general,
stable, coherent with background knowledge a causal claim is.
However, as we have noted, not all kinds of generality, stability
etc. seem explanatorily relevant (or connected to explanatory
goodness). So even if one focuses only on causal explanation, there
remains the important project of trying to understand better what
sorts of distinctions among causal claims matter for goodness in
explanation. To the extent this is so, the kinds of concerns that
have animated traditional treatments of explanation don't seem to be
entirely subsumable into standard accounts of causation.

There is also the important question of whether all legitimate forms
of why- explanation are causal. For example, some writers
(e.g. Nerlich, 1979) contend that there is a variety of physical
explanation which is “geometrical” rather than causal, in
the sense that it consists in explaining phenomena by appealing to the
structure of spacetime rather than to facts about forces or
energy/momentum transfer. (Nerlich takes causal explanations in
physics to have to do with the latter.) According to Nerlich,
explaining the trajectory followed by a free particle by noting that
it is following a geodesic in spacetime is an illustration of a
geometrical rather than a causal explanation. A really satisfying
theory of explanation should provide some principled answer to the
question of whether all why explanation must be causal (and according
to what notion of causal this is so or not so), rather than just
assuming an affirmative (or negative) answer to this question. Again,
to the extent that there are non-causal forms of explanation,
explanation will remain a topic that is at least somewhat independent
of causation.

Noretta Koertge (1992) noted that although the literature on
explanation is immense, comparatively little attention has been paid,
in the construction of the various competing models of explanation, to
the question of what they are to be used for or what their
larger point or purpose is (other than capturing “our”
notion of explanation). Relatedly, writers on explanation have not
always paid adequate attention to how explanation itself is connected
to or interacts with (or is distinct from) other goals of
inquiry—for example, what the connection is between explanatory
goodness and other frequently proposed goals for inquiry such as
evidential support, prediction, control of nature, simplicity, and so
on. One result is that it is sometimes unclear how to assess the
significance of our intuitive judgments about the goodness of various
explanations or to determine what turns on our giving one judgment
rather than another. For example, as we have noted, most people judge
intuitively that one cannot explain the height of a pole by appealing
to the length of its shadow.

However, a determined defender of the DN model (e.g. Hempel,
1965, pp 353–4) may well ask why we should be so impressed by such
intuitive judgments. Perhaps our pre-analytic assessment is confused
or mistaken in some way or perhaps it reflects merely pragmatic
considerations that should have no place in the theory of
explanation. One way to respond to this skepticism would be to provide
a non-question-begging account of what of importance would be lost or
left out if we failed to distinguish between explanations of shadow
lengths in terms of pole heights and “explanations”
running in the opposite direction. (Note that to the extent that we
are interested merely in prediction, the two inferences appear to be
on a par. “Non-question-begging” means that we don't just
say that the height causes the shadow and not vice-versa, but that we
provide some further explication of what this difference consists in
and why the difference matters.) One possible answer would appeal to
the epistemic goal of having information relevant to manipulation and
control; one may manipulate the length of the shadow by, among other
things, manipulating the height of the pole but not conversely. This
difference is real regardless of one's intuitions about explanation in
the two
cases[30].

Regardless of what one thinks about this particular answer, the more
general point is that one way forward in assessing competing models of
explanation is to focus less (or not just) on whether they capture our
intuitive judgments and more on the issue of whether and why the kinds
of information they require is valuable (and attainable), and how this
information relates to other goals we value in inquiry.

As another illustration, consider the CM model. Underlying
this model is presumably some judgment to the effect that tracing
causal processes and their interactions is a worthy goal of inquiry.
Now of course one might try to defend this judgment simply by claiming
that the identification of causes is an important goal and that causal
process theories yield the correct account of cause. But a more
illuminating and less question-begging way of proceeding would be to
ask how this goal relates to other epistemic values. For example, what
is the connection between the goal of identifying causal processes and
constructing unified theories? Or between identifying causal processes
and the discovery of information that is relevant to prediction or to
manipulation and control? Are these the same goals? Independent but
complementary goals? Competing goals in the sense that satisfaction of
one may make it harder to satisfy the other? Obviously, one may ask
similar questions about the goal of unification.

The need for treatments of explanation that relate this notion more
adequately to other concepts and goals is particularly salient in
connection with the role of laws in explanation, which is another item
on the agenda for future work in this area. The account of laws that
is currently regarded as the most promising by many philosophers is
the Mill-Ramsey-Lewis (MRL) theory. According to this
theory, laws are those generalizations which figure as axioms or
theorems in the deductive systemization of our empirical knowledge
that achieves the best combination of simplicity and strength (where
strength has to do with the range of empirical truths that are
deducible)[31].
It is natural to connect this conception of laws with
unificationist approaches to explanation: if laws are generalizations
that play a central role in the achievement of simple (and presumably
unified) deductive systemizations, then by appealing to laws in
explanation, we achieve explanatory unification—this makes it
intelligible why it is desirable that explanations invoke
laws[32].
If an
account along these lines could be made to work we would have a sort
of integrated story about laws and explanation that is largely lacking
in the DN account—a story about what laws are that is
directly connected to an idea about the point of explanation. Of
course there remain real problems (some of which are discussed above)
with the unificationist account of explanation and, for that matter,
with the MRL theory of
laws[33],
but the integrated account that
would result from putting the two together nonetheless might be taken
to illustrate the sort of thing we should be aiming at.

Yet another general issue concerns the extent to which it is possible
to construct a single model of explanation that fits all areas of
science. It is uncontroversial that explanatory practice—what is
accepted as an explanation, how explanatory goals interact with
others, what sort of explanatory information is thought to be
achievable, discoverable, testable etc.—varies in significant
ways across different disciplines. Nonetheless, all of the models of
explanation surveyed above are “universalist” in
aspiration—they claim that a single, “one size”
model of explanation fits all areas of inquiry in so far as these have
a legitimate claim to explain. Although the extreme position that
explanation in biology or history has nothing interesting in common
with explanation in physics seems unappealing (and in any case has
attracted little support), it seems reasonable to expect that more
effort will be devoted in the future to developing models of
explanation that are more sensitive to disciplinary
differences. Ideally, such models would reveal commonalities across
disciplines but they should also enable us to see why explanatory
practice varies as it does across different disciplines and the
significance of such variation. For example, as noted above,
biologists, in contrast to physicists, often describe their
explanatory goals as the discovery of mechanisms rather than the
discovery of laws. Although it is conceivable that this difference is
purely terminological, it is also worth exploring the possibility that
there is a distinctive story to be told about what a mechanism is, as
this notion is understood by biologists, and how information about
mechanisms contributes to explanation.

A closely related point is that at least some of the models described
above impose requirements on explanation that may be satisfiable in
some domains of inquiry but are either unachievable (in any
practically interesting sense) in other domains or, to the extent that
they may be achievable, bear no discernible relationship to generally
accepted goals of inquiry in those domains. For example, we noted
above that many scientists and philosophers hold that there are few if
any laws to be discovered in biology and the social and behavioral
sciences. If so, models of explanation that assign a central role to
laws may not be very illuminating regarding how explanation works in
these disciplines. As another example, even if we suppose that the
partition into objectively homogeneous reference classes recommended
by the SR model is an achievable goal in connection with
certain quantum mechanical phenomena, it may be that (as suggested
above) it is simply not a goal that can be achieved in a non-trivial
way in economics and sociology, disciplines in which causal inference
from statistics also figures prominently. In such disciplines, it may
be that additional statistically relevant partitions of any population
or subpopulation of interest will virtually always be possible, so
that the activity of finding such partitions is limited only by the
costs of gathering additional information. A similar assessment may
hold for most applications of the CM model to the social
sciences.

The SEP would like to congratulate the National Endowment for the Humanities on its 50th anniversary and express our indebtedness for the five generous grants it awarded our project from 1997 to 2007.
Readers who have benefited from the SEP are encouraged to examine the NEH’s anniversary page and, if inspired to do so, send a testimonial to neh50@neh.gov.