Philosophy of Statistical Mechanics

Statistical mechanics was the first foundational physical theory in
which probabilistic concepts and probabilistic explanation played a
fundamental role. For the philosopher it provides a crucial test case
in which to compare the philosophers' ideas about the meaning of
probabilistic assertions and the role of probability in explanation
with what actually goes on when probability enters a foundational
physical theory. The account offered by statistical mechanics of the
asymmetry in time of physical processes also plays an important role in
the philosopher's attempt to understand the alleged asymmetries of
causation and of time itself.

From the seventeenth century onward it was realized that material
systems could often be described by a small number of descriptive
parameters that were related to one another in simple lawlike ways.
These parameters referred to geometric, dynamical and thermal
properties of matter. Typical of the laws was the ideal gas law that
related product of pressure and volume of a gas to the temperature of
the gas.

It was soon realized that a fundamental concept was that of
equilibrium. Left to themselves systems would change the values of
their parameters until they reached a state where no further changes
were observed, the equilibrium state. Further, it became apparent that
this spontaneous approach to equilibrium was a time-asymmetric process.
Uneven temperatures, for example, changed until temperatures were
uniform. This same “uniformization” process held for densities.

Profound studies by S. Carnot of the ability to extract mechanical
work out of engines that ran by virtue of the temperature difference
between boiler and condenser led to the introduction by R. Clausius of
one more important parameter describing a material system, its entropy.
How was the existence of this simple set of parameters for describing
matter and the lawlike regularities connecting them to be explained?
What accounted for the approach to equilibrium and its time asymmetry?
That the heat content of a body was a form of energy, convertible to
and from mechanical work formed one fundamental principle. The
inability of an isolated system to spontaneously move to a more orderly
state, to lower its entropy, constituted another. But why were these
laws true?

One approach, that of P. Duhem and E. Mach and the “energeticists,”
was to insist that these principles were autonomous phenomenological
laws that needed no further grounding in some other physical
principles. An alternative approach was to claim that the energy in a
body stored as heat content was an energy of motion of some kind of
hidden, microscopic constituents of the body, and to insist that the
laws noted, the thermodynamic principles, needed to be accounted for
out of the constitution of the macroscopic object out of its parts and
the fundamental dynamical laws governing the motion of those parts.
This is the kinetic theory of heat.

Early work on kinetic theory by W. Herepath and J. Waterston was
virtually ignored, but the work of A. Krönig made kinetic theory a
lively topic in physics. J. C. Maxwell made a major advance by deriving
from some simple postulates a law for the distribution of velocities of
the molecules of a gas when it was in equilibrium. Both Maxwell and L.
Boltzmann went further, and in different, but related, ways derived an
equation for the approach to equilibrium of a gas. The equilibrium
distribution earlier found by Maxwell could then be shown to be a
stationary solution of this equation.

This early work met with vigorous objections. H. Poincaré had
proven a recurrence theorem for bounded dynamical systems that seemed
to contradict the monotonic approach to equilibrium demanded by
thermodynamics. Poincaré's theorem showed that any appropriately
bounded system in which energy was conserved would of necessity, over
an infinite time, return an infinite number of times to states
arbitrarily close to the initial dynamical state in which the system
was started. J. Loschmidt argued that the time irreversibility of
thermodynamics was incompatible with the symmetry under time reversal
of the classical dynamics assumed to govern the motion of the molecular
constituents of the object.

Partly driven by the need to deal with these objections explicitly
probabilistic notions began to be introduced into the theory by Maxwell
and Boltzmann. Both realized that equilibrium values for quantities
could be calculated by imposing a probability distribution over the
microscopic dynamical states compatible with the constraints placed on
the system, and identifying the observed macroscopic values with
averages over quantities definable from the microscopic states using
that probability distribution. But what was the physical justification
for this procedure?

Both also argued that the evolution toward equilibrium demanded in
the non-equilibrium theory could also be understood probabilistically.
Maxwell, introducing the notion of a “demon” who could manipulate the
microscopic states of a system, argued that the law of entropic
increase was only probabilistically valid. Boltzmann offered a
probabilistic version of his equation describing the approach to
equilibrium. Without considerable care, however, the Boltzmannian
picture can still appear contrary to the objections from recurrence and
reversibility interpreted in a probabilistic manner.

Late in his life Boltzmann responded to the objections to the
probabilistic theory by offering a time-symmetric interpretation of the
theory. Systems were probabilistically almost always close to
equilibrium. But transient fluctuations to non-equilibrium states could
be expected. Once in a non-equilibrium state it was highly likely that
both after and before that state the system was closer to equilibrium.
Why then did we live in a universe that was not close to equilibrium?
Perhaps the universe was vast in space and time and we lived in a
“small” non-equilibrium fluctuational part of it. We could only find
ourselves in such an “improbable” part, for only in such a region could
sentient beings exist. Why did we find entropy increasing toward the
future and not toward the past? Here the answer was that just as the
local direction of gravity defined what we meant by the downward
direction of space, the local direction in time in which entropy was
increasing fixed what we took to be the future direction of time.

In an important work (listed in the bibliography), P. and T.
Ehrenfest also offered a reading of the Boltzmann equation of approach
to equilibrium that avoided recurrence objections. Here the solution of
the equation was taken to describe not “the overwhelmingly probable
evolution” of a system, but, instead, the sequence of states that would
be found overwhelmingly dominant at different times in a collection of
systems all started in the same non-equilibrium condition. Even if each
individual system approximately recurred to its initial conditions,
this “concentration curve” could still show monotonic change toward
equilibrium from an initial non-equilibrium condition.

Many of the philosophical issues in statistical mechanics center
around the notion of probability as it appears in the theory. How are
these probabilities to be understood? What justified choosing one
probability distribution rather than another? How are the probabilities
to be used in making predictions within the theory? How are they to be
used to provide explanations of the observed phenomena? And how are the
probability distributions themselves to receive an explanatory account?
That is, what is the nature of the physical world that is responsible
for the correct probabilities playing the successful role that they do
play in the theory?

Philosophers concerned with the interpretation of probability are
usually dealing with the following problem: Probability is
characterized by a number of formal rules, the additivity of
probabilities for disjoint sets of possibilities being the most central
of these. But what ought we to take the formal theory to be a theory
of? Some interpretations are “objectivist,” taking probabilities to be,
possibly, frequencies of outcomes, or idealized limits of such
frequencies or perhaps measures of “dispositions” or “propensities” of
outcomes in specified test situations.

Other interpretations are “subjectivist,” taking probabilities to be
measures of “degrees of belief,” perhaps evidenced in behavior in
situations of risk by choices of available lotteries over outcomes.
Still another interpretation reads probabilities as measures of a kind
of “partial logical entailment” among propositions.

Although subjectivist (or, rather, logical) interpretations of
probability in statistical mechanics have been proffered (by E. Jaynes,
for example), most interpreters of the theory opt for an objectivist
interpretation of probability. This still leaves open, however,
important questions about just what “objective” feature the posited
probabilities of the theory are and how nature contrives to have such
probabilities evinced in its behavior.

Philosophers dealing with statistical explanation have generally
focussed on everyday uses of probability in explanation, or the use of
probabilistic explanations in such disciplines as the social sciences.
Sometimes it has been suggested that to probabilistically explain an
outcome is to show it likely to have occurred given the background
facts of the world. In other cases it is suggested that to explain an
outcome probabilistically is to produce facts which raise the
probability of that outcome over what it would have been those facts
being ignored. Still others suggest that probabilistic explanation is
showing an event to have been the causal outcome of some feature of the
world characterized by a probabilistic causal disposition.

The explanatory patterns of non-equilibrium statistical mechanics
place the evolution of the macroscopic features of matter in a pattern
of probabilities over possible microscopic evolutions. Here the types
of explanation offered do fit the traditional philosophical models. The
main open questions concern the explanatory grounds behind the posited
probabilities. In equilibrium theory, as we shall see, the statistical
explanatory pattern has a rather different nature.

The standard method for calculating the properties of an
energetically isolated system in equilibrium was initiated by Maxwell
and Boltzmann and developed by J. Gibbs as the microcanonical ensemble.
Here a probability distribution is imposed over the set of microscopic
states compatible with the external constraints imposed on the system.
Using this probability distribution, average values of specified
functions of the microscopic conditions of the gas (phase averages) are
calculated. These are identified with the macroscopic conditions. But a
number of questions arise: Why this probability distribution? Why
average values for macroscopic conditions? How do phase averages
related to measured features of the macroscopic system?

Boltzmann thought of the proper average values to identify with
macroscopic features as being averages over time of quantities
calculable from microscopic states. He wished to identify the phase
averages with such time averages. He realized that this could be done
if a system started in any microscopic state eventually went through
all the possible microscopic states. That this was so became known as
the ergodic hypothesis. But it is provably false on topological and
measure theoretic grounds. A weaker claim, that a system started in any
state would go arbitrarily close to each other microscopic state is
also false, and even if true would not do the job needed.

The mathematical discipline of ergodic theory developed out of these
early ideas. When can a phase average be identified with a time average
over infinite time? G. Birkhoff (with earlier results by J. von
Neumann) showed that this would be so for all but perhaps a set of
measure zero of the trajectories (in the standard measure used to
define the probability function) if the set of phase points was
metrically indecomposable, that is if it could not be divided into more
than one piece such that each piece had measure greater than zero and
such that a system started in one piece always evolved to a system in
that piece.

But did a realistic model of a system ever meet the condition of
metric indecomposability? What is needed to derive metric
indecomposability is sufficient instability of the trajectories so that
the trajectories do not form groups of non-zero measure which fail to
wander sufficiently over the entire phase region. The existence of a
hidden constant of motion would violate metric indecomposability. After
much arduous work, culminating in that of Ya. Sinai, it was shown that
some “realistic” models of systems, such as the model of a gas as “hard
spheres in a box,” conformed to metric indecomposability. On the other
hand another result of dynamical theory, the Kolmogorov-Arnold-Moser
(KAM) theorem shows that more realistic models (say of molecules
interacting by means of “soft” potentials) are likely not to obey
ergodicity in a strict sense. In these cases more subtle reasoning
(relying on the many degrees of freedom in a system composed of a vast
number of constituents) is also needed.

If ergodicity holds what can be shown? It can be shown that for all
but a set of measure zero of initial points, the time average of a
phase quantity over infinite time will equal its phase average. It can
be shown that for any measurable region the average time the system
spends in that region will be proportional to the region's size (as
measured by the probability measure used in the microcanonical
ensemble). A solution to a further problem is also advanced. Boltzmann
knew that the standard probability distribution was invariant under
time evolution given the dynamics of the systems. But how could we know
that it was the only such invariant measure? With ergodicity we can
show that the standard probability distribution is the only one that is
so invariant, at least if we confine ourselves to probability measures
that assign probability zero to every set assigned zero by the standard
measure.

We have, then, a kind of “transcendental deduction” of the standard
probability assigned over microscopic states in the case of
equilibrium. Equilibrium is a time-unchanging state. So we demand that
the probability measure by which equilibrium quantities are to be
calculated be stationary in time as well. If we assume that probability
measures assigning non-zero probability to sets of states assigned zero
by the usual measure can be ignored, then we can show that the standard
probability is the only such time invariant probability under the
dynamics that drives the individual systems from one microscopic state
to another.

As a full “rationale” for standard equilibrium statistical
mechanics, however, much remains questionable. There is the problem
that strict ergodicity is not true of realistic systems. There are many
problems encountered if one tries to use the rationale as Boltzmann
hoped to identify phase averages with measured quantities relying on
the fact that macroscopic measurements take “long times” on a molecular
scale. There are the problems introduced by the fact that all of the
mathematically legitimate ergodic results are qualified by exceptions
for “sets of measure zero.” What is it physically that makes it
legitimate to ignore a set of trajectories just because it has measure
zero in the standard measure? After all, such neglect leads to
catastrophically wrong predictions when there really are hidden, global
constants of motion. In proving the standard measure uniquely
invariant, why are we entitled to ignore probability measures that
assign non-zero probabilities to sets of conditions assigned
probability zero in the standard measure? After all, it was just the
use of that standard measure that we were trying to justify in the
first place.

In any case, equilibrium theory as an autonomous discipline is
misleading. What we want, after all, is a treatment of equilibrium in
the non-equilibrium context. We would like to understand how and why
systems evolve from any initially fixed macroscopic state, taking
equilibrium to be just the “end point” of such dynamic evolution. So it
is to the general account of non-equilibrium we must turn if we want a
fuller understanding of how this probabilistic theory is functioning in
physics.

Boltzmann provided an equation for the evolution of the distribution
of the velocities of particles from a non-equilibrium initial state for
dilute gases, the Boltzmann equation. A number of subsequent equations
have been found for other types of systems, although generalizing to,
say, dense gases has proven intractable. All of these equations are
called kinetic equations.

How may they be justified and explained? In the discussions
concerning the problem of irreversibility that ensued after Boltzmann's
work, attention was focussed on a fundamental assumption he made: the
hypothesis with regard to collision numbers. This time-asymmetrical
assumption posited that the motions of the molecules in a gas were
statistically uncorrelated prior to the molecules colliding. In
deriving any of the other kinetic equations a similar such posit must
be made. Some general methods for deriving such equations are the
master equation approach and an approach that relies upon
coarse-graining the phase space of points representing the micro-states
of the system into finite cells and assuming fixed transition
probabilities from cell to cell (Markov assumption). But such an
assumption was not derived from the underlying dynamics of the system,
and, for all they knew so far, might have been inconsistent with that
dynamics.

A number of attempts have been made to do without such an assumption
and to derive the approach to equilibrium out of the underlying
dynamics of the system. Since that dynamics is invariant under time
reversal and the kinetic equations are time asymmetric, time asymmetry
must be put into the explanatory theory somewhere.

One approach to deriving the kinetic equations relies upon work
which generalizes ergodic theory. Relying upon the instability of
trajectories, one tries to show that a region of phase points
representing the possible micro-states for a system prepared in a
non-equilibrium condition will, if the constraints are changed,
eventually evolve into a set of phase points that is “coarsely” spread
over the entire region of phase space allowed by the changed
constraints. The old region cannot “finely” cover the new region by a
fundamental theorem of dynamics (Liouville's theorem). But, in a manner
first described by Gibbs, it can cover the region in a coarse-grained
sense. To show that a collection of points will spread in such a way
(in the infinite time limit at least) one tries to show the system
possessed of an appropriate “randomization” property. In order of
increasing strength such properties include weak-mixing, mixing, being
a K system or being a Bernoulli system. Other, topological as opposed
to measure-theoretic, approaches to this problem exist as well.

As usual, many caveats apply. Can the system really be shown to have
such a randomizing feature (in the light of the KAM theorem, for
example)? Are infinite time limit results relevant to our physical
explanations? If the results are finite time, are they relativized in
the sense of saying that they only hold for some coarse partitionings
of the system rather than to those of experimental interest?

Most importantly, mixing and its ilk cannot be the whole story. All
the results of this theory are time symmetric. To get time asymmetric
results, and to get results that hold in finite times and which show
evolution in the manner described by the kinetic equation over those
finite times, requires an assumption as well about how the probability
is to be distributed over the region of points allowed as representing
the system at the initial moment.

What must that probability assumption look like and how may it be
justified? These questions were asked, and partly explored, by N.
Krylov. Attempts at rationalizing this initial probability assumption
have ranged from Krylov's own suggestion that it is the result of a
non-quantum “uncertainty” principle founded physically on the modes by
which we prepare systems, to the suggestion that it is the result of an
underlying stochastic nature of the world described as in the
Ghirardi-Rimini-Weber approach to understanding measurement in quantum
mechanics. The status and explanation of the initial probability
assumption remains the central puzzle of non-equilibrium statistical
mechanics.

There are other approaches to understanding the approach to
equilibrium at variance with the approaches that rely on mixing
phenomena. O. Lanford, for example, has shown that for an idealized
infinitely dilute gas one can show, for very small time intervals, an
overwhelmingly likely behavior of the gas according to the Boltzmann
equation. Here the interpretation of that equation by the Ehrenfests,
the interpretation suitable to the mixing approach, is being dropped in
favor of the older idea of the equation describing the overwhelmingly
probable evolution of a system. This derivation has the virtue of
rigorously generating the Boltzmann equation, but at the cost of
applying only to one severely idealized system and then only for a very
short time (although the result may be true, if unproven, for longer
time scales). Once again an initial probability distribution is still
necessary for time asymmetry.

The thermodynamic principles demand a world in which physical
processes are asymmetric in time. Entropy of an isolated system may
increase spontaneously into the future but not into the past. But the
dynamical laws governing the motion of the micro-constituents are, at
least on the standard views of those laws as being the usual laws of
classical or quantum dynamics, time reversal invariant. Introducing
probabilistic elements into the underlying theory still does not by
itself explain where time asymmetry gets into the explanatory account.
Even if, following Maxwell, we take the Second Law of thermodynamics to
be merely probabilistic in its assertions, it remains time
asymmetric.

Throughout the history of the discipline suggestions have often been
made to the effect that some deep, underlying dynamical law itself
introduces time asymmetry into the motion of the
micro-constituents.

One approach is to deny the time asymmetry of the dynamics governing
the micro-constituents and to look for a replacement law that is
itself time asymmetric. A modern version of this looks to an
interpretation of quantum mechanics that seeks to explain the
notorious “collapse of the wave packet” upon measurement. Ghirardi,
Rimini and Weber (GRW) have posited the existence of a purely
stochastic process deeper than that of the usual quantum evolution.
This pure chance process will quickly drive macroscopic systems into
near eigenfunctions of position while leaving isolated micro-systems
in superposition states. The stochastic process is asymmetric in time
(as is the collapse of the wave function upon measurement). D. Albert
has suggested that such a GRW process, if real, might also be invoked
to account for the time asymmetry of the dynamics of systems that
needs to be accounted for in thermodynamics. The time asymmetry of
the GRW collapse might work by directly affecting the dynamics of the
system, or it might do its job by appropriately randomizing the
initial states of isolated systems. Little has yet been done to fill
in the details to see if the posited GRW processes could, if real,
account for the know thermodynamic asymmetries. And, of course, there
is much skepticism that the GRW processes are even real.

Other proposals take the entropic change of a system to be mediated
by an actually uneliminable “interference” into the system of random
causal influences from outside the system. It is impossible, for
example, to genuinely screen the system from subtle gravitational
influences from the outside. The issue of the role of external
interference in the apparently spontaneous behavior of what is
idealized as an isolated system has been much discussed. Here the
existence of special systems (such as spin echo systems encountered in
nuclear magnetic resonance) plays a role in the arguments. For these
systems seem to display spontaneous approach to equilibrium when
isolated, yet can have their apparent entropic behavior made to “go
backward” with an appropriate impulse from out side the system. This
seems to show entropic increase without the kind of interference from
the outside that genuinely destroys the initial order implicit in the
system. In any case, it is hard to see how outside interference would
do the job of introducing time asymmetry unless such asymmetry is put
in “by hand” in characterizing that interference.

It was Boltzmann who first proposed a kind of “cosmological”
solution to the problem. As noted above he suggested a universe overall
close to equilibrium with “small” sub-regions in fluctuations away from
that state. In such a sub-region we would find a world far from
equilibrium. Introducing the familiar time-symmetric probabilistic
assumptions, it becomes likely that in such a region one finds states
of lower entropy in one time direction and states of higher entropy in
the other. Then finish the solution by introducing the other Boltzmann
suggestion that what we mean by the future direction of time is fixed
as that direction of time in which entropy is increasing.

Current cosmology sees quite a different universe than that posited
by Boltzmann. As far as we can tell the universe as a whole is in a
highly non-equilibrium state with parallel entropic increase into the
future everywhere. But the structure of the cosmos as we know it allows
for an alternative solution to the problem of the origin of time
asymmetry in thermodynamics. The universe seems to be spatially
expanding, with an origin some tens of billions of years ago in an
initial singularity, the Big Bang. Expansion, however, by itself does
not provide the time asymmetry needed for thermodynamics, for an
expanding universe with static or decreasing entropy is allowed by
physics. Indeed, in some cosmological models in which the universe
contracts after expanding, it is usually, though not always, assumed
that even in contraction entropy continues to increase.

The source of entropic asymmetry is sought, rather, in the physical
state of the world at the Big Bang. Matter “just after” the Big Bang is
usually posited to be in a state of maximum entropy – to be in
thermal equilibrium. But this does not take account of the structure of
“space itself,” or, if you wish, of the way in which the matter is
distributed in space and subject to the universal gravitational
attraction of all matter for all other matter. A world in which matter
is distributed with uniformity is one of low entropy. A high entropy
state is one in which we find a clustering of matter into dense regions
with lots of empty space separating these regions. This deviation from
the usual expectation – spatial uniformity as the state of
highest entropy – is due to the fact that gravity, unlike the
forces governing the interaction of molecules in a gas for example, is
a purely attractive force.

One can then posit an initial “very low entropy” state for the Big
Bang, with the spatial uniformity of matter providing an “entropic
resevoir.” As the universe expands, matter goes from a uniformly
distributed state with temperature also uniform to one in which matter
is highly clumped into hot stars in an environment of cold empty space.
One then has the universe as we know it, with its thermally highly
non-equilibrium condition. “Initial low entropy,” then, will be a state
in the past not (as far as we know) matched by any singularity of any
kind, much less one of low entropy, in the future. If one
conditionalizes on that initial low entropy state one then gets, using
the time symmetric probabilities of statistical mechanics, a prediction
of a universe whose entropy increased in time.

But it is not, of course, the entropy of the whole universe with which
the Second Law is concerned, but, rather, that of “small” systems
temporarily energetically isolated from their environments. One can
argue, in a manner tracing back to H. Reichenbach, that the entropic
increase of the universe as a whole will lead, again using the usual
time symmetric probabilistic posits, to a high probability that a
random “branch system” will show entropic increase parallel to that of
the universe and parallel to that of other branch systems. Most of the
arguments in the literature that this will be so are flawed, but the
inference is reasonable nonetheless. It has also been suggested that
if one invokes some underlying statistical dynamic law (such as the
GRW law noted above), one need not posit a branch system hypothesis in
addition to initial low entropy to derive the thermodynamic
results.

Positing initial low entropy for the Big Bang gives rise to its own
set of “philosophical” questions: Given the standard probabilities in
which high entropy is overwhelmingly probable, how could we explain the
radically “unexpected” low entropy of the initial state? Indeed, can we
apply probabilistic reasoning appropriate for systems in the universe
as we know it to an initial state for the universe as a whole? The
issues here are reminiscent of the old debates over the teleological
argument for the existence of God.

It comes as no surprise that the relationship of the older
thermodynamic theory to the new statistical mechanics on which it is
“grounded” is one of some complexity.

The older theory had no probabilistic qualifications to its laws.
But as Maxwell was clearly aware, it could not then be “exactly” true
if the new probabilistic theory correctly described the world. One can
either keep the thermodynamic theory in its traditional form and
carefully explicate the relationship its principles bear to the newer
probabilistic conclusions, or one can, as has been done in deeply
interesting ways, generate a new “statistical thermodynamics” that
imports into the older theory probabilistic structure.

Conceptually the relationship of older to newer theory is quite
complex. Concepts of the older theory (volume, pressure, temperature,
entropy) must be related to the concepts of the newer theory (molecular
constitution, dynamical concepts governing the motion of the molecular
constituents, probabilistic notions characterizing either the states of
an individual system or distributions of states over an imagined
ensemble of systems subject to some common constraints).

A single term of the thermodynamic theory such as
‘entropy’ will be associated with a wide variety of
concepts defined in the newer account. There is, for example, Boltzmann
entropy which is the property of a single system defined in terms of
the spatial and momentum distribution of its molecules. On the other
hand there are the Gibbs'entropies, definable out of the probability
distribution over some Gibbsian ensemble of systems. Adding even more
complications there is, for example, Gibbs' fine grained entropy which
is defined by the ensemble probability alone and is very useful in
characterizing equilibrium states and Gibbs' coarse grained entropy
whose definition requires some partitioning of the phase space into
finite cells as well as the original probability distribution and which
is a useful concept in characterizing approach to equilibrium from the
ensemble perspective. In addition to these notions which are measure
theoretic in nature, there are topological notions which can play the
role of a kind of entropy as well.

Nothing in this complexity stands in the way of claiming that
statistical mechanics describes the world in a way that explains why
thermodynamics works and works as well as it does. But the complexity
of the inter-relationship between the theories should make the
philosopher cautious in using this relationship as a well understood
and simple paradigm of inter-theoretic reduction.

It is of some philosophical interest that the relationship of
thermodynamics to statistical mechanics shows some similarity to
aspects uncovered in functionalist theories of the mind-body
relationship. Consider, for example, the fact that systems of very
different physical constitutions (say a gas made up of molecules
interacting by means of forces on the one hand and on the other hand
radiation whose components are energetically coupled wave lengths of
light) can share thermodynamic features. They can, for example, be at
the same temperature. Physically this means that the two systems, if
initially in equilibrium and then energetically coupled, will retain
their original equilibrium conditions. The parallel with the claim that
a functionally defined mental state (a belief, say) can be instantiated
in a wide variety of physical devices is clear.

We have noted that it was Boltzmann who first suggested that our
very concept of the future direction of time was fixed by the direction
in time in which entropy was increasing in our part of the universe.
Numerous authors have followed up this suggestion and the “entropic”
theory of time asymmetry remains a much debated topic in the philosophy
of time.

We must first ask what the theory is really claiming. In a sensible
version of the theory there is no claim being made to the effect that
we find out the time order of events by checking the entropy of systems
and taking the later event as the one in which some system has its
higher entropy. The claim is, rather, that it is the facts about the
entropic asymmetry of systems in time that “ground” the phenomena that
we usually think of as marking out the asymmetrical nature of time
itself.

What are some features whose intuitive temporal asymmetry we think
of as, perhaps, “constituting” the asymmetrical nature of time? There
are asymmetries of knowledge: We have memories and records of the past,
but not of the future. There are asymmetries of determination: We think
of causation as going from past through present to future, and not of
going the other way round. There are asymmetries of concern: We may
regret the past, but we anxiously anticipate the future. There are
alleged asymmetries of “determinateness” of reality: It is sometimes
claimed that past and present have determinate reality, but that the
future, being a realm of mere possibilities, has no such determinate
being at all.

The entropic theory in its most plausible formulation is a claim to
the effect that we can explain the origin of all of these intuitive
asymmetries by referring to fact about the entropic asymmetry of the
world.

This can be best understood by looking at the very analogy used by
Boltzmann: the gravitational account of up and down. What do we mean by
the downward direction at a spatial location? All of the phenomena by
which we intuitively identify the downward direction (as the direction
in which rocks fall, for example) receive an explanation in terms of
the spatial direction of the local gravitational force. Even our
immediate awareness of which direction is down is explainable in terms
of the effect of gravity on the fluid in our semi-circular canals. It
comes as no shock to us at all that “down” for Australia is in the
opposite direction as “down” for Chicago. Nor are we dismayed to be
told that in outer space, far from a large gravitating object such as
the Earth, there is no such thing as the up-down distinction and no
direction of space which is the downward direction.

Similarly the entropic theorist claims that it is the entropic
features that explain the intuitive asymmetries noted above, that in
regions of the universe in which the entropic asymmetry was
counter-directed in time the past-future directions of time would be
opposite, and that in a region of the universe without an entropic
asymmetry neither direction of time would count as past or as
future.

The great problem remains in trying to show that the entropic
asymmetry is explanatorily adequate to account for all the other
asymmetries in the way that the gravitational asymmetry can account for
the distinction of up and down. Despite many interesting contributions
to the literature on this, the problem remains unresolved.

Most foundational inquiry into statistical mechanics presumes a
classical dynamical basis for describing the dynamics of the
constituent components of macroscopic systems. But this cannot be
correct, of course, since that underlying dynamics must be quantum
mechanical. Gibbs was cautious in claiming a simple explanatory role
for his ensemble version of statistical mechanics, for example, since
it led to notoriously false predictions for such macroscopic features
of systems as their specific heat. It was later realized that the
fault here lay not in Gibbs' statistical mechanics but in assuming
classical dynamics at the constituent level. Once systems were
re-described on the correct quantum mechanical basis the predictive
errors disappeared.

Naturally changing to a quantum mechanical basis leads to wholesale
changes within statistical mechanics. A new notion of phase space
with probabilities over it is needed, for example. Does this mean,
though, that the foundational explorations presupposing classical
mechanics are now irrelevant?

We have already noted that some proposals have been made that seek for
the grounding of the very probabilistic nature of statistical
mechanics in the fundamentally probabilistic nature of quantum
mechanics at the dynamical level, or, rather, in some interpretation
of how probability functions in the roots of quantum mechanics.

Even without going that far, though, the shift to a quantum dynamical
basis just require some rethinking of subtle issues in the
foundational debates. From the earliest days Poincare's Recurrence
Theorem was a problem for statistical mechanics. With a classical
dynamical basis a reply could be made that while the Theorem held for
individual systems of the concern of the theory, it would not
necessarily hold of an ensemble of such systems. Once one moves to a
quantum mechanical basis this “way out” is no longer available. In
both dynamical frameworks, however, a move to the thermodynamic limit
of an infinite number of constituents for a system can eliminate the
applicability of the theorem as an objection to the monotonicity of
thermodynamic change being unobtainable in statistical mechanics.

One of the most striking macroscopic feature of systems is the
existence of multiple phases (gas, liquid and solid, for example, or
diamagnetic and ferromagnetic for another) and the transitions between
those phases as thermodynamic features such as temperature and
pressure or imposed magnetization are varied. Early work on phase
transitions focused on the way in which quantities changed in a
non-analytic manner from phase to phase, even though statistical
mechanics seemed to show that such non-analytic behavior was
impossible, at least for systems with a finite number of constituents.
Here resort was often to going to the “thermodynamic limit” of an
idealized infinite system.

More recently methods have been developed for dealing with some phase
transitions that not only supplement the standard explanatory schemes
of traditional statistical mechanics, but that also provide insights
into the variety of forms scientific explanations can take. The
explanatory program, use of the so-called “renormalization group”
gives insight into why it is that systems of quite different physical
natures can show thermodynamic similarities in their transitions from
phase to phase. In some cases the nature of the transition is seen to
depend on a few abstract parameters and not on the physical details of
the system. What matters are such things as the dimension of the
system, the degrees of freedom of the dynamics of the constituents and
general limits on the interactions of the constituents with one
another such as short and very long range behavior of the relevant
forces of interaction.

The trick is to look first at the interactions of nearest
constituents. Then one moves to a block of constituents as it relates
to nearest similar blocks. The new block-to-block interaction can
sometimes be obtained by a “scaling” of the original individual
constituent interaction. One continues this process to the limit of
an infinite system and looks for a limit point to the continually
rescaled interaction. From this limiting behavior the surprising
“universal” features of the phase change can sometimes be found,
explaining the generality of similar phase transitions over diverse
physical systems.

The explanatory strategy here is quite unlike the usual one
encountered in statistical mechanics and shows incisively how the
specifics of the physical systems crying out for explanation in
science can require the introduction of new methodological ploys if
full understanding is to be obtained. In this the introduction of
these renormalization group methods resembles the way in which the
need to an atomistic account of macroscopic thermodynamic behavior
itself required the new methods of statistical mechanics to be added
to the older repertoire of typical dynamical explanations.

A comprehensive treatment of the issues from a philosophical
perspective is Sklar 1993. Of important historical interest is
Reichenbach 1956. An accessible and up-to-date discussion of the
fundamental issues is Albert 2000. The possible invocation of a time
asymmetric law through the GRW approach is discussed in this book. A
spirited defense of the initial low entropy approach to time asymmetry
is Price 1996. Frigg 2008 reviews additional philosopher's work on the
foundational issues. English translations of many original fundamental
papers are in Brush 1965. Brush 1976 provides an historical treatment
of the development of the theory. Two foundational works that are
essential are Gibbs 1960 and Ehrenfest and Ehrenfest 1959. Two works
explaining clearly and detail many of the technical aspects of
foundational statistical mechanics are Emch and Liu 2002 and Toda,
Kubo and Saito 1983. These two works provide a thorough grounding on
quantum statistical mechanics and how it differs from statistical
mechanics grounded on classical theory. An excellent introduction to
phase change and renormalization group theory is Batterman 2002.

Frigg, R., 2008, “A Field Guide to recent Work on the Foundations
of Statistical Mechanics,” in D. Rickles (ed.), The Ashgate
Companion to Contemporary Philosophy of Physics, London: Ashgate,
pp. 99-196.