Boltzmann's first attempt to derive the second law of thermodynamics assumed that gas particles followed strict dynamical laws, that is, Newton's classical mechanics. In 1872 Boltzmann derived a mathematical quantity (his H-Theorem) that had the same property of increase for a gas approaching equilibrium as Rudolf Clausius' entropy law. Clausius enunciated the two laws of thermodynamics. First, the energy in the world is a constant. Second, the entropy of the world increases to a maximum.

James Clerk Maxwell was critical of Boltzmann's result. He argued (correctly as it turns out) that the kinetic theory of gases must be purely statistical, not deterministic and dynamical.

Boltzmann had in 1866 derived Maxwell's velocity distribution for the molecules of a gas in equilibrium dynamically, putting it on a firmer ground than Maxwell.

Boltzmann's mentor and colleague Josef Loschmidt criticized Boltzmann's 1872 demonstration of entropy increase, on the grounds that Newton's dynamical laws are reversible. If all the individual particles could be turned around exactly (or if time could be reversed), Boltzmann's H-Theorem should show that the entropy would decrease, violating the second law. Some, including Boltzmann, suggested that time might be simply the direction in which entropy increases. Arthur Stanley Eddington later called this the Arrow of Time.

The basic problem is - how can macroscopic irreversibility result from microscopic processes that are fundamentally reversible?

We shall see that the answer is found in the quantum nature of the atoms and molecules, especially when they interact with radiation.

Five years later, responding to Loschmidt's criticism, Boltzmann reformulated his H-theorem on purely statistical and probabilistic grounds. Maxwell, who died in 1879, did not remark on this obvious improvement. He found Boltzmann's papers much too long and too dense to read. (Boltzmann found Maxwell's papers too brief to contain a full explanation.)

It is not clear that Boltzmann would agree with Maxwell about the implicit loss of determinism in physics. Boltzmann maintained (as his student Franz Exner, and Exner's student Erwin Schrödinger would later briefly insist) that observational evidence can never justify our assumptions of strict determinism.

Boltzmann was under severe attacks from colleagues for espousing the reality of atoms. He may have been wary of emphasizing that atomic motions are chaotic and random. Real ontological chance was anathema to deterministic nineteenth-century thinkers and even considered atheistic by many, since it implies denial of the omniscience of God.

Boltzmann was a great believer in theories, but he knew that they could "go beyond experience," a phrase he used more than once and the key phrase in Franz Exner's denial of strict causal determinism decades before quantum mechanics. As Albert Einstein would later explain, theories are "free inventions of the human mind." Theories are guesses, new ideas, fictions, and pure information that goes beyond Ernst Mach's positivist belief that science includes only "economic summaries" of the results of experiments.

The confirmation of theories always rests on the statistics of experiments. So theories are always probabilistic predictions about what will happen in the experiments. And experiments can only provide statistical evidence, although sometime the evidence is so good that we can regard it as "adequately" or statistically determined.

Whence comes the ancient view, that the body does not fill space continuously in the mathematical sense, but rather it consists of discrete molecules, unobservable because of their small size. For this view there are philosophical reasons. An actual continuum must consist of an infinite number of parts; but an infinite number is undefinable. Furthermore, in assuming a continuum one must take the partial differential equations for the properties themselves as initially given. However, it is desirable to distinguish the partial differential equations, which can be subjected to empirical tests, from their mechanical foundations (as Hertz emphasized in particular for the theory of electricity). Thus the mechanical foundations of the partial differential equations, when based on the coming and going of smaller particles, with restricted average values, gain greatly in plausibility; and up to now no other mechanical explanation of natural phenomena except atomism has been successful.

A real discontinuity of bodies is moreover established by numerous, and moreover quantitatively agreeing, facts. Atomism is especially indispensable for the clarification of the facts of chemistry and crystallography. The mechanical analogy between the facts of any science and the symmetry relations of discrete particles pertains to those most essential features which will outlast all our changing ideas about them, even though the latter may themselves be regarded as established facts. Thus already today the hypothesis that the stars are huge bodies millions of miles away is similarly viewed only as a mechanical analogy for the representation of the action of the sun and the faint visual perceptions arising from the other heavenly bodies, which could also be criticized on the grounds that it replaces the world of our sense perceptions by a world of imaginary objects, and that anyone could just as well replace this imaginary world by another one without changing the observable facts.

I hope to prove in the following that the mechanical analogy between the facts on which the second law of thermodynamics is based, and the statistical laws of motion of gas molecules, is also more than a mere superficial resemblance.

The question of the utility of atomistic representations is of course completely unaffected by the fact, emphasized by Kirchhoff, that our theories have the same relation to nature as signs to significates, for example as letters to sounds, or notes to tones. It is likewise unaffected by the question of whether it is not more useful to call theories simply descriptions, in order to remind ourselves of their relation to nature. The question is really whether bare differential equations or atomistic ideas will eventually be established as complete descriptions of phenomena.

Once one concedes that the appearance of a continuum is more clearly understood by assuming the presence of a large number of adjacent discrete particles, assumed to obey the laws of mechanics, then he is led to the further assumption that heat is a permanent motion of molecules. Then these must be held in their relative positions by forces, whose origin one can imagine if he wishes. But all forces that act on the visible body but not equally on all the molecules must produce motion of the molecules relative to each other, and because of the indestructibility of kinetic energy these motions cannot stop but must continue indefinitely.

In fact, experience teaches that as soon as the force acts equally on all parts of a body — as for example in so-called free fall — all the kinetic energy becomes visible. In all other cases, we have a loss of visible kinetic energy, and hence creation of heat. The view offers itself that there is a resulting motion of molecules among themselves, which we cannot see because we do not see individual molecules, but which however is transmitted to our nerves by contact, and thus creates the sensation of heat. It always moves from bodies whose molecules move rapidly to those whose molecules move more slowly, and because of the indestructibility of kinetic energy it behaves like a substance, as long as it is not transformed into visible kinetic energy or work.

We do not know the nature of the force that holds the molecules of a solid body in their relative positions, whether it is action at a distance or is transmitted through a medium, and we do not know how it is affected by thermal motion. Since it resists compression as much as it resists dilatation, we can obviously get it rather rough picture by assuming that in a solid body each molecule has a rest position. If it approaches a neighboring molecule it is repelled by it, but if it moves farther away there is an attraction.
Consequently, thermal motion first sets a molecule into pendulum-like oscillations in straight or elliptical paths around its rest position A (in the symbolic Fig. 1, the centers of gravity of the molecules are indicated). If it moves to A', the neighboring molecules B and C repel it, while D and E attract it and hence bring it back to its original rest position. If each molecule vibrates around a fixed rest position, the body will have a fixed form; it is in the solid state of aggregation. The only consequence of the thermal motion is that the rest positions of the molecules will be somewhat pushed apart, and the body will expand somewhat. However, when the thermal motion becomes more rapid, one gets to the point where a molecule can squeeze between its two neighbors and move from A to A" (Fig. 1). It will no longer then be pulled back to its old rest position, but it can instead remain where it is. When this happens to many molecules, they will crawl among each other like earthworms, and the body is molten.

Although one may find this description rather crude and childish, it may be modified later and the apparent repulsive force may turn out to be a direct consequence of the motion. In any case, one will allow that when the motions of the molecules increase beyond a definite limit, individual molecules on the surface of the body can be torn off and must fly out freely into space; the body evaporates. If it is in an enclosed vessel, then this will be filled with freely moving molecules, and these can occasionally penetrate into the body again; as soon as the number of recondensing molecules is, on the average, equal to the number of evaporating ones, one says that the vessel is saturated with the vapor of the body in question.

A sufficiently large enclosed space, in which only such freely moving molecules are found, provides a picture of a gas. If no external forces act on the molecules, these move most of the time like bullets shot from guns in straight lines with constant velocity. Only when a molecule passes very near to another one, or to the wall of the vessel, does it deviate from its rectilinear path. The pressure of the gas is interpreted as the action of these molecules against the wall of the container.

We shall suppose for a moment that in the container there is a
single gas composed of completely identical molecules. The molecules
will also from now on—unless we specify otherwise—be assumed
to behave like completely elastic spheres when they collide
with each other. Even if all the molecules initially had the same
velocity, there would soon occur collisions in which the velocity
of one colliding molecule is nearly in the direction of the line of
centers, but that of the other is nearly perpendicular to it. The
first molecule would thereby end up with nearly zero velocity,
while the velocity of the second would become √2 times as large.
In the course of further collisions it would soon happen, if the
number of molecules were large enough, that all possible velocities
would occur, from zero up to a velocity much larger than the
original common velocity of all the molecules; it is then a question
of calculating the law of distribution of velocities among the molecules
in the final state thus reached, or, as one says more briefly,
to find the velocity distribution law. In order to find it, we shall
consider a more general case. We assume that we have two kinds
of molecules in the container. Each molecule of the first kind has
mass m, and each of the second has mass m1. The velocity distribution
which prevails at any arbitrary time t will be represented
by drawing as many straight lines (starting from the origin of
coordinates) as there are m-molecules in unit volume. Each line
will be the same in length and direction as the velocity of the corresponding
molecule. Its endpoint will be called the velocity point
of the corresponding molecule. Now at time t let

(9) f(ξ, η, ζ, t)dξdηdζ = fdω

be the number of m-molecules whose velocity components in the
three coordinate directions lie between the limits

(10) ξ and ξ + dξ, η and η + dη, ζ and ζ + dζ

and for which the velocity points thus lie in the parallelepiped
having at one corner the coordinates ξ, η, ζ and having edges of
length dξ, dη, dζ parallel to the coordinate axes. We shall always
call this the parallelepiped dω. We shall also use the abbreviations
dω for the product dξdηdζ, and f for f(ξ, η, ζ, t) If dω were a volume
element of any other shape, though still infinitesimal, the number
of m-molecules whose velocity points lie inside dω would still be
equal to

(11) f(ξ, η, ζ, t)dω,

as one can see by dividing the volume element dω into even
smaller parallelepipeds. If the function f is known for one value
of t, then the velocity distribution for the m-molecules is determined
at time t. Similarly we can represent the velocity of each
m1-molecule by a velocity point, and denote by

(12) F(ξ1, η1, ζ1, t)dξ1dη1dζ1 = F1dω1

the number of molecules whose velocity components lie between
any other limits

(13) ξ1 and ξ1 + dξ1, η1 and η1 + dη1, ζ1 and ζ1 + dζ1

and for which therefore the velocity points lie in a similar parallelepiped
dω1. Likewise we write dω1 for dξ1dη1dζ1 and F1 for
F(ξ1, η1, ζ1, t). Moreover, we shall completely exclude any external forces, and assume that the walls are completely smooth
and elastic. Then the molecules reflected from the wall will move
as if they came from a gas which is the mirror image of our gas,
and which is thus completely equivalent to it; the container wall
is thought of as a reflecting surface. According to these assumptions,
the same conditions will prevail everywhere inside the container,
and if the number of molecules in a volume element whose
velocity components lie between the limits (10) is initially the
same everywhere in the gas, then this will also be true for all subsequent
times. If we assume this, then it follows that the number
of m-molecules inside any volume Φ that satisfy the conditions
(10) is proportional to the volume Φ and is therefore equal to

(14) Φfdω;

likewise the number of m1-molecules in the volume Φ that satisfy
the conditions (13) is:

(14a) ΦF1dω1

From these assumptions it follows that the molecules that leave
any space as a result of their progressive motion will on the average
be replaced by an equal number of molecules from the neighboring
space or by reflection at the walls of the container, so that
the velocity distribution is changed only by collisions, and not by
the progressive motion of the molecules. We shall make ourselves
independent of these restrictive conditions (made now to simplify
the calculation) in §§15-18, where we shall take account of the
effect of gravity and other external forces.

We next consider only the collisions of an m-molecule with an
m1-molecule and indeed we shall single out, from all those collisions
that can occur during the interval dt, only those for which
the following three conditions are satisfied:

The velocity components of the m-molecule lie between
the limits (10) before the collision, hence its velocity point lies in
the parallelepiped dω.

The velocity components of the m1-molecule lie between
the limits (13) before the collision, hence its velocity point lies in
the parallelepiped dω All m-molecules for which the first condition
is fulfilled will be called "m-molecules of the specified kind,"
and similarly we speak of "m1-molecules of the specified kind."

We construct a sphere of unit radius, whose center is at
the origin of coordinates, and on it a surface element dλ. The line
of centers of the colliding molecules drawn from m to m1 must,
at the moment of collision, be parallel to a line drawn from the
origin to some point of the surface element dλ. The aggregate of
these lines constitutes the cone dλ.

(15) Direction mm1 in the cone dλ.

All collisions that take place in such a way that these three
conditions are fulfilled will be called "collisions of the specified
kind" and we have the problem of determining the number dν of
collisions of the specified kind that take place during a time interval
dt in unit volume. We shall represent these collisions in
Figure 2. Let O be the origin of coordinates, C and C1 the velocity
points of the two molecules before the collision, so that the lines
OC and OC1 represent these velocities in magnitude and direction,
before the collision. The point C must lie inside the parallelepiped
dω, and the point C1 inside the parallelepiped dω1 (The two parallelepipeds
are not shown in the figure.) Let OK be a line of unit
length which has the same direction as the line of centers of the
two molecules at the instant of the collision, drawn from m to m1.

The point K must therefore lie inside the
surface element dλ, which is also not
shown in the figure. The line C1C = g
represents in magnitude and direction
the relative velocity of the m-molecule
with respect to the m1-molecule before
the collision, since its projections on the
coordinate axes are equal to ξ - ξ1, η - η1,
and ζ - ζ1, respectively. The frequency of
collisions obviously depends only on the
relative velocity. Hence if we wish to find
FIG. 2. the number of collisions of the specified
kind, we can imagine that the specified
m1-molecule is at rest, while the m-molecule moves with velocity
g. We imagine further that a sphere of radius σ (the sphere σ) is
rigidly attached to each of the latter molecules, so that the center
of the sphere always coincides with the center of the molecule.
σ should be equal to the sum of the radii of the two molecules.
Each time that the surface of such a sphere touches the center of
an m1-molecule, a collision takes place. We now draw from the
center of each sphere a a cone, similar and similarly situated to
the cone dλ. A surface element of area σ2dλ is thereby cut out from
the surface of each of these spheres. Since all the spheres are
rigidly attached to the corresponding molecules, all these surface
elements move a distance gdt relative to the specified m1-molecule.
A collision of the specified kind occurs whenever one of these surface
elements touches the center of a specified m1-molecule, which
is of course possible only if the angle θ between the directions of
the lines C1C and OK is acute. Each of these surface elements
traverses by its relative motion toward the m1-molecule an oblique
cylinder of base σ2dλ and height g cosθdt. Since there are fdωm
molecules of the specified kind in unit volume, all the oblique
cylinders traversed in this manner by all the surface elements
have total volume

(16) Φ = fdωσ2g cosθdλdt

All centers of m1-molecules of the specified kind lying inside the
volume Φ will touch during the interval dt one surface element
σ2dλ, and hence the number dν of collisions of the specified kind
which occur in the volume element during time dt is equal to the
number ZΦ of centers of m1-molecules of the specified kind that
are in the volume Φ at the beginning of dt. But according to
Equation (14a) this is .

(17) ZΦ = ΦF1dω1

S. H. Burbury and his British colleague E. P. Culverwell suggested in the 1890's that Boltzmann's idea of "molecular disorder" might be the result of radiation interacting with the molecules
Boltzmann appears never to have taken this seriously

In this formula there is contained a special assumption, as
Burbury has clearly emphasized. From the standpoint of mechanics,
any arrangement of molecules in the container is possible;
in such an arrangement, the variables determining the motion of
the molecules may have different average values in one part of
the space filled by the gas than in another, where for example the
density or mean velocity of a molecule may be larger in one half
of the container than in the other, or more generally some finite
part of the gas has different properties than another. Such a distribution will be called molar-ordered [molar-geordnete], Equations (14) and (14a) pertain to the case of a molar-disordered distribution. If the arrangement of the molecules also exhibits no
regularities that vary from one finite region to another—if it is
thus molar-disordered—then nevertheless groups of two or a small
number of molecules can exhibit definite regularities. A distribution that
exhibits regularities of this kind will be called molecular-ordered. We have a molecular-ordered distribution if—to select
only two examples from the infinite manifold of possible cases—
each molecule is moving toward its nearest neighbor, or again if
each molecule whose velocity lies between certain limits has ten
much slower molecules as nearest neighbors. When these special
groupings are not limited to particular places in the container but
rather are found on the average equally often throughout the entire
container, then the distribution would be called molar disordered.
Equations (14) and (14a) would then always be valid
for individual molecules, but Equation (17) would not be valid,
since the nearness of the m-molecule would be influenced by the
probability that the m1-molecule lies in the space Φ. The presence
of the m1-molecule in the space Φ cannot therefore be considered
in the probability calculation as an event independent of the
nearness of the m-molecule. The validity of Equation (17) and
the two similar equations for collisions of m- or m1-molecules with
each other can therefore be considered as defining the meaning
of the expression: the distribution of states is molecular disordered.
If the mean free path in a gas is large compared to the mean
distance of two neighboring molecules, then in a short time, completely
different molecules than before will be nearest neighbors
to each other. A molecular-ordered but molar-disordered distribution
will most probably be transformed into a molecular disordered
one in a short time. Each molecule flies from one collision
to another one so far away that one can consider the occurrence
of another molecule, at the place where it collides the second
time, with a definite state of motion, as being an event completely
independent (for statistical calculations) of the place from which
the first molecule came (and similarly for the state of motion of
the first molecule). However, if we choose the initial configuration
on the basis of a previous calculation of the path of each molecule,
so as to violate intentionally the laws of probability, then of
course we can construct a persistent regularity or an almost
molecular-disordered distribution which will become molecular-ordered at a particular time. Kirchhoff also makes the assumption
that the state is molecular-disordered in his definition of the probability concept.

That it is necessary to the rigor of the proof to specify this
assumption in advance was first noticed in the discussion of my
so-called H-theorem or minimum theorem. However, it would be
a great error to believe that this assumption is necessary only for
the proof of this theorem. Because of the impossibility of calculating
the positions of all the molecules at each time, as the
astronomer calculates the positions of all the planets, it would be
impossible without this assumption to prove the theorems of gas
theory. The assumption is made in the calculation of the viscosity,
heat conductivity, etc. Also, the proof that the Maxwell velocity
distribution law is a possible one—i.e., that once established it
persists for an infinite time—is not possible without this assumption. For one cannot prove that the distribution always remains
molecular-disordered.

In fact, when Maxwell's state has arisen'
from some other state, the exact recurrence of that other state
will take place after a sufficiently long time (cf. the second half
of §6). Thus one can have a state arbitrarily close to the Maxwellian
state which finally is transformed into a completely different
one. It is not a defect that the minimum theorem is tied to
the assumption of disorder, rather it is a merit that this theorem
has clarified our ideas so that one recognizes the necessity of this
assumption.

When a gas is enclosed in a rigid container, and initially one part of it has a visible motion with respect to the rest, then it soon comes to rest as a consequence of viscosity. When two kinds of gas are initially unmixed, but in contact with each other, then they mix, even if the lighter one was originally on top. In general, when a gas or a system of several kinds of gas has initially some improbable state, then it passes to the most probable state under the given external conditions, and remains there during all observable later times. In order to prove that this is a necessary consequence of the kinetic theory of gases, we used the quantity H defined and discussed in this chapter. We proved that it continually decreases as a result of the motion of the gas molecules among each other. The one-sidedness of this process is clearly not based on the equations of motion of the molecules. For these do not change when the time changes its sign. This one-sidedness rather lies uniquely and solely in the initial conditions.

This is not to be understood in the sense that for each experiment one must specially assume just certain initial conditions and not the opposite ones which are likewise possible; rather it is sufficient to have a uniform basic assumption about the initial properties of the mechanical picture of the world, from which it, then follows with logical necessity that, when bodies are always interacting, they must always be found in the correct initial conditions. In particular, our theory does not require that each time when bodies are interacting, the initial state of the system they form must be distinguished by a special property (ordered or improbable) which relatively few states of the same mechanical system would have under the external mechanical conditions in question. Hereby the fact is clarified that this system takes in the course of time states which do not have these properties, and which one calls disordered. Since by far most of the states of the system are disordered, one calls the latter the probable states.

The ordered initial states are not related to the disordered ones in the way that a definite state is to the opposite state (arising from the mere reversal of the directions of all velocities), but rather the state opposite to each ordered state is again an ordered state.

The self-regulating most probable state — which we call the Maxwell velocity distribution since Maxwell first found its mathematical expression in a special case — is not some kind of special singular state which is contrasted to infinitely many more non-Maxwellian distributions. Rather it is, on the contrary, characterized by the fact that by far the largest number of possible states have the characteristic properties of the Maxwell distribution, and compared to this number, the number of possible velocity distributions which significantly deviate from the Maxwellian is vanishingly small. The criterion of equal possibility or equal probability is provided by Liouville's theorem.

In order to explain the fact that the calculations based on this assumption correspond to actually observable processes, one must assume that an enormously complicated mechanical system represents a good picture of the world, and that all or at least most of the parts of it surrounding us are initially in a very ordered — therefore very improbable — state. When this is the case, then whenever two or more small parts of it come into interaction with each other, the system formed by these parts is also initially in an ordered state, and when left to itself it rapidly proceeds to the disordered most probable state.

§88. On the return of a system to a former state.

We make the following remarks:

1. It is by no means the sign of the time which constitutes the characteristic difference between an ordered and a disordered state. If, in the "initial states" of the mechanical picture of the world, one reverses the directions of all velocities, without changing their magnitudes or the positions of the parts of the system; if, as it were, one follows the states of the system backwards in time, then he would likewise first have an improbable state, and then reach ever more probable states. Only in those periods of time during which the system passes from a very improbable initial state to a more probable later state do the states change in the positive time direction differently than in the negative.

2. The transition from an ordered to a disordered state is only extremely improbable. Also, the reverse transition has a definite calculable (though inconceivably small) probability, which approaches zero only in the limiting case when the number of molecules is infinite. The fact that a closed system of a finite number of molecules, when it is initially in an ordered state and then goes over to a disordered state, finally after an inconceivably long time must again return to the ordered state,* is therefore not a refutation but rather indeed a confirmation of our theory.

One should not however imagine that two gases in a 1/10 liter container, initially unmixed, will mix, then again after a few days, separate, then mix again, and so forth. On the contrary, one finds by the same principles which I used* for a

* Ann. Phys. [3] 57, 783 (1896).

similar calculation that, not until after a time enormously long compared to
101010 years will there be any noticeable unmixing of the gases. One may recognize that this is practically equivalent to never, if one recalls that in this length of time, according to the laws of probability, there will have been many years in which every inhabitant of a large country committed suicide, purely by accident, on the same day, or every building burned down at the same time — yet the insurance companies get along quite well by ignoring the possibility of such events. If a much smaller probability than this is not practically equivalent to impossibility, then no one can be sure that today will be followed by a night and then a day.

* Ann. Phys. [3] 57, 783 (1896).

We have looked mainly at processes in gases and have calculated the function H for this case. Yet the laws of probability that govern atomic motion in the solid and liquid states are clearly not qualitatively different in this respect from those for gases, so that the calculation of the function H corresponding to the entropy would not be more difficult in principle, although to be sure it would involve greater mathematical difficulties.

§89. Relation to the second law of thermodynamics.

If therefore we conceive of the world as an enormously large mechanical system composed of an enormously large number of atoms, which starts from a completely ordered initial state, and even at present is still in a substantially ordered state, then we obtain consequences which actually agree with the observed facts; although this conception involves, from a purely theoretical — I might say philosophical — standpoint, certain new aspects which contradict general thermodynamics based on a purely phenomenological viewpoint. General thermodynamics proceeds from the fact that, as far as we can tell from our experience up to now, all natural processes are irreversible. Hence according to the principles of phenomenology, the general thermodynamics of the second law is formulated in such a way that the unconditional irreversibility of all natural processes is asserted as a so-called axiom, just as general physics based on a purely phenomenological standpoint asserts the unconditional divisibility of matter without limit as an axiom.

Just as the differential equations of elasticity theory and hydrodynamics based on this latter axiom will always remain the basis of the phenomenological description of a large group of natural phenomena, since they provide the simplest approximate expression of the facts, so likewise will the formulas of general thermodynamics. No one who has fallen in love with the molecular theory will approve of its being given up completely. But the opposite extreme, the dogma of a self-sufficient phenomenology, is also to be avoided.

Just as the differential equations represent simply a mathematical method for calculation, whose clear meaning can only be understood by the use of models which employ a large finite number of elements,1 so likewise general thermodynamics (without prejudice to its unshakable importance) also requires the cultivation of mechanical models representing it, in order to deepen our knowledge of nature — not in spite of, but rather precisely because these models do not always cover the same ground as general thermodynamics, but instead offer a glimpse of a new viewpoint. Thus general thermodynamics holds fast to the invariable irreversibility of all natural processes. It assumes a function (the entropy) whose value can only change in one direction — for example, can only increase — through any occurrence in nature. Thus it distinguishes any later state of the world from any earlier state by its larger value of the entropy. The difference of the entropy from its maximum value — which is the goal [Treibende] of all natural processes — will always decrease. In spite of the invariance of the total energy, its transformability will therefore become ever smaller, natural events will become ever more dull and uninteresting, and any return to a previous value of the entropy is excluded.2

1 Boltzmann, Die Unentbehrlichkeit der Atomistik i.d. Naturwissenschaft. Wien. Ber. 105 (2) 907 (1896); Ann. Phys. [31 60, 231 (1897). Ueber die Frage nach der Existent der Vorgange in der unbelebten Natur, Wien. Ber. 106 (2) 83 (1897).
2 W. Thomson (Lord Kelvin), On a universal tendency in nature to the dissipation of mechanical energy, Proc. R. S. Edinburgh 3, 139 (1852), and later papers by many authors, too numerous to cite. For the generalized "dissipation of energy" version of the second law, see F. O. Koenig, On the history of science and of the second law of thermodynamics, p. 57 in Men and Moments in the History of Science, ed. H. M. Evans (Seattle: University of Washington Press, 1959), and other articles cited therein.

One cannot assert that this consequence contradicts our experience, for indeed it seems to be a plausible extrapolation of our present knowledge of the world. Yet, with all due recognition to the caution which must be observed in going beyond the direct consequences of experience, it must be granted that these consequences are hardly satisfactory, and the discovery of a satisfactory way of avoiding them would be very desirable, whether one may imagine time as infinite or as a closed cycle. In any case, we would rather consider the unique directionality of time given to us by experience as a mere illusion arising from our specially restricted viewpoint.

§90. Application to the universe.

Is the apparent irreversibility of all known natural processes consistent with the idea that all natural events are possible without restriction? Is the apparent unidirectionality of time consistent with the infinite extent or cyclic nature of time? He who tries to answer these questions in the affirmative sense must use as a model of the world a system whose temporal variation is determined by equations in which the positive and negative directions of time are equivalent, and by means of which the appearance of irreversibility over long periods of time is explicable by some special assumption. But this is precisely what happens in the atomic view of the world.

One can think of the world as a mechanical system of an enormously large number of constituents, and of an immensely long period of time, so that the dimensions of that part containing our own "fixed stars" are minute compared to the extension of the universe; and times that we call eons are likewise minute compared to such a period. Then in the universe, which is in thermal equilibrium throughout and therefore dead, there will occur here and there relatively small regions of the same size as
our galaxy (we call them single worlds) which, during the relative short time of eons, fluctuate noticeably from thermal equilibrium, and indeed the state probability in such cases will be equally likely to increase or decrease. For the universe, the two directions of time are indistinguishable, just as in space there is no up or down. However, just as at a particular place on the earth's surface we call "down" the direction toward the center of the earth, so will a living being in a particular time interval of such a single world distinguish the direction of time toward the less probable state from the opposite direction (the former toward the past, the latter toward the future). By virtue of this terminology, such small isolated regions of the universe will always find themselves "initially" in an improbable state. This method seems to me to be the only way in which one can understand the second law — the heat death of each single world — without a unidirectional change of the entire universe from a definite initial state to a final state.

Obviously no one would consider such speculations as important discoveries or even — as did the ancient philosophers — as the highest purpose of science. However it is doubtful that one should despise them as completely idle. Who knows whether they may not broaden the horizon of our circle of ideas, and by stimulating thought, advance the understanding of the facts of experience?

That in nature the transition from a probable to an improbable state does not take place as often as the converse, can be explained by assuming a very improbable initial state of the entire universe surrounding us, in consequence of which an arbitrary system of interacting bodies will in general find itself initially in an improbable state. However, one may object that here and there a transition from a probable to an improbable state must occur and occasionally be observed. To this the cosmological considerations just presented give an answer. From the numerical data on the inconceivably great rareness of transition from a probable to a less probable state in observable dimensions during an observable time, we see that such a process within what we have called an individual world — in particular, our individual world — is so unlikely that its observability is excluded.

In the entire universe, the aggregate of all individual worlds, there will however in fact occur processes going in the opposite direction. But the beings who observe such processes will simply reckon time as proceeding from the less probable to the more probable states, and it will never be discovered whether they reckon time differently from us, since they are separated from us by eons of time and spatial distances 101010 times the distance of Sirius — and moreover their language has no relation to ours.*

* For further discussion see H. Reichenbach, The Direction of Time (Berkeley and Los Angeles: University of California Press, 1956).

Very well, you may smile at this; but you must admit that the model of the world developed here is at least a possible one, free of inner contradiction, and also a useful one, since it provides us with many new viewpoints. It also gives an incentive, not only to speculation, but also to experiments (for example on the limit of divisibility, the size of the sphere of action, and the resulting deviations from the equations of hydrodynamics, diffusion, and heat conduction) which are not stimulated by any other theory.

§91. Application of the probability calculus in molecular physics.

Doubts have been expressed as to the permissibility of the applications made of the probability calculus to this subject. Since however the probability calculus has been verified in so many special cases, I see no reason why it should not also be applied to natural processes of a more general kind. The applicability of the probability calculus to the molecular motion in gases cannot of course be rigorously deduced from the differential equations for the motion of the molecules. It follows rather from the great number of the gas molecules and the length of their paths, by virtue of which the properties of the position in the gas where a molecule undergoes a collision are completely independent of the place where it collided the previous time. This independence is of course attained only for a finite number of gas molecules during an arbitrarily long time. For a finite number of molecules in a rigid container with completely smooth walls it is never completely exact, so that the Maxwell velocity distribution cannot hold throughout all time.

In practice, however, the walls are continually undergoing perturbations, which will destroy the periodicity resulting from the finite number of molecules. In any case, the applicability of probability theory to gas theory is not refuted but rather is confirmed by the periodicity of motion of a finite closed system in the course of eons of time, and since it leads to a model of the world which not only agrees with experience but also stimulates speculations and experiments, it should be retained in gas theory.

Moreover we see that the probability calculus plays yet another role in physics. The calculation of errors by Gauss's famous method is confirmed in purely physical processes, like the calculation of insurance premiums for statistical ones. We have to thank the laws of probability for the fact that in an orchestra the sounds regularly reinforce each other in unison rather than cancelling out by interference; and the same laws also clarify the nature of unpolarized light.

Boltzmann could only imagine that a quantum mechanics would appear two decades later in which all atomic motions are statistical

Since today it is popular to look forward to the time when our view of nature will have been completely changed, I will mention the possibility that the fundamental equations for the motion of individual molecules will turn out to be only approximate formulas which give average values, resulting according to the probability calculus from the interactions of many independent moving entities forming the surrounding medium — as for example in meteorology the laws are valid only for average values obtained by long series of observations using the probability calculus. These entities must of course be so numerous and must act so rapidly that the correct average values are attained in millionths of a second.