Introduction

A brief historical sketch.

After the molecular theory of the structure of matter attained a predominant role in physics, the appearance of new statistical (or probabilistic) methods of investigation in physical theories became unavoidable. From this new point of view each portion of matter (solid, liquid, or gaseous) was considered as a collection of a large number of very small particles. Very little was known about the nature of these particles except that their number was extremely large, that in a homogeneous material these particles had the same properties, and that these particles were in a certain kind of interaction. The dimensions and structure of the particles, as well as the laws of the interaction could be determined only hypothetically.

Under such conditions the usual mathematical methods of investigation of physical theories naturally remained completely powerless. For instance, it was impossible to expect to master such problems by means of the apparatus of differential equations. Even if the structure of the particles and the laws of their interaction were known, their exceedingly large number would have presented an insurmountable obstacle to the study of their motions by such methods of differential equations as are used in mechanics. Other methods had to be introduced, for which the large number of interacting particles, instead of being an obstacle, would become a stimulus for a systematic study of physical bodies consisting of these particles. On the other hand, the new methods should be such that a lack of information concerning the nature of the particles, their structure, and the character of their interaction, would not restrict the efficiency of these methods.

All these requirements are satisfied best by the methods of the theory of probability. This science has for its main task the study of group phenomena, that is, such phenomena as occur in collections of a large number of objects of essentially the same kind. The main purpose of this investigation is the discovery of such general laws as are implied by the gross character of the phenomena and depend comparatively little on the nature of the individual objects. It is clear that the well-known trends of the theory of probability fit in the best possible way the aforementioned special demands of the molecular-physical theories. Thus, as a matter of principle, there was no doubt that statistical methods should become the most important mathematical tool in the construction of new physical theories; if there existed any disagreement at all, it concerned only the form and the domain of application of these methods.

In the first investigations (Maxwell, Boltzmann) these applications of statistical methods were not of a systematical character. Fairly vague and somewhat timid probabilistic arguments do not pretend here to be the fundamental basis, and play approximately the same role as purely mechanical considerations. Two features are characteristic of this primary period. First, far-reaching hypotheses are made concerning the structure and the laws of interaction between the particles; usually the particles are represented as elastic spheres, the laws of collision of which are used in an essential way for the construction of the theory. Secondly, the notions of the theory of probability do not appear in a precise form and are not free from a certain amount of confusion which often discredits the mathematical arguments by making them either void of any content or even definitely incorrect. The limit theorems of the theory of probability do not find any application as yet. The mathematical level of all these investigations is quite low, and the most important mathematical problems which are encountered in this new domain of application do not yet appear in a precise form.

It should be observed, however, that the tendency to restrict the role of statistical methods by introducing purely mechanical considerations, (from various hypotheses concerning the laws of interaction of particles), is not restricted to the past. This tendency is clearly present in many modem investigations. According to a historically accepted terminology, such investigations are considered to belong to the kinetic theory of matter, as distinct from the statistical mechanics which tries to reduce such hypotheses to a minimum by using statistical methods as much as possible. Each of these two tendencies has its own advantages. For instance, the kinetic theory is indispensable when dealing with problems concerning the motion of separate particles (number of collisions, problems concerning the study of systems of special kinds, mono-atomic ideal gas); the methods of the kinetic theory are also often preferable, because they give a treatment of the phenomena which is simpler mathematically and more detailed. But in questions concerning the theoretical foundation of general laws valid for a great variety of systems, the kinetic theory naturally becomes sometimes powerless and has to be replaced by a theory which makes as few special hypotheses as possible concerning the nature of the particles. In particular, it was precisely the necessity of a statistical foundation for the general laws of thermodynamics that produced trends which found their expression in the construction of statistical mechanics. To avoid making any special hypotheses about the nature of the particles it became necessary in establishing a statistical foundation to develop laws which had to be valid no matter what was the nature of these particles (within quite wide limitations).

The first systematic exposition of the foundations of statistical mechanics, with fairly far developed applications to thermodynamics and some other physical theories, was given in Gibbs' well-known book [W Gibbs, Elementary principles of statistical mechanics, Yale University Press, 1902]. Besides the above mentioned tendency not to make any hypotheses about the nature of particles the following are characteristic of the exposition of Gibbs.

(1) A precise introduction of the notion of probability, which is given here a purely mechanical definition, is lacking with the resulting questionable logical precision of all arguments of statistical character.

(2) The limit theorem of the theory of probability does not find any application (at that time they were not quite developed in the theory of probability itself).

(3) The author considers his task not as one of establishing physical theories directly, but as one of constructing statistic-mechanical models which have some analogies in thermodynamics and some other parts of physics; hence he does not hesitate to introduce some very special hypotheses of a statistical character without attempting to prove them or even to interpret their meaning and significance.

(4) The mathematical level of the book is not high; although the arguments are clear from the logical standpoint, they do not pretend to any analytical rigour.

At the time of publication of Gibbs' book, the fundamental problems raised in mathematical science in connection with the foundation of statistical mechanics became more or less clear. If we disregard some isolated small problems, we have here two fundamental groups of problems representing a broad, deep, interesting and difficult field of research in mathematics which is far from being exhausted even at present. The first of these groups is centred around the so-called ergodic problem, that is, the problem of the logical foundation for the interpretation of physical quantities by averages of their corresponding functions, averages taken over the phase-space or a suitably selected part of it. This problem, originated by Boltzmann, apparently is far from its complete solution even at the present time. This group of problems was neglected by the investigators for a long time after some unsuccessful attempts, based either on some inappropriate hypotheses introduced ad hoc, or on erroneous logical and mathematical arguments (which, unfortunately, have been repeated without any criticism in later handbooks). In the book of Gibbs these problems naturally are not considered because of the tendency to construct models. Only recently (1931), the remarkable work of G D Birkhoff again attracted the attention of many investigators to these problems, and since then this group of problems has never ceased to interest mathematicians, who devote more and more effort to it every year. We will discuss this group of problems in more detail [later].

The second group of problems is connected with the methods of computation of the phase-averages. In the majority of cases, these averages cannot be calculated precisely. The formulas which are derived for them in the general theory (that is, without specification of the mechanical system under discussion) are complicated, not easy to survey, and as a rule, not suited for mathematical treatment. It is quite natural, therefore, to try to find simpler and more convenient approximations for these averages. This problem is always formulated as a problem of deriving asymptotic formulas which approach the precise formulas when the number of particles constituting the given system increases beyond any limit. These asymptotic formulas have been found long ago by a semi-heuristic method (by means of an unproved extrapolation, starting from some of the simplest examples) and were without rigorous mathematical justification until fairly recent years. A decided change in this direction was brought about by the papers of Darwin and Fowler about twenty years ago. Strictly speaking these authors were the first to give a systematic computation of the average values; up to that time, such a computation was in most cases replaced by a more or less convincing determination of "most probable" values which (without rigorous justification) were assumed to be approximately equal to the corresponding average values. Darwin and Fowler also created a simple, convenient, and mathematically rigorous apparatus for the computation of asymptotic formulas. The only defect of their theory lies in an extreme abstruseness of the justification of their mathematical method. To a considerable extent this abstruseness was due to the fact that the authors did not use the limit theorems of the theory of probability (sufficiently developed by that time), but created anew the necessary analytical apparatus. In any case, the course in statistical mechanics published by Fowler on the basis of this method, represents up to now the only book on the subject, which is on a satisfactory mathematical level.

In closing this brief sketch we should mention that the development of atomic mechanics during the last decades has changed the face of physical statistics to such a degree that, naturally, statistical mechanics had to extend its mathematical apparatus in order to include also quantum phenomena. Moreover, from the modem point of view, we should consider quantized systems as a general type of which the classical systems are a limiting case. Fowler's course is arranged according to precisely this point of view: the new method of constructing asymptotic formulas for phase-averages is established and developed for the quantized systems, and the formulas which correspond to the classical systems are obtained from these by a limiting process.

Quantum statistics also presents some new mathematical problems. Thus, the justification of the peculiar principles of statistical calculations which are the basis of the statistics of Bose-Einstein and Fermi-Dirac required mathematical arguments which were distinct as a matter of principle (not only by their mathematical apparatus) from all those dealt with in the classical statistical mechanics. Nevertheless, it could be stated that the transition from the classical systems to the quantum systems did not introduce any essentially new mathematical difficulties. Any method of justification of the statistical mechanics of the classical systems, would require for quantized systems an extension of the analytical apparatus only, in some cases introducing small difficulties of a technical character but not presenting new mathematical problems. In places where we might have to use finite sums or series, we operate with integrals, continuous distributions of probability might be replaced by the discrete ones, for which completely analogous limit theorems hold true.

Precisely for these reasons in the present book we have restricted ourselves to the discussion of the classical systems, leaving completely out of consideration everything concerning quantum physics, although all the methods which we develop after suitable modifications could be applied without any difficulties to the quantum systems. We have chosen the classical systems mainly because our book is designed, in the first place, for a mathematical reader, who cannot always be assumed to have a sufficient knowledge of the foundations of quantum mechanics. On the other hand, we did not consider as expedient the inclusion in the book of a brief exposition of these foundations. Such an inclusion would have considerably increased the size of the book, and would not attain the desired purpose since quantum mechanics with its novel ideas, often contradicting the classical representations, could not be substantially assimilated by studying such a brief exposition.

JOC/EFR August 2006

The URL of this page is:
http://www-history.mcs.st-andrews.ac.uk/Khinchin_introduction.html