with people placed according to
date of their first impact. Do not take the placings too seriously and remember
that a career may last more than 50 years! At
each marker there are notes on developments in the following period. There is
more about Britain
and about economics than there should be but I know more about them.

For further on-line information there are links to

·Earliest Uses (Words and Symbols) for details
(particularly detailed references) on the topics to which the individuals
contributed. (The Words site is organised by letter of the alphabet. See
here
for a list of entries)

·MacTutor for
fuller biographical information on the ‘principals’ (all but three) and
on a very large ‘supporting’ cast. The MacTutor biographies also cover
the work the individuals did outside probability and statistics. The MacTutor
and References links are to these pages. There is an index
to the Statistics articles on the site.

·The Mathematics
Genealogy Project,abbreviatedMGP, which
is useful for tracking modern scholars. The PhD degree is a relatively recent
development and in the UK
a very recent one. See my The Mathematics PhD in the UK.

·Wikipedia for additional
biographies. This is an uneven site but it has some useful articles.

The entries
contain references to the following histories
and books of lives. See below for more literature.

·Leading Personalities in Statistical Sciences from the Seventeenth
Century to the Present, (ed. N. L. Johnson and S. Kotz) 1997. New York: Wiley.
Contains around 110 biographies and based on entries in Encyclopedia of Statistical Science (ed. N. L. Johnson and S.
Kotz.) AbbreviatedLP.

·Statisticians of the
Centuries (ed. C. C. Heyde and E. Seneta) 2001. New York: Springer. Contains 105
biographies. The coverage is restricted to individuals born before 1900. AbbreviatedSC.

·Actuarial
History. A very
comprehensive collection of links, created by Henk Wolthuis,not only to actuarial science and
demography, but to statistics as well.

To help place individuals I have
used modern terms for occupation (e.g. physicist or statistician). For the
earlier figures these terms are anachronistic but, I hope, not too misleading.
I have not given nationality as people move and states come and go. MacTutor
has plenty of geographical information.

1650-1700The origins of probability and statistics are usually found
in this period, in the mathematical treatment of games of chance and in the
systematic study of mortality data. This was the age of the Scientific
Revolution and the biggest names, Galileo
(Materials
and Todhunter
ch.I (4-6).) and Newton
(LP) gave some thought to probability without apparently
influencing its development. For an introduction to the Scientific
Revolution, see Westfall’s Scientific
Revolution (1986).

·There
were earlier contributions to probability, e.g. Cardano
(1501-76) gave some ‘probabilities’ associated with dice throwing, but a
critical mass of researchers (and results) was only achieved following
discussions between Pascal
and Fermatand the publication of the first book by Huygens.HackingChapters
1-5 discusses thinking before Pascal. James Franklin’s The Science of Conjecture:
Evidence and Probability Before Pascal (2001) examines this earlier
work in depth. A recent issue of the JEHPSis devoted to Medieval
probabilities.

·Statisticsin the form of population statistics was created by Graunt. Graunt’s friend William Petty
gave the name Political Arithmetic
to the quantitative study of demography and economics. Gregory King
was an important figure in the next generation. However the economic line
fizzled out. Adam Smith, the most influential C18 British economist,
wrote,“I have no great faith in
political arithmetic...” Wealth of Nations (1776) B.IV,
Ch.5, Of Bounties.

Blaise Pascal(1623-1662) Mathematician and philosopher. MacTutorReferencesSC, LP. Pascal
was educated at home by his father, himself a considerable mathematician. The
origins of probability are usually found in the correspondencebetween Pascal and Fermat where they treated several problems associated
with games of chance. The letters were not published but their contents were
known in Parisian scientific circles. Pascal’s only probability publication
was the posthumously published Traité du
triangle arithmétique (1654, published in 1665 and so after Huygens’s work); this treated Pascal’s
triangle with probability
applications. Pascal introduced the concept of expectation and discussed the problem of gambler’s
ruin. Pascal’s
wager, is now often read as a pioneering analysis of
decision-making under uncertainty although it appeared, not in his
mathematical writings, but in the Pensées, his reflections on
religion. The last chapter of the Port-Royal
Logicpp.
365ff by Pascal’s friends Arnauld
and Nicole has a brief treatment of the use of probability in decision making,
with an allusion to the wager. See Ben Rogers Pascal's
Life & Times,Life
& Work,A.W.F.
Edwards on the triangle and Todhunter
ch.II (pp. 7-21). See also Hald 1990, chapter 5, The
Foundations of Probability Theory by Pascal and Fermat in 1654 and Hacking
1975, chapter
7, The Roannez
Circle (1654) andchapter
8, The Great Decision (1658?).

Christiaan
Huygens(1629-94) Mathematician and physicist. MacTutorReferencesSC, LP. As a youthHuygens was expected
to become a diplomat but instead he became a gentleman scientist, making
important contributions to mathematics, physics and astronomy. He was educated at the University of Leiden
and at the College of Orange at Breda.
He spent 14 years in Paris
at the Académie
des Sciences. Huygens wrote the first book on probability, a pamphlet
really, Van Rekeningh in Spelen van Geluck, translated into Latin by
his teacher van
Schooten as De Ratiociniis in Ludo Aleae (1657) and then into
English as The
Value of all Chances in Games of Fortune etc. Huygens drew on the
ideas of Pascal and Fermat, which he had encountered when
he visited Paris.
Much of the book is devoted to calculating the value or, as it would be
called now the expectation, of a
game of chance. The problems contained in the book include the gambler’s
ruin and Huygens treated the hypergeometric distribution.
His book was widely read and the first part ofJakob Bernoulli’s Ars Conjectandi is a
commentary on it. See Life
& Work and Todhunter
ch.III (pp. 22-5). See also Hald (1990): Chapter 6, Huygens
and De Ratiociniis in Ludo Aleae, 1657 and Hacking (1975), Chapter
11, Expectation. C.J. (Kees) Verduin is constructing an
impressive Christiaan
Huygens site. See Peter Doyle Hedging
Huygens.

John Graunt(1620-74) Merchant. Wikipedia. SC, LP, ESM. Graunt is unique among the figures described here in
not having had a university education. He published only one work, Observations
Made upon the Bills of Mortality (1662). However, through
this work and his friendship with William Petty,
he became a fellow of the Royal Society of London and his work became known
to savants like Halley.
The weekly bills of mortality, which had been collected since 1603, were
designed to detect the outbreak of plague. Graunt
put the data into tables and produced a commentary on them; he does basic
calculations. He discusses the reliability of the data. He compares the
number of male and female births and deaths. In the course of Chapter XI on
estimating the population of London Graunt produces a primitive life table see the JIA
articles by Glass, Renn, Benjamin and Seal. The life table became one of the
main tools of demography and insurance mathematics. Halley produced a life
table using data from Caspar
Neumann (SC) in Silesia. See Life &
Work for writings by Graunt, Petty and Halley. See alsoTodhunter
ch. V (pp. 37-43),Hald (1990):Chapter 7, John Graunt and the
Observations upon the Bills of Mortality, 1662 and
Hacking (1975): Chapter
12, Political Arithmetic 1662.

1700-50The great leap forward
is Hald’s (1990) name for the decade 1708-1718: there were so many important
contributions to such a greatly expanded subject. The roots of probability and
statistics were quite distinct but by the early C18 it was understood that the
subjects were closely related.

Jakob (James) Bernoulli (1654-1705)
Mathematician. MacTutorReferencesSC, LP, ESM. Eight members of the Bernoulli family have
biographies in MacTutor
(family
tree) and several wrote on probability. The most important
contributors were Jakob, Daniel and Niklaus.
Jakob and his younger brother Johann
were the first of the mathematicians. Jakob studied philosophy at the University of Basel but learnt mathematics on his
own. Eventually he became professor of mathematics at Basel. The posthumously publishedArs
Conjectandi(title
page) (1713) was his only probability publication but it
was extremely influential.The first part is a commentary on Huygens’sDe Ratiociniis.
The work was an important contribution to combinatorics: the
term permutationoriginated here.
(The combinatorial side of probability remained important and in late C19
Britain probability and combinatorial analysis were taught together in
algebra courses.) Bernoulliused the terms a priori and a posteriori
to distinguish two ways of deriving probabilities (see posterior
probability): deduction a priori (without
experience) is possible when there are specially constructed devices, like
dice but otherwise it is possible to make a deduction from many observed
outcomes of similar events. Bernoulli’stheorem, or thelaw
of large numberswas the work’s most spectacular contribution but see
also the entries formorally certainandbinomial distributionThe eponymous Bernoullitrials, numbers and random variable all refer to this
Bernoulli and his Ars Conjectandi. See
Sheynin ch. 3Life
& Work (which also has
links for the contributions of Niklaus and Jean III) and Todhunter
ch.VII (pp. 56-77). See Stigler (1986): Chapter
2, Probabilists and the Measurement of Uncertainty. See also Hald (1990): Chapter 15, James Bernoulli and Ars Conjectandi, 1713;Chapter 16, Bernoulli’s
Theorem. Sheynin has recently translated Part IV of
the Ars Conjectandi. A new
translation of the whole work has appeared: The Art of Conjecturing translated by Edith Dudley Scylla Amazon.
The proceedings of a conference on the Ars Conjectandi are online at the JEHPS: part 1andpart 2.

Abraham de Moivre(1667-1754)MathematicianMacTutorReferencesSC, LP. De Moivre came to England
from France
as a refugee aged about 20 and, although he gained recognition as a
mathematician and became a fellow of the Royal Society, he never obtained an
academic appointment. De Moivre had read Huygens’s book
before leaving France
but his first paper on probability was published in 1711. In 1718 he
published The Doctrine of Chances: or, a Method of Calculating the
Probability of Events in Play (title
page). He published other pieces on probability, putting the
results into new editions of the Doctrine; the third appeared in 1733.
The book began with an influential definition ofprobability.
De Moivre obtained the normalapproximation to
the binomial distribution (a forerunner of the central limit theorem)
and almost found thePoisson distribution. His technical innovations included the use ofprobability
generating functions, which
he used to find the distribution of the sum ofuniformvariables. De Moivre also wrote about life
insurance mathematics when analysing annuities
Seesurvival function.
Todhunter ranked de Moivre’s contributions very highly, “it will not be
doubted that the Theory of Probability owes more to him than to any other
mathematician, with the sole exception of Laplace.”
(p. 139) Bellhouse & Genest have translated and augmented Maty’s
biography of De Moivre: Statistical
Science 2007 Project
Euclid.See Sheynin ch. 4Life
& Workand Todhunter
ch.IX (pp. 135-93). See Stigler (1986): Chapter
2, Probabilists and the Measurement of Uncertainty. See also Hald (1990):Chapter 22,
De Moivre and the Doctrine of Chances 1718, 1738 and 1756;Chapter
25, The Life Insurance Mathematics of
de Moivre and Simpson.

Daniel Bernoulli(1700-1782) Mathematicianand physicistMacTutorReferencesSC, LP. Daniel
Bernoulli, a nephew of Jakob Bernoulli, was educated at the
University of Basel where his father Johann
was a professor. Daniel studied medicine—his father insisted—although
subsequently his father agreed to teach him mathematics. Daniel worked in St.
Petersburg and at the University
of Basel. He wrote nine
papers on probability, statistics and demography but is best remembered for
his "Exposition of a New Theory on the Measurement of Risk" (1737):
his theory of choice was based on moral expectation (or expected utility). The theory had a solution for the St.
Petersburg paradox, which
had exposed the difference between the mathematical expectation of a prospect and its value to ‘me’: its
expectation is infinite but its value to me is not. In a prize-winning paper
of 1735 Bernoulli tested for the random distribution of planetary orbits.
Bernoulli devised an urn model for treating the diffusion of liquids but,
although it was discussed by Laplace, such probabilistic
models only became common in the late C19.Another contribution, the Essai
(pp.1-45) of 1766, was an investigation of the consequences of inoculation
against smallpox; see Blower
2004. This paper contained a model of epidemics of the kind that
Ross and McKendrick developed in the C20 (below). Bernoulli also described the method known since Fisher as maximum
likelihood in a 1777 paper “The most probable choice
between several discrepant observations and the formation therefrom of the
most likely induction.”See Life
& WorkandTodhunter
ch. IX (pp. 213-38). See Hald
(1998, passim).

1750-1800Probability
established itself in physical science, in astronomy its most developed branch. The most enduring
of these applications to astronomy treated the combination of observations. The resulting theory of errorswas the most important ancestor of modern
statistical inference, particularly of estimation theory.

·More
tests of significance
were developed, mainly for use in astronomy, see Daniel Bernoulli
and also John
Michell (1767) Crossley, who calculated the odds that the
Pleiades is a system of stars and not a random agglomeration. See Hald
(1998): Part I Direct Probability, 1750-1805).

·In
the 1770sCondorcet
started publishing on social mathematics, largely the application of
probability to the decisions of juries and other assemblies. His work had a
strong influence on Laplace and Poisson. Other French authors from this period included D’Alembert
and Buffon;
the former is best remembered for his critical remarks on probability and the
latter for his needle experiment.

Thomas Bayes(1702-1761)Clergyman and mathematician. MacTutorReferencesSC, LP, ESM. Bayes
attended the University
of Edinburgh to prepare
for the ministry but he studied mathematics at the same time. In 1742 Bayes
became a fellow of the Royal Society: the certificate
of election read “We propose and recommend him as a Gentleman of
known Merit, well Skilled in Geometry and all parts of Mathematical and
Philosophical Learning and every way qualified to be a valuable Member of the
Same.” Bayes wrote only one paper on probability, the posthumously published An Essay
towards solving a Problem in the Doctrine of Chances (1763). (For
statement of the “problem”, seeBayes). The paper was submitted to the Royal Society by Richard
Price who added a post-script of his own in which he discussed a
version of the rule of
succession. In the paper Bayes refers only to de Moivre and there has been much speculation as to where the
problem came from. Bayesian methods were widely used in the C19, through
the influence of Laplace
and Gauss, although both had second thoughts. Their
Bayesian arguments continued to be taught until they came under heavy attack
in the C20 from Fisher and Neyman. In
the 1930s and –40s Jeffreys was an isolated figure in
trying to develop Bayesian methods. From the 50s onwards the situation
changed when Savage and others made Bayesianism
intellectually respectable and recent computational advanceshave made Bayesian methods technically feasible. From the
early C20 there has been a revival of interest in Bayes
himself and he has been much more discussed than ever before. See Bellhouse
biography, Sheynin ch. 5Life
& Work and Todhunter
ch.XIV (pp. 294-300). See Stigler (1986): Chapter
3, Inverse Probability and Hald (1998): Chapter 8, Bayes, Price
and the Essay, 1764-1765.) There is a major new biography, A. I. Dale Most
Honorable Remembrance: The Life and Work of Thomas Bayes.

Pierre-Simon Laplace(1749-1827) Mathematician and physicistMacTutorReferencesSC, LP, ESM. Laplacewrote on probability over a period of more than 50 years. His 1774 Mémoire sur la probabilité des
causes par les évènemens gave aBayesian
analysis of errors of measurement. His Théorie
Analytique des Probabilités(title
page) (1812 and further editions in –14, -20 and -25) was
by far the biggest thing in probability yet. Laplace
made many contributions, producing results like thecentral limit theorem
(see also the normal
distribution) and developing tools including the probability generating function
and the characteristic
function. His system was based on classical probability
but the superstructure outgrew the foundations. In
Britain Laplace was admired by his C19 readers including De
Morgan(review
of Théorie Analytique) and Edgeworth but, in the C20, his thought tended to be reduced
to the popular Philosophical Essay on Probabilityand
discussion often focussed on debateable items like the rule of succession.Fisher, for instance, had a very
contracted view of Laplace’s work. Although Laplace gets far more space in Todhunter
than any other author, the coverage of Laplace’s
estimation theory ends in 1814. In early C20 France (see Lévy)
Laplace seems also to have been largely
forgotten. Hald (1998) does justice to Laplace by giving him about 400
pages and by presenting subsequent work as a series of footnotes to Laplace.See
Sheynin ch. 5Life
& Workand Todhunter
ch. XX (p. 464-614). See Stigler (1986): Chapter
3, Inverse Probability;Chapter 4, The Gauss-Laplace Synthesis.
Laplace’s works are available on Gallica.

·Gauss found a second important
application of least squares in geodesy. Geodesists
made important contributions to least squares, particularly on the
computational side—not surprisingly as the calculations could be on an
industrial scale. The eponyms, Gauss-Jordan
and CholeskyMacTutor,
honour later geodesists. Helmert
(Helmert's transformation)
and Paolo Pizzetti
were geodesists who contributed to the theory of errors. At least one
important C20 statistician started as a surveyor, Frank
Yates, Fisher’s colleague and successor at
Rothamsted.

·In Britain the
first census of the population was taken in 1801. It ended a controversy about
the size of the population that began when Bayes’s friend Price argued that the population had been falling in the
C18. Numerous writers, including Eden,
came up with their own estimates.

·Around this time William Playfair
was finding new ways of representing data graphically but nobody was paying
attention. Techniques slowly accumulated over the next 150 years without the
idea of graphical statistics as a study in its own right gaining ground. That
idea is quite recent and mainly associated with Tukey. See
Milestones.

·The age of the academies
was over and from now on the main advances took place in universities. The
French education system was transformed in the course of the Revolution and the
C19 saw the rise of the German university.

See Stigler (1986): Part
I, The Development of Mathematical Statistics in Astronomy and Geodesy before
1827 and Hald (1998): Part III The Normal Distribution, the
Method of Least Squares and the Central Limit Theorem. See
Life
& Work. See also L. Daston (1988) Classical
Probability in the Enlightenment.

Carl Friedrich Gauss (1777-1855)
Mathematician, physicist and geodesist.MacTutorReferencesSC, LP. Gauss is generally regarded
as one of the greatest mathematicians of all time and his contributions to
the theory of errors
were only a small part of his total output. Gauss spent most of his working
life at the University of Göttingen, which became the main centre for
mathematics in Germany.
Initially Gauss was interested in treating astronomical observations but
later he became involved in geodesy. Gauss used the method of least squares
for which he gave two rationalisations. The first in The Theory of the
Motion of Heavenly Bodies moving around the Sun in Conic Sections (1809)
assumed normal
errors and used a Bayesian argument with a uniform prior on the
coefficients; the normal or Gaussian
curve was derived from the principle of the arithmetic mean. Gauss
departed from this Bayesian position in 1816 when he investigated the ‘efficiency’ of
different estimators of precision. In the Theory of the combination of
observations least subject to error (1821/3) he presented the Gauss-Markov theorem.
Gauss’s way of writing the Gaussian distribution is described on Symbols associated with normal
distribution; the associated terminology is described in mean error and modulus. Gauss’s
influence in on the combination of observations in astronomy and geodesy was
very strong. One of his followers was the astronomer Bessel
(see probable error).
In the early C20 Cambridge astronomers taught Gauss Mark I to Fisher
and Jeffreys amongst others.See Sheynin ch. 9Life &
Work See Stigler (1986): chapter 4, The Gauss-Laplace
Synthesis and Hald (1998): Chapter 21, Gauss’s Theory of Linear
Unbiased Minimum Variance Estimation, 1823-1828.

1830-1860This
period saw the emergence of the statistical society, which has been on
the stage ever since, although the meaning of “statistics” has changed and the
beginning of a philosophical literature on probability. It saw also the
beginning of the most glamorous branch of empirical time series analysis, the
sunspot cycle.

·Since 1840, or so, there
has been a philosophical literature on probability. The English
literature begins with the extensive discussion of probability in John Stuart
Mill’s System of Logic (1843). This was followed by The
Logic of Chance (1862) of John
Venn, the Principles of Science (1873) of W. Stanley Jevons
and the Grammar of Science (1892) of Karl Pearson.
The American scientist/philosopher C.S.
Peirce (Stanford)
wrote extensively on probability, although he was not much read. There was an
overlapping literature on logic and probability.
De
Morgan can be placed here as well as Boole
(LP) whoseAn Investigation into the
Laws of Thought, on which are founded the Mathematical Theories of Logic and
Probabilities (1854) contained a long discussion of probability. Later
figures are mentioned below. There were German and French
literatures as well but philosophical probability was less international than
mathematical probability. See Porter’s Rise of Statistical
Thinking.

·In 1843 Schwabe observed that sunspot activity is periodic. There followed
decades of research, not only in solar physics but in terrestrial magnetism,
meteorology and even economics examining series to see if their periodicity
matched that of the sunspots. Even before the sunspot craze there was intense
interest in periodicity in meteorology, tidology, and other branches of observational physics and, by
the end of the century seismology, was becoming important. Both Laplace and Quetelet had analysed meteorological
data and Herschel had written a book on the
subject. The techniques in use varied from the simple, such as the Buys
Ballottable, to the sophisticated, forms of harmonic
analysis.
At the end of the century the physicist Arthur Schuster
introduced the periodogram.
However, by then a rival form of time series analysis, based on correlation
and promoted by Pearson, Yule, Hookerand others, was
taking shape. For an account see J. L. Klein (1997) Statistical
Visions in Time, Cambridge:
Cambridge University Press.

Lambert Adolphe Jacques Quetelet (1796-1874)
Astronomer and statistician. MacTutorReferencesSC, LP, ESM. Adolphe Quetelet studied at the University of Ghent
but his career was centred on Brussels.
His great energies were first concentrated on establishing an observatory
there. In 1824 he went to Paris
for three months to learn about such things and met the great French
scientists; Quetelet seems to have learnt about probability from Fourier.He returned an enthusiast for probability and
proposed improvements in census taking.His book introducing “social
physics”, Sur l'homme et le
developpement de ses facultés, essai d'une physique sociale (1835),
introduced the “average man” (l'homme moyen). His very widely read Letters
on Probability (1846) publicised the use of the normal distribution, not
as an error law, but for describing the distribution of measurements.
Quetelet was a tireless scientific entrepreneur both at home and
internationally, e.g. he played an important part in establishing the London
Statistical Society (above). His work was not well
received by the social scientists of the C19, although his initiative was
admired by the economist Stanley Jevons
and Galton continued his work. In the early C20 J. M.
Keynes (Treatise on Probability) wrote of Quetelet, “There is scarcely
any permanent, accurate contribution to knowledge, which can be associated
with his name. But suggestions, projects, far-reaching ideas he could both
conceive and express, and he has a very fair claim, I think, to be regarded
as the parent of modern statistical method.” See Life &
Work. Coven’s A History of
Statistics in the Social Sciences is a study of Quetelet. See Stigler
(1986): Chapter 5, Quetelet’s Two Attempts. Hald (1998) 26.3. Quetelet
on the Average Man, 1835, and on the Variation around the Average, 1846.

1860-1880Two
important applied fields opened up in this period. Probability found a
major new application in physical science, to the theory of gases, which
developed into statistical mechanics. Problems in statistical mechanics were behind many
of the probability advances of the early C20. The statistical study of heredity developed into
biometry and many of the advances in statistical theory
were associated with this subject. There were important geographical
changes, as important work in probability
started to come from Russia
and important statistical work from England.

·Galton
inaugurated the statistical study of heredity,
work continued way into the C20 by Pearson and Fisher.
Correlation was the most distinctive contribution of this “English” school. See
Stigler (1986): Part III, A Breakthrough in Studies of Heredity.

·By contrast, the so-called
“continental direction” investigated the appropriateness of simple urn models
for treating birth and death rates by considering the stability of the series
of rates over time. Wilhelm LexisTheorie der Massenerscheinungen
in der menschlichen Gesellschaft (1877). BortkiewiczMarkovChuprov and Anderson all worked in this tradition. See Stigler
(1986): Chapter 6, Attempts to revive the Binomial, C. C. Heyde & E.
Seneta I. J. Bienaymé: Statistical Theory Anticipated, 1977 and Sheynin ch. 15.1.

·‘Higher’ statistics entered
psychology and economics.
For psychology see Fechner. In economics W. Stanley Jevons
(WikipediaMacTutorNew School)
(SC) saw himself as continuing the work of the political
arithmeticians of 1650+. In the intervening two centuries
much had been done and Jevons’s work on index numbers was
inspired by the theory of errors, while his research on economic time series
was inspired by the work of meteorologists on seasonal variation and of
physicists on the solar cycle and its terrestrial correlates. (see above) Jevons also tried to link his mathematical economic
theory (see utility)
to statistical analysis—a project revived in the econometrics of the C20.

Ludwig Boltzmann (1844-1906)
Physicist. MacTutorReferences.
MGP.LP. Boltzmann, withGibbs,
was responsible for transforming Maxwell’s
probabilistic theory of gases into statistical mechanics. Boltzmann was
awarded a doctorate from the University
of Vienna in 1866 for a
thesis on the kinetic theory of gases. He had appointments at the
universities of Graz, Leipzig
as well as Vienna.
Statistical mechanics required solutions to problems in distribution theory
and also generated conceptual problems. Boltzmann gave the χ2
distribution for 2 and 3 degrees of freedom (1878) and for n (1881). Ernst
AbbeLP a physicist working in the theory of errors
had already obtained the distribution in 1862 and Pearson
was to obtain it again 1900. One of the conceptual innovations was entropy.Zermelo
argued that theirreversibility in Boltzmann’s thermodynamics
contradicted the recurrence properties of dynamic systems. Boltzmann’s
student Paul
Ehrenfest devised the Ehrenfest urn model to
show that the contradiction was only apparent. One of Ehrenfest’s students
was Uhlenbeck
whose 1930 paper on Brownian motion became part of the probability
literature. Boltzmann thought of
the proper average values to identify with macroscopic features as being
averages over time of quantities calculable from microscopic states. He
wished to identify the phase averages with such time averages In the 1930s Birkhoff
and von
Neumann produced ergodic
theorems that addressed the problem. The Stanford Encyclopedia has 2
relevant articles: J. Ullfink “Boltzmann's
Work in Statistical Physics” and L. Sklar “Philosophy
of Statistical mechanics.”See also von Plato passim.

Gustav Theodor
Fechner (1801-1877) Wikipedia Physicist and psychologist. SC. Fechner went to the University of Leipzig
to study medicine but did not qualify as a doctor. He developed a
strong interest in the relationship between mind and matter and he switched
to physics; his impressive experimental work earned him a chair in physics in
1834. In the mid 50s he started the experiments that formed the basis of his Elemente
der Psychophysik (Elements of Psychophysics) (1860). In his psychophysics Fechner
emphasised the Weber-Fechner
law relating to sensation to stimulus. Stigler emphasises the
book’s contribution to experimental method and assesses the treatment of
experimental design as the “most comprehensive” before Fisher’sDesign of Experiments (1935). One of the techniques Fechner introduced
as an ancestor of probit
analysis. In all this work Fechner used probability ideas that he
had known from his work in physics. His second major work was the
posthumously published Kollektivmasslehre (1897). The latter
anticipates the ideas of von Mises on collectives. Hermann Ebbinghaus
was inspired by Fechner’s Psychophysics to start his own experimental
work on memory; these researches were published in his Über dasGedächtnis (1885). See Life
& Work (for both Fechner and Ebbinghaus) and Stigler
(1986): Chapter 5, Psychophysics as a Counterpoint.See also O.
Sheynin (2004) Fechner as a Statistician, British
Journal of Mathematical and Statistical Psychology, 57, 53-72.

Francis Galton (1822-1911) Man of science MacTutorReferencesSC, LP, ESM. After studying mathematics at CambridgeUniversity,
medicine in Birmingham and London Galton spent
some years exploring Africa; his eminence as
an African explorer and geographer led to his election to the Royal
Society in 1860. Galtonbecame interested in the phenomena
of heredity in the 1860s and most of his contributions to statistics arose
out of that study. Apart from his books Hereditary
Genius(1869) and
Natural
Inheritance(1889)
he wrote many articles. His cousin Charles Darwin,
whom he advised on statistical matters, was an important influence, as was Quetelet, although he disagreed with him on many points. (Like
Darwin,
Galton was rich enough to be a gentleman scholar.) Thenormal distribution
played an important part in Galton’s work. He is most remembered for
introducing the methods of correlation
and regression.
He often involved other, better, mathematicians, including George
Darwin, Glaisher,Watson,MacAlister,Pearson, Edgeworth and Sheppard
(of Sheppard's corrections),
in his problems. His work on the lognormal
distribution and branching
processes came about from his posing problems to mathematician
friends. Galton contributed a large number of terms to statistics, including
many of those used in elementary statistics, e.g. ogive,percentile and inter-quartile range.Apart from his influence on biometry, Galton had a
strong influence on the development of psychology,
especially in Britain,
both because he wrote about psychology
and because the statistical tools he developed were taken up by psychologists
like Spearman.
See Life
& Work. Much of Galton’s vast output is available on Gavan
Tredoux’s Francis Galton.
There are two recent biographies: N. W. Gillham A Life of Sir Francis
Galton: From African Exploration to the Birth of Eugenics (2002) and M.
G. Bulmer Francis Galton: Pioneer of Heredity and Biometry (2003). See
Stigler (1986): Chapter 8, The English Breakthrough: Galton.

1880-1900In this period the English statistical school
took shape. Pearson was the
dominant force until Fisher displaced
him in the 1920s. The school dominated statistics until the Second World War.
T. Schweder’s Early Statistics in the Nordic Countries considers
why this did not happen in Scandinavia.

·Galton introduced correlation and a
theory was rapidly developed by Pearson, Edgeworth, Sheppard and Yule. Correlation was major departure from the statistical
work of Laplace and Gauss, both as a technique and because of the applications
it made possible. It became widely used in biology, psychology and social
science.

·In economics Edgeworth developed
some of Jevons’s ideas, most
notably on index numbers. However, economic statistics in Britain was more closely tied to
official statistics or financial journalism and Newmarch (1820-82) and Giffen (1837-1910) were more representative than Jevons or
Edgeworth. In ItalyVilfredo Pareto discovered a statistical
regularity in the distribution of income; see Pareto
distribution.

F. Y. Edgeworth (1845-1926) Economist and
statistician. MacTutorReferences.
SC, LP, ESM. Edgeworth studied classics at
Trinity College Dublin and Balliol College Oxford. From around 1880 he
followed dual careers in economics and in statistics. Edgeworth seems to have
been self-taught in mathematics and he made a thorough study of the subject
and remained very well read. He began in statistics by subjecting the casual
statistical methods of Jevons to rigorous examination and started what turned
out be made a long involvement with index numbers (see
“Money” in his Papers
relating to Political Economy, vol. 2). However, most of his
extensive publications in statistical theory were not motivated by
economic applications, or direct applications of any kind. In 1892
Edgeworth, prompted by Galton, examined correlation and
methods of estimating correlation coefficients. Another concern, which led to
a stream of papers, was with generalisations of the normal distribution, as
in e.g. his 1905 paper “The law of error”. The Edgeworth expansions
that came from this research are now associated with distributions of
estimators and test statistics but Edgeworth originally envisaged these
distributions used for data distributions, as an alternative to the Pearson curves.
Edgeworth’s starting point was Laplace. Much of his work
was not followed up, like his 1908/9 papers “On the probable errors of
frequency-constants” which anticipated some of Fisher’s large sample theory for maximum likelihood.
Unlike his contemporaries, Pearson in statistics and Alfred Marshall in
economics, Edgeworth founded no school. From 1891 he was professor of
political economy at Oxford.
He had no students in Statistics and his only follower was Arthur Bowley,
whose reputation rests on work in economic statistics and social surveys. See
Life
& Work and Francis
Ysidro Edgeworth. See Stigler (1986):
Chapter 9, The Next Generation Edgeworth. His statistics papers are
collected in the 3-volume set F. Y. Edgeworth, Writings in Probability,
Statistics, and Economicsedited by Charles Robert McCann, Jr.

Karl Pearson (1857-1936)
Biometrician, statistician& applied mathematician. MacTutorReferences.SC, LP, ESM. Karl Pearson read mathematics at Cambridge but made his career at University
College London. Pearson was an established applied mathematician when
he joined the zoologist W. F. R.
Weldon and launched what became known as biometry; this found
institutional expression in 1901 with the journal Biometrika.
Weldon had come to the view that “the problem of animal evolution is
essentially a statistical problem” and was applying Galton’sstatistical methods. Pearson’s contribution consisted
of new techniques and eventually a new theory of statistics based on the Pearson curves, correlation, the method of
moments and the chi square test. Pearson
was eager that his statistical approach be adopted in other fields and
amongst his followers was the medical statistician Major Greenwood.
Pearson created a very powerful school and for decades his department was the
only place to learn statistics. Yule, Irwin, Wishart
and F.
N. David were among the distinguished statisticians who started their
careers working for Pearson. Among those who attended his lectures were the
biologist Raymond
Pearl, the economistH. L. Moore,
the medical statistician Austin Bradford
Hill and Jerzy Neyman;
in the 1930s Wilks
was a visitor to the department. In FranceLucien March
was a follower. Pearson’s influence extended to Russia where Slutsky
(see minimum chi-squared method)
and Chuprov
were interested in his work. Pearson had a great influence on the language
and notation of statistics and his name often appears on the Words
pages and Symbols
pages—see e.g. population,
histogram and standard deviation. When
Pearson retired, his sonE.
S. Pearson inherited the statistics part of his father’s empire—the
eugenics part went to R. A. Fisher. Under ESP (who retired
in 1961) and his successors the department continued to be a major centre for
statistics in Britain.
M. S. Bartlettwent
there as a lecturer after graduating from Cambridge in 1933 (his teacher was Wishart)
and again as a professor when ESP retired. For more on KP see Karl Pearson: A Reader’s Guide. See Stigler
(1986): Chapter 10, Pearson and Yule.

1900-1920In
the years before the Great War of 1914-18 probability and statistics were
expanding in all directions. During the war research in statistics and
probability almost stopped as people went into the armed services or did other
kind of war work. Pearson, Lévyand Wiener worked in ballistics, Jeffreys in meteorology and Yule in
administration. For the mathematicians’ traditional role in war, see The Geometry of War.

·In
1900 David
Hilbertproposed a set of problems
for the C20. The 6th was, “to treat … by means of axioms,
those physical sciences in which mathematics plays an important part; in the
first rank are the theory of probabilities and mechanics.” Measure
theory which would have a key
role in the axiomatisation of probability was being created by Borel,Lebesgue
and others—seebelow.

·Mendel did not use probability in his work on genetics
(published 1866) but his ideas were
probabilised as Pearson,Yule and Fisher
investigated how far his principles could rationalise the findings of the
biometricians.

·Industrial
applications of probability begin
with Erlang’s
work on congestion in telephone systems, the ancestor of modern queuing theory

·Institutional
developments include the creation in 1911 of the Department of Applied Statistics at UCL
headed by Pearson. Also in London,
at the London School of Economics, Bowley
became the first (full-time) Professor of Statistics in Britain. At Cambridge a University Lectureship in Statistics was
created in 1912.Yule, who got the job, might be
called the first modern statistician—his expertise was in statistics
(rather than in mathematics more broadly or in science like astronomy or
biology) and he applied this expertise to anything that interested him.

See Hald (1998, Part IV)
and von Plato (ch. 3) “Probabilities in Statistical Physics.” For
developments in economics see M. S. Morgan A History of Econometric Ideas,
Cambridge 1990.

G. Udny Yule (1871-1951) Statistician. MacTutorReferences.WikipediaSC, LP, ESM. After
training as an engineer at University College London and studying in Germany,
Yule returned to work for Karl Pearson in 1893. He was
soon contributing to the theory of correlation and regression and after
1900 he developed a parallel theory of association for
attributes. Yule applied Pearson’s statistical techniques to social problems—see e.g. the entry on Pearson curves—and,
with Edgeworth and Bowley,
he was one of the few members of the Royal Statistical Society
interested in mathematical statistics. Yule had broad interests and
his collaborators included the agricultural meteorologist R. H. Hooker,
the medical statistician Major Greenwood
(see negative binomial
distribution for their study of the incidence of disease and
accidents) and the agriculturalist (Sir) Frank Engledow. Yule’s sympathy
towards the newly rediscovered Mendelian theory of genetics led to several
papers; one involved developing the minimum chi-squared method
of estimation.In the 1920s he wrote
important papers on time
series analysis: “On the time-correlation problem” (1921) was a
critique of the variate
difference method; “Why Do We Sometimes Get Nonsense Correlations
between Time-series?” (1926) investigated a form of spurious correlation;
“On a Method of Investigating Periodicities in Disturbed Series, with Special
Reference to Wolfer's Sunspot Numbers” (1927) used an autoregressive model
in place of the usual periodic trend of harmonic analysis; see
periodogram
analysis and Yule-Walker
equations. (above) Yule taught at Cambridge
for nearly 20 years yet he seems to have had little influence on the students
there. His successor, John
Wishart, had more impact. Although Cambridge
was the outstanding centre in Britain
for training mathematicians, statistics was slow in becoming established
there: see Cambridge
history. Yule’s Introduction to the Theory of Statistics
(1910) was influential and widely-used, especially after it was updated by Maurice
Kendall in 1937.See Stigler
(1986): Chapter 10, Pearson and Yule.

A. A. Markov (1856-1922) Mathematician. MacTutorReferences.SC, LP. Markov spent his working life at
the University of St. Petersburg.
Markov was, with Lyapunov,
the most distinguished of Chebyshev’s students in probability.
Markov contributed to established topics such as the central limit theorem
and the law of large numbers.
It was the extension of the latter to dependent variables that led him to
introduce the Markov chain.
He showed how Chebyshev’s
inequality could be applied to the case of dependent random
variables. In statistics he analysed the alternation of vowels and consonants
as a two-state Markov chain and did work in dispersion theory. Markov had a
low opinion of the contemporary work of Pearson,
an opinion not shared by his younger compatriots Chuprov
and Slutsky.
Markov’s Theory of Probability was an influential textbook. Markov
influenced later figures in the Russian tradition including Bernstein
and Neyman. The latter indirectly paid tribute to Markov’s
textbook when he coined the term Markoff theorem for the result Gauss had obtained in 1821; it is now known as the Gauss-Markov theorem.
J. V. Uspensky’s Introduction to Mathematical Probability (1937) put
Markov’s ideas to an American audience. See Life
& Work There is an interesting volume of letters, The
Correspondence between A.A. Markov and A.A. Chuprov on the Theory of
Probability and Mathematical Statistics ed. Kh.O. Ondar(1981, Springer) See also Sheynin ch. 14 andG. P. Basharin
et al.The
Life and Work of A. A. Markov.

‘Student’ = William Sealy Gosset (1876-1937)
Chemist, brewer and statistician. MacTutorReferences.WikipediaSC, LP. Gosset
was an Oxford-educated chemist who worked not in a university but for
Guinness, the Dublin
brewer. Gosset taught himself the theory of errors from the textbooks by Mansfield Merriman
and Airy. His career as a publishing statistician began after he studied for
a year with Karl Pearson. In his first published paper
‘Student’ (as he called himself) rediscovered the Poisson distribution.
In 1908 he published two papers on small sample distributions, one on the
normal mean (see Student's t
distribution and Studentization)
and one on normal correlation (see Fisher’s z-transformation).
Although Gosset’s fame rests on the normal mean work, he wrote on other
topics, e.g. he proposed the variate
difference method to deal with spurious correlation.
His work for Guinness and the farms that supplied it led to work on
agricultural experiments. When his friend Fisher made randomization central
to the design of experiments Gosset disagreed—see his review
of Fisher’s Statistical Methods. Gosset was not very interested in
Pearson’s biometry
and the biometricians were not very interested in what he did; the normal
mean problem belonged to the theory of errors and was more closely related to
Gauss and to Helmert
than to Pearson. Gosset was a marginal figure until Fisher
built on his small-sample work and transformed him into a major figure in C20
statistics; for his relations with Fisher, see Fisher
Guide. Another admirer
was E.
S. Pearson. For a sample of Gosset’s humour see the entry kurtosis. See Life
& Work

1920-1930Many of the people who would dominate probability and
statistics over the following decades, first made an impact. Of them, the
individual who had the greatest hold over his subject was Fisher
in statistics. The ascendancy of Fisher was also the ascendancy of the
English language. While German was the international scientific language of the
time—and the language of probability—Fisher and his followers rarely referred
to literature in German, believing that that literature ended with Gauss, although Helmert
was later added to the canon. Thus standard works, like those by Czuber,
did not cross the Channel.

·The
foundations of probability received much
attention and certain positions found classic expression: the logical
interpretation of probability (degree of reasonable belief) was propounded by
the Cambridge philosophers, W.
E. Johnson,J.
M. Keynes and C.
D. Broad, and presented to a scientific audience by Jeffreys;Keynes was influenced by the German
physiologist/philosopher J. von Kries.
The frequentist
view was developed by von Mises.

Richard von Mises (1883-1953) Applied
mathematician. MacTutor.
References.MGP.SC, LP. Miseswas educated at the
Technische Hochschule in Vienna.
From 1919 he was director of the Institute
of Applied Mathematics at the University of Berlin. He used probability in his
work in physics, e.g. von
Mises distribution, and wrote on mainstream probability topics,
e.g. central limit theorem,
but he is most famous for his work on the foundations of probability. In 1919
he published his “Grundlagen
der Wahrscheinlichkeitsrechnung,” (p. 52) which expounded his frequentist
interpretation of probability, based on the notion of a collective. The paper contained
other innovations, including the “label space” (sample space) and the distribution function.
At the time the Mathematische
Zeitschrift, which published these early papers, was the most
important German outlet for work in probability. Von Misespublished two books on
probability, the widely read Probability, Statistics and Truth (1928)
and a comprehensive textbook (1931). His position on statistical inference
was—surprisingly—Bayesian. In 1933 he left Germany
for Turkey and, in 1938,
moved again to the United
States, where he became Professor of
Aerodynamics and Applied Mathematics at Harvard. Mises influenced many
writers on probability in the 20s and 30s, including Kolmogorov.
Among those who worked to make the collective rigorous in the 1930s were Wald and the logician Alonzo
Church.Hilda
Geiringer has written a history of probability from a
Misean standpoint, seeProbability: Objective Theory. See also
von Plato (ch. 6) Von Mises’ frequentist probabilities.

Harold Jeffreys(1891-89) Applied mathematician and physicist. MacTutorReferences.SC, LP. Jeffreys has a good claim to be
considered the first Bayesian statistician in that he used only Bayesian methods. Jeffreys
arrived tostudy mathematics at CambridgeUniversity
a year afterFisher and he spent his life there
working on astronomy and geophysics. Unlike von Mises,
Jeffreys was not primarily interested in probability as a means of modelling
physical processes but in probability in relation to scientific inference.
With Dorothy
Wrinch, he produced a series of papers between 1919 and –23. (See the
entriesBayes
and posterior probability.)
Theywere influenced by the approach to probability taken by the Cambridge philosophers W.
E. Johnson, J.
M. Keynes and C.
D. Broad; all would now be described as Bayesians. From his
earliest work Jeffreys had used least squares (which he had learnt from the
astronomer Eddington)
in his empirical work but around 1930 he started to devise new methods and to
reconstruct the old in accordance with his theory of probability. He did
extensive empirical work, the best known being in collaboration with K E
Bullen on earthquake travel times. Jeffreys also studied Fisher’s statistical work and
adopted some of his concepts and terminology, e.g. likelihood. Jeffreys’s big
book Theory of Probability (1939) combined a philosophy of probability
with a reworking of the “modern statistics” of Fisher and Pearson—all
founded on the principles of inverse
probability. In 1946 Jeffreys completed his system by providing a
rule for choosing priors—see Jeffreys
prior. Statisticians (including those who attended his lectures as
students!) showed little interest in Jeffreys’s work until the Bayesian revival of the
1960s. The physicist E. T.
Jaynes (1922-98) was strongly influenced by him. See Symbols in Probability
for Jeffreys’s contribution to notation. See Life &
Work.For more information see
Harold
Jeffreys as Statistician.

Norbert Wiener (1894-1964) Mathematician.
MacTutorReferences.MGP.SC, LP. Wiener’s working life was spent
at MIT. He was well-travelled, having studied at Harvard, CambridgeUniversity
and Göttingen. He studied mathematical logic at Cambridge
with Russell
but he was to become closer to the Cambridge
analysts especially Hardy
and Paley
with whom he collaborated. Among contemporary probabilists his strongest
links were with Lévy. Wiener earliest work on probability
treated Brownian motion
(see also Wiener process)
where he used the new Daniell
integral; see here
for a detailed account of his relations with Daniell. In 1930 Wiener
presented a generalizedharmonic
analysis, which had a mathematical model of the spectrum, and
developed the periodogramanalysis introduced by the physicist Arthur Schuster at the end of the
C19. (above) Much of Wiener’s work ran parallel to that of
the Russian probabilists Khinchin and Kolmogorov
but the relationship between their work and his did not emerge until later.
Wiener worked with engineers, in particular with Y. K. Lee; see The
Lee-Wiener Legacy. In the Second World WarWiener
developed a theory of prediction to be used in fire control systems. His
wartime report was published as Extrapolation, Interpolation and Smoothing
of Stationary Time Series (1949) (see filter and autocorrelation).
Wiener devised the subject of cybernetics
as an umbrella to cover his various interests. P. R.
Masani’s biography Norbert Wiener
(1990) is reviewed in Mathematical
Reviews. Wiener’s papers are at MIT.

Aleksandr
Yakovlevich Khinchin(1894-1959)
Mathematician.MacTutorReferencesMGPPublications.Khinchin was a student at Moscow State University
and spent almost all his working life there. Khinchin, like Lévy
and Doob, started in analysis. The university had a very
strong analysis group and Khinchin’s supervisor was Luzin.
There was no tradition of work in probability until, that is, Khinchin and Kolmogorov created one. There do not seem to have been any
personal links with the Chebyshev/Markov
tradition at St. Petersburg.
Khinchin was drawn into probability through an interest in the theory of
numbers. The law of the
iterated logarithm (1924) had an existence in number theory, as
did the strong law of large
numbers. While in the 20s Khinchin worked on sequences of
independent random variables, in the 30s he developed the theory of stochastic processes
and, in particular, that of stationary
processes. In the 1940s he applied his probabilistic techniques
to the theory of statistical mechanics in a book Mathematical Principles
of Statistical Mechanics (1943). In the 50s Shannon's
information theory
(1948) was generating much more interest in probability and statistics
circles than had earlier work on communication.Khinchin contributed to the absorption of
these concepts with his Mathematical Foundations of Information Theory.

1930-1940 Against
a calamitous economic and political background there
were important developments in probability, statistical theory and applications.
In the Soviet Union mathematicians fared better than economists or geneticists
and in the early years they could travel abroad and publish in foreign
journals; thus Kolmogorov and Khinchin
published in the main German periodical, Mathematische
AnnalenGDZ.
In Germany Jews were barred from academic jobs from 1934.

·In probability
the main developments were Kolmogorov’s axiomatisation of probability and the
development of a general theory of stochastic
processes by him and Khinchin. This work is
usually seen as marking the beginning of modern probability. See von Plato (ch. 7) “Kolmogorov’s measure
theoretic probabilities.” Most of the activity was in the Soviet Union and France but the United States began to play a
bigger role in the course of the decade.

·In the foundations of probabilityBruno de
Finetti and Frank Ramsey’s (1903-1930) (St.
Andrews, N.-E.
Sahlin) work on subjective probability appeared. Ramsey started by
criticising the Cambridge
logical school (see Jeffreys), in particular Keynes. A
statistical superstructure came only later. Jeffreys gave a complete treatment of statistics founded on his
logical notion of probability but otherwise the prevailing approach was classical.

·In Britain and the United Statesstatistics was redefined. The Royal Statistical
Society (above) broadened its political arithmetic and
‘state-istics’ agenda and welcomed work on agriculture and industry and on mathematical statistics. There were similar
changes in the American Statistical Association (above). Biometrika
stopped publishing biological research and focussed on theoretical statistics.
The Institute of Mathematical
Statistics was founded in 1930 and its journal The Annals of Mathematical
Statistics appeared in 1933. This became a major journal for
both mathematical statistics and probability. The first statistics laboratory
in the US was created at IowaState
by Snedecor
in 1933. Snedecor was strongly influenced by Fisher. In FranceGeorges Darmois
and Daniel Dugué came under Fisher’s influence. Georg Rasch took
Fisher’s ideas to Denmark.

·Applications of mathematics
and statistics to economics came together in the
econometric
movement. This could look back to the C17 political arithmetic and the C19 work
on index numbers
and on Pareto's law
but econometric modelling, which involved the application of regression methods to
economic data, was a C20 development. Among the leaders in the 1930s were Jan
Tinbergen and Ragnar
Frisch. Econometricians who have been followed them as Nobel laureates
in economics include Engle, Granger, Haavelmo, Heckman, Klein, McFadden. Equally important were developments in the
collection of economic information. In the United States the outstanding
economic statistician was Simon
Kuznets. See M. S. Morgan A History of Econometric Ideas, Cambridge 1990.

Harald Cramér (1893-1985) Mathematician,
statistician & actuary. MacTutorReferences.MGP.PhotosSC, LP. Personal
recollections Statistical
Science 1986. Cramér
studied at the University
of Stockholm and spent
his working life there. His career spanned the applied mathematics of
insurance and the pure mathematics of number theory. In Sweden
probability, statistics and actuarial science were more closely related than
elsewhere; see Cramér’s talk Actuaries
and Actuarial Science.Filip
Lundberg was a symbol of
the link between insurance and probability and the Skandinarvisk Aktuarietidskriftwas the main statistics journal. From the mid-20s probability became increasingly prominent in
Cramér’s research. In 1929 a chair in “Actuarial Mathematics and Mathematical
Statistics” was created for him. Cramér’s Random Variables and
Probability Distributions (1937) has been called “the first modern book
on probability in English”; he was encouraged to write it by G.
H. Hardy the British number theorist and analyst. Cramér’s early work was on the central limit theorem,
and treated the expansions associated with Edgeworth (Edgeworth series), Gram
and the astronomer Charlier.
Cramér was an important synthesiser of
subjects and of national traditions. His student Herman Woldbrought together the individual processes, studied by Yule
in the English statistical literature, and the theory of stationary stochastic processes,
studied by Khinchin in the Russian mathematical
literature. Cramér’s ownMathematical
Methods of Statistics(1945) brought together English statistical
theory and Continental probability. Amongst the new results it contained was
theCramér-Rao
inequality. Cramér’smain work from 1940 onwards was on stochastic
processes, where he extended the
theories of Kolmogorov and Khinchin. Cramérmade an important contribution
tothe anglicising of probability language and so his name often
appears on the Words pages. See also Symbols in Probability.

Bruno de
Finetti (1906-85) Mathematician, actuary & statistician. MacTutorReferencesWikipediaLP. De Finetti studied mathematics in Milan. He was very precocious and
very prolific, publishing the first of his 300 works while still a student.
Although de Finetti became the best known of the Italian probabilists, there
was already an Italian presence on the international scene. Castelnuovo’s
textbook, Calcolo della probabilità (1919), was comparable to Markov’s and Cantelli’s
(LP) work on the strong law of large numbers made
him a pioneer of modern probability. The journal which Cantelli founded, Giornale
dell'Istituto Italiano degli Attuari, published important probability
contributions in the 1930s; for Cantelli see Regazzini. Corrado Gini(LP) (SC)
founded the Italian Central Statistical Institute, and also the journal Metron.De Finetti worked for a time at the
Statistical Institute and, as well as being associated with the universities
of Trieste and Rome, he did actuarial work. While de
Finetti made important contributions to probability theory, he is best known
for his subjective theory of probability, based on the Dutch book argumentfor coherence In the English
speaking world de Finetti’s work only became known in the 1950s when Savage drew attention to it. De Finetti’s work has attracted
a lot of attention from philosophers. See for example Richard Jeffreyand his book Subjective Probability.
See von Plato ch. 8: “De Finetti’s subjective probabilities.”
See the website Bruno de
Finetti.

William
Feller (1906-70) Mathematician. MacTutorReferences.MGP.About half of Feller’s papers were in probability, the rest were in
calculus, functional analysis and geometry. After a first degree at the University of Zagreb
Feller went to the University of Göttingen,
where the world’s leading mathematics department was presided over by David
Hilbert, Feller’s ideal mathematician. Feller’s supervisor was Richard
Courant. Feller was awarded his Ph.D. in 1926, aged 20. In 1933 he
left Germany first for Denmark and then for Sweden, joining Cramér at the University of Stockholm.
He moved to the USA in
1939, first to Brown and then to Cornell and Princeton.
Feller’s first contribution to probability was a 1935 paper on the central limit theorem; he
obtained similar results to Lévy. Feller was the main
architect of renewal theory.
In the 50s he worked on a theory of diffusion, which brought together
functional analysis differential equations and probability. Feller had
numerous PhD students who became influential probabilists; one unofficial
student was Frank
Spitzer. The publication in 1950 of volume 1 of Feller’s Introduction
to Probability Theory and its Applications was a major event. Gian-Carlo
Rota wrote, “Together with Weber’s Algebra and Artin’s Geometric
Algebra this is the finest text book in mathematics in this century.”
Besides giving new results and new forms to old results, the book drew
attention to a vast body of applied probability work that had not been
noticed in the theoretical literature. Feller’s frequent appearance on the Symbols in Probability
and Words pages (e.g. sample
space and experiment)
testify to the influence of the book. See J. L. Doob “William Feller and
Twentieth Century Probability” in AMS History
of Mathematics, Volume 3 and von Plato (ch. 7)
“Kolmogorov’s measure theoretic probabilities.” There is an excellent
biography William
Feller (1906-1970) by Darko Zubrinic.

J. L. Doob
(1910-2004) Probabilist. MacTutorReferences.
MGP.SS
Interview. Because Wiener was not part
of the probability community, Doob was the first “modern probabilist” from
the United States
or even from the English-speaking world. When Doob came on the scene the only
American probability textbook was the 1925 book by J.
L. Coolidge, which could almost have been written in 1885.
Harvard, where Doob studied, had a strong group of mathematicians but nobody
worked on probability. Statistics was much livelier and Doob came into
contact with probability and its European literature when he got a job with
the statistician Harold
Hotelling. Some of Doob’s early work (1934-6) was devoted to
making statistical theory more rigorous. His first idea for a topic for his
PhD student Paul
Halmos was that he should make some of Fisher’s
ideas rigorous. Halmos switched to a more realistic topic. Doob’s career was
almost entirely spent at the University
of Illinois. Although
Doob did not begin in probability, almost all of his work was in this field.
His main work was on stochastic
processes; he was responsible for making martingales so
prominent. His book Stochastic Processes (1954) was an important work of
synthesis. His Classical Potential Theory and its Probabilistic
Counterpart (1984) brought together Doob’s probabilistic and
non-probabilistic interests. Doob made an important contribution to
anglicising the language of probability theory and he appears often in this
capacity on the Words pages—see e.g. probability measure
and Markov process.Apart from Halmos, his best-known student
was David
Blackwell who became part of Neyman’s group at Berkeley. For a
retrospective on Stochastic Processes see N. H. Bingham (2005) Doob: a
half-century on, Journal of Applied
Probability, 42, 257–266. D. Burkholder and P. Protter have
some personal reminiscences here. An issue of the JEHPS is devoted to “the splendours
and miseries of martingales.”

1940-1950Among the millions who died in the Second World War
were mathematicians and statisticians. Doeblin
is only the best known of those killed; one of Neyman’s books
is dedicated to 10 lost colleagues and friends. Yet
this war, unlike the First World War, promoted the study of statistics and
probability. At the end of the war there were many more people working in
statistics, there were new applications and the importance of the subject to
society was more widely recognised.

·The war brought many people
into statistics and probability. Savage and Tukey
are examples from the US while in Britain the recruits included Barnard
(MGP),
Box (MacTutor)
(MGP),Cox(MGP),
D.
G. Kendall (MGP),
Lindley
(MGP).
The recruits were often better trained in abstract mathematics than earlier
statisticians. This contributed to closing the gap between the English
statistical and the Continental probability traditions.

Abraham Wald (1902-1950)
Statistician. MacTutorReferences.
MGP.LP. Wald studied at the University of Vienna with Karl
Menger writing a thesis and several articles on geometry. To make
a living Wald worked in economics, publishing on both mathematical economics
and on economic statistics. His first work in probability was on the
collective of von Mises.
When Wald moved to the USA
in 1938 he went to ColumbiaUniversity where Harold
Hotelling was the senior statistician.
Wald started working on statistics in 1939 and soon developed his first ideas
on decision theory,
which he saw as an extension of Neyman-Pearson testing
theory. During the war Wald worked on statistical problems for the
military—one result was the development of sequential analysis.
The Statistical Research Group at ColumbiaUniversity was one of
the most important wartime groups. After the war Wald developed his decision
theory ideas further in the book Statistical Decision Functions
(1950). In his short career Wald made contributions to most branches of statistical theory; among them the eponymous Wald test. He played
an important part in the statistical developments in econometrics, which led
to a Nobel prize for Trygve
Haavelmo.
Wald’s co-authors include Jacob
Wolfowitz and H. B. Mann.
Among his PhD students were Herman
Chernoff, Charles
Stein and Milton
Sobel. Jack
Kiefer had started his PhD when Wald was killed in an air crash.

C. R. Rao(b. 1920) Statistician. MacTutorReferencesASAMGP.Recent
photo. Rao is the most distinguished member of the Indian statistical
school founded by P.
C. Mahalanobis and centred on the Indian Statistical Institute
and the journal Sankhya.
Rao’s first statistics teacher at the University of Calcutta
was R. C. Bose.
In 1941 Rao went to the ISI on a one-year training programme, beginning an
association that would last for over 30 years. (Other ISI notables were S. N. Roy in the
generation before Rao and D. Basu one
of Rao’s students.) Mahalanobis was a friend of Fisherand much of the early research at ISI was
closely related to Fisher’s work. Rao was sent to Cambridge to work as PhD student with
Fisher, although his main task seems to have been to look after Fisher’s
laboratory animals! In a remarkable paper written before he went to Cambridge, “Information and the accuracy attainable in
the estimation of statistical parameters,”Bull. Calcutta Math. Soc. (1945)37,
81-91 Rao published the results now known as the Cramér-Rao inequality
and theRao-Blackwell
theorem. A very influential contribution from his Cambridge period was the
score test (or Lagrange multiplier test),
which he proposed in 1948. Rao was influenced by Fisher but he was perhaps as
influenced as much by others, including Neyman. Rao has
been a prolific contributor to many branches of statistics as well as to the
branches of mathematics associated with statistics. He has written 14 books
and around 350 papers. Rao has been a very international statistician. He
worked with the Soviet mathematicians A. M. Kagan and Yu.
V. Linnik (LP) and since 1979 he has worked in the United States, first at the University of Pittsburgh
and then at PennsylvaniaStateUniversity.
He was elected to the UKRoyal
Society in 1967; in 2001 he received the Indian Padma Vibhushan
award (biography);
he received the US National Medal of Science in 2002.See ET Interview: C.
R. Rao, ISI
interview and profile.
For a general account of Statistics in India, see B. L. S. Prakasha
Rao’s About
Statistics as a Discipline in India.

·The scope of probability theory increased with the emergence of new
sub-fields such as queueing
theory and renewal
theory. Feller’s Introduction to
Probability Theory made a very strong impact on in the English-speaking
world; it promoted the study of the subject and made advanced topics, like Markov chains,
accessible.

·In statistics
there was a Bayesian revival. In BritainI.
J. Good—Probability and the
Weighing of Evidence (1950)—was influenced (positively and negatively) by
the Cambridge
logicians (see Jeffreys). The American version, Bayesiandecision
theory, reflected more the influence of Wald’s
classical decision theory.
The most important early contributions were Savage’sFoundations
of Statistics (1954) and Howard Raiffa and Robert Schlaifer’sApplied Statistical Decision Theory (1961).

·There was a great expansion
in medical statistics and epidemiology. Austin Bradford
Hill was an important contributor to both fields: he pioneered randomised clinical trials and, in work
with Richard Doll,
demonstrated the connection between cigarette smoking and lung cancer.

·Laplace
and Quetelet saw the work of the census as a possible
application of probability but the use of statistical theory by official data
gatherers only became institutionalised through the activities of Morris Hansen
(interview)
at the US Census Bureau.

·Since
the 50s finance
has been an important area for applied decision theory: the 1990 Nobel Prize in
economics was awarded to Markowitz
for work that was influenced by Savage although the idea of
expected utility goes back to Daniel Bernoulli. Since the 70s
finance has been an important area for applied stochastic processes. Itohad developed hisstochastic calculus in
the 40s but it was applied in an unexpected way in the Black-Scholes model for
pricing derivatives. Scholes
and Merton
received the 1997 Nobel Prize in economics for their contribution (see Black-Scholes formula).
The intellectual ancestor of stochastic finance was Bachelier (above).

·In 1973 the Annals of
Mathematical Statistics (see above) split into the
Annals of Probability and the Annals of Statistics. This represented
increasing specialisation—there were weren’t many new Cramérs—as well as the need to expand journal pages.

Leonard
Jimmie Savage(1917-71) Statistician. MacTutorReferences.MGPLP.After training
as a pure mathematician and obtaining a PhD from the University
of Michigan Savage worked briefly
with von
Neumann in Princeton. Savage
became a statistician in the war, joining the Statistical Research Group at ColumbiaUniversity, which included Wald. Savage’s earliest
publications were in the style of Wald’s
classical decision
theory. The change came with the Foundations
of Statistics (1954). This work, written while Savage was at the University of Chicago, continued the decision theme
but provided the basis for a Bayesian
approach to probability in the spirit of Ramsey and de Finetti.
Savage also drew on the utility
theory von Neumann and Morgenstern developed for use in the theory of games. The
maxim of maximising expected utility went back to Daniel
Bernoulli. However the book’s main take on statistics was still classical and Savage criticised
the foundational work of Jeffreys
who had provided the only set of Bayesian methods that reflected C20
statistics. Schlaifer was quicker to develop practical Bayesian methods.
However, Savage became a strong advocate of Bayesian methods in
statistical research and a champion of such non-classical principles as the likelihood principle.
This principle was treated axiomatically by Allan Birnbaum
but it had been discussed earlier by Fisher and Barnard.
Jimmie’s younger brother, I. Richard Savage (1926-2004), was also a
distinguished statistician. (Obit. p. 8)
MGP

John W. Tukey(1915-2000) Statistician. MacTutorReferences.ASA
and Bell Labs.MGP. Tukey
originally trained as a topologist (see finite character and Zorn's lemma) but
became a statistician in the Second World War. He remained in Princeton but worked on statistics with Wilks
and Cochran.
In parallel with his university career Tukey had a long association with Bell Labs;
another notable associated with this institution was Claude
Shannon. Tukey did work on the foundations of statistics, notably
on Fisher’s fiducial
argument but his main contributions were in applied areas—experiments, time
series analysis (see fast
fourier transformwindow
and pre-whitening),
robust methods
(see trimming
and Winsorizing)
and exploratory data
analysis (see stem-and-leaf-displays).
His enthusiasm for data
analysis was in part an expression of frustration with certain
tendencies in mathematical statistics. Tukey is a major presence on the Early
Uses pages because he loved inventing names. His new names for old terms
(e.g. hinge for median)
did not always catch on but his names for his own creations, e.g. jackknife always did
and he was ready to apply his facility outside statistics, see the entries on
bit, linear programmingand software. David
Brillinger wrote a long tribute
for the Annals of Statistics and
the August 2003 issue of Statistical
Science is dedicated to Tukey’s life and work. (Euclid).
The American Philosophical Society has care of the Tukey
Papers.

1980+Instead of describing
people from the very recent past, I describe the effect the computer has had on statistics from its advent, around 1950 and the changes in the writing of thehistory
of probability and statistics in recent decades.

The effects
of the computer. The changes following the introduction of the computer have been much
more radical than those following the increased use of mechanical calculating
machines at the end of the C19. Such machines provided the material
basis for Pearson and Fisher’s research
and for the construction of their statistical
tables in the period1900-50. The machines were not in general use
and Fisher assumed that most of the users of the tables and the
“research workers” who read his book would use logarithm tables or a slide
rule. For general background see A Brief
History of Computing.

With the availability of
computers old activities took less time and new activities became possible.

·Statistical tables and
tables of random numbers first became much easier to produce and then they
disappeared as their function was subsumed into statistical packages.

·Much more complex models
and methods could be used. Methods have been designed with computer
implementation in mind—a good example is the family of Generalized
linear models linked to the program GLIM; see John NelderFRS.

·In the early C20 when Student (1908) wrote about the normal mean and Yule
(1926) about nonsense correlations they used sampling experiments and in the
1920s it became worthwhile to produce tables of random numbers. With the
introduction of computer-based methods for generating pseudo random numbers
much more ambitious Monte-Carlo
investigations (introduced by von
Neumann and Ulam)
became possible. The Monte-Carlo experiment became a standard way of
investigating the finite sample behaviour of statistical procedures.

·Since around 1980 Monte Carlo methods have been used directly in data
analysis. In classical statistical inference the bootstrap has been very
prominent; Statistical Science’s silver
anniversary issue (Euclid)
includes an interview with Efron, the creator. In Bayesian analysis Markov Chain Monte-Carlomethods
have been used extensively; previously conjugate priors and noninformative priors
had been used because of computational limitations.

Writing history. In recent
decades there has been a flood of works—books and articles—on the history of
probability and statistics from statisticians, philosophers and historians. I
will mention a few titles in each category to indicate the range of activity.

·50 years ago the standard
general works were Todhunter for the history of probability,
Walker
for statistics with an emphasis on psychology and education and Westergaard
for statistics with an emphasis on economic and vital statistics.

Harald
Westergaard (1932) Contributions to the History of Statistics, London: King.

·E
S Pearson got history moving in Britain. In 1955 Biometrika published the first of its Studies in the History of Statistics and Probability; as editor, ESP, wrote “It is hoped to
publish articles by a number of different authors under this general heading.”
So far, around 50 articles have appeared. ESP wrote about the people he
knew—his father, Fisher and Student—but
the other pioneers, F
N David and M
G Kendall, chose more remote topics. Two collections have appeared, Studies in the History of Statistics and Probability (1970) and Studies
in the History of Statistics and Probability, Volume II, (1977). See here
for a list of the articles. In addition, David wrote a book on the early
history of statistics and Pearson edited and published his father, Karl’s,
lectures; both works are very different from Todhunter.

F. N. David (1962) Games, Gods and Gambling:
the Origins and History of Probability and Statistical Ideas from the Earliest Times
to the Newtonian Era.Griffin, London.

E. S. Pearson
(ed) (1978) The History of Statistics in the 17th and 18th Centuries against
the Changing Background of Intellectual, Scientific and Religious Thought:
Lectures by Karl Pearson given at University College, 1921-1933. London: Griffin.

Stephen M
Stigler (1999) Statistics on the Table: The History of Statistical Concepts
and Methods, Cambridge, MA:
HarvardUniversity Press.

·Much
neglected work has been rediscovered. Bienaymé’s
(LPSC) name is now linked to branching processes and Chebyshev's inequality
and his total contribution is recognised. There
are cases of individuals, famous for other work, e.g. Abbe
(LP) or Einstein, (Brillinger
Time Series) and of individuals who may be known for one
topic but who produced a large body of work, e.g.Thiele
(LPSC) who had been identified with semi-invariants. Hald’s book
is a major addition to the literature of the
overlooked for, as well as revealing a largely forgotten Laplace,
it describes many continental European developments that were overlooked in the
Anglo-centric statistical literature of the C20. (see above)

·In recent decades the history and sociology of science have flourished. T. S.
Kuhn’sStructure of Scientific Revolutions (1962) had a
strong influence on both fields. Among the works on probability and statistics
by historians and sociologists are

H. A. David & A. W. F.
Edwards (eds.) (2001) Annotated Readings in
the History of Statistics, New
York: Springer. Amazon
(Its Appendix A lists English translations of works of interest to historians
of statistics.)

·Post-1940
developments have not attracted the attention of historians yet. Some classic modern
contributions are reprinted (with commentary) in S. Kotz & N. L. Johnson
(Editors) (1993/7) Breakthroughs in
Statistics: Volume I-III New York Springer Amazon.
Statistical Science has been publishing interviews for the past 20 years and
these are a form of living history—there must be 100 by now; the post-1995
issues are available through Euclid.
Econometric Theory
also publishes interviews and articles on history; among the statisticians
interviewed are T. W. Anderson
and J.
Durbin.

·Probability
and statistics now appear as topics in textbooks on the history of mathematics
and the history of disciplines that use probability and statistics. See for
example