Ajit Balakrishnan: Good things from bad motives?

If you ever require evidence that a group of talented scientists can pursue completely misguided and inhumane goals for years but, in the process, invent tools and techniques of fundamental benefit to humanity, one need only turn to the science of statistics. Today we use statistical methods for purposes as diverse as computing gross domestic product and planning healthcare and education — but statistical methods originated with much darker motives.

The story of modern statistics starts in the late 19th century with Francis Galton and his preoccupation with establishing that human ability was inherited. He envisioned a world in which intellectually gifted children would be awarded scholarships and given the best possible education and – hold your breath – the not-so-talented “could find a welcome and a refuge in celibate monasteries or sisterhoods”. The remarkable thing is that while pursuing these crackpot notions, Galton also invented the statistical concept of “standard deviation”, the notion that data such as the heights of men would cluster around a mean (or central tendency) and vary from this mean in a measureable form, which was “standard deviation”. This concept of standard deviation is a defining foundation for the modern science of statistics. It is used in applications such as sports (comparing the batting performance of cricketers), weather (in pronouncing whether a day is hotter or colder than normal) and finance (in measuring the risk of a portfolio of stocks or bonds).

Galton was a member of a distinguished family of bankers and thinkers — Charles Darwin was a cousin. The publication of The Origin of Species, in fact, was an inspiration for Galton. It is probably true to say that the Europe of Galton’s time, the late 19th century and early 20th century, was as a whole preoccupied with inherited and racial differences and that Galton was merely a product of his time. Galton invented the term “eugenics” and was the president of the “Eugenics Education Society”, which, among other things, advocated financial incentives for marriage between eminent families as a way of improving the stock of citizens. By all accounts, this was a fashionable place to be at: Winston Churchill was a frequent attendee at their meetings.

Galton’s student, Karl Pearson, was no less talented; the Pearson correlation coefficient, another statistical concept as foundational as standard deviation, is named after him. And he was no less bigoted, and his bigotry extended beyond individuals to whole races. “History shows me,” he wrote, that there was “one way, and one way only, in which a high state of civilisation has been produced, namely, the struggle of race with race, and the survival of the physically and mentally fitter race.” He was adamantly opposed to Jewish immigration into Britain. He wrote in the first issue of the journal Annals of Eugenics that “this alien Jewish population is somewhat inferior physically and mentally to the native population”.

The father of Indian statistics, and the architect of India’s National Sample Survey – the basis for practically all economic policy making in India – P C Mahalanobis, started out with similar interests. His first research project was to do a statistical analysis of Anglo-Indians in Calcutta to see whether their skull sizes differed on an average from the skull sizes of Hindus there. The bigger picture that he and his sponsors were trying to establish was whether Calcutta’s Anglo-Indians came from the upper castes of Hindus in Bengal. In the process of doing this bigoted research, Mahalanobis, in keeping with the grand tradition of innovation in statistics, invented an original way to compare two groups of data by detecting outliers. This measure, called the “Mahalanobis distance”, is a key tool in the armoury of today’s statisticians. Mahalanobis, as is well known, went on to found the Indian Statistical Institute in Calcutta, was a key player in India’s Five-Year Plans and was the architect of our National Sample Surveys, the single most important tool for economic planners till this day.

The standard deviation of Galton, the correlation coefficient of Pearson and the Mahalanobis distance, among other statistical measures, are the foundational tools for today’s Web search engines, social networks and e-commerce recommendation systems. But, as we have seen, the original motivation for these inventions was far from benign.

On reflection, the pattern between bad motives leading to good things that we saw in statistics is mirrored in many other branches of science. Nuclear power, which was nurtured as a weapons programme, may yet save the world from an energy crisis; and steel was first invented for, and found its main market in, sword making for centuries before any peaceful uses could be found for it.