Cluster-struck

CHARLES DARWIN is more usually cited for his scientific discoveries than his moral insights. In the closing pages of his travelogue The Voyage of the Beagle however, he condemns the practice of slavery — which he observed firsthand in the colonized New World — in blistering, heartfelt terms worthy of an Old Testament prophet:

Those who look tenderly at the slave owner, and with a cold heart at the slave, never seem to put themselves into the position of the latter; […] picture to yourself the chance, ever hanging over you, of your wife and your little children—those objects which nature urges even the slave to call his own—being torn from you and sold like beasts to the first bidder! And these deeds are done and palliated by men, who profess to love their neighbours as themselves, who believe in God, and pray that his Will be done on earth!

In this testimony against the great social sin of his age, Darwin makes an observation that should unsettle us even here and now: “if the misery of our poor be caused not by the laws of nature, but by our institutions, great is our sin.”

Whether disparities in wealth, health, and power within and between nations are due to the political and social structures we have created ourselves, or a reflection of fundamental, potentially immutable differences between peoples was a subject of debate long before Darwin’s voyage, and it remains one to this day. Almost two centuries after the Beagle sailed, we can point to individual pieces of genetic code that shape our physical stature, the color of our skin, and our risk for numerous diseases. The science writer Nicholas Wade would have us believe that this genetic revolution, set in motion by Darwin’s scientific discoveries, has also answered Darwin’s moral challenge.

In his new book A Troublesome Inheritance, Wade claims that modern biology proves that race is a real, genetic phenomenon with the power to shape the fate of human societies. Citing a wealth of modern genetic data, he argues that racial groups emerge naturally from human genetic variation as a result of “independent evolutionary paths” followed by different human populations, and speculates that genetic differences between these groups are responsible for social, cultural, and economic disparities between nations. However, a close examination of the research cited in A Troublesome Inheritance shows that Wade’s conclusions are very different from those drawn by geneticists.

To understand how Wade goes wrong, it may help to review the grand themes in human genetics. In migrating out of Africa to settle the globe, our population has grown explosively. Yet modern humans have less genetic diversity than our closest living relatives, the great apes. A pair of humans drawn at random from different continents will be more similar, in their genetic sequences, than two chimpanzees from different river basins. It is nevertheless possible to find points in the genetic code, called loci, where different people carry different variations of the genetic code, or alleles. Most of these alleles do not have strong effects, or any effect, on the function of our genes, but they can be useful for understanding the history of human populations. Wade explains these basic details with admirable clarity, avoiding many niggling errors of terminology that can crop up in popular discussion of genetics.

The geographic distribution of different alleles reveals another major pattern in our genetics, “isolation by distance”: when a population is arrayed across a landscape, individuals from geographically distant sites can become genetically different even if migrants move from site to site and interbreed freely. Even before modern, global transportation, human wanderlust could eventually carry an allele that first appeared in China all the way to France. We would expect that allele to be more common in China than France, not because it is better suited to life in the one region than in the other, but because it takes time for people to travel between them — the two locations are isolated by distance.

Isolation by distance has profound implications for those of us who study genes that adapt living things to different environments. The fact that alleles may differ in different locations simply because they are in different locations means that even when we observe differences between two populations of living things, and even when those populations differ in their genetic variation, we cannot assume that these differences are specifically suited to conditions in those different locations. Much of modern population genetics is devoted to teasing apart the Gordian knot of isolation by distance and the process of evolution to fit different environments.

For all his care with the basics of genetics, Wade ignores or elides these complexities. He founds his case for the biological reality of race on a class of genetic analyses called “clustering algorithms.” Geneticists use clustering algorithms to identify individuals within a larger sample who are more closely related to each other — because their genetic codes are more similar to each other than to the rest of the sample, they group together into a cluster. Wade focuses on two studies led by Noah Rosenberg, who applied a clustering method to worldwide genetic datasets. Rosenberg and his coauthors found that the individuals in their samples could be sorted into five groups corresponding to sub-Saharan Africa; Europe, the Middle East, and South Asia; East Asia; Australia and the Pacific; and the Americas — and Wade calls these clusters “continental races.”

What Wade does not make particularly clear, though, is that clustering algorithms do not independently select the number of clusters in a genetic dataset — rather, they sort individuals into a predetermined number of clusters. Geneticists typically run clustering analyses for a range of possible numbers of clusters, then compare the results to find a clustering schemethat best describes their data. This is what Rosenberg and his colleagues did, and in the more recent of the two studies Wade cites, they found that their data best supported six clusters, which split their Native American samples into two groups.

Moreover, clustering algorithms cannot find clusters that are not sampled, and the data examined by Rosenberg’s studies missed a big portion of human diversity: Africa. As the ancestral home of humanity, Africa contains more of our genetic diversity than any other continent, but many of the first global human genetic datasets included more European than African samples. In 2009, Sarah Tishkoff led a team in collecting and analyzing more than 2,000 individuals with African ancestry. Combining this new sample with the data used by the Rosenberg studies, Tishkoff and her coauthors ran a clustering analysis and found that their new, more representative dataset was best described not by five clusters but by 14, more than half of them within Africa.

The ambiguity in the genetic structure of human populations is an expected result of isolation by distance. Clustering algorithms, faced with an essentially continuous distribution of genetic variation, are quite capable of slicing it into chunks — much the way we squint at the spectrum of a rainbow and parse red from orange, or blue from green. Rosenberg and his coauthors were well aware of this, noting in the first paper that Wade cites,

In several populations, individuals had partial membership in multiple clusters […] These populations might reflect continuous gradations across regions or admixture of neighboring groups.

That is, there were individuals in the sample whose genetic codes did not sort neatly into a single one of Wade’s “continental races,” exactly as expected when a clustering algorithm runs up against isolation by distance. In their more recent study, Rosenberg and his colleagues tested for the definitive pattern of isolation by distance, in which populations located closer together are also more genetically similar, and this is precisely what they found. In the end, the team concluded that the clustering in their global dataset arose from small discontinuities in the isolation-by-distance relationship created by geographic barriers like the Atlantic Ocean, the Sahara Desert, or the Himalayas.

Wade skims past these points. In his hands, Rosenberg’s work becomes proof positive that humanity assorts into five races, and he ignores the ultimate implications of Tishkoff’s study. Discussing the results of other studies that have found still other patterns of clustering, Wade is reduced to declaring that “the five-race, continent-based scheme seems the most practical for most purposes” — not so much an empirical conclusion as an aesthetic judgment. Indeed, throughout the rest of A Troublesome Inheritance, the number and scale of racial divisions turns out to be whatever Wade wants them to be: in the chapter on genetics, the Middle East is part of the same “continental race” as Europe, but when Wade turns to describing societal differences, the Middle East becomes an example of “tribalism” contrasted with the “modern economies” of Europe.

From the start, Wade correctly notes that the genetic differences between human populations are quite small. The results of clustering analyses are driven by subtle differences in the frequency of different alleles within populations — over 1,000 loci or more; all the sites where one allele is carried by, say, 50 percent of Native Americans but by 55 percent of Asians are enough to differentiate the two groups. Wade nevertheless thinks that these tiny differences are suggestive of natural selection acting to create genetic differences in temperament and behavior that translate into broad societal disparities.

He bases this on scientific discussion of a phenomenon called “soft selective sweeps.” Soft sweeps are defined in contrast to hard sweeps — cases in which a single allele confers a benefit to people who carry it, so that they have more surviving children than those without that beneficial allele. If the advantage is great enough, in just a few generations the entire population will be descendants of the one happy mutant who first carried the beneficial allele. Such rapid evolutionary change leaves obvious marks on the genetic diversity of a population.

These genetic marks of hard sweeps are exceptionally rare in modern human populations, but soft sweeps may have been more common. A soft sweep can occur when many different loci work together to create an advantageous trait — for example, we know of almost 200 different loci that help to determine how tall a person grows. In such “polygenic” traits, small changes in the frequency of alleles at each contributing locus add up to big changes in the average trait value of the population. Taken together, these small changes are a soft sweep. Wade reasons that if the traits that shape social institutions are created by many genes, the signature of evolutionary change in those traits will be difficult to isolate from even the subtle genetic differences between human populations.

This amounts to an argument that, in the absence of evidence for specific genetic differences that create social and economic disparities, we should believe that they exist because Wade thinks it is reasonable to do so. In fact, evolutionary geneticists have begun to develop methods to find the footprints of soft selective sweeps. In a recent, cutting-edge study, Jeremy Berg and Graham Coop were able to identify signals of what may be soft selective sweeps in loci with well-documented effects on human traits and diseases including height, skin pigmentation, and type 2 diabetes — but they have noted that even the trait with the strongest signal, height, is an ambiguous case at best. In comparison, we have yet to find loci that strongly and consistently shape the kinds of social and cognitive traits that interest Wade, much less evaluate them for signs of change in response to natural selection.

In the opening chapter of A Troublesome Inheritance, Wade differentiates between the “hard science” of human population genetic structure and his “much more speculative” discussion of social and economic differences between societies. By the time the reader reaches the “speculative” chapters, however, this caution has evaporated. The latter half of the book is mostly devoted to describing differences between social, racial, and ethnic groups, then asserting that because the groups of people involved show some degree of genetic differentiation, their societal differences may very well have a genetic basis. In no case, however, does Wade have positive proof to support his racial theorizing. The best argument A Troublesome Inheritance musters in defense of its thesis is that it would not violate the laws of population genetics, if it were true.

What Wade seems unable to recognize is that the available evidence is not consistent with his fundamental argument, but in contradiction to it. However we might choose to divide up the diversity of the human race, the story written in our genetic code describes deep and ancient kinship, not our present differences. Our inheritance should trouble us not because it shows that some human societies are genetically doomed to second-class status, but because this status has arisen in spite of our shared humanity.