Wednesday, August 29, 2012

Human races, networks and fuzzy clusters

Evolutionary networks have recently become a hot topic of discussion. However, although networks have rather a long history in some parts of biology (see this previous post), historically it is phylogenetic trees that have dominated in biology, rather than phylogenetic networks. Interestingly, during the first half of the 20th century one research area where networks were to be found somewhat more commonly is anthropology.

Humans have long been considered to have a reticulate evolutionary history, both genetically and culturally (Moore 1994), and anthropologists have, on occasion, therefore used networks as one of their representations of that intra-species history (Brace 1981). This does not mean that trees have not dominated in anthropology also (Caspari 2003), as elsewhere; and the consequences of reticulation for anthropological studies form an ongoing debate (Holliday 2003; Arnold 2009). Interestingly, modern anthropologists are still coming to terms with the genetics of reticulation (see Jolly 2009), having previously been distracted by the Evolutionary Synthesis as well as by fossils (Hawks and Wolpoff 2003).

Some Anthropological Trees and Networks

We can start this brief survey with a tree from Arthur Keith (1915). There is no indication of reticulation at this early stage of the century, and thus the genealogy seems to owe much in principle to Ernst Haeckel's (1868) tree from the previous century.

Keith (1915) Figure 187. Genealogical tree of man's ancestry.

Carleton Stevens Coon (1939) had a polyphyletic view of human origins but also believed in a degree of reticulation, as shown in the next diagram. Note, however, that the most common European race, Mediterranean, does not take part in the reticulation events.

Earnest Albert Hooton (1931) was an even stronger believer in the reticulate nature of human microevolution. He commented that the following figure: "represents my idea of the various ways in which human blood streams have intermingled to form the principal races. It is not a family tree, but a sort of arterial trunk with offshoots and connecting vessels."

Hooton (1931) Figure 58. The blood streams of human races.

He modified this figure for the revised edition of his book (Hooton 1946), making it even more complex.

Hooton (1946) Figure 68. Blood streams of human races.

Elsewhere in the same book, Hooton (1946) produced this next diagram, which expresses a more phylogenetic idea. Indeed, it comes very close to the modern idea of a tree obscured by vines.

Hooton (1946) page 413.

Finally, we can consider a modern anthropological network, based on polymorphic genetic markers. This one is from Campbell and Tishkoff (2010), in which they note: "decreasing intensity of color represents the concomitant loss of genetic diversity as populations migrated in an eastward direction from Africa. Solid horizontal lines indicate gene-flow between ancestral human populations and the dashed horizontal line indicates recent gene-flow between Asian and Australian/Melanesian populations."

Campbell and Tishkoff (2010) Figure 2. The Recent African Origin model of
modern humans and population substructure in Africa.

Discussion

This whole approach to the analysis of human history presupposes that races exist as more-or-less distinct lineages, which is an idea that is not all anthropologists support. Genomically, humans seem to form what might be called fuzzy clusters, rather than discrete groups with sharp boundaries (Novembre et al. 2008; Lao et al. 2008). Inter-breeding is predominantly within the clusters, due to geographical and social isolation, with relatively little inter-breeding between the clusters. This creates a situation where gene-based distinctions between "races" seem to be obvious to casual observers but where more detailed analysis reveals considerable complexity. This results from the evolutionary history being a network not a tree.

So, this raises a point that anthropologists have been struggling with for some time, and which all network biologists need to address at some stage: Are distinct evolutionary lineages worth recognizing when there is extensive reticulation in a network? From the analysis point of view, the recognition of races is a model, and all models are wrong (because they are simplifications of the real world). However, some models are more useful than others. So, the question can be re-phrased as: Is the recognition of distinct evolutionary lineages a worthwhile model for interpreting a reticulated network? After all, the lineages may not form nested phylogenetic clusters, which is historically the basic criterion for recognizing them.

Domesticated organisms provide other classic examples of genealogical reticulation. We recognize dog breeds, for instance, and we even have an official register of breeds at the Fédération Cynologique Internationale. However, dog breeds form fuzzy clusters rather than discrete groups, with many individual dogs being cross-breeds. In spite of this, a model of fuzzy clusters formed by a reticulate evolutionary history is still considered to be useful by dog breeders and owners. A similar thing can be said about the breeds of horses, cats and cows; and, indeed, also for almost all human-associated species (see Arnold 2009).

In the non-domesticated part of biodiversity, systematists recognize subspecies, which often refer to morphologically distinguishable populations occupying geographically separated areas, but which are not otherwise genetically isolated. These subspecies can also form fuzzy clusters as a result of a reticulate evolutionary history, especially for plants. Once again, this is apparently a useful model, although there is no universal criterion for how much morphological difference it takes to delimit a subspecies.

I have noted before (see this blog post) that using a tree model for the evolutionary history of dog breeds is inappropriate, because of the reticulate inter-breeding. However, the question here goes further than this, and asks about what should be the units of analysis in the first place. If it is the dog breeds, then we are effectively excluding cross-bred dogs from the evolutionary history, unless they themselves form a new breed that is subsequently recognized.

This issue has profound consequences for our view of possible human races. Most of the networks shown above use races as the units of analysis. Modern evolutionary diagrams of human ancestry, on the other hand, are more likely to be based on genetic data from individual people (as shown in the last figure), which does not pre-suppose the existence of races. Races (if they exist) are then an outcome of the analysis, rather than an input. This distinction has been of particular importance for anthropology, where for most of the past century it has been assumed that discrete races exist and can be fitted into a non-reticulating phylogenetic tree (Caspari 2003; Arnold 2009). Even the very language of naming races creates a supposition that those races are "real", and so care is needed.

Historically, studies of race and human evolution have been inexorably linked. One problem with the current discussions about race is the confusion over whether races are sociological constructions or biological ones (Tattersall and DeSalle 2011; Krimsky and Sloan 2011). My point here is that, either way, they are a model of fuzzy clusters formed by a reticulate evolutionary history, at best, rather than being discrete groups. They have clearly been misused in sociology (racism), but are they a useful model in biology (racialism)?