Listened to an interesting interview this morning with the author of a new book, The Latinos of Asia: How Filipino Americans Break the Rules of Race. There was a lot to agree with and disagree with, but it rang true in many ways for me because I have had a fair number of students with roots in the Philippines. An early portion of the interview illustrates an important dynamic. The author himself has parents from the Philippines and when his university was running a study on alcohol consumption and those of “Asian” ancestry. When he approached to be a participant though the researchers said that what they were looking for were people of Japanese, Korean and Chinese ancestry, because they had the right “population structure.”

Naturally this was somewhat offensive. The author pointed that as a sociologist he believes race is a “social construct.” It is also the case that people from the Philippines occupy a someone liminal position of both “Asian” and “Latino” identities. As a South Asian I can relate, as I am “Asian”, but not “typically” Asian.

From what I can gather the research group was rather artless in the way they communicated their necessary conditions for their project, but the researchers probably were correct in excluding the author.Alcohol flush reaction segregates in only a finite set of East Asians. With limited resources it is rational for them to exclude individuals from populations where the variants of interest are not present, or at very low frequencies.

The problem is that the author confuses the terminology, “Asian”, with reality. A common tendency in the “post-modern” style of thought is putting primacy in the power of language to shape our perception of reality. The fact is that people from the Philippines have very distinct genetic structure in relation to Northeast Asians. Whether they are categorized as Asian or Latino does not truly impact that fact, unless one is of Chinese background and from the Philippines. To me it is ironic that so many scholars place into language so much power when language is only an imperfect mapping onto reality.

So I don’t really have strong opinions on the whole controversy over women’s sports at the elite level…mostly because I have a really hard time following all the logic. For me the biggest problem seems to be that we have two categories, men’s and women’s, and there are those who are arguing that they’re actually nearly plastic catchalls…which then suggests to me we shouldn’t have two categories in the first place in competition at the highest levels.

I would also like to relate a two-part epiphany that I had after my transition. In 2005, nine months after starting HRT, I was running 12% slower than I had run with male T levels; women run 10-12% slower than men over a wide range of distances. In 2006 I met another trans woman runner and the she had the same experience. I later discovered that, if aging is factored in, this 10-12% loss of speed is standard among trans women endurance athletes. The realization that one can take a male distance runner, make that runner hormonally female, and wind up with a female distance runner of the same relative capability was life changing for me.

As they say, “read the whole thing.” It’s long, and detailed, and doesn’t offer easy answers. Ultimately the reality is that no “solution” is going to be fair to world-class athletes. But, it’s probably important to remind ourselves that it is also unfair to those of us without the genetics of world-class athletes, and we seem to be OK with that.

Sociologists have long tried and failed to draw a line between science and pseudoscience. In physics, though, that ‘demarcation problem’ is a non-problem, solved by the pragmatic observation that we can reliably tell an outsider when we see one. During a decade of education, we physicists learn more than the tools of the trade; we also learn the walk and talk of the community, shared through countless seminars and conferences, meetings, lectures and papers. After exchanging a few sentences, we can tell if you’re one of us. You can’t fake our community slang any more than you can fake a local accent in a foreign country.

…

I haven’t learned any new physics in these conversations, but I have learned a great deal about science communication. My clients almost exclusively get their information from the popular science media. Often, they get something utterly wrong in the process. Once I hear their reading of an article about, say, space-time foam or black hole firewalls, I can see where their misunderstanding stems from. But they come up with interpretations that never would have crossed my mind when writing an article.

I’ve been blogging since 2002. Like Sabine I can often tell if someone has a scientific background after a few sentences, especially if they are biologists of some sort. As for the rest, the chasm is between the intelligent vs. not so intelligent, and it is usually pretty clear too. Mostly the intelligent have liberal arts or social science backgrounds, but have the basic analytic tools to decompose problems at the most general levels. The less intelligent tend to speak in simple formulas when coherent, and devolve into total incomprehensibility when they try and attempt originality.*

The second issue is a somewhat different one from physics. Usually at a given moment there is a topic of particular interest to the media. Evo-devo and epigenetics come to mind. These are real scientific fields of inquiry. But because of disproportionate media attention to these sorts of topics, usually those who rely on their science knowledge from popularizations will assume that evo-devo and epigenetics have “revolutionized” our understanding of evolution and genetics, when in reality these are still developing areas, whose ultimate impact is to be determined.

In fact, I’d take this further: the area of evolutionary genetics has arguably not been “revolutionized” since the 1970s, with the theoretical and empirical debates triggered by allozyme work and the neutralist-selectionist debates. All the rest, including genomics, is just commentary.

* Here is a good example: the stupid reader who was explaining to me patiently how splicing and gene regulation “disprove” heritability estimates. I dismissed them, but the reality is that I’m 99% sure that that reader thinks I’m an idiot as well.

The Greenland shark (Somniosus microcephalus), an iconic species of the Arctic Seas, grows slowly and reaches >500 centimeters (cm) in total length, suggesting a life span well beyond those of other vertebrates. Radiocarbon dating of eye lens nuclei from 28 female Greenland sharks (81 to 502 cm in total length) revealed a life span of at least 272 years. Only the smallest sharks (220 cm or less) showed signs of the radiocarbon bomb pulse, a time marker of the early 1960s. The age ranges of prebomb sharks (reported as midpoint and extent of the 95.4% probability range) revealed the age at sexual maturity to be at least 156 ± 22 years, and the largest animal (502 cm) to be 392 ± 120 years old. Our results show that the Greenland shark is the longest-lived vertebrate known, and they raise concerns about species conservation.

…Using this technique, the researchers concluded that two of their sharks—both less than 2.2 meters long—were born after the 1960s. One other small shark was born right around 1963.

The team used these well-dated sharks as starting points for a growth curve that could estimate the ages of the other sharks based on their sizes. To do this, they started with the fact that newborn Greenland sharks are 42 centimeters long. They also relied on a technique researchers have long used to calculate the ages of sediments—say in an archaeological dig—based on both their radiocarbon dates and how far below the surface they happen to be. In this case, researchers correlated radiocarbon dates with shark length to calculate the age of their sharks. The oldest was 392 plus or minus 120 years, they report today in Science. That makes Greenland sharks the longest lived vertebrates on record by a huge margin; the next oldest is the bowhead whale, at 211 years old. And given the size of most pregnant females—close to 4 meters—they are at least 150 years old before they have young, the group estimates.

The above visualization is from a Reddit thread, Almost all men are stronger than almost all women. It’s based on grip strength, and basically reiterates my post from last year, Men Are Stronger Than Women (On Average). The same metric, grip strength, is highlighted. The plot above shows that the “great divergence” occurs on the cusp of puberty, exactly when secondary sexual characteristic of males and females become much more pronounced. In my post I pointed out that the Olympic caliber female German fencers were on the lower end of the male distribution.

It’s not unusual for men and women swimmers to train together, but being in the pool with Ledecky is something that many men can’t handle. In April, Conor Dwyer, a 6-foot-5, 27-year-old American swimmer who won a gold medal in the 4-by-200 freestyle relay in London, gave a revealing interview posted online by USA Swimming. In it, he talked about male swimmers being “broken” by Ledecky when they practiced together at the Olympic Training Center in Colorado Springs.

…

Ledecky’s ability to crush men in practice does not necessarily mean she would defeat them in competition. There’s a difference between imposing her will, and perhaps superior conditioning, over the course of a two-hour practice and doing it in a shorter race in which men’s generally greater strength provides an advantage. Her best chance would probably be in the 1,500 freestyle, which women race at the FINA World Championships but not at the Olympics. (The men don’t swim the 800 in the Olympics, so there are the same number of events for male and female swimmers.) Ledecky’s best time in the event would put her among the dozen or so top American men and is 25 seconds faster than their qualifying time at the United States Olympic trials — but it is much too slow to earn a medal at the Games. On the other hand, because no other woman offers a real challenge to her, she is never pushed in that event. I asked Andrew Gemmell, who specializes in the 1,500 free, a hypothetical question: What if, in some dystopian swim universe, Ledecky was told that there would be no women’s events and that she would have to try to make the American team by competing with the men in the 1,500?

His father, who trains her, had told me that he did not think she could qualify, a feat that under current rules would require her to finish first or second at the trials. Andrew, who trains side by side with her, had a different answer. “It would be really difficult, but I would never bet against her,” he said. “I don’t think anybody knows yet what she’s capable of.”

I’m a little surprised honestly that the term “dystopian” got in there, because there are now people with academic appointments arguing for the ending of sex segregation in sports. Often they are sociologists, who believe all things are socially constructed, and take some element of non-binary aspect to gender to meaning that the distribution of possibilities are entirely flat and arbitrary.

Katie Ledecky has preternatural gifts, as well as opportunities afforded to her by her class status. The whole piece highlights Ledecky’s exceptional physical abilities and mental attributes. But even it acknowledges she would likely not beat the top men in her events.

First, she argued that sex segregation in sport denoted women’s inferiority, and that was a problem. The fact is that when it comes to strength, especially upper body strength, all the data do suggest that women, on average, are markedly inferior to men. This is a fact. This fact causes problems. But the fact that this fact causes problems does not entail that we literally deny the fact. At least that’s my opinion.

Second, she analogizes sex and gender as social constructs to race as a social construct. I knew she was going to go there, because this is a rhetorical nuclear option which is going to quickly defenestrate interlocutors. She observes that:

“We look at race as a social construction. It is not genetic, it is not biological, and we believe the same is [true] for sex … The male-female dichotomy doesn’t cover everyone, right? We have trans people, intersex people.”

As I said above, the reporter was incredulous, but he had a hard time responding after Dr. Milner explicitly connected race and sex, because it is the mainstream position now that race is a social construct and lacks any biological basis. The facts may not be on Milner’s side, but she has the theory and the “moral arc of history” backing her. It would take great courage to still dig in and defend reality as it is, as opposed to her preferences.

The reality is that race and sex/gender are social constructs. The atom is a social construct. Matter and energy are social constructs. Cities are social constructs. Everything is a social construct, as we look through the glass darkly. But social constructs operate on various levels of clarity and distinctiveness and exhibit different levels of pliability and utility. Dalton’s atomic model is profoundly wrong. It has long been superseded by quantum physical models, which have the utility of making correct predictions, whatever their correspondence to reality on a metaphysical level might be. But the Daltonian model is still often implicitly the one introduced to children to allow them to gain some intuition as to the nature of how matter is constituted. In contrast, the metaphysical ideas of the ancients as to the material nature of the universe are both wrong, and, lacking in utility.

All models are wrong, but there are still superior and inferior models. Their measure is in how they correspond to, and predict, reality. Not how they correspond to our ethical judgements of how the universe should be.

Many sociologists dissent from this position. They’ve marched into the academy and taken it over. Because of their ideology that all things are social, they believe they can reshape the fabric of the universe through their own normative preferences. To me this is a problem. I struggle against it. Our deep human intuitions often reject, and recoil, against fragments of reality. But to successfully grapple with reality we need to attempt to understand reality on its own terms, not our own.

I may struggle in vain. Could it be the liberal Whiggish scientific moment in history is over? History is written by the winners, but perhaps in the future science will also be written by the winners. I’m not sure that the truth will win out. Perhaps the glass will become darker, rather than clearer. There are genuine difficult empirical questions about the nature of human variation and our dispositions, and how it relates to the values that we hold to be true. The fact that we’re still discussing sex segregation in sports and how it is unjust illustrates how far we’ve come in the solipsistic and socially constructionist direction.

Imagine that in the end of days all the mandarins will be sociologists, who come not to bring illumination of the truth, but to determine the nature of the truth for us to agree upon. Perhaps this is the true end of history, as humanity returns to an equilibrium where the bracing aspects of reality are shielded from the masses, which lay indolent in their delusions, while the technocrats and artificial intelligences confront the outside.

I’ve joked on Twitter that one aim of conservatives should be to defund disciplines whose avowed goals are to espouse a particular ideological viewpoint. Of course “scholars” in those disciplines might dispute the characterization of their chosen fields in such a manner, but the reality is that that’s how they roll. Conservative or moderate viewpoints are considered illegitimate and not worthy of consideration in many of these departments and disciplines. The political spectrum goes from mainstream liberals on the Right to Marxists on the Left. There is no reason that the the “master” should be paying for someone to burn down his house.

Of course these viewpoints are concentrated in the “studies,” which is ironic as many of the scholars in this field don’t study much, as opposed to being activists and ideologues espousing their views at length. Traditional humanities and philosophy are relatively sane compared to Women’s or Ethnic Studies, but I see where Rod Dreher’s reader, a professor in STEM, is coming from when he suggests that “Why Not Close Humanities Departments?”

The findings have proved divisive. Some researchers hope that the work will aid studies of biology, medicine and social policy, but others say that the emphasis on genetics obscures factors that have a much larger impact on individual attainment, such as health, parenting and quality of schooling.

“Policymakers and funders should pull the plug on this sort of work,” said anthropologist Anne Buchanan and genetic anthropologist Kenneth Weiss at Pennsylvania State University in University Park in a statement to Nature. “We gain little that is useful in our understanding of this sort of trait by a massively large genetic approach in normal individuals.”

Buchanan and Weiss are smart. Money is what fuels research, and without that oxygen further studies may not be possible. At least in the short term. Whole genome sequencing will become ubiquitous soon, so understanding these patterns is going to be a matter of joining a few tables somewhere. Imagine a future where Facebook has your genome as part of your profile; they could glean a lot about human behavior genomics simply by combining genetic states with online browsing and engagement patterns.

In my last post I drilled down on just a few of the results in the paper The genetic history of Ice Age Europe (ungated). There are many results which I didn’t really explore, in particular, the finding that there seems to be a gradual decline in Neanderthal ancestry within European populations over time. That’s for a follow up.

In any case, it’s an interesting time to be alive and be interested in these topics.

The epoch we are situated is between an age of ignorance, and one in which we will be overwhelmed by interpretations based on a surfeit of data. The whole genome of the Neanderthal was published in 2010. Today we have many more whole genomes, and probably on the order of 1,000 ancient genomes of varying quality in the pipeline (i.e., some of it is unpublished). Reconstructing the history of humanity from genetic data has transformed from inference from the tips of the phylogenetic tree, to the examining of points deep in the nodes.

This reminds me of an argument that was highlighted in The Monkey’s Voyage between cladists with a background in systematics and paleontologists. The way paleontologists understood evolutionary relationships was to examine the fossil record, and reconstruct trees with putative ancestral forms and their descendants. The cladists asserted that this method relied upon the incomplete and unreliable fossil record, and so was not nearly as powerful as simply looking at extant variation in a more rigorous manner. Though the added rigor of the cladists arguably transformed the field of phylogenetics, as I have suggested before the extremism of the cladists in dismissing whole domains of knowledge and alternative methods has not swept the field.

Personally, I think that is a good thing. But, some of the warnings of the cladists probably need to be considered when taking into account the new results from paleogenetics. The reality is that in many ways there is little difference in terms of the raw data which paleogenetics and paleontology based on fossils provides. For various technical reasons phylogenetic inference from whole genomes of DNA sequence can be much more powerful than analyzing, for example, the teeth of an ancient hominin. Those teeth would give you phylogenetic and functional information. But reconstruction is more robust when you have tens of thousands (or perhaps millions) of variations, which is what DNA gives you. Second, those markers can tell you a great deal more about a variety of functions than simply teeth (I am not denigrating teeth here, as they are very informative!).

Extended data table 5

One thing which is more and more clear as more data comes in this that the genetic architecture of pigmentation in modern Europeans is a product of the Holocene, and perhaps even the last 4,000 years. A more sensational way to state this is that the Nordic phenotype may not have been present in appreciable amounts in any population when the pyramids of Giza were being constructed! Of course, there is a major caveat here in that we know that light skin emerges with different genetic architectures, so ancient Pleistocene Europeans may simply have had a different arrangement of functional SNPs. The main caution on this caveat is that pigmentation is a trait that is very well characterized across mammals as a whole, so prediction is much less dodgy than in other traits. If we eventually get enough high quality genomes from Gravettian period Europeans, and they lack derived SNPs across ten major pigmentation genes, then we can be pretty certain that they were in the ancestral state.

Researchers are then literally putting flesh on ancient bones. And yet we still see what we see. Paleogenetics suffers from the same issue as paleontology: skewed sampling. Especially when sample sizes for certain periods and regions are small, our illumination can’t give us a sense of what we don’t know. The genetic history of Ice Age Europe gives us a picture of genetic turnover, but one in which the Goyet sample representative of early Aurignacians turns out be the ancestor of a population which pops back up into prehistory (the Magdalenians) after a long 10,000 year Gravettian interlude. Unless they traveled through a wormhole, it seems clear that this lacunae is a consequence of patchy and biased sampling. As the number of DNA samples increases we’ll get a better sense of how patchy and biased the methods turn out to be, but we’ll never totally abolish this problem. I believe that the same fact explains why many papers see a resurgence of Mesolithic hunter-gatherer ancestry among Neolithic farmers; the former were always there, but they are not being sampled across much of the temporal transect due to spatial patchiness.

As we traverse this period between ignorance and the potential for knowledge, we can start forming conjectures as to the shape of the future. In the comments below Andrew Oh-Wilke suggests that ancient people traveled much further than we might have guessed. I think this is right. In The Monkey’s Voyage the author suggests that in biogeography there has been a move away from vicariance to a somewhat stochastic long distance dispersal model to explain variation. The vicariance model emphasized the emergence of geographical barriers due to geological processes, and the subsequent divergence between two populations due to reduced gene flow. The idea behind stochastic long distance dispersal is basically that a lot of the patterns are due to random freak events, such as a small group of Old World monkeys somehow making it across the Atlantic from Africa, and becoming the ancestors of the whole family of New World monkeys.

The vicariance model has less relevance for human prehistory, because in most cases we’re not talking about geological time scales. There are exceptions, after a fashion. Berengia and Sahul both harbored populations, and after the sea levels rose groups were isolated on opposite sites of the water barrier. But there was always some contact even after this, because humans can traverse water barriers. There is an analog to the vicariance model in historical population genetics, and that is the isolation-by-distance model of human genetic variation and diversity. The major example is in the 2005 paper Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. This model’s logic is sound. One imagines that humans start at a point in space, and expand outward through a demographic wave of advance, as groups disperse into territory inhabited by archaic hominins, or not inhabited by hominins at all (e.g., the Americas and Oceania). Because this results in a serial founder effect, you see a pattern where populations further away from Africa exhibit less genetic diversity. Additionally, genetic structure in humans can be conceived as as dominated by geographic distance and exhibiting clinal variation.

What the paper Toward a new history and geography of human genes informed by ancient DNA did was show that the genetic data used to support the isolation-by-distance model of decay of genetic diversity did not have the power to truly show this was the correct model. What David Reich and Eske Willerslev (and others) have shown with ancient DNA (as well as novel methods in Reich’s case) is that 1) population turnover has been relatively common 2) most (all?) modern populations are best thought of as admixtures between ancient lineages, in many cases pulse admixtures that occurred rapidly. Like the vicariance model the isolation-by-distance model was boring and general. It was easy to model, and didn’t engage in special pleadings to historical contingency. In other words, it’s a perfect model to use as null hypothesis. But that doesn’t mean that it’s correct, or, more accurately, captures most of the dynamics.

An example may suffice. Europe is the most well elucidated of the major regions of the world in terms of prehistory. The “standard model”, utilizing simple and generic population genetic demographic processes produces a nice and simple model to fit the data. ~50,000 years ago humans leave Africa, they settle the Middle East/Central Asia. ~40,000 years ago they arrive in Europe. ~10,000 years ago farmers arrive from the Middle East, and expand into Europe from the southeast, with their genetic signal diluting over time to the northwest.

Here is the model, sketchily, informed by ancient DNA. ~50,000 years ago humans leave Africa, and mix with a number of Neanderthals. ~40,000 years ago, they arrive in Europe. ~35-40,000 years ago the first modern Europeans are replaced by another population. This second population is culturally similar to the first, and contributes some (though small proportionally) ancestry to modern Europeans. It is replaced by another population, which does not contribute much to modern Europeans (Gravettians), though populations related to it do. It is replaced by a population related to the first Europeans with descendants (Magdalenians, who are descended in part from Aurignacians, and do not share much drift with Gravettians). Then, the Magdalenians are replaced by Villabruna populations, the very late Paleolithic populations at the tail end of the Ice Age. The Villabruna have mixture from both the Near East, and to a lesser extent East Asia. Or, Villabruna populations were intrusive to the Near East, and possibly East Asia, or there were mediating populations between. It is all somewhat unclear. Then the Villabruna populations, which become Mesolithic hunter-gatherers, are overwhelmed by Near Eastern groups, which have very exotic ancestry unrelated to all other non-Africans (Basal Eurasian). Finally, the Neolithic groups are overwhelmed by populations from the steppe, who are themselves compounds of very distinct elements.

This is a difficult and historically contingent story. It is not neat, tidy, and is a dog of a model. It is not easy to generalize. But, it is probably a model which captures many more of the salient dynamics than the earlier one.

Going forward what generalizations can we take from this? Europe has been well elucidated for historically contingent and biogeographic reasons. But the rest of the world will come into the light of understanding in a similar fashion over the next ten years. One prediction I will make is that inter-group barriers were more powerful earlier in the human past than today, at least in terms of how they were relevant genetically. The emergence of meta-ethnic religions and fictive kinship may have paved the way for gene flow on a massive scale over the past 4,000 years. Additionally, human population density is such that the landscape of habitation is less patchy, and conventional continuous gene flow between adjacent populations is just more feasible. In prehistory human groups thin on the ground may have had organize proactively to exchange mates, perhaps during gatherings which were culturally focused. This might imply that mate exchange was less a function of proximity than cultural affinity.

A pattern of turnovers that we see in Pleistocene Europeans aligns with the idea that socio-cultural boundaries were major fault-lines which were inimical to gene flow. Admixture between two groups in the recent past can occur when one collapses culturally, as occurred in the New World. But it also occurs as a matter of course through proximity, as is the case with the Hui in China. The balance of forces in the hunter-gatherer world may have been toward the former. Patching sampling means we don’t know where the pre-Magdalenian and post-Aurignacian peoples were persisting over the 10,000 years of Gravettian domination, but they were there, biding their time. Any modern understanding of 10,000 years would expect us to lead to massive mixing and gene flow, but that did not seem to occur (some did, but look at the admixture graphs and the Magdalenians are >50% Aurignacian, while the Gravettians are ~0%).

Second, the turnovers probably were partly due to ecological forces. At this stage in history humans were animals whose existence was strongly conditioned on natural vicissitudes. Small numbers of people may easily have gone extinct because of diminished opportunities, and drifted below sustainable levels. Particularly if they weren’t part of a broader network of redundant support, which seems unlikely to have been the case. Agricultural populations still retain a reservoir numerically even after famine. Hunter-gatherers may not have.

Finally, Europe may be a special cases because it is on the frontier of habitation during a phase of glaciation, but it is unlikely to be totally sui generis. The branches of the human phylogenetic tree see to be pruned rather regularly. The genetic history of other parts of the world are likely to exhibit the same pattern of turnover, and relatively recent roots for the demographically dominant group.

Though we often think of evolutionary processes as either matters of bones (i.e., paleontology) and genes (i.e., evolutionary genetics), that is not strictly true. There are other domains of study where evolutionary thinking and frameworks have been applied. In particular I’m thinking of evolutionary thought in the context of culture. This has a long history, and evolutionary models as metaphors are commonly bandied about, from Herbert Spencer to Richard Dawkins. But the reality is that there is little systematic and formal investigation of the topic. In the late 1970s to the middle 1980s six scholars attempted to change this. First, E. O. Wilson and Charles Lumsden in Genes, Mind, And Culture: The Coevolutionary Process. Arguably the most ambitious of the projects, Wilson and Lumsden have moved onto other things. Next you have L. L. Cavalli-Sforza and Marcus Feldman with Cultural Transmission and Evolution. By and large both authors have moved onto other things, though Feldman at least still produces some research in the area of cultural evolution. I asked Cavalli-Sforza about cultural anthropology’s reaction to this book in 2006. He responded:

I entirely agree that the average quality of anthropological research, especially of the cultural type, is kept extremely low by lack of statistical knowledge and of hypothetical deductive methodology. At the moment there is no indication that the majority of cultural anthropologists accept science – the most vocal of them still choose to deny that anthropology is science. They are certainly correct for what regards most of their work.

His pessimism about cultural anthropology was warranted in my opinion.

If you are interested in the above topic, you should get a hold of at least one of the above books. For those with some background in evolutionary genetics modeling, you’ll feel very comfortable (I recommend Mathematical Models of Social Evolution: A Guide for the Perplexed for an up-to-date take). But today I bring this all up because Peter Turchin has just announced the birth of a new organization, Cultural Evolution Society. In describing the backstory of how this society came about Peter references a visit to Davis in 2014. I happen to have been there, and had good fun with with both Peters (Turchin and Richerson) dining on Korean barbecue and downing red wine. The precis for Ultrasociety was already present in Peter’s mind at that point, but I don’t recall talk about a society for the study of cultural evolution. That may be due to the fact I wasn’t privy to all the conversations, or, that I was rather inebriated soon enough as there was no way I could keep up with Peter Turchin!

I sincerely hope more students interested in evolution will begin to look to cultural processes as well. If you are a human evolutionary geneticist it strikes me as not just something that would be a bonus in terms of insight, but a necessary aspect of the field. For the past generation there has been a emphasis on culture alone, as the co-evolutionary ambitions of Wilson and Lumsden in their original groundbreaking work have been somewhat set to the side. I think that will change in the near future, as many of the thinkers who are pushing the field forward know that at some point cultural evolution and evolutionary genetics will fuse again….

First, let’s put this in context. Canids area big deal. They’re big social mammals whose distribution and speciose character have undergone big changes across the Pleistocene. Sound familiar? Is it any surprise that one of their kind is our “best friend.” And, according to the anthropologist Pat Shipman the symbiotic relationship between dog and man is responsible for the victory of our lineage of hominins in the evolutionary war of all against all. About six months ago that thesis would seem a stretch, as the origin of dogs does not date until almost the Holocene according most genetic scholarship (the paleontologists have found rather old suggestive skulls thought). So tens of thousands of years after modern humans replaced other lineages. But ancient DNA suggests problems with the calibration of earlier work, which may have dated their divergence from wolves too recently. That and the fact that the emergence of dogs as a distinct group of canids might be concurrent with the arrival of modern humans to Eurasia make Shipman’s thesis at least feasible, if not probable. And note that I stated divergence from wolves, not derivation. It turns out that dogs are a sister lineage to Palearctic wolves, not derived from them. As observed in this paper extant lineages of wolves are genetically rather homogeneous, and seem to have diversified relatively recently, within the last 20,000 years, on the order of 10 to 20 thousand years after the last common ancestor of extant wolves and dogs.

Where do jackals play into this? The golden jackal has a distribution which covers both Eurasia and Africa. The species’ was determined morphologically. In other words, they look similar across their range. But sometimes you can’t judge a book by its cover. As an obvious example, most people would think that a hyrax on superficial inspection was a rodent. But a close examination of anatomical details indicated a relationship to elephants to classical taxonomists, which has been validated by DNA. But, as the paper above states plainly in the title the DNA here contradicts inferences made from morphology. Wolves and dogs, and African golden jackals, form a monophyletic lineage, to which Eurasian golden jackals are an outgroup! This determination was achieved through mtDNA analyses, as well as phylogenetic reconstruction from specific genetic regions, and, genome-wide comparisons on millions of polymorphisms.

But wait there’s more! One major difference between the example above of the hyrax vs. elephant and jackal vs. wolf is that the phylogenetic distance in the latter case is far smaller across the tips of the branch. That probably explains why morphological characters were not sufficient to discern the shared ancestry and derived characteristics of the wolf and the African jackal, as opposed to the Eurasian jackal. And, a corollary to this is that hybridization between these lineages is possible. In other words, this isn’t a phylogenetic tree, it’s a phylogenetic graph! Using D-statistics the authors show that there has been a fair amount of gene flow between Eurasian wolves and Eurasian jackals. And, in particular a lot of admixture from the Eurasian jackal to the dingo and basenji breeds.

Is this starting to sound a bit familiar? As population genomics has increased coverage of human populations, modern and ancient, as well as increasing marker density and accuracy, first approximation coarse phylogenetic trees have given way to threads of gene flow edges tracing their away across the thick branches. The trees have given ways to myriad graphs which force us to make more subtle our understanding of the genetic background of our own lineage. I see no reason why the same will not be true for large mammals, or, frankly, an innumerable number of clades.

In the near feature sequencing will be ubiquitous in ecological and systematic studies. At the coarsest big picture scale we’ll still see a confirmation of the tree of life as it’s classically envisioned, exploding outward from node to node, in subdivisions of clean monophyletic lineages, pruned by extinction diversified by drift and selection. But as you focus in closely the bifurcations will turn in on themselves or thread together in tangle, as the branches begin to be stitched together by gene flow. Look even closer and you’ll see that even within a young species, like humans, our local geographic pedigrees also collapse in on themselves, and tangle and coalesce down to a set finite number of individuals, rather than the infinite space of genealogical possibilities.

If science is hard, history is harder. Harder in that the goal is to understand what happened in ages which are fading away like evanescent ghosts of our imagination. But we must be cautious. We are a great storytelling species, seduced by narrative. The sort of empirically informed and rigorous analysis which is the hallmark of modern scholarship is a special and distinctive thing, even if it is usually packaged in turgid and impenetrable prose. It is too pat to state that history was born fully formed with the work of Thucydides (or Sima Qian). In fact Thucydides’ pretensions at historical objectivity despite obvious perspective and bias lend credence to the assertions of those who make the case that the past is fiction (in this way Herodotus may actually have been more honest). The temptation is always great to paint an edifying myth which gives succor to national pride or flatters our contemporary self-image. The fact that modern nation-states in the technological age have vigorous debates about details as to the nature of periods of history in the recent past, when the people who lived during those times are still here to bear witness, is telling in terms of the magnitude of the task before us. Fraught questions must be answered with far fewer resources.

Much of history we see only vaguely through chance and contingency, known through happenstance and the whims of our ancestors. In the West the documents which shed light upon antiquity come to us through tunnels of finite transmissions, a furious period of textual transcription in the last few centuries before 1000 A.D. The Carolingians, the Byzantines, and the Abbasids all engaged in sponsoring the capital intensive project of taking ancient texts and making copies for posterity. The vast majority of the works of antiquity we have today can be traced back to this period[1]. Biases and concerns of the elites who sponsored these projects were critical in determining the nature of the source material which serves as the foundation for our understanding of the deeper past which we take for granted today. We know how little was copied because the extant material make copious reference to a vast body of work which was circulating in the ancient world on assorted topics (and even many of the works we do have are only portions of multi-volume endeavours, such as that of Livy).

But what about pushing beyond what the text can tell us, and transitioning from history to prehistory? Here is where matters become opaque and conditional upon the nature of the texts (or lack thereof). This is clear when you observe that there are very early periods of human history when our knowledge of individual actors and daily life is actually greater than later epochs due to regress of civilization, or, changes in technology which mitigated against preservation of texts[2]. The “Dark Ages” of Greece between the Mycenaeans and the Classical Greeks are the purview purely of archaeology (and even during the Mycenaean period most Linear B were of a bureaucratic nature; I do not know of narrative literature such as we have for Egypt or Babylon). For the Classical Greeks the rupture was traumatic enough that their Mycenaean past became the subject of legends. The citadels of the Bronze Age warlords were viewed as “cyclopean” works, as if only giants could have created them. Similarly, the period in Britain between the end of central Roman rule and the Christianization of the Anglo-Saxons, about two centuries, is perceived only faintly because of the paucity of written records (this also explains why this period is often utilized as the setting for historical fantasy).

Yet when text is silent one still has material remains. Their collection and analysis are the domain of archaeology, a historical science. The fact that history as we understand it deals in the written word, and so limits its focus to the period when we have texts, is itself a historical coincidence. Ideally traditional history and archaeology should work in concert, and critically, words have a way of deceiving and misleading. Most obviously we have a major ascertainment bias in our understanding of the past when we listen only to the perspectives of those who can speak through words, because they who were literate or had access to literate professionals were a very small subset of the broader human experience. Archaeology has less of this bias, because all classes leave behind their material evidence (though if one wants textual representations of a broader cross section of the Roman populace, the novel The Golden Ass is a good place to start). An excellent illustration of this for me, as readers know, is the extended argument in the book The Fall of Rome, which brings material evidence to buttress the position that the decline and fall of the unitary Roman state in the 5th century coincided with a genuine degradation of what we might term civilization. Revisionists looking purely at textual materials have long argued that the classical view was misleading, and to reduce their argument down toward its essence, suggest that classical civilization evolved and transformed, channeling its energies into different activities (e.g., the rise of Christian theology as a successor to the classical liberal arts, see Peter Brown’s The Rise of Western Christendom). But what material remains tell us is that there was indeed an economic and demographic collapse, despite apologia that one can make as to the reshaping of high culture in texts. One may choose to weight these facts, or not, but the facts nevertheless remain, no matter how many glosses one wishes to upon them. The Rome of 600 may have had many more Christian theologians than the Rome of 400 (which was then a mainly non-Christian city), but the Rome of 400 probably had a population on the order of 10-20 times greater.

In a world without text, which is almost all of human history, the material remains are all that we have to grasp upon. Though we can attempt to glean the minds of people long gone from paintings and scratches in stone, the reality is that what they hunted with, what they ate with, and the dwellings in which they lived, are going to give us concrete information where leaps of imagination are unnecessary. Moving beyond the text can allow us to truly illuminate the vast dark oceans of human history with more than our dreams, from the dawn of our species, down to even recent periods when literacy was the privilege of the few, and the experiences of the many were dead to us. Despite this, the paintings have only a few colors on the palette, because archaeology is filled with enormous gaps in perception. Pots not cloth. Caves not tents.

Which brings us to biology, and specifically genetics, as it turns out that DNA is actually one of the material remains that one can extract from archaeological field sites. It’s a robust macromolecule, and today researchers believe that it is feasible that some information can be drawn from remains as old as 1 to 2 million years, though that’s a best case scenario. When it comes to questions of demographic change genetic insights are key, and present data in a way that allows for more rigorous analysis. As has been the case in previous posts I must now give a nod here to L. L. Cavalli-Sforza and The History and Geography of Human Genes. Cavalli-Sforza’s magnum opus reopened the book in attempting to understand history through demographics. It was the first page, and the first chapter. Prior to this before World War II there was a cottage industry which attempted to do what Cavalli-Sforza achieved in the late 20th century. But these endeavors were hobbled by two problems. First, they was not scientific, often relying upon intuition derived from their erudition (they were not hypothetico-deductive, though that’s overrated if you have lots of data). Second, the reliance upon intuition meant that many of the conclusions dovetailed rather neatly with the ideological preferences of the day, National Socialism most horrifically, but much more widely than that was a shoddiness of nationalism inflected prehistory. Scientific romance without the genocide (see Pat Shipman’s The Evolution of Racism). After World War II archaeologists reversed course and decoupled cultural evolution and change from demographic variation. Works such as the Races of Europe became anachronistic when decades before they’d have been mainstream, and there was a strong bias toward a null hypothesis that pots, that is cultural traditions, migrate, but people do not.

Into this intellectual climate stepped Cavalli-Sforza and his students, triggering a minefield in academic explosions (see The Human Genome Diversity Project: An Ethnography of Scientific Practice). Molecular anthropology in its earliest incarnations focused on deep time. In particular, there was a recalibration of time depth of the origin of apes and humans, where the molecular biologists clashed with paleontologists, and came out the victors (see The Monkey Puzzle for a history of these controversies). Then, there was the “Out of Africa” debate (see The African Exodus). Though these were somewhat fractious and personalized arguments, the emotions around the implications of these contests of ideas were often limited to scholars (though the scholars themselves may not have felt the fallout was limited; apparently at Stanford in the late 1990s a cultural anthropologist gave a presentation where he juxtaposed a photo of Cavalli-Sforza with Josef Mengele). What Cavalli-Sforza did was bring genetic science toward addressing more contemporary phenomena, to answer questions which come to the cusp of the present, tackling issues of relevance to living human people on the scale of nations and peoples. Over many decades his lab collected enough information from hundreds of genetic loci to arrive at the sum totality of inferences which were eventually presented in The History and Geography of Human Genes.

Let’s take a step back here. Cavalli-Sforza and his colleagues had access to hundreds of markers at best. Note that ~2% of the human genomic codes for proteins, but there are 3 billion positions in terms of bases. Today anyone who wants to pay can get millions of positions through SNP-chip services. My son has billions of positions, because he’s been whole-genome sequenced. For phylogenetic purposes you don’t need billions, millions, or even thousands, depending on the nature of the questions you have in mind. But, it puts in perspective how far we’ve come in literally 20 years. Even 5 years.

As is the nature of science there was much that Cavalli-Sforza got wrong in The History and Geography of Human Genes. But there was much that he got right, because the results were so clear and strong on particular points of contention. In short, very broad patterns on the continental level jumped out when analyzing even hundreds of neutral (that is, not subject to natural selection) markers. For example, the data confirm a gradient of genetic diversity which implies human origins from an African locus, as well as the relative homogeneity of Europe (aside from Finns, European populations have a surprisingly low between-population pairwise genetic distance in most cases). But, more subtle counterintuitive relationships were often not robust (e.g., North and South Chinese do not bifurcate in the manner that he reported in the 1990s). And, most critically for the purposes of this post inferring past demography from current phylogeographic patterns had serious limitations.

*The present as a window into the past*

The basic idea behind historical population genetics (archaeogenetics) which was pioneered by Cavalli-Sforza at the HPGL at Stanford was to look at patterns of diversity and relatedness among modern populations, and intersect that with what was and is known about history, as well as geography, and then allow those intersections to peal back the palimpsests of human history (see his The Great Human Diasporas). Though Cavalli-Sforza focused initially on autosomal markers scattered through the genome, in the period between 1995 and 2005 there was a great deal of work using uniparental data., the markers on the Y and mtDNA. The mtDNA is passed through women only, is copious in terms of quantity on a cellular level, and has a highly mutable region of utility for molecular phylogenetics. The Y chromosome exhibited some technical difficulties in comparison to mtDNA, but with the emergence of better extraction techniques as well as a focus on highly mutable microsatellite regions, it came to be set next mtDNA as a critical tool in the forensic reconstruction of human population history. In addition, both had the virtue of being nonrecombining, so that the generation of a phylogenetic tree was not an artificiality, but a reflection of the nature of the transmission of these two regions of the genome (congenial to a coalescent framework as well).

In the end this line of research often resulted in a transposition of a phylogenetic tree upon a world map, outlining patterns of human migration. It also aligned well with another line of research which explicitly modeled the expansions of humans out of Africa as a “serial founder bottleneck” process. That is, each population which left Africa progressively branched out in a unidirectional manner, resulting in reduced genetic diversity as one progressed out of Africa.

Ramachandran, Sohini, et al. “Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa.” Proceedings of the National Academy of Sciences of the United States of America 102.44 (2005): 15942-15947.

In its broadest strokes this model is not without validity. It does seem that most of the ancestry of modern humans can be traced to a population which flourished around or in Africa ~50-100 thousand years ago. Much of the inter-continental racial variation that we see in extant populations does nicely fit onto a bifurcating tree-like model (e.g., Non-Africans branch off from Africans, West Eurasians and East Eurasians diverge, Amerindians branch off from East Eurasians). The problem though is that the branches themselves turn out to be brambles which turn back in on themselves, and in some cases twist with other branches, creating lineages with very diverged ancestral roots. The yield of the earliest efforts by Cavalli-Sforza and his heirs was on a very coarse continental grain, where the effects of the dynamics were so striking that they would exhibit themselves across most neutral markers without much difficulty. But, when the questions were narrower, and the temporal and spatial scope more constrained, the earlier methods were not perceptive enough to smoke out the real dynamics.

By the middle years of the 2000s researchers had gone back to a focus on recombining autosomal markers. But now they had a whole human genome to compare it to, as well as SNP-chips which quickly yielded large troves of data with little effort. In 2008 a paper was published which took the origin HGDP data set collected by Cavalli-Sforza and his colleagues, and utilized the new technologies to make deeper inferences. First, instead of hundreds of markers you had 650,000 SNPs. Second, the emergence of powerful new analytic and computational resources allowed for the complemention of tree-based and PCA visualizations of genetic relationship with model-based understandings of genetic variation and population structure. By “model-based,” I mean that the algorithm posits particular parameters (e.g., “3 ancestral populations”) and operates upon the data (e.g., “650,000 SNPs in 1000 individuals”) , to generate results which are the best representation of the fit of the data to the model. This different from PCA, which has fewer assumptions, and represents genetic variation geometrically (each axis represents an independent dimension of variation within the data). Model-based clustering is very clear and aesthetically appealing. It gives precise results. But, the model itself is not necessarily right.

Anyone who uses these methods understands their limitations. If you use PCA to project variation of the data set, then the composition of the data you input is going to influence the largest principal components. Therefore, if you are asking questions on a broader spatial scale you should be careful about the possibility that you are overloading the sample set of interest with particular populations. More data in this case might result in less insight. Similar issues crop up with model-based clustering you don’t appropriately weight the populations. Another major problem is that the models are imposing limitations which might produce false inferences (false in that they do not accurately reflect demographic history). Most simply you might ask for many more population divisions than is realistic for the demographic and genetic history of the data. Consider a data set of Irish from Cork and Nigerians from a small village. PCA would no doubt show you two very tight and distinct clusters. With a model-based framework you could look for divisions and structure beyond K = 2 (two ancestral populations). The method is devised in such way that you would get results. But, they wouldn’t be very informative, and they’d be forced. They wouldn’t be robust. The model would be a poor fit to reality.

*From model to reality*

Obviously no model captures all elements of reality. But when the model deviates so much from reality that you get a false sense of what is true then that model is not nearly as useful. Being wrong is a definite bug. Aside from model-based admixture analysis, which posits a finite number of ancestral populations which come together to produce the genetic variation in the data set, you notice that the 2008 paper also had a tree representation of genetic variation. These two together give real and substantive results that can be useful. But, they mislead to the point of falsity in many specific cases.

This can be illustrated by the instance of South Asians, who are about 20% of the world’s population. A 2009 paper, Reconstructing Indian Population History, utilized both the higher autosomal marker density sets and new analytic frameworks to come to some specific conclusions which resolve many confusions about the nature of the genetic history of the peoples of the Indian subcontinent. So what did we know before? If you go back to the ideas of the old physical anthropologists they observed that many South Asian groups had an affinity to the peoples of West Eurasia (Europeans and West Asians). This varied as a function of geography and caste. In other words, there was a cline to the northwest, as well as up and down the caste system. You can see it in a PCA, where Indian groups vary in distance from Europeans, while Europeans form a very tight cluster. It also shows up in admixture based analyses. There is usually a K value where a South Asian modal cluster emerges, and it is near fixation in South Indian non-Brahmins, declining in frequency as one moves toward Pakistan, or, in North India up the caste hierarchy (the residual are West Asian and European clusters, except Bengalis, who have East Asian admixture). In The History and Geography of Human Genes South Asians form an outgroup to Europeans and Middle Eastern populations using older distance measures.

So far all good. One can imagine then a cline of genetic variation, with South Asians at one end, and West Eurasians at the other. On a PCA between East Asians and Europeans South Asians usually fall in the middle, but closer to Europeans. But there have long been major problems with this model when you drilldown into the details. The mtDNA and Y chromosomes of South Asians give very different results. The former classes them as distinct from West Eurasians, with distance affinities to East Eurasians. The latter on the other hand are quite a bit more like West Eurasians. Second, South Asians exhibit a lot of variation as a function of both geography and class in terms of their relatedness to word populations. If South Asians were deeply rooted in the subcontinent, as the migration maps above would imply, then we’re talking about massive barriers to gene flow which have persisted for tens of thousands of years. An alternative explanation is that South Asians are the product of recent admixture between two very different groups, which is what is often the norm when there is a lot of inter-individual variation in ancestral components and PCA position within a putative population group (e.g., African Americans). Finally, tests of natural selection geared toward detecting very recent sweeps have indicated a commonality between South Asians and Europeans and Middle Easterners on the haplotype of SLC24A5, which implies either extreme connectedness, or, recent admixture and migration (on the margin these two models are going to be hard to distinguish, since connections are mediated through migration).

I will sidestep the technical issues at this point, and just offer up that the work on South Asians has presaged much of what we’ve learned over the past decade when it comes to the genesis of modern population structure. The puzzles about South Asian genetic variation are resolved when you admit a model where a West Eurasian population mixed with a local indigenous group with distant affinities with other East Eurasians (see Genetic Evidence for Recent Population Mixture in India). The high level of between population variance within South Asia is due to the recent nature of the admixture event and the high genetic distance between the source populations. This may actually be the story of much of the world over the last 10,000 years. Instead of a regular branching process, imagine branches that periodically fuse back together, in a reticulated pattern. Another way to conceive of it is that the last 10,000 years have been a story of the destruction of population structure accrued over the past 100,000 years. A survey of this field can be found in the review Toward a new history and geography of human genes informed by ancient DNA.

*Inference made concrete, ancient DNA*

Up until now we have been talking about increasing the power of analysis of genetic variation in existent populations. Processes like bottlenecks and positive selection leave footprints in the genomes of modern peoples. But these methods of inference have limits. And, to a great extent they necessitate a simplicity of population dynamics to allow for them to have utility in painting a portrait of the past. Researchers had to assume that the past was simple, or the methods that they had wouldn’t be able to tell them as much as they claimed. The complexity of the demographic palimpsest could never race beyond ability of the genetic methods to peel it back, so there was a ceiling on the number of layers imposed upon the model.

Ancient DNA was a game changer, because it did not come with these limitations. Instead of just inferring the past from the present, the past could now be inferred from the past! That is, a temporal transect in time could be generated which explicitly explored the trajectory of genetic variation across time and space. As if to recapitulate history the earliest work was with mtDNA, just as it had been with “mtDNA Eve” in the 1980s. The sequence target here is small and mtDNA is copious. The immediate upshot though is that massive discontinuities were detected. Populations replaced each other repeatedly in many regions. Pulse admixture events being inferred with novel methodologies on extant populations now could be understood to have been the natural result of migration and population change over the past ~50,000 years. Thanks to the work of researchers such as Svante Paabo and Eske Willerslev the number of samples we have from ancient DNA for humans has grown to such an extent over the past 5 years that a bright line is shining into what had been a dark cavern of prehistory.

*European man, made and unveiled*

Because of both the concentration of researchers in Europe, as well as suitable preservation conditions in Northern Eurasia, ancient DNA has totally changed how we understand the genetic history of this continent most especially. Two new papers have expanded the sample set to 170 individuals, and many major questions have now been answered, and other new questions have been triggered by perplexing results. A few years ago I was talking to Spencer Wells about the age that we are privileged to live in. Spencer is a history and genetics buff (he was one of Richard Lewontin’s last grad students). So naturally as genetic science has emerged to shed light on history we’ve tracked its developments very closely. Spencer professionally, he’s a genetic anthropologist. Many questions which in the past would have been unanswerable are now answerable. Truth is coming at us so fast that it is hard to even respond to all of it (if you wait too long to publish, everything might have changed).

Carl Zimmer’s piece in The New York Times, DNA Deciphers the Roots of Modern Europeans, is accurate as to the current state of the accelerating research in this area. This is the equivalent of having a Rosetta Stone. The ancients are now coming back to life. They speak! Everything has changed. In NatureEwen Callway quotes a scientist stating in plain language, “Christ, what does this mean?” I’ll try and flesh out further what it means, but the papers themselves do a good job. These are first steps, but they’re very big steps. There’s only so much more to go, and truth will be at hand.

The old debate whether Europeans are descended from farmers or hunter-gatherers was always somewhat incoherent. All humans are descended from hunter-gatherers. Rather, the issue was whether modern Europeans descend primarily from people who were resident within the continent of Europe at the end of the last Pleistocene, or, whether they descend from peoples who developed agriculture in the Middle East ~10,000 years ago. That is, did farming spread through cultural diffusion or migration? Plants or people? The answer is actually not straightforward, but, the results are not controversial today.

First, migration seems to have been the dominant dynamic which defined the spread of farming, especially early on. These first farmers who arrived in Europe were genetically very different from the hunter-gatherers of Europe’s north and west. Some of their ancestry had been isolated by long distances for tens of thousands of years before contact. The people of the Iberian peninsula today have less genetically in common with the hunter-gatherers which were present in the region when the farmers arrived than do modern Northern Europeans, who harbor a greater fraction of ancestry which derives from the Pleistocene people. The main qualifier I’d put on this though is that the farmers themselves seem to have picked up European hunter-gatherer admixture on their way out of the Middle East. The fraction is on the order of ~50%. The other component has been termed “Basal Eurasian,” because this element is an outgroup to all other Eurasians, including the European hunter-gatherers. That is, the Basal Eurasians are an outgroup to a clade that includes such as diverse populations as Andaman Islanders, Australian Aborigines, Japanese, and European hunter-gatherers.

The figure to the left is from the paper Ancient human genomes suggest three ancestral populations for present-day Europeans. WHG = “Western (European) Hunter-Gatherers.” EEF = “Early European Farmers.” You can see that EFF is a compound. I don’t think there’s too much clarity right now with where the EEF got its WHG-like ancestry. It could have been structure in the Middle East. Or it could have been in Southeast Europe. In the supplements of Haak et al. they test a Hungarian sample, and it does seem that the EEF individuals are closer to it than the Western European hunter-gatherer samples. So there might have been structure in the ancestral European population, but the confidence here is low. And from what I can tell Basal Eurasian is still something of a mystery, almost occupying the role of “Planet X” before the discovery of Nepture. To make the patterns make sense they have to exist, but much isn’t known about them in detail. And of course there seems to be a huge lacunae right now in terms of exploring the population genetics of the Middle East in a similar fashion as has occurred in Northern Eurasia (my understanding is that Carlos Bustamante was an important person in getting Latin American populations in the 1000 Genomes; unfortunate that there wasn’t someone else to advocate for including a Middle Eastern group, since this is such an important part of the world for human history).

With all that said, if one assumes that the West Eurasian admixture in EEF was from European hunter-gatherers, then it is clearly obvious that most of the ancestry of modern Europeans can date to the Pleistocene (i.e., EEF + Yamnaya likely means more than half the ancestry is WHG-like if you look back 10,000 years). But, this proportion obscures the fact that massive migrations and population turnovers have occurred, so that a simple model of expansion out of Ice Age refuges no longer holds. Cavalli-Sforza has long argued that pure proportions of ancestry are less important than the dynamic, as population growth driven “waves of advance” will over time dilute the initial genetic signal anyway (though the final proportion of non-WHG-like ancestry is actually higher in much of Europe than Cavalli-Sforza conceded in the early 2000s). Whether the ancestry of modern Europeans derives predominantly from those of European hunter-gatherers, the idea of dominant local continuity in a given region has been thoroughly refuted. The hunter-gatherer ancestry in the British Isles, for example, may be mostly from admixture into agricultural groups far to the south and east during the initial waves of advance, not from the people who initially recolonized Northern Europe in the early Holocene.

The second demographic turnover event which has been highlighted by the papers cited so far is from the east. The migration from the steppes. This event had disproportionate, even dominant, impact across much of Northern Europe. Culturally it is often rooted in the Yamnaya complex, which gave rise to various disparate and wide ranging “daughter” societies. David Anthony’s The Horse, the Wheel, and Language surveys the archaeological terrain thoroughly. If you are interested in this topic, and haven’t read it, do read it. In this work Anthony outlines the spread of Indo-European languages via expansion of a mobile pastoralist elite. He was involved in the retrieval of some of the samples in these studies, and from what I am to understand he was personally surprised that the genetic data imply not just elite migration, but a folk wandering. Not just a band of brothers, but whole peoples on the move.

Haak, Wolfgang, et al. “Massive migration from the steppe was a source for Indo-European languages in Europe.” Nature (2015).

Focusing on the genetics, these people seem to themselves be a compound of disparate elements. First, some of their ancestry derives from a population which Haak et al. term “Eastern Hunter-Gatherers” (EHG). And the other half derives from a population with affinities to those of the Near East, but different from that of the EEF. There is some disagreement between the two papers in Nature as to the details, but Allentoft et al. admit that they did not have EHG samples, which may have impacted their ability to detect admixture. Allentoft et al. also diverge from Haak et al. in the emphasis they place on the ancestral component among the Yamnaya which some term “Ancient North Eurasian” (ANE) based on the location of the most ancient individual of this line (see Upper Paleolithic Siberian genome reveals dual ancestry of Native Americans). What does seem clear is that this element is deeply diverged from other West Eurasian populations, on the order of ~20 to 30 thousand years. And, they contribute about half the ancestry to the EHG (the rest is WHG-like). The descendants of the Yamnaya people brought this component all throughout Europe, with the exception of the Sardinians and Sicilians, likely isolated because of their position on the Mediterranean littoral (Sicilians have later Near Eastern admixture as well). But this is not limited to Europeans, as a substantial proportion of Native American and West and South Asian ancestral heritage (at least the Kalash) also exhibit connections to this component. Allentoft et al., like Haak et al., points out that there was likely structure in this broader group. That is, the ANE themselves were diversified, with the ancestors of the element in Native Americans and Europeans different from that which contributed to the Siberian component. In fact I have talked to researchers who believe that the term “Ancient North Eurasian” is misleading, as there is little clarity on the distribution of this group (the highest inferred fractions in Eurasia are in the North Caucasus). It is feasible that the Kalash have a different ANE source than Europeans.

A key issue to note, and that confuses some people, is that the ancestry of groups such as Yamnaya exhibited commonalities with other groups across Eurasia. Therefore, if you replaced similar groups then the change in admixture components utilizing model-based programs may not be as extreme as you would think. To illustrate what I’m getting at concrete, the population transfer between Greece and Turkey during the 1920s was far more impactful as a dynamic than simple before and after admixture estimates would suggest to you (since genetically the two groups were very similar). The figure from Haak et al does not use admixture components that break out naturally, but their inferred demographic mixes taking into account the genetic character of the putative ancestral populations. The blue component refers to WHG, but WHG-like ancestry is also in both the green (Yamnaya) and orange (EEF) elements (this is why I’m saying it is likely that modern Europeans are mostly >50% WHG-like).

One temporal dimension that Haak et al emphasizes in particular, but seems clear in Allentoft et al. as well, is that non-Yamnaya ancestry slowly begins to rise again by the Bronze Age. Why? I will address that below. But, Allentoft et al. has broader Eurasian samples, including likely Indo-European populations in the trans-Ural and trans-Altai regions. In both of these areas the successor cultures had EEF-like ancestry. That is, like the Corded Ware population, and unlike the parent Yamnaya group. This strongly implies back-migration by this complex from Eastern Europe, as far east as western China, during the Bronze Age.

In The New York Times piece David Anthony states two things which puzzle me as an interested lay person without his expertise. First, he seems to think that the amalgamation of the Yamnaya and EEF-descended populations was not a warlike process. Specifically he says “It wasn’t Attila the Hun coming in and killing everybody,”. This is a useful image, but let’s be honest and note that the Huns were not primary producers, and did not aim just to increase pasturage by killing settled peoples as Genghis Khan had wanted to do (see The End of Empire: Attila the Hun & The Fall of Rome). Rather, they conquered and subordinated other barbarian groups, as well as extorted tribute from the East Roman Empire. The demographic impact of the Huns was not directly from them, but the fact that they and their successors (in particular the Avars) facilitated the migration of other groups, first, the Goths, and later the expansion of the Slavs. By the time of Attila barbarian leaders were well aware that the conquered were vital as economic producers whose capture and subjugation would allow them to engage in status competitions of conspicuous consumption. I do not believe that this was quite the case in the Copper and Bronze Ages beyond the limes of the civilized world, which was then an small archipelago of literacy in a sea of barbarism. Both the above papers indicate massive demographic disruption across Europ. Though war as we understand it is necessarily inevitable for our species, between the rise of agriculture and the modern period it seems to have been very common. It is not a coincidence that the Scandinavian Corded Ware culture are also called the Battle-Axe culture. Yes, many archaeologists believe that they were primarily a status symbols. I’m willing to bet many archaeologists are wrong. It’s been known to happen.

The second issue which Anthony brings up is the connectedness of the various post-Yamnaya cultures, in particular that of the earliest Indo-Europeans on the fringes of western China, 4,000 miles from their likely point of origin. The genetic characteristics of these eastern groups is also such that it is likely that there was gene flow from Europe, mediated by a common steppe culture. Anthony states that “I myself have a hard time wrapping my head around explanations for that”. This totally confuses me, because he’s a professional archaeologist, so he must know that widespread gene flow and cultural ties cross the vast swath of the Eurasian heartland is not surprising at all! To Carl Zimmer I pointed out the example of the Goturk Empire of the mid 6th century A.D., which expanded rapidly from the core Altai zone, and prefigured the later distribution of the Turkic people, from the Nile to the fringes of the Arctic sea. Language and lifestyle mediate relationships and demographic contact. The peripatetic character of steppe peoples is well known and attested from the historical and semi-historical record. Groups such as the Huns, Avars, and Alans, had inchoate origins in the heart of Eurasia, and moved back and forth along lines of cultural affinity as needed. Alans were serving under the Mongols in China in the 13th century, but 800 years earlier they had accompanied the Vandal tribe to North Africa, and maintained a separate identity there until the conquest of Justinian. It seems entirely plausible that this pattern of hyper-mobility arose with agro-pastoralism along the whole range of continuous ecological appropriateness, only ending with the rise of gunpowder empires and the crushing of the Oirat by the Manchus (with the tacit approval of Russia).

*Northern European archetypical physical characteristics are younger than the pyramids*

Spencer Wells, a new look in the world

Phylogenomics is tangled and complicated still, even with all these new results. I’ve only scratched the surface above. You really need to read the papers, and their supplements, to even get a sense of what’s going on (yes, ideally you’ll know what an f3 statistic is!). But, the population genomics which give us a sense of the character of natural selection and phenotype over time is much clearer. The suite of traits which we associate with white Europeans is quite possibly very recent, as late as post-Bronze Age. White supremacist scholars of the early 20th century who posited that ancient Egypt (in fact, all civilizations) were founded by blonde Nordic people turn out to likely be wrong because these civilizations probably predate the existence of blonde Nordic people, both in their genetic structure, and in their physical type (at least in any number).

The genetic architecture of pigmentation is something geneticists know a fair amount about, because genome-wide association has been very fruitful in this area. Unlike traits such as height there is a large amount of between population variation in pigmentation. And, that variation is due in large part to a few genes of large effect. At SLC24A5 there is a SNP which accounts for around 1/3 of the melanin index difference between Europeans and Africans, using an admixed African American population to test the effect. As I have observed before SLC24A5 in its derived form is as close to fixed as you can get in Europeans. In the 1000 Genomes data set of thousands of individuals I found a few samples with a heterozygote and the ancestral copy. In the Middle East this allele is also near fixation, though not quite. As you can see from the figure I adapted from Allentoft et al., among South Asians the derived allele is also at high frequency. My whole family is a homozygote for the “European” variant. There is some suggestive evidence that this haplotype derives from the Middle East. It was only at low frequency among European hunter-gatherers[3]. But, by the Bronze Age had it gone to fixation in Europe, as well as on the Eurasian steppe.

Of more interest to me is the trajectory of SLC45A2. The derived allele is nearly fixed in modern European populations, though not nearly to the same extent at SLC24A5. In Iberian and Sardinian populations the ancestral type is in the range of ~10%. During the Bronze Age in Europe it was only at ~50% frequencies, which is in the range of modern Middle Eastern populations. It was even at lower frequency in the steppe, from which the putative Indo-Europeans migrated.

Finally, in this panel for pigmentation they included a major SNP in OCA2-HERC2 region. This locus is famous for being involved in blue-brown eye color variation, explaining 75% of the variance, and also exhibiting the third longest haplotype in the European genome. Naively projecting from these SNPs one could credibly argue that the ancient hunter-gatherers of Europe at the beginning of the Holocene were dark-skinned and blue-eyed! The Bronze Age European samples, which in this case are biased toward Northern Europeans, had a range of genetic variation equivalent to modern Southern Europeans. The people of the steppe did not seem to have blue eyes at all.

These results align perfectly with those in Mathieson et al. One thing to observe is that the Paleolithic samples, which have a much deeper time depth, are “ancestral” at all these positions. Even if the sample size is small (N =4), they’re from diverse times and places. Does that mean that they were much darker than even the Holocene hunter-gatherers of Europe? As some have pointed out we can’t just straight-line extrapolate from the genetic architecture of today to the past. Remember that Neanderthals exhibited pigmentation polym]orphism, but of a different sort. A deeper functional analysis may yield the possibility that Paleolithic Europeans had alleles which also resulted in lighter skin, but they were different ones from the ones segregating as polymorphisms today. I have already stated that I doubt much of modern European ancestry derives form before the Last Glacial Maximum. The reason that modern genetic variation in terms of predicting phenotype gives these sorts of results is that they may have arrived at the same trait value via a different set of polymorphisms. Genotype-phenotype maps derived from modern populations may be a poor predictor of the relationship 30,000 years ago. Why would one think that selection upon variation in pigmentation began at the cusp of the Holocene?

But, I do think we can predict with more confidence the nature of phenotypes for populations which are genetically much closer to modern ones. Bronze Age Europeans fit that bill. And, I know something personally about what the appearance of individuals during this period might have been based on genetic architecture: both my children exhibit a genotype profile on pigmentation loci similar to many Bronze Age Europeans. That is, they’re fixed for the derived variant of SLC24A5, and are heterozygotes at SLC45A2 and OCA2-HERC2 (my son, but not my daughter, is a heterozygote at KITLG; it does seem to make a difference in hair color). In terms of just their complexion they could pass as indigenous Southern Europeans, but definitely not Northern European.

*Culture leads genes by the leash*

Another major finding of Mathieson et al. and Allentoft et al. is that the derived allele found across West Eurasians that allows them to digest lactose sugar as adults has been sweeping up in frequency over the last 4,000 years. This allele spans a diverse array of populations, from Basques to South Asians. With pigmentation it seems that we need to consider jointly the impact of ancestry and selection (in South Asia derived SLC24A5 frequencies are definitely a function of both selection and descent). But with LCT it seems likely that selection is paramount. The predominant genetic character of Eurasia was established by the Bronze Age, but the frequency of the lactase persistent allele was still far lower. Tests of natural selection which focus on patterns of haplotype variation long detected a huge hit from LCT so this is not surprising.

Intriguingly Allentoft et al. indicates that though the Bronze Age steppe populations had low frequencies of the derived allele, it seems that they did have a higher frequency than contemporary populations. This suggests that the origin of this haplotype, which spans the whole range of Indo-European speaking populations, and also into Finnic groups and the Basque, may still be attributed to the Yamnaya complex. In 10,000 Year Explosion Greg Cochran proposed the hypothesis that the favored mutation for LCT enabled the spread of Indo-European pastoralists. These results are not strong support for that direct causal relationship; rather, it strikes me that the ascendancy of the pastoralists drive the selection pressures for the allele in question. Biology did not drive culture, culture drove biology. The milk-drinking Celts and Germans encountered by Julius Caesar 2,000 years ago may still have been in the middle stages of adaptation to the agro-pastoralist lifestyle slowly being perfected by their ancestors.

*As the white man is, so shall we all be*

A new look as well

It is a running joke of mine on Twitter that the genetics of white people is one of those fertile areas of research that seems to never end. Is it a surprise that the ancient DNA field has first elucidated the nature of this obscure foggy continent, before rich histories of the untold billions of others? It’s funny, and yet these stories, true tales, do I think tell us a great deal about how modern human populations came to be in the last 10,000 years. The lessons of Europe can be generalized. We don’t have the rich stock of ancient DNA from China, the Middle East, or India. At least not enough to do population genomics, which requires larger sample sizes than a few. But, climate permitting, we may. And when that happens I am confident that very similar stories will be told. Using extant genetics we can already infer that modern populations in South Asia are a novel configuration of genotypes and phenotypes. The same in Southeast Asia, the Americas, and probably Africa. Probably the same in East Asia. Perhaps in Oceania. Even without admixture humans evolve in situ and changed, but with admixture the variation increases, and the parameter space of adaptation becomes richer and more flexible.

In Isaac Asimov’s later Foundation books he touched upon the existence of racial diversity in the future (from what I recall his earlier works from the pulp era were whites-only galaxies). At one point Hari Seldon encounters someone whose physical appearance seems to be East Asian, and they discuss the strangeness of people with East Asian ancestry being termed “Easterners” and those with European appearance being “Westerners.” With a loss of memory of the ancient distribution of these populations on the home planet only the shadow of a semantic recollection exists as a ghost in the galaxy-spanning Empire based out of Trantor. But of course tens of thousands of years in the future, even barring genetic and mechanical modification, it is unlikely that modern racial types will persist in any way we would recognize them.

But these results coming out of ancient DNA are telling us that what is likely to be true for the far future was also true for the recent past. White Europeans are a new type. But so are brown South Asians. Ethiopians have a recent ethnogenesis, as do most North African groups. The Bantu expansion has reshaped the face of Africa on the edge of the historical horizon. And so forth. In the big picture Young Earth Creationists are wrong, but in the specifics the idea that the sons of Noah populated the world ~5,000 years ago is not looking as crazy as it once did! Human genetic variation across Eurasia today may be mostly clinal, but in the recent past it was not. Rather, it was characteristic by sharp discontinuities and isolated local populations with diverged ancestry from their neighbors.

*And culture made man in its image*

About ten years ago it was common in paleoanthropology to assume that human beings emerged almost fully formed ~50,000 years ago, and wiped out all the others in a genocidal wave of advance. Richard Klein advanced this model in The Dawn of Human Culture. Klein’s thesis was that some stochastic event, a mutation, resulted in the punctuation of a new species, our own. This singular genetic process allowed for the emerged of fully formed linguistic faculties in our lineage, which allowed for the development of the cultural flexibility, which made the rest of the human lineages evolutionary dead ends. It was a single and elegant story. It appealed to the principle of parsimony. The reality of “archaic” admixture was a difficulty for Klein’s model, evidenced by the fact that he voiced his skepticism of genetic claims of admixture in The New York Times after most others had moved on. For Klein a biological change explained the rise and success of our species, not a cultural one.

At the time I found the thesis compelling. We were after all a very special species. Modern Homo made it to Oceania and the New World. Something must have happened. Something big. What else could explain our rapid expansion and marginalization of other lineages? I’m a biologist, and so biology is an appealing causal mechanism.

*The luck of the English facing the ocean*

At about the same time the evidence for Neanderthal admixture came out, Luke Jostins posted results which showed that other human lineages were also undergoing encephalization, before their trajectory was cut short. That is, their brains were getting bigger before they went extinct. To me this suggested that the broader Homo lineage was undergoing a process of nearly inevitable change due to a series of evolutionary events very deep in our history, perhaps ancestral on the order of millions of years. Along with the evidence for admixture it made me reconsider my priors. Perhaps some Homo lineage was going to expand outward and do what we did, and perhaps it wasn’t inevitable that it was going to be us. Perhaps the Neanderthal Parallax scenario is not as fantastical as we might think?

Consider the case of Europe around 1600. In England and northern Germany (or what was to become northern Germany) you have two Protestant and genetically similar populations. But by 1850 it looked as if England was going to demographically overtake Germany in a broader genetic sense. James Belich’s Replenishing the Earth reviews the history of this period, when England spearheaded a demographic revolution far out of proportion to what one might have predicted in the year 1000. But by 2000 Germany, or Germans, had caught up somewhat. How? Millions of Germans migrated to the United States, starting in very large numbers in the mid-19th century, and were “picked up” by the demographic revolution which was the United States. The point is that contigencies of history, cultural and social, rather than biology, explain the trajectory of the gene pool over time. Much of the human past, and the sharp fluctuations in gene frequencies, might be driven by the long and forceful arm of culture.

In the treatment above I note that the EEF farmers who by and large replaced the indigenous hunter-gatherer groups in modern Southern Europe were themselves a compound. The hunter-gatherer ancestry within the EEF was far more successful than that of those they replaced, but the only reason that this was so was geographic coincidence. The WHG-like groups absorbed into the EEF were positioned further east, and so closer to the initial locus of expansion of Neolithic farmers. Similarly, the Neanderthal admixture into modern populations was almost certainly localized to particular groups. This is not to say that there are no biological differences between human populations which may explain a wide range of phenomena. Anyone looking at the skull of a Neanderthal and a modern human knows there are. There are also likely bio-behavioral differences between extent populations. Gene-culture coevolution is a real process, even if the details need to be worked out. But the interplay between biology and culture is complex, and in many cases cultural changes are driving the biological change, and then fixing differences which are advantageous to the “winners” (lactase persistence seems rather to be a perfect case of this). But just as in the individual case we must also remember that winning is often in part a function of being lucky. Naturally selection, generally thought of as a deterministic process, is also to some extent stochastic[4].

*From genetic islands to a roiling sea of humans*

One of the most shocking things for many of the geneticists working in the area of ancient DNA, and encountering the variation of the past, is the high level of population structure. That is, you have groups co-resident for many generations who nevertheless exhibit genetic distances of intercontinental scale. But as I stated above David Reich himself found the same results for India. And, in Africa you have long symbiotic populations, such as the pygmy groups of the Congo, and their agricultural neighbors, who are genetically very different, and have been for tens of thousands of years. Allentoft et al. dryly observe that “These results are indicative of significant temporal shifts in the gene pools and also reveal that the ancient groups of Eurasia were genetically more structured than contemporary populations.”

About 10 years ago I read Nicholas Dirks’ Castes of Mind. Dirks is an eminent scholar who is now the chancellor of UC Berkeley. He emphasizes the power of European categories and systematization in creating the modern caste system. I don’t want to reduce his argument to a caricature. Obviously caste predates European colonialism. Dirks would admit this. But in Castes of Mind it is hard to shake the feeling that he believes that the British imposition of formalization made it what we truly understand it to be today. That caste has to be understood as a contemporary and early modern phenomenon, rather than an ancient one that was a structural feature of South Asian society.

The genetic evidence is clear now, and it paints a very different landscape. Many of the caste, even jati, boundaries we see today are thousands of years old. Endogamy long predates the British. It may predate the Aryans! Rather than the British, or Aryans, inventing caste, this form of ethnic segregation may date to the initial admixture event, to be reinvented and modified with each new population which arrives and imposes its hegemony on the subcontinent. In The New York TimesDavid Reich states “You have groups which are as genetically distinct as Europeans and East Asians. And they’re living side by side for thousands of years.” He then he goes on to say “There’s a breakdown of these cultural barriers, and they mix,” alluding to the rise in WHG ancestry in farmer samples over time. Of course it is interesting to remember Reich’s work on India has highlighted exactly how persistent caste has been, and how it maintains genetic variation in a localized region that is often nearly inter-continental in magnitude.

We can never know if 6,000 years ago the LBK people, the first farming culture of Northern Europe, imposed a caste-like system of segregation when encountering the indigenous hunter-gatherers. Nor can we say with total confidence whether their relationship exhibited a symbiosis analogous to that between the Bantu agriculturalists and pygmies of the Congo (though do note that in these scenarios the Bantu communities are higher status, and the individual pygmies often have a semi-slave status). But, we need to look to what cultural evolutionary models and empirical results can tell us to make sense of these patterns. Ancient DNA can tell us very concretely the details of changes in allele frequencies. We can somewhat confidently reconstruct the faces and complexions of our ancestors. The questions population genomicists ask and answer in relation to animal models are relatively cleanly addressed by these data sets, assuming the sample sizes are large enough. But humans are the cultural animal par excellence, and that is the critical new variable which will require a new set of scholars to come together and create a truly multi-disciplinary understanding of the human past, present, and perhaps future. Powerful genomic techniques which produce results which have implications for the study of human history needs to leverage the full array of scholars who study human historical science.

1 – The three-fold copying is an important matter, because the different cultures had different preferences and goals. The Arab effort for example focused mostly on the philosophical production of the ancients. Without the Byzantines we would have far less of the humanistic production of Classical Greece, in particular the theatrical tradition.

2 – Much of what is known about the diplomatic history of the Bronze Age Near East has been preserved in cuneiform tablets. Though unwieldy, this form of writing on clay tablets is obviously more robust and less dependent upon copying than parchment and papyrus which came later.

3 – I would be curious to know if it is the same haplotype as is currently common in Eurasia.

4 – New mutations will usually go extinct, even if they are favored, in the initial generations. It is only when the frequency becomes high enough due to chance that selection will inevitably drive its frequency up, perhaps to fixation.

There is a season for everything. Last year my friend David Mittelman and I teamed up with GigaScience editor Laurie Goodman to write up a commentary in Genome Biology, Dragging scientific publishing into the 21st century. We’re obviously in the 21st century, but for science publishing we’re in the “long twentieth.” But wait, it’s worse! As we noted in the piece, to a great extent the internet is used as a PDF delivery device by many publishers, and the PDF is an electronic form of the classic paper journal article, whose basic outlines were established in the 17th and 18th centuries. In other words, in a qualitative sense we’re not that much beyond the Age of Newton and the heyday of the Royal Society. Scientific publishing today is analogous to “steampunk.” An anachronistic mix of elements somehow persisting deep into the 21st century. Its real purpose is to turn the norms of the past into cold hard cash for large corporations.

Obviously I’m not the only one with this thought. To a great extent PLOS and the open access revolution arose to overturn the procrustean status quo. More recently preprint culture, and the transparency of “personal communication” via Twitter, have changed the terms of discussion. The metabolic pace has increased, and the transparency which breathes life into scientific discourse is on the march. It seems likely that the old order will die a death of a thousand cuts, as one practice after another fades into obscurity.

One of the main weak points of the current framework is that it does not serve the needs of the end user. For many, the goal of getting published is to add a line to the c.v., at least outside of the top-tier journals. This explains the emergence of vanity and fraudulent publishing houses. Many researchers of genuine eminence exist, but for some workaday scientists publishing somewhere will do well enough to keep the salary and perks coming. But science should be more than just a job. Science feeds the spirit of our society, it allows us to see with our mind’s eye how the world truly is. Scientific discussion has to flourish in a manner which is not simply an ends to careerism.

So back to a specific weakness of the current system: how to engage with the end user? David has assembled a small team to begin actualizing the “wish list” that we outlined in Dragging scientific publishing into the 21st century. That actualization takes the form of a new startup, N of Everyone, which exists to roll out technology to help folks better engage with and discuss science. Their first project is a readerwhich leverages the way people today actually read “papers.” That is, not simply a pixelization of paper, but a form of engagement with science which actually brings to the table the interactivity which is invited by the nature of electronic media. At many journal clubs people now read “papers” on tablets, notebook computers, and even phones. Why retrofit the print format of yore for the cutting-edge technology of today?

This probably sound a bit vague and nebulous. To make this concrete N of Everyone is looking to the crowd for support and to raise initial funds. Funding to transform an idea into a reality. There’s already a prototype (I’ve seen it). Imagine leaving comments on specific sentences of papers. Basically, the sort of annotation you already do emailing files or sharing docs when it comes to collaborating to get a publication polished.

Share or comment on any sentence in the paper without having to leave the paper

Get in-line context for references as well as a map of where those references are discussed throughout the paper.

Get in-line context and discussion of figures in a paper as well as the entire discussion for a figure, even if it is distributed throughout the paper.

Get more information and lots of great screenshots at their their Kickstarter page, and consider contributing to the project (obviously). There’s a lot of ways one can imagine the communication of science going, and it will change. I’m confident of that. David and his partners are attempting to grab the bull by the horns and drive in a fruitful direction. I know they’re passionate about science, and for me that’s key. You can make money in a variety of ways. The reason they’re tackling this project is because this is an issue that’s close to their heart.

Note: comments are closed to this post. Since David & co. would appreciate feedback I’m sure, I’ll just point you to their Twitter, http://twitter.com/nofevery1.

In the culture of science you occasionally run into the sort of person who believes as an apodictic fact that if one is religious one can not by their fact of belief be a good scientist. You encounter this sort of person at all levels of science, and they exhibit a range of variation in terms of the volume of their belief about beliefs of others. I don’t want to exaggerate how much it permeates the culture of science, or at least what I know of it. But, it is a tacit and real thread that runs through the world-views of some individuals. It’s a definite cultural subtext, and one which I don’t encounter often because I’m a rather vanilla atheist. A friend who is now a tenure track faculty in evolutionary biology who happens to be a Christian once told me that his religion came up nearly every day during graduate school! (some of it was hostile, but mostly it was curiosity and incomprehension)

This is on my mind because a very prominent person on genomics Twitter stated yesterday that Francis Collins by the very fact of his evangelical Christianity should not hold the scientific position of authority that he holds (the individual in question was wondering if they could sign a petition to remove him!). The logic was very straightforward: science by its nature conflicts with religion, and those who engage in the sort of cognitive processes which result in religion will be suboptimal in terms of scientific reasoning. As I indicated above the people who promote this viewpoint treat it as a deterministic scientific law. And, importantly there is little reference to cognitive science or survey data to support their propositions. Ten seconds on Google will yield the figure you see above. A substantial proportion of American scientists aver a religious affiliation.

Mind you, there are patterns. The data when examined in a more granular fashion suggests that academic scientists are more secular than those in industry, as are the more eminent ones. But it doesn’t take much time to think of great scientists who avowed some sort of religious affiliation. In evolutionary biology R. A. Fisher and Theodosius Dobzhansky affiliated as Christians. The mid-20th century evolutionary biologist David Lack was an Anglican convert. In Reconciling Science and Religion the historian of science Peter J. Bowler outlines a movement in early 20th century Britain to accommodate and assimilate the findings of evolutionary biology to that of mainstream Christianity, so it is entirely unsurprising that Anglicans such as Fisher and Lack were active researchers within evolutionary science.

Outside of evolutionary biology there are two examples which stand out in my mind. Larry Wall, the originator of the Perl language which has had a long history in bioinformatics is an evangelical Protestant Christian. And Donald Knuth, the author of the magisterial series The Art of Computer Programming is a Lutheran.

My point in reviewing this data, which should be widely known, is to bring some empiricism to this discussion. What do the data say? Not one’s prejudices and intuitions. One response on Twitter was that empiricism precludes faith. That’s the theory about empiricism. The reality is that there are many great empirical scientists who have a religious faith. Any scientist worth their salt who wishes to air hypotheses about the incompatibility of religion and science on an individual level needs to engage with these facts.

To be fair, I don’t think it’s a coincidence that there’s a correlation in the aggregate between secularism and science. But this issue is complex, emerging at the intersection of cognitive science, sociology, and history. These subtleties can’t be waved away airily with a reference to facts that everyone knows which happens to reflect one’s own personal prejudices. That reminds me of things besides science.

Finally, this truth that in the aggregate scientists are a diverse lot even if there tends to be particular patterns of social concentration is a general one. E.g., most scientists are more liberal than not. But a substantial minority are not, with a fraction of those being rather closeted about this. The average scientist, in particular in the academy, is a secular liberal. But the minority are not trivial. We’re in your lab meetings, at your conferences, collecting data for you, and on your committees, reviewing your grant applications.* Because of the nature of the academy outside of religious colleges there is often silence from this minority lest they be pigeon-holed as out of step with the social culture of science. That’s human nature. And scientists can’t escape that, whether they are in the majority, or the minority. For all the talk of logic and empiricism, scientists are all too human in their basic wiring.

* Much of what I say applies to natural science. From the survey data in the academy non-liberals-to-Leftists are almost entirely absent in sociology and a lesser extent in areas of psychology.

Science is a pretty big deal. Science is the foundation for our civilization. Science is the best method we’ve found to map reality, and take us into the unknown on more than whim and prayer. I don’t agree with those who believe that science drained romance from our understanding of the world around us. I don’t agree with those who assert science is just another superstition. I don’t agree with those who assert that science is a tool of oppression by its nature.

With all that stipulated, science has problems. And that’s because it is a human enterprise. Humans are both the root of science’s problems, and, the source of its solutions. Philosophers can think deeply about how science is done, from Karl Popper to Thomas Kuhn, but high level abstraction has little impact on the day to day practice of science. Science today is social. Individuals work in the context of research groups, and then publish and disseminate their findings across the broader community of peers. The social aspect is why genuine scientific productivity on an international scale is so concentrated in a few nations, above and beyond what you might expect from economic development. The per capita gross domestic product difference between Germany and Italy is significant, but it is dwarfed by the yawning scientific productivity chasm between Germany and Italy in any area of science I am personally familiar with. Science exhibits returns to scale. Who you are around makes you smarter in science.

This is why Twitter has become such a big deal. It’s a way to enable disintermediation; cutting the middlemen and gatekeepers out of the equation, and ratcheting up on the metabolism of discourse so that it is nearly frictionless. About ten years ago some friends of mine disagreed with a scientific paper in PNAS. They were going to write a response, but didn’t think anyone would pay attention, even if PNAS accepted and published it. So they put up a blog post. Today they would probably start responding on Twitter.

Michael Snyder, a geneticist at Stanford University in California and co-author of the original paper, stands by his team’s study and its conclusions and says that Gilad broke the “social norms” of science by initially posting the critique on Twitter. Gilad says that he took to social media to highlight his work, which might otherwise have been overlooked.

Obviously there isn’t a book which outlines the social norms of science. These norms have developed and coalesced implicitly, tacitly, over time. And, they change. It’s no surprise that a lot of people on Twitter are taking Gilad’s side in this. Also, many are giving credit to Snyder’s group for releasing the raw data to Gilad for the reanalysis. If Gilad and company are correct then this is another victory for open(ish) data. Derek Lowe has some reasonable thoughts on the details of how this has been playing out in public. I don’t have much to add.*

But, I do wonder how ephemeral the role of Twitter is going to be in the scientific community. After all, Twitter is not a public utility. It’s a public firm which is traded on the stock market and exists to make a profit and return value to its shareholders. There was a time when AOL, or Myspace, were ubiquitous corners of the internet. Though Twitter allows for a level of disintermediation, to some extent it is a stealth intermediary in and of itself.

The social norms of science are evolving, and the rate of change is increasing. I doubt that this generation shall pass into emeritus before the entire edifice of scholarship as we know it, from publishing status quo to the tenure system, is overturned. Snyder put his finger on the fact that Gilad is likely violating the social norms of science, but those were past norms. Scientists are making it up as they go along right now. Genomics in particular, which is a heavily computational field, with many researchers amenable to data sharing, distribution, and reanalysis, is to some extent going to be a guinea pig for other domains. We’re in a time of change, so likely don’t have the clarity we will in a decade or so, when the current maelstrom will have passed and a new equilibrium attained.

* I didn’t pay much attention to the original paper, so I’m having a hard time understanding how the authors didn’t bother to check for batch effects as some are claiming. Finally, I’ve met Mike Snyder, and he’s a very nice person from what I can tell for how big of a deal he is. I hope this resolves without too many hurt feelings and reputations intact on all sides.

A few months ago the anthropologist Pat Shipman published a book, The Invaders: How Humans and Their Dogs Drove Neanderthals to Extinction. I’ve read Shipman before, and because of my interest in domestication it’s been on my radar, but I haven’t gotten around to purchasing it. The major reason is that as I understand it the title is somewhat misleading, in that there’s a lot less in the text on human-dog cooperation than one might think. Which is reasonable, it’s a speculative hypothesis at best.

Perhaps the biggest problem is that there’s no strong evidence that dogs were domesticated or distinct as early as ~35 thousand years ago, when modern humans replaced Neandertals in Europe. This comes up in a very highly rated comment on Amazon in fact. The best genetic work, Genome Sequencing Highlights the Dynamic Early History of Dogs, implies a date of ~15,000 years before the present, at the earliest.

The origin of domestic dogs is poorly understood…with suggested evidence of dog-like features in fossils that predate the Last Glacial Maximum…conflicting with genetic estimates of a more recent divergence between dogs and worldwide wolf populations…Here, we present a draft genome sequence from a 35,000-year-old wolf from the Taimyr Peninsula in northern Siberia. We find that this individual belonged to a population that diverged from the common ancestor of present-day wolves and dogs very close in time to the appearance of the domestic dog lineage. We use the directly dated ancient wolf genome to recalibrate the molecular timescale of wolves and dogs and find that the mutation rate is substantially slower than assumed by most previous studies, suggesting that the ancestors of dogs were separated from present-day wolves before the Last Glacial Maximum. We also find evidence of introgression from the archaic Taimyr wolf lineage into present-day dog breeds from northeast Siberia and Greenland, contributing between 1.4% and 27.3% of their ancestry. This demonstrates that the ancestry of present-day dogs is derived from multiple regional wolf populations.

As you can see from the figure to the left the Taymyr sample diverges at about the same time as the common ancestor of wolves and modern dogs. In other words, you have a polytomy. Not only that, but there has been introgression from the Taymyr lineage into particular northern dog populations.

Let’s let this sink in: if the results above hold, then the arrival of modern humans to northern Eurasia may have been coincident with the emergence of a distinct dog lineage. The term “man’s best friend” takes on a whole new meaning. The relationship between man and dog may be nearly as ancient as modern humans as we understand them, that is, populations capable of copious and protean symbolic cultural production which explode out in the archaeological record over the past ~40.000 years. In addition, I also believe we now need to totally reconceptualize how we view the relationship of wolves and dogs. Rather than an ancestral and derived set of populations, whose “species” status is only semantic convenience, they are actually sister clades. The results in this paper confirm other findings that the wolves of North America and Eurasia seem to share a post Last Glacial Maximum origin. Wolves as we understand them today may have emerged simultaneously with dogs, both descending from the melange of canid lineages which flourished during the Pleistocene. There’s a reason that feral dogs, such as dingos, do not “revert” to wolves. The ancestor may not have even been a wolf!

Additionally, the authors also note that the features of the dog which are hallmarks of domestication may themselves be derived within the dog lineage. That is, the separation of the ancestors of dogs and wolves predates the Last Glacial Maximum, ~20,000 years ago. But the evolution of dogs so that they exhibit particular derived traits may have occurred far later in time. In fact, I would hold that perhaps the true story is one of co-evolution between dogs and humans.

The ultimate moral of this true story to me is that many Pleistocene mega-fauna with wide ranges in Eurasia were subject to similar evolutionary dynamics. Extinction of distinct local lineages was the rule, not the exception. Recolonization from populations which dodged extinction was also inevitable. The phylogenetic tree was pruned repeatedly, but tempered somewhat in the ferocity of clipping by admixture and introgression, as branches fused together.

A few years ago when I reviewed The Invisible Gorilla: And Other Ways Our Intuitions Deceive Us, I joked that it was the anti-Malcolm Gladwell manifesto. The joke was only half serious. Chris Chabris and Daniel Simons presented in their book serious arguments which weren’t sexy and offered no easy shortcuts. As such it is no surprise that Gladwell is still rolling in the money, while Chabris and Simons are respected academics, though not public intellectuals on the same magnitude (the irony is that arguably they are intellectuals in a more substantive way than their famous bête noire). A more egregious individual when it comes to science popularizing than Gladwell was Jonah Lehrer (not surprising that Jonah was somewhat of a protege of Gladwell). Aside from the admitted fabrications, Chabris has been long pointing out that Lehrer seems to purposely misrepresent or misunderstand the process of science, taking isolated studies and stitching them together to support novel and counter-intuitive theses which might sell copies of books (it was ironic that he wrote a long piece for The New Yorker on problems with replication).

The fact that you shouldn’t hinge your perception about the validity of a hypothesis on one study isn’t an issue for most scientists. They know how science works. It’s a noisy process, with lots of fits and starts, and consensus emerges slowly, and is periodically overturned or extended. There’s a reason that John Ioannidis’ Why Most Published Research Findings Are False is highly cited. There are thousands and thousands of studies published every year. If you want, you can search through the stack and find “peer reviewed research” to support nearly any proposition. The issue isn’t whether there are scholars willing to support your position, but what the scholarly consensus is, if there is one.

All this came to mind when I saw this blog post, A Trick For Higher SAT scores? Unfortunately no. The short of it is that a few years ago the author read Thinking, Fast and Slow, from Daniel Kahneman, a Nobel Prize winner. He reported with excitement results from a study which primed individuals to focus more with less clear fonts, and therefore increased their cognitive performance substantially. The reason why this study’s results are important is obvious to anyone, increasing median cognitive performance is a social good (this is why we put iodine in salt to combat cretinism).

Though Kanheman is a great scholar, most people are not going to know about this study from him. Rather, Malcolm Gladwell used the study in David and Goliath: Underdogs, Misfits, and the Art of Battling Giants to illustrate one of his points. Unfortunately Gladwell is a big deal for many people. Though I quite liked The Tipping Point when it came out, over the years I’ve come to see that Gladwell is less a communicator of scholarship than a storyteller who sells intellectually-themed yarns. Gladwall hasn’t seen a sample size that dissuades him from reporting enthusiastically on a result with a marginally significant p-value, so long as it supports one of his story arcs.

Three years on the author of the blog post, and one of the original authors of the paper, have a follow up publication where they report that there is no effect at all from the priming with less clear fonts. The sample size of the original study was 40. The follow up, 7,000 total (they pooled multiple studies). The author of the blog post ends on a down note:

I expect that the false story as presented by Professor Kahneman and Malcolm Gladwell will persist for decades. Millions of people have read these false accounts. The message is simple, powerful, and important. Thus, even though the message is wrong, I expect it will have considerable momentum (or meme-mentum to paraphrase Richard Dawkins).

Probably descriptively correct. But you can do something about it. Be the asshole at the party to point out that the “latest research” your friend has read in the current issue of The New Yorker is most likely to be crap, especially if it is both counter-intuitive and supports your group’s normative priors. (yes, I am usually that asshole in real life too)

Note: the reason I say irrelevant, rather than false or wrong, is that a lot of research is trivial improvement on an already established consensus if when the results are robust.

A lot of this collapse of the old orthodoxy can probably be traced back to Gary Taubes, at least in the public consciousness (see his The New York Times Magazine piece from 2002). Taubes and company now put sugar into the same category that fat and cholesterol were, though for somewhat different reasons (ergo, the focus on types of calories ingested). But health is not the only concern. Hundreds of millions of people have made their food less savory over a generation because of these false recommendations.

In The Washington Post the article concludes:

“These reversals in the field do make us wonder and scratch our heads,” said David Allison, a public health professor at the University of Alabama at Birmingham. “But in science, change is normal and expected.”

When our view of the cosmos shifted from Ptolemy to Copernicus to Newton and Einstein, Allison said, “the reaction was not to say, ‘Oh my gosh, something is wrong with physics!’ We say, ‘Oh my gosh, isn’t this cool?’ ”

Allison said the problem in nutrition stems from the arrogance that sometimes accompanies dietary advice. A little humility could go a long way.

“Where nutrition has some trouble,” he said, “is all the confidence and vitriol and moralism that goes along with our recommendations.”

A lot about nutrition is tied up to morality, and our ancient psychological fixations on the “purity” of food. That’s why no matter what people say about veganism, or paleo, or high/low fat/sugar/carb, in terms of its functional health consequences, it’s really about the values that you are projecting in terms of the psychology. And that’s why we tend to get into dietary moral panics so often. Because nutrition has a lot of variables producing an output (weight or overall health) it’s difficult to assess which ones are effecting change. If gunnery specialists were using non-Newtonian physics to land hits on the enemy we’d know pretty quickly that non-Newtonian physics (or at least pre-Newtonian mechanical intuition) just didn’t work. Though heart disease rates have gone down, Americans have become more obese. The signals are mixed. Meanwhile many of us are turning our lives upside down, eliminating or adding food elements on the latest research, which is often overturned or found to be statistically not robust. No wonder many people have started to tune out any health advice from the “authorities.”

Of course there is science, and there is science. Germ theory and epidemiology in relation to viral infections and vaccinations are a robust era of science. Planetary mechanics as well. In contrast many areas of nutritional, medical, and social science remain highly uncertain and low in confidence as to the nature of the results. You wouldn’t get that from the “experts” though. Science is science when they hold forth from on high. Except it’s not.

Addendum: I forgot to mention this, but one clear issue in regards to nutrition is sensitivity to particular individuals and populations. One of the ridiculousness of modern nutrition is how lowest-common-denominator and one-size-fits-all it seems to be. Yes, I’m generalizing here, but I know people with heritable familial cholesterol who were recommended to exercise and avoid eating a wide array by nutritionists even when this condition was already known to run in the family, and, the individuals in question were relatively fit. It was obvious that lifestyle and diet were marginal variables here, but the nutritionists simply could not imagine going off script.

If you have a family history of hypertension and stroke, by all means avoid salt. But it seems that for most of the population the downside risk to flavor is small to non-existent. Apparently the same might apply to cholesterol. And sorry, I think the same also applies to sugar! There are people who are more resistant to metabolic disease. If you enjoined the whole populations to avoid food that 10-25% might gain nutritional benefit, pretty soon that means everyone will eat literally nothing, because we all have different Achilles’ heels.

I’m having a discussion on Twitter about the value of journals, etc., in this age. You’ll hear more from me on that topic in the near future. But right now I want to tell a quick story about how novel distribution and communication channels speeds up everything. A few years back I had some discussions with Peter Ralph while he and Graham Coop were putting together their manuscript for The Geography of Recent Genetic Ancestry across Europe. Peter told me that once the manuscript was put on a preprint server he’d email me so I could check it out. What happened is that 1) the preprint went up 2) within one hour people were talking about it on Twitter 3) within two hours I had put up a blog post about it. Peter emailed me to laugh about the fact that he was about to tell me that the preprint was up when he saw that I had already written a blog post about it.

Obviously not all aspects of the academic production process can be accelerated in this manner. But there are now steps in the reaction where there is very little friction, and the latency can be pretty much abolished. The internet introduced us to “Netscape time”, but it doesn’t seem that many aspects of science have changed much since the universal penetration of the internet….

Since I’ve moved to Unz Review I’ve attracted a set of readers who are used to the level of discourse on topics evolutionary which is the norm on “HBD blogs.” Let me be clear that I don’t tolerate uninformed speculation because I don’t care to listen to it as I don’t gain any value from it. This is in response to a long and bizarre hectoring rant about my lack of credentials, the nature of heredity, etc. It reminded me of the moron who accused me of not understanding Lewontin’s Fallacy at Inducivist a few years ago (a further idiot also decided to “explain” epistasis to me). A buzz word or two does not sagacity make. Naturally this person was banned. But in any case this is as good a place as any to suggest that someone who wants to engage with me in a manner where I will take them seriously should be at least somewhat familiar with population genetics, and hopefully genomics. This naturally curtails communication with most of the human race, and that’s the point. I will at some point die in the future unless the Singularity arrives, so I do not wish to waste my time talking to most of the human race about things they know nothing of.

With the pleasantries out of the way I am here to offer a way to meet the threshold of knowledge which will make you fluent in leaving comments here.

All of these are pretty easy, and three of them are free. You don’t need to derive all the formalisms. God knows I haven’t. But you need a basic algebraic framework to think about the process quantitatively. Additionally, it is probably useful to get at least some genomics background since that’s the empirical data that is really relevant for much of the commentary on this weblog.

I hope I’m clear that any rude, annoying, and hectoring comments are going to result in immediate banning.

By 6 October 2014, many laboratories in the United States must begin honoring new individual data access rights created by recent changes to federal privacy and laboratory regulations. These access rights are more expansive than has been widely understood and pose complex challenges for genomic testing laboratories. This article analyzes regulatory texts and guidances to explore which laboratories are affected. It offers the first published analysis of which parts of the vast trove of data generated during next-generation sequencing will be accessible to patients and research subjects. Persons tested at affected laboratories seemingly will have access, upon request, to uninterpreted gene variant information contained in their stored variant call format, binary alignment/map, and FASTQ files. A defect in the regulations will subject some non-CLIA-regulated research laboratories to these new access requirements unless the Department of Health and Human Services takes swift action to avert this apparently unintended consequence. More broadly, all affected laboratories face a long list of daunting operational, business, compliance, and bioethical issues as they adapt to this change and to the Food and Drug Administration’s recently announced plan to publish draft guidance outlining a new oversight framework for lab-developed tests.