About two months ago I posted an entry where I sketched out an extremely simple model for skin color assuming there were 6 loci and two alleles (on and off). There was a reference in the comments to “5 loci” for skin color as a quantitative trait. From what I can gather that assumption derives from a paper published in 1981 by Russ Lande, which is online. In reality that paper simply draws upon older work from 1964, and its primary focus is on estimating the number of loci in crosses between heterogenous populations (using inbred lines was the way pioneered by Sewall Wright). But, it turns out that Cavalli-Sforza and Bodmer discuss that older work in Genetics of Human Populations, which I have a copy of. Today genomics is exploring the details of the loci which control for skin color, but we have a long way to go, so I’m going to reproduce some of the data and conclusions from Bodmer & Cavalli-Sforza’s work so that it will be online….

I’m laughing at the “we have a long way to go” part. Long way in this case probably meant a few years, as I don’t think there’s been that much substantive change since about 2008 in human pigmentation genetics. All the low hanging fruit has been picked. It looks like that across any two distinct inter-continental populations you’ll be able to apportion most of the variance to less than half a dozen loci. Geneticists were able to infer this decades ago based on pedigree analysis, which was only possible because of the fact that these were large effect quantitative trait loci in the first place (i.e., most of the variation was due to only a few genes). * If the trait had been extremely polygenic they’d only have been able to say with any plausibility or precision that the number of genes responsible was very large.

But it’s one thing to ascertain the genetic architecture of the trait, and another to make reasonable characterizations about its natural history. To make a long story short haplotype based tests, which look for correlations of markers across regions of the genome, tend to suggest that many of the pigmentation loci have been subjected to recent bouts of natural selection. More interestingly, the candidate genes which seem likely to account for light skin in East and West Eurasians seem to be somewhat different, implying that the change in allele frequencies postdates the separation of these two populations. A few years ago there were waves made when there was a report that the gene which seems to be responsible for a great deal of the de-pigmentation in West Eurasians, SLC24A5, only began to sweep up to higher frequencies within the last ~6,000 years. But I heard through the grapevine that this may be too much of an underestimate, and you might be looking at a sweep which began more than ~10,000 years ago.**

The results in the paper above throw some cold water on positive results for natural selection at the pigmentation loci. Why does this matter? Because a priori there are obvious reasons why there might be natural selection at these genes. In contrast, many results have to be accompanied by after the fact suppositions as to the functional rationale for adaptation. The question becomes: if you can’t trust the results to be consistent on a trait where the adaptive rationale and genetic architecture are clear, when can you trust these tests? I think the qualifying kicker in the paper above comes in the discussion:

The fifth, and perhaps most likely, reason for discrepancies between LRH [long range haplotype] and sequence-based tests we observed here may be the different underlying assumptions of the evolutionary models used (that is, instantaneous selective sweep versus incomplete selective sweeps) in the definition of each statistic, and the evolutionary timescale over which each type of test can recover departures from neutrality…In that case, our results might indicate an extremely recent selection in the pigmentation genes, which would be recovered by haplotype-based but not sequence-based tests.

In other words, the authors themselves believe i is entirely possible that the likely reason you don’t see a concordance between the results in these sets of tests is that they exhibit differing sensitives to different adaptive dynamics. This is one reason haplotype based tests became popular in the first place, as they could fix upon processes which something like Tajima’s D might miss. So at this point I think we can still say with some certainty that natural selection seems highly likely at these genes, even if they don’t jump out on all the tests.

COMMENTS NOTE: Any comment which misrepresents the material in this post will result in banning without warning. So you should probably stick to direct quotes in lieu of reformulations of what you perceive to be my intent in your own words. For example, if you start a sentence with “so what you’re trying to say….”, you’re probably going to get banned. I said what I tried or wanted to say in the post. Period.

* There are few enough SNPs that I can, and have, constructed a distribution of phenotypic outcomes of my soon-to-arrive child based on the variation present in the parents, who have both been genotyped.

** I am homozygous for the “European” allele at this locus, as are my parents. I am of the suspicious that this variant arrived in the Indian subcontinent via the “Ancestral North Indians.”

That paper is totally underpowered to find evidence of recent selection on these loci. They sequenced on the order of 6 kb, in around 25 people per sample. For computational efficiency, the statistical tests in the paper considered only 10 sequences. (!)

If we consider the last 20,000 years as the time when these loci were most plausibly under positive selection, that gives a max of 1000 generations of time. Assuming a mutation rate of 1.4e-8 per site per generation, we expect 6e3*1e3*2e-8 = 0.08 mutations in a given lineage across 6 kb of sequence in 20,000 years. If selection unfolded within 10,000 years, it’s roughly 0.04 mutations per lineage.

A test of neutrality may be violated in the case where a large fraction of the sampled sequences go back into a star phylogeny from one ancestor in the last 20,000 years. To detect this pattern reliably, we need either a pretty high mutation density or a really big sample.

http://johnhawks.net/weblog John Hawks

Sorry forgot to edit the equation, 6e3*1e3*1.4e-8=0.08.

http://entitledtoanopinion.wordpress.com TGGP

“my soon-to-arrive child”
I don’t know if you’ve already revealed that on the blog before, but in any case congrats.

http://www.isteve.blogspot Steve Sailer

Yes, congratulations.

Here’s an article idea: why we will or we won’t have our newborn child’s genome sequenced. I’d, personally, be especially likely to publish such an article if my answer was no because I don’t want to go the growing up in public path that Amy Chua and Bryan Caplan have taken their children down.

miko

Finally got my sample collection vial from the PGP1000! If I actually get sequenced, I’ll have an online project of going through it.

http://blogs.discovermagazine.com/gnxp Razib Khan

why we will or we won’t have our newborn child’s genome sequenced. I’d, personally, be especially likely to publish such an article if my answer was no because I don’t want to go the growing up in public path that Amy Chua and Bryan Caplan have taken their children down.

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com