November 07, 2010

Multidimensional scaling and ADMIXTURE across Northern Eurasia corresponds to geography and language

Here is a multi-dimensional scaling plot of a number of North Eurasian populations. In comparison to my previous post, I have excluded Americans and Greenlanders, and added several other populations from Central Asia and West Eurasia.

Population labels have been printed in the co-ordinates of the population averages; these largely correspond with identifiable blobs of colored points, but note that some populations have several outliers, so labels appear in white space. Most notable in that respect are the Koryak, Chukchi, and the Nganasan, all of whom have some apparently European-admixed individuals.

"Mongol" corresponds to Rasmussen et al. (2010) Mongol sample, while "Mongola" to the HGDP-CEPH one. The population codes on the left may not be clearly visible as they overlap with each other and are CEU, LT, HU (relatively unadmixed Caucasoids), FI/RU (Uralian-admixed northern Caucasoids), IR/TR (Altaic-admixed southern Caucasoids). The West Eurasian part of the plot can be seen blown up on the right.

The correspondence with geography and language is striking. Siberian isolates from the extreme north and east, Koryak and Chuckhi are on top; HapMap Chinese at the bottom. Between them are Uralians (Selkup, Yukagir, Nganassan) and Altaics (Mongol-Tungus-Turkic people).

Below is ADMIXTURE analysis for the same set of populations, for K=7:

Finns and Russians seem to have an excess of the "Nganasan" component over the Altaic, while Turks have the opposite. Below is a table of Fst distances between components:

The close relationship between the two Caucasoid components is apparent (Fst=0.033), but note fairly large Fst divergences between the morphologically Mongoloid groups. I attribute this mostly to the very low population sizes of these groups, which have probably affected them by drift. For the less demographically constrained Altaic and East Asian components, Fst=0.044.

23 comments:

"But as for certain truth, no man has know it,Nor will he know it; neither of the godsNor yet of all things of which I speak,And even if by chance he were to utterThe perfect truth, he would himself not know it;For all is but a woven web of guesses."

Would have been very interesting to see where the Japanese and the Korean fit within this (if there are any samples available).I also hope that one day reliable Ainu and Nivkh samples can be obtained for this kind of things.

I have one Korean, and Japanese are included in HapMap, so I might make a go at it when I find some time. I don't use Japanese primarily because I think it unlikely that they influenced the gene pool of most of Eurasia, which is my region of interest.

Unfortunatelly, you've omitted Ket from this run. In last one three Sibirean components appeared: Southwestern (Nganasan - Uralic), Central (Ket - Yeniseian) and Northeastern (Koryak). As a result, Selkups "lost" their genetic identity.

Probably because there are only 2 Kets in the data. I'll keep it in mind to include them in future analyses. Unfortunately I just started one with 67 populations that I don't want to interrupt, but in the next iteration I will put them in.

Where'd you get these samples? I can only find the data for the ancient Eskimo.

Lithuanians are a fairly good proxy for unadmixed NE European Caucasoids, but Iranians aren't a good proxy for unadmixed Caucasoids of Asia Minor. You should add Armenians to the analysis to see how unadmixed Anatolian Caucasoids would appear on MDS.

I am sure those samples will be very interesting to NW and NE Europeans who seem to be gifted with admixture from NE Eurasia.

I am wondering about the small but unexplained NE, North Eurasian admixture seem in some Southern Europeans. Is it all Altaic, a consequence of the Anatolian Turks or something quite different. There is very little said about the movements of Altaic speakers in Southern Europe like the Huns or Avars, and what effect did the Ottomans have in the Balkans?

The geographical and language correspondence is expected, so I don't find those striking.

What I find more striking is the great level of diversity in a very sparsely populated area, in which many populations were historically more compact geographically than they are now (Mongol and Altaic language speakers made their way to the West only in historic times).

While there are clear clusters, they are not tight ones, and for language families, the clusters take up an immense space on the MDS map compared to Europeans or the Chinese.

On the East-West axis some of that may simply be tracking levels of admixture between very different West Eurasian and East Eurasian populations, or may be an artifact of the scale units. It would be interesting to see what the scatterplot would like like with dimension two twice as fine, and dimension one a quarter as fine as it is in the plot. That would make the North-South gradient from East Asia, to Altaic, to Uralic to proto-Siberian more visually obvious, while understating the significance of the European v. East Asian gradient across Siberia.

Dieneke, one very important point, remember that in Auton et al. 2009, which used the POPRES samples, the only non-West Eurasian component in Turks was the South Asian component with no East Eurasian component. Is the difference because the Auton et al. paper used the Affymetrix microarray platform (500K) instead of Illumina? Which microarray platform is more reliable according to you?

Lol, are referring to the .1% East Asian found In the CEU again. Its really only the fins and Russians who have a small but significant enough amount of. East/northeast Asian component.

If HU stands for Hungary the "Huns" and "avatar" did not contribute much non-european blood at all. However if you look at dienekes recent analysis which included the Romanians you'll see that they have a sizable contribution from the east.

Do you have data on the swiss, such as the 2008 study "Genes mirror geography within Europe". Or any other study especially including non-western Swiss. I would really like to see the Swiss in dotecad.I don't really know what they are (central European/German, French like, or Alpine/northern Italian).

Dieneke, one very important point, remember that in Auton et al. 2009, which used the POPRES samples, the only non-West Eurasian component in Turks was the South Asian component with no East Eurasian component. Is the difference because the Auton et al. paper used the Affymetrix microarray platform (500K) instead of Illumina? Which microarray platform is more reliable according to you?

Lol, do you see any East Asian reference populations in what you are linking? How do you expect to see the East Eurasian admixture in Turks in a study that doesn't include East Eurasians?

onur, the barplot you sent me has a vertical height of about 33 pixels. This means that ~5-6% East Asian admixture in Turks will occupy 1-2 pixels, which will, moreover, may be averaged out in the lossy JPG format.

Dieneke, it is obvious even with this level of resolution that the results of Auton et al.'s Turks are very different from those of your and Behar et al.'s Turks. There is no visible East Eurasian component in Auton et al.'s Turks and a South Asian component much bigger than your and Behar et al.'s Turks. Is it becuase that Auton et al. includes only 4 Turks? Maybe. But Auton et al. uses a different microarray platform from that of Behar et al. and your project and this may be the reason of the difference too. I think you can easily test this probability using Affymetrix microarray platform on Turks and other populations. POPRES samples would be invaluable, but they aren't open source AFAIK.

I didn't know if Tibetan-Qiangic group's data of the paper are available for you, if so, It's better to add those to campare with siberians and Mongolians, Mongolians share a relatively large grey cluster with Tibetans and Qiangs while the red cluster shared with south Chinese, Koreans, Japanese are less relatively according the older paper. thanks!

You linked (http://2.bp.blogspot.com/_Ish7688voT0/Sfttv4ydl5I/AAAAAAAABVE/ycfoDOsujnQ/s1600-h/auton_structure.jpg) showing the results of Auton et al. 2009. I'm so glad that it has resulted for the German Swiss. Are the results displayed in any other way such as fst values or composition graphs.

Nuadha, the Auton et al. paper doesn't have intra-European Fst estimates; and unfortunately its intra-European haplotype analyses are between groups of countries and ethnic groups, not between single countries and/or ethnic groups, so they don't provide any specific information about the German Swiss. As to its other graphs, again they don't have any intra-European results, and they have no European country/ethnicity labels. In short, the Auton et al. paper has nothing of value about the German Swiss, so I can't make a sub-racial classification for them based on this paper.

Old Blog Archive

Dienekes' Anthropology blog is dedicated to human population genetics, physical anthropology, archaeology, and history.

You are free to reuse any of the materials of this blog for non-commercial purposes, as long as you attribute them to Dienekes Pontikos and provide a link to either the individual blog entry or to Dienekes Anthropology Blog.

Feel free to send e-mail to Dienekes Pontikos, or follow @dienekesp on Twitter.