Monday, December 19, 2011

'world9' calculator

I have consistently received requests for an assessment of Amerindian ancestry. While the focus of the Project is, and will remain, the region of Eurasia, I thought it was a good idea to release a tool that could be used by persons of partial Amerindian ancestry.

I have also included the two Australasian populations currently available, namely Bougainville Melanesians (NAN_Melanesian) and Papuans from the HGDP.

The inferred components at K=9 are quite similar to those of 'eurasia7', with the addition of the Australasian and Amerindian components. I have also included the Kalash in this experiment, which caused the 'West_Asian' component to be modal in them, although the Kalash's difference in terms of this component to other populations is not so great as to render it strongly population-specific; I have called this component 'Caucasus_Gedrosia' and it -like the 'eurasia7' West Asian component- ought to be quite similar to the k5 component inferred by Metspalu et al. (2011).

It is unfortunate that there are only two Australasian populations currently available as public data. There are many more Amerindian and Mestizo ones, but it should be noted that the Amazonian populations on which the 'Amerindian' component is modal are some of the most lacking in genetic diversity in my entire database. As a result, Eurasians who lack any Amerindian or Australasian ancestry can expect to see a little of it in their results as noise.

This is a very important caveat for Americans who suspect that they may have an Amerindian ancestor. Small levels of this component may be noise, and this component is also found in Siberia, and may represent either backflow from the Americas or the common ancestry of Siberian and Amerindian populations. If you are interested in the detection of Amerindian ancestry, I recommend that you use DIYDodecad's 'byseg', 'bychr', and 'target' modes to drill down deeper in your genomes.

Download Files

The spreadsheet contains admixture proportions, the table of Fst distances, and individual results in the Individual Results tab.

The RAR file contains files for use with DIYDodecad. Extract its contents to the working directory of DIYDodecad. In order to run the calculator, you follow the instructions of the README file, but type 'world9' instead of 'dv3'.

Terms of use:

'world9', including all files in the downloaded RAR file is free for non-commercial personal use. Commercial uses are forbidden. Contact me for non-personal uses of the calculator.

Information

Admixture proportions barplot:

The nine ancestral components are:

Amerindian

East_Asian

African

Atlantic_Baltic

Australasian

Siberian

Caucasus_Gedrosia

Southern

South_Asian

Table of Fst divergences:

Neighbor-joining tree of Fst distances; the long branch lengths of the Australasian (and to a less degree the Amerindian) branch is due to the high level of inbreeding in the populations for which this component is modal.

First 8 dimensions of multi-dimensional scaling (MDS):

Technical Details

A dataset of 3,548 individuals/265,519 SNPs/284 populations was assembled. Pruning for distantly related individuals was performed by iterative pruning of a single individual from each pair showing IBD RATIO greater than the mean plus 2 standard deviations, or greater than 2.5. 3,026 individuals remained. An additional 14 individuals were removed because they had less than 97% genotype rate. The marker set was thinned to remove SNPs with less than 97% genotype rate or 1% minor allele frequency. Linkage-disequilibrium based pruning with a window of 200 SNPs, advanced by 25 SNPs, and an R-squared of 0.4 was performed. A total of 3,012 individuals and 170,822 SNPs survived these filtering steps. PLINK 1.07 and ADMIXTURE 1.21 were used in the analyses.

35 comments:

At this run, my Amerindian and Siberian components, combined, were significantly larger than the Asian component(s) I used to get on other calculators (v3, k12a, euro7, eurasia7, and some of Eurogene's calculators). Is it possible that the Amerindian component has now been overestimated (say, because some of the Amerindian control samples were admixed), or is it more likely that it has before been underestimated due to a lack of Amerindian control samples in previous works?

Can anyone help me understand how to load the results of my population finder from ftDNA onto this calculator? My results are from ftDNA, I will gladly share it, since my mother is Oceania and my Father Pakistan.

Umm, I have read your article about the caution of admixture estimates. Are Pure amerindian reference samples are used on the world9 calculator? I think it is very important to choose Pure 100% Amerindian samples in order to get a accurate percentage.

If you have Malagasy ancestry, that could be a reason since Native Americans are found to have Polynesian markers in their dna, and come from East Asian back ground like Polynesian/Austronesian people.

Hello, I have the world9 2011, Is there a newer or more update version then what i have? Please let me know if so and where to find it or maybe let me what i maybe doing wrong, with much thanks. Because i have non african american results and i have to wonder it may be earlier version i had download some years ago,Thank you.

Useful software

You may cite, quote, or reproduce articles on this site for non-commercial purposes, provided that you attribute them to Dienekes Pontikos and provide a link either to the main page of this blog or to the individual blog entry you are referring to.