Discover the fascinating world of genetic genealogy! Written for the non-scientist, YGG is the best source for unbiased news on the major genealogy DNA testing companies.
Written by CeCe Moore, an independent professional genetic genealogist and television consultant.

Wednesday, July 25, 2012

National Geographic and Family Tree DNA Announce Geno 2.0

Today we have some news that is incredibly exciting
for our citizen scientists and for all those who are interested in determining
their ancestral origins through DNA testing.

National Geographic is entering the next phase of their
Genographic Project in partnership with Family Tree DNA and the genetic genealogy community.
Continuing to move toward their goal of mapping the pattern of human genetics,
they are introducing the new GenoChip 2.0. This chip is specifically designed
for ancestry testing and includes SNPs from autosomal DNA, X-DNA, Y-DNA
and mtDNA. The design of the new chip was a collaborative effort between Eran Elhaik of
Johns Hopkins, Spencer Wells of National Geographic, Family Tree DNA and
Illumina. The testing will be done at FTDNA in Houston.

Dr. Wells explained that "off-the-shelf chips are not
good for studying ancestry" for the simple reason that they are skewed in
favor of medically relevant SNPs and are not focused on detailed inclusion of
the sex chromosomes and the mtDNA. As a result, this team started from scratch
choosing SNPs for the Illumina iSelect HD chip platform one and a half years
ago. The resulting chip includes approximately 146,000 SNPs, avoiding all known
medically relevant markers and exclusively concentrating on ancestry
informative ones. This new chip will be used for both the research and the
public participation component of the project.

The new funding structure for the project will be announced in September.

[Caution ahead: Some of the following is quite advanced, so if you are new
to genetic genealogy, please skip over the unfamiliar portions. I am including
as much as I can from my notes for the more advanced in our community who may
want specific details.]

BASICS

The Geno 2.0 test will be offered for $199.95 with free
shipping within the US on the National Geographic site and will only
require a cheek swab. All resulting data will be downloadable. They will begin accepting pre-orders today for a fall shipping date (10/30/12). In the future, orders will also be accepted through the Family Tree DNA website (no date is set for this option). Although this is not a traditional relative finder matching tool and is not meant to replace Family Finder, it will cluster you to your closest genetic matches and you will be able to send an anonymous email to correspond with them (not functional at launch). These circle clusters will demonstrate how you connect to people one thousand years ago.

Y-DNA SNPs

The chip includes just over 12,000 Y-DNA SNPs. Ten thousand
of these are completely unique and have “never been published before”. First, the team created probes for all of the 862
Y-SNPs from the current YCC 2010 Tree. Next, they contacted research centers
all over the world and asked them to provide a list of all the Y-SNPs that they
had data mined or discovered, including the L SNPs, the Z SNPs and “private
Hammer” SNPs, and created probes for those. Y-SNPs discovered by citizen
scientists were also included.

More details:

- Many new terminal branches will be gained and, according to
Bennett Greenspan, this will completely replace the deep clade test currently
offered by Family Tree DNA.

- Y-SNPs were vetted against Family Tree DNA’s “Walk Through the Y”
samples.

- 862 SNPS from YCC 2010 Tree vs. 6,153 SNPs on the New Tree

- About 200 SNPs from 2010 failed with ~160 SNPS from 2010
unconfirmed

- Most failures were at roots
such as R, P, A2 and F. Many have synonymous SNPs.

- 115 SNPs from YCC 2010 Current R-Tree vs 550 SNPs on the New
R-Tree with ~200 more potential

- 31 SNPs from 2010 failed with 25 more unconfirmed, but in
progress

- Rebekah Canada wrote and/or performed comprehensive rewrites
of 182 different Y-DNA stories based on approximately 1000 peer reviewed publications and information
from the genetic genealogy community.

- New, updated Y haplogroup maps

mtDNA SNPs

The chip also includes over 3200 unique mtDNA SNPs. They
started by creating probes for the 3352 highest frequency mtDNA SNPs from
Family Tree DNA and GenBank. According to Elliott Greenspan, the level of difficulty
was greatly increased due to variability in mtDNA. It was necessary to create about
31,000 probes to cover all of the variation that can be found in the
surrounding flanking regions. Ultimately, they were able to detect about 3200
of those and, as a result, they can determine about 90% of the known
haplogroups at this point.

More details:

- All SNPs were vetted by running known samples.

- Rebekah Canada wrote new and/or performed comprehensive
rewrites of 248 different mtDNA stories based on ~1000 peer reviewed
publications and information from the genetic genealogy community.

- New, updated mtDNA haplogroup maps

Autosomal and
X-chromosomal SNPs

Over 130,000
autosomal SNPs and X-DNA SNPs were chosen

- AIMs harvested from literature

- AIMs identified using two methods

- Contributed by Family Tree DNA

- Identified at Random

Ancestry Informative Markers (AIMs) are SNPs that show
substantial differences in allele frequency across population groups. Approximately
75,000 AIMs were chosen from approximately 450 populations around the world. About
half of these AIMs were collected from about two dozen published papers and the
rest were calculated from private and public datasets. Many of these populations
datasets had not previously been studied for this purpose, so they used two algorithms to develop
new and never before used AIMs: infocalc by Rosenberg and a private algorithm
developed by Dr. Elhaik called “AIMsFinder” (PCA approach). Dr. Elhaik personally
collected over 300 population datasets from which they had genotype data from thirty
thousand to over one million base pairs and did very exhaustive pairwise
comparisons between difficult-to-distinguish populations to build a unique
database of AIMs.

They also wanted to address the question of how much
interbreeding occurred between modern humans and ancient hominins. Once again,
they collected all relevant SNPs from existing literature on the subject and
included those on the chip. However, they wanted to go further so they used a
novel approach. They identified regions in which modern humans and Neanderthal shared
the derived allele where Denisovan and Chimp share the ancestral and then
repeated the exercise for derived alleles in Denisovan, but not Neanderthal and
Chimp. Ultimately, they collected about 30,000 such SNPs that they feel can
help identify interbreeding between ancient hominins and modern humans.

The team also included SNPs from underrepresented
populations such as Paleo-Eskimos and Aboriginal Australians. What they call “control
SNPs” came from 7,500 random SNPs that have high frequency in the HapMap
and 1000 Genome Project. They were included to facilitate future studies on
these SNPs and how they distribute in different populations. They excluded a
large number of SNPs that had high linkage disequilibrium (LD) in all
populations, excluding those found in the Hunter Gatherer and Papuan
populations because these are of special interest for future studies. (An
interesting side note, when these high LD SNPs are removed from the
commercial platform chips, only about half of the total remains.) The team only
included SNPs that were confirmed by both HapMaps and 1000 Genomes to reduce the
number of erroneous SNPs.

To ensure that the genetic results will not be used for
unethical purposes such as political ends, pharmaceutical ventures, etc… all
samples are anonymous, no medical or trait data is collected, and all SNPs are
non-coding and have no known function. In order to facilitate this process, the
team built a huge database that included all SNPs that were known, suspected or
implied to have associations with disease or traits. To avoid imputation, they
also removed high LD SNPs. They are confident that phenotype cannot be inferred.

More details:

- 23,962 Neanderthal SNPs

- 1,357 Denisovan SNPs

- 12,027 Aboriginal SNPs

- 10,159 Eskimo Saqqaq SNPs

- 998 Chimpanzee SNPs

- 975 X chromosome SNPs (the team is looking for more X chromosome
AIMs from citizen scientists)

- 76% of SNPs overlap with Illumina 660k array

- 55% of SNPs overlap with Illumina HumanOmni1-Quad and Express and Affy 6.0

- 40% of SNP overlap with Affy 5.0 and Human Origins Chip

- GenoChip is enriched for Common Alleles

- Heat maps

Summary

All of this adds up to an unprecedented effort by National
Geographic and Family Tree DNA to move genetic genealogy in an innovative new
direction. This is a very exciting time for all of us citizen scientists since
it appears that there is increasing opportunity to contribute to this advancing
field and recognition for those who do.

This blog post is really just a
start. There will be much more to report in the coming weeks, including a product review. So, be sure and check back!

[Update 12/13/12 - You can see my results here and others here and here.]

For a genealogist who has questions about his/her ancestral origins, this test should be more accurate than any of the others currently on the market. It is also a way to possibly discover unknown origins of your direct paternal and/or direct maternal lines through their very detailed haplogroups.

I personally see many uses, including ethnicity prediction (which interests both genealogists and non-genealogists); deep clade haplogroup analysis (something I've been meaning to do for years but never got around to); SNP discovery to refine Y-DNA and mtDNA trees, etc. I think there may also be numerous third-party uses, some of which we haven't thought of yet. It's not useful for identifying autosomal cousins, but of course there's much more to genetic genealogy than identifying cousins.

HI CeCeGreat info!DO you think this would be a good investment for us adoptees?I have done so much and already with 23andme me FTNDA and upgrades to 67 and Deep Clade…plus I waiting for my ANcenstery.com results.

Hi Bob!I am always happy to give advice to a "rooky" and adoptees. I think that this test will likely be the most accurate test for determining unknown ancestral origins. For instance if you have a SNP that is only found in specific populations that will tell you that at least one of your ancestors originated there. If you "cluster" with others exclusively from a small village somewhere, then you can safely assume that you have some connection to that place too. It probably isn't the first test I would recommend since a person likely wouldn't discover close biological family members there, but if there are questions about an adoptee's ancestral origins, then this would be a worthwhile test.I will have to see how it works to really be able to give a definitive answer though, so if I were you, I would probably wait to order it until I review it - unless you just can't stand the anticipation - then go ahead! :-)

Thank you for the info regarding a discount. Apparently, customer service at NatGeo are not yet aware of this, so they are advising customers to wait to order if they are going to use the discount.Since the data will be downloadable, I am very confident that third party tools will be developed for it.

Janice,It remains to be seen what impact this test will have on the necessity of single SNP orders for each individual haplogroup. Personally, I will not suspend my SNP testing in search of my I-M223 Moore origins. The test will not be out for at least a couple months and I don't want to wait!

@Celala,Family Finder is a test that allows you to find people who are related to you in a genealogical time frame (within the past 400 years or so). The primarily purpose of Geno 2.0 is for studying deep ancestry and for deep Y chromosome subclade testing. The mtDNA test results will also be helpful from Geno 2.0, but this test will not be a substitute for a complete mtDNA sequence, which is still the gold standard for mtDNA.

@Mike,The Y STR test that National Geographic offers is entirely different from Geno 2.0. The results are complimentary to each other.Sincerely,Tim Janzen

@sb10 - There are many never before researched SNPs included on the new test that may (or may not) be informative to you. I would suggest waiting until it comes out and the SNP researchers get a chance to review it before purchasing it.CeCe

@Prasenjit - There is MUCH more to be learned from the new Geno 2.0 test as compared to the old one. It sounds as if it may be very interesting for those of Indian ancestry, but we won't know for sure until it comes out and the "citizen scientists" get a chance to review it. I would wait until it has been out for, at least, several weeks before deciding whether to purchase it.Thanks for commenting,CeCe

Hi CeCe, Just 2 questions:1) The results of 67 STR-values on my Y are already known. Will Geno 2.0 increase that number or tell me anything I still don't know about my STR's?2) I hear that SNP's L140 and L141 SNPs will not be included in the tests. And what about SNP L177?

Hi Bee,Yes NatGeo tests over 130,000 autosomal SNPs with this Geno 2.0. You inherit 50% of your autosomal DNA from each of your parents, so your results are based on the ancestry that you received from both sides of your family. CeCe

Hi Jose,It depends which test you have taken, but this test is unique from FTDNA's other tests in that it tests Y-DNA SNPs, mtDNA mutations AND autosomal SNPs, so you receive a Y-DNA haplogroup, a mtDNA haplogroup and an ancestral origin breakdown from your autosomal DNA. Take a look at my results here: http://www.yourgeneticgenealogist.com/2012/12/my-geno-20-results-step-by-step.html

I have tested through ancestry.com and transfered my data to FTDNA. I have purchased an upgrade to my kit for 67 markers and have tested a couple SNP's. I wanted to get my Dad involved and we talked about the benefits of testing with Geno 2.0. I have since ordered the kit only to find ou that the new Geno 2.0 does not test STR's as the Geno 1.0 did. So, I will not be able to link his information to mine and share data.I guess he will not appear on the results pages just the SNP pages. I'm starting to feel like the popular new guy in prison.

I was excited by the Genographic 2.0 offer. I ordered my kit in early December and mailed back my swabs for testing before Christmas. I eagerly awaited the 6-8 weeks to get my results...and waited...and waited...and waited. It has now been 5 MONTHS and the online results page says my analysis is only 60% completed. I contacted National Geographic for a partial refund because of the exceedingly long delay, but they refuse to issue one. This is a new and apparently popular test. I would urge people to seek out another DNA testing company than National Geographic, their Genographic Project, and their contractor Family Tree DNA unless you are willing to wait half a year or more for your results.

Where specifically is the list of what SNPs are on the Geno 2 chip, and also what ones were on it before 2013, if that has changed?

M284 is specifically not on the Geno 2 test, which means there was no attempt to include every important SNP, nor the entire haplotree as of 2010, on the chip.

I need to determine specifically what to believe that someone who tested positive for U106 on Geno 2 WAS tested for, so he can be tested specifically for what his Y DNA match was NOT already tested for. There is noone in existence whose endpoint SNP is U106.