As DF Reynolds has pointed out, Geno 2.0 testing is likely to change the tree so much that R1b will soon be archaic....

ISOGG's tree is probably as current if not more current than Geno 2.0 already, in terms of new SNPs. Am I missing something? I recognize there will be some SNPs identified in more than one haplogroup that we don't need know about, but why are we expecting the known Y DNA tree to change because of Geno 2.0, at least the R1b parts of it?

No, you're not missing something. Geno 2.0 is not an SNP discovery trip like WTY is. The Geno 2.0 chip is "loaded up" with known SNPs. The following quote is from Thomas Krahn, a member of the Geno 2.0 design team:

"Also you shouldn't have a too high expectation to find someground-breaking new SNP with the Geno 2.0 test. The phylogeny of thecontained markers is pretty much established and only in a few occasionswe will find unexpected parallel or reverse mutations. Geno 2.0 cannotfind absolutely new SNPs, however it will bring high resolution SNPtesting to a very broad user-base."

Further to the above, it strikes me that there may be a point of confusion in regards to what we might call "new" SNPs versus "known" SNPs.

As Thomas tells it (above), all of the SNPs on the chip are "known" by definition because they have to be -- the chip is programmed to recognize (and report instances of) known SNPs.

However, of those thousands of "known" SNPs on the chip, a subset could be called "new" as far as FTDNA customers are concerned because they haven't been available for testing via FTDNA until they were programmed into this chip because they were only "known" to sources outside of FTDNA and its customer community.

If that's the case, then those of us who have currently tested "to the death" as it were with FTDNA may have the opportunity to be found positive for SNPs that were hitherto unavailable to us via FTDNA, so we could consider those to be new to us (if not to science).

Exactly. SNP rs11799226 was added to the NCBI database in Feb 2004, so had been "known" for over four years at the point 23andME added it to their v2 chip. Then in Oct 2008, a whole bunch of R-P312* folks were pleasantly surprised to find out they were rs11799226+, which lead to Thomas Krahn offering the SNP as L21 and Jim Wilson offering it as S145.

...Exactly. SNP rs11799226 was added to the NCBI database in Feb 2004, so had been "known" for over four years at the point 23andME added it to their v2 chip. Then in Oct 2008, a whole bunch of R-P312* folks were pleasantly surprised to find out they were rs11799226+, which lead to Thomas Krahn offering the SNP as L21 and Jim Wilson offering it as S145.

I guess I'm nitpicking, but I think it is important to differentiate Geno 2.0 isn't discovering new SNPs, it is really just a new and much, much more comprehensive "deep clade package". However, that is a big thing. Add to that the fact that the National Genographic Project is for the first time offering a deep clade kind of package will mean a very large influx of additional people tested fairly deeply on SNPs. This is all bound to cause refinements to the Y DNA tree.

My perception is that the citizen-scientist team has done a great job of finding differentiating SNPs within R1b based on Y chromosome scanning projects. I don't know how it difficult this was, but I know folks gleaned through large amounts of "raw" data. Any opinion - or maybe Richard R or Greg have one on this - Do they feel a lot of Y locations in the raw data were not yet analyzed and cross-referenced across known Y DNA SNPs?

I guess, I'm asking if there is much of a chance of a latent L21 laying around in the data?... per the 23andMe example. There was no WTY back then and less human genome publicly available data in general... plus the citizen-scientist team wasn't as fully engaged.

I can see something akin to L459 and/or Z245 finding their own unique places on the Y DNA tree that split off fairly small paragroups or what have you, but I'd be surprised of a major division within L21 or P312 or something. Am I wrong?

...Exactly. SNP rs11799226 was added to the NCBI database in Feb 2004, so had been "known" for over four years at the point 23andME added it to their v2 chip. Then in Oct 2008, a whole bunch of R-P312* folks were pleasantly surprised to find out they were rs11799226+, which lead to Thomas Krahn offering the SNP as L21 and Jim Wilson offering it as S145.

I guess I'm nitpicking, but Geno 2.0 isn't discovering new SNPs, it is really just a new and much, much more comprehensive "deep clade package". The fact that the National Genographic Project is for the first time offering a deep clade kind of package will mean a very large influx of additional people tested fairly deeply on SNPs. This is all bound to cause refinements to the Y DNA tree.

My perception is that the citizen-scientist team has done a great job of finding differentiating SNPs within R1b based on Y chromosome scanning projects. I don't know how it difficult this was, but I know folks gleaned through large amounts of "raw" data. Any opinion - or maybe Richard R or Greg have one on this - Do they feel a lot of Y locations in the raw data were not yet analyzed and cross-referenced across known Y DNA SNPs?

I guess, I'm asking if there is much of a chance of a latent L21 laying around in the data?... per the 23andMe example. There was no WTY back then and lesser human genome publicly available data in general... plus the citizen-scientist team wasn't as fully engaged.

Chris Tyler-Smith trawled over the 1K Genomes Project data and reportedly found ~3500 SNPs. I don't know what, if any, overlap there is between his findings and those of the citizen scientists, but seems there's potential for "new" SNPs in that component of Geno 2, if not others.

...Exactly. SNP rs11799226 was added to the NCBI database in Feb 2004, so had been "known" for over four years at the point 23andME added it to their v2 chip. Then in Oct 2008, a whole bunch of R-P312* folks were pleasantly surprised to find out they were rs11799226+, which lead to Thomas Krahn offering the SNP as L21 and Jim Wilson offering it as S145.

I guess I'm nitpicking, but Geno 2.0 isn't discovering new SNPs, it is really just a new and much, much more comprehensive "deep clade package". The fact that the National Genographic Project is for the first time offering a deep clade kind of package will mean a very large influx of additional people tested fairly deeply on SNPs. This is all bound to cause refinements to the Y DNA tree.

My perception is that the citizen-scientist team has done a great job of finding differentiating SNPs within R1b based on Y chromosome scanning projects. I don't know how it difficult this was, but I know folks gleaned through large amounts of "raw" data. Any opinion - or maybe Richard R or Greg have one on this - Do they feel a lot of Y locations in the raw data were not yet analyzed and cross-referenced across known Y DNA SNPs?

I guess, I'm asking if there is much of a chance of a latent L21 laying around in the data?... per the 23andMe example. There was no WTY back then and lesser human genome publicly available data in general... plus the citizen-scientist team wasn't as fully engaged.

Chris Tyler-Smith trawled over the 1K Genomes Project data and reportedly found ~3500 SNPs. I don't know what, if any, overlap there is between his findings and those of the citizen scientists, but seems there's potential for "new" SNPs in that component of Geno 2, if not others.

Tyler Smith used 525 diverse males from the 1000 Genomes dataset and 36 from the Complete Genomics dataset. here are some of the findings:Some facts: I don't know if this is the final status (publication tree): 1K-Genomes 523 individuals and 15,953 sites (SNPs);Complete Genomics 36 individuals and 6,662 sites (SNPs);expansion of DE/GR calculated at ca. 66,000 yearscontemporary expansion of GR confirmedlate extreme expansion of R1b calculated at ca. 11,000 yearsR1b, O and E with very good coverage. Not much I, J, N diversity, some D, Q and R1a samples and very few A, G and T samples. L individuals completely missing?I understood that new SNPs will simply have a rs-numberY Haplogroups: C. Tyler-Smith asks if a nomenclature or abbreviated names for major clusters (R1b-M269) are the best solution

...Exactly. SNP rs11799226 was added to the NCBI database in Feb 2004, so had been "known" for over four years at the point 23andME added it to their v2 chip. Then in Oct 2008, a whole bunch of R-P312* folks were pleasantly surprised to find out they were rs11799226+, which lead to Thomas Krahn offering the SNP as L21 and Jim Wilson offering it as S145.

I guess I'm nitpicking, but I think it is important to differentiate Geno 2.0 isn't discovering new SNPs, it is really just a new and much, much more comprehensive "deep clade package". Howver, that is a big thing. Add to that the fact that the National Genographic Project is for the first time offering a deep clade kind of package will mean a very large influx of additional people tested fairly deeply on SNPs. This is all bound to cause refinements to the Y DNA tree.

My perception is that the citizen-scientist team has done a great job of finding differentiating SNPs within R1b based on Y chromosome scanning projects. I don't know how it difficult this was, but I know folks gleaned through large amounts of "raw" data. Any opinion - or maybe Richard R or Greg have one on this - Do they feel a lot of Y locations in the raw data were not yet analyzed and cross-referenced across known Y DNA SNPs?

I guess, I'm asking if there is much of a chance of a latent L21 laying around in the data?... per the 23andMe example. There was no WTY back then and less human genome publicly available data in general... plus the citizen-scientist team wasn't as fully engaged.

I can see something akin to L459 and/or Z245 finding their own unique places on the Y DNA tree that split off fairly small paragroups or what have you, but I'd be surprised of a major division within L21 or P312 or something. Am I wrong?

In the 1KG data, there were many positions on the Y that either did not sequence or did not sequence to a quality worth reporting. THis was due to the number of passes being of low quality (2x-4x). So, while unlikely, it is possible that some major branch is waiting to be discovered. Will that branch be revealed in Geno 2.0? I think Thomas has said 'no'. To be honest, I am so well tested, that I did not order Geno 2.0 until I found out that Sardinian full genomes were sequenced even though U152 in Sardinia seems to not be L2+. Since Sardinia is prone to insular founder SNPs, I will probably remain L2*.

On the other hand, there were many 1KG same-level and singleton SNPs that were located in positions that made it impossible to create primers for. Those would not be an issue for sequencing and may produce some sub-branching. Also interesting is the SNPs that are a little less stable that could act as proxies for unknown SNPs.

To be honest, I am so well tested, that I did not order Geno 2.0 until I found out that Sardinian full genomes were sequenced even though U152 in Sardinia seems to not be L2+. Since Sardinia is prone to insular founder SNPs, I will probably remain L2*.

But also Sicily is an island and there were exchanges between these two ones: see the R-M269* haplotype of that Elymian I cannot mention, which has clear links with Sardinia. Certainly the Sardinian U152-s are continental derived, but just for this they may have many SNPs from the continent and by my friend (Grassi) haplotype from Liguria I think that L20 (then L2) is older than it is thought and may be born there and not elsewhere.

...Exactly. SNP rs11799226 was added to the NCBI database in Feb 2004, so had been "known" for over four years at the point 23andME added it to their v2 chip. Then in Oct 2008, a whole bunch of R-P312* folks were pleasantly surprised to find out they were rs11799226+, which lead to Thomas Krahn offering the SNP as L21 and Jim Wilson offering it as S145.

I guess I'm nitpicking, but I think it is important to differentiate Geno 2.0 isn't discovering new SNPs, it is really just a new and much, much more comprehensive "deep clade package". However, that is a big thing. Add to that the fact that the National Genographic Project is for the first time offering a deep clade kind of package will mean a very large influx of additional people tested fairly deeply on SNPs. This is all bound to cause refinements to the Y DNA tree.

My perception is that the citizen-scientist team has done a great job of finding differentiating SNPs within R1b based on Y chromosome scanning projects. I don't know how it difficult this was, but I know folks gleaned through large amounts of "raw" data. Any opinion - or maybe Richard R or Greg have one on this - Do they feel a lot of Y locations in the raw data were not yet analyzed and cross-referenced across known Y DNA SNPs?

I guess, I'm asking if there is much of a chance of a latent L21 laying around in the data?... per the 23andMe example. There was no WTY back then and less human genome publicly available data in general... plus the citizen-scientist team wasn't as fully engaged.

I can see something akin to L459 and/or Z245 finding their own unique places on the Y DNA tree that split off fairly small paragroups or what have you, but I'd be surprised of a major division within L21 or P312 or something. Am I wrong?

I did not mean to imply that I thought an SNP of the magnitude of L21 would be found via Geno 2.0. It is certainly possible, but I wouldn't think it very probable.

My expectation is that many of the changes we will see will be in the mid to upper reaches of the haplogroup trees. Certainly consumer testing is focused on the terminal branches and there are hundreds of SNPs higher up in the trees that have been only very loosely characterized.

I am also interested in the 130,000 AIMs which are used for deep ancestry. This is not something we have had available before. The first results and the detailed report should shed more light on this.There will be many new papers presented on the 7th November including Ancestry Painting 2.0 and POBI, I am wondering if Geno 2.0 will be scheduled for that time frame.

I am also interested in the 130,000 AIMs which are used for deep ancestry. This is not something we have had available before. The first results and the detailed report should shed more light on this.There will be many new papers presented on the 7th November including Ancestry Painting 2.0 and POBI, I am wondering if Geno 2.0 will be scheduled for that time frame.

I believe a commencement time of October was mentioned in the initial announcement of Geno 2.

However, it seems the "go live" is also dependent upon publication of Spencer Well's paper.