Terminal SNPs

FTDNA has released their new yTree and is transitioning to reporting Terminal SNPs. This is two major changes at one time - which is creating a good bit of confusion. In my opinion, Terminal SNPs are fine - as long as we continue to use some version of a companion long form so that we can recognise where a SNP falls on the tree. (Even with the regular changing that has to occur to a long form to keep it current, having it is much better - and less work - than not having any easy reference.)

Issues that I see:

1. Terminal SNPs are a very different Format which uses the first letter of the Haplogroup, followed by a dash and then the Terminal SNP. As example, my previous reported haplogroup was R1b1a2a1a1b3c2. (I call this the "long form") My Terminal SNP version is R-L196. In the old system, when someone extended their level of detail, the extra info was added to the end of a long form and it was easy to see the progression. With the new system - position and relationship to other kits won't be understandable without a reference tree. (As example, My R-L196 is R-L2 if you remove the last digit.)

2. Errors: There have been errors in the Tree and errors in the reported Terminal SNP during the roll-out. Right now, things seem to be changing daily - so it's a good idea to lean back and wait for the dust to settle.

3. Transition: The conversion is apparently being made in stages - possibly each night - creating conflicts due to combining the old long form and new Terminal SNP structures. (when I try to update a project, some of the men have a long form and some have a Terminal SNP. And, some no longer have any information. (I assume that this is temporary) This is another reason to wait for the dust to settle.

4. Matching Check - based on matching or conflicting Haplogroups: The old logic was that you couldn't have a "Match" with a man with a different Haplogroup. (yes - we had to examine the shorter haplogroup to make sure that it was the same as the longer one until it ran out of digits. If so, they weren't different)

Example 1 - R1b, R1b1a2, and R1b1a2a1a1b3c2aren't "Different" - they are simply shorter or longer versions of the same Haplogroup family.long form. These men can be in the same Genetic Family (i.e. "Match")

Example 2 - R1b1a1 and R1b1a2 are Different. These man cannot be a "Match" and will be in Different Genetic Families

5. Mapping: There is no mapping structure built into the Terminal SNP names. The new logic is that each man has a Terminal SNP based on his FTDNA estimate or from his own formal SNP testing.

This will mean that you have to memorize the terminal SNP structure or have a reference tree on hand as you seek to understand whether two men can be related or not - based on their position on the Tree

Example 1 - The three men above (R1b, R1b1a2, and R1b1a2a1a1b3c2) are probably reported in the Barton project as R-P25, R-L2 and R-L196. These are all at different points on the same Haplogroup Family long form and are nodes on the same branch of the new yTree. They can be in the same Genetic Family. (note: the L2 estimate in this example is based on matching men who are known to be L2 in the Barton project. Another project with R1b1a2 men may get a totally different Terminal SNP estimate)

Example 2 - I'm not perfectly sure of the translation of R1b1a1 and R1b1a2 - as the FTDNA tree has a different arrangement than before - but these should be R-M73 and R-M269. Whatever is the correct translation from long form to Terminal SNP - they can't be a Match.

6. Color of Haplogroup or Terminal SNP: Note that I am showing Haplogroups in Green or Red. This has always been important, but if anything, it becomes even more important with the new system.

A red designation is used when the Haplogroup is estimated by FTDNA. (the conversion from long form to Terminal SNP will be different from man to man - you can't predict the Terminal SNP for an individual by only knowing his old long form)

A green designation is used when the Haplogroup is based on a formal test of the Terminal SNP (This translation should be predictable and be the same from man to man when converting from the old long form to the new Terminal SNP)

A black designation has been used by WorldFamiles (and others) to denote an estimate not provided by FTDNA (such as a result transferred from another testing company) These are going to be difficult to manage.

7. FTDNA and ISOGG Trees: ISOGG has worked in "real time" on their yTree, while FTDNA has released their new info in batches - typically less often than yearly. This has meant that ISOGG typically has the more advanced (up-to-date) and more complete tree. In recent years, ISOGG has established credibility and is regularly cited in scientific journals. It appears that the ISOGG work was not a reference for the new FTDNA tree that uses Terminal SNPs

8. Differences in FTDNA & ISOGG Trees: There are many differences between the two trees at this point - Naming and arrangement iis sometimes different and the two trees don't include the same SNPs. I haven't checked - but I hope the two trees don't disagree on the structure when considering the same SNPs. The FTDNA tree is reported to be focused on adding the Geno 2.0 SNPs to their prior tree, while the ISSOG Tree had already included much of the Geno 2.0 learning and moved on to the learning from the Big Y and other data sources. Hopefully, the significant differences wiill soon be resolved. For now, it is one more place - and reason - to wait for the dust to settle.

9. For WorldFamilies, there is another issue - as we have relied on the Haplogroup to sort men in a project before detecting the matches that define the geneatic families that we call "Lineages". This still works for much of the tree, but no longer works in "R" - as R, R1a, R1b & R2 are all being reported as "R". Additionally, our Results Tool was not built to handle a random combination of long form and Terminal SNP reporting. (which is what we are receiving from FTDNA today) Both are creating problems for the WorldFamilies results tables. We are working to resolve these - but until we do - you will see all sorts of oddities in the Results Tables that we prepare. (Yes - the user complaints have already started)

Operating with Terminal SNPs without a companion long form name is going to make for a long and painful transition. Making the transition with a much different yTree makes the challenge greater. My opinion is that a companion long form would greatly simplify and ease this transition.

I suggest that you give things time to settle and then that you get advice from your Haplogroup or Surname project administrator before you order any SNP.

Has anyone else tested up to SNP R Z2123+. By looking extensively on Google I seem to be from the eastern side of India very long ago. That would put me into the Bashkirs or Indians but at the very least on the Eastern European or Central Asia area around 2500 BCE. I can't find it now but some spot in World Families had someone indicating Iran. Any help out there?

Terry, thank you for this post. I saw the note regarding the change to Terminal SNPs on the results page and was wondering where to find more information. You explanation is good....but do you know why FTDNA is making this change?

I checked my FTDNA project page and was startled to see some drastic SNP reassignments. One man went from M269 to P-P295, another from M269 to I-M253, and a third from M269 to G-P15. So I made those changes on my Worldfamilies site and notified the three of them. This morning I belatedly thought to verify by checking their Personal Pages--and found that on them the three are R-P311, R-M269, and the third R-M222 respectively (There is no way this man can be M222). George Washington sure had the right idea about what to do with trees.

Paul, I have also found completely wrong assignments. (one was particularly easy to verify because the man was matching two men who still have his original E haplogroup) I have notified FTDNA with specifics that I found in case the example can help them troubleshoot their problems. At this stage, I have made no systematic spot check to see if the Terminal SNP estimates are reasonable - but am reacting to the ones that are clearly wrong

The good news is that I am seeing some improvement in the situation - with many of the R1b men now having a Terminal SNP estimate after going a number of days with nothing in the Haplogroup/Terminal SNP field.

I am still seeing:

Men with no Terminal SNP estimate

Some still having an old "longform"

Some having nothing

Men with incorrect Terminal SNP estimates

Men with a different Haplogroup than they had only days ago

Men with a Terminal SNP that is not on the Haplotree for my own myFTDNA kit page

Our WorldFamilies Results Tool is having particular trouble with R1a and R1b assignments, as FTDNA now reports both as "R" and is reporting an assortment of Terminal SNPs for any particular longform branch of their tree. My programmer partially fixed this before going on vacation and we have already discussed an approach which we think will fully address the issue when he gets back to work

I have no guess whether it will be hours, days, weeks or months before this ugly transition is completed. I hope it's soon. It would have been be helpful if administrators were kept in the loop on what is happening!

Well, the good news is that most kits now have a Terminal SNP. There are a few men without an estimate - a situation that I normally report to FTDNA so that the men can go into the Haplogroup Assurance program. However, I am going to wait a while before I start using my time to report these - as I am guessing that many of these will be dealt with. (Sadly, I guess we'll always have the nuisance of men who transfer in with a Sorenson test and didn't retest at FTDNA - as those men don't get a Terminal SNP estimate)

I am still seeing some substantial issues:

1. Terminal SNP reporting inconsistency

a. FTDNA has chosen one of multiple SNPs associated with a branch to report as the reference Terminal SNP for that location - but then (appropriately) reports the actual SNP that was tested if there was a formal SNP test. That can create more difficulty in making the correlations. As example, at R1a1a (ISOGG long form), M198 was heavily tested at FTDNA and is often reported as the man's defining Terminal SNP - R-M198 - but FTDNA calls that branch R-M512 (except on his Haplotree) The good news is that M198 is visible and a search on the page will find it and make for a relativley easy correlation between M512 and M198

b. My cousin in Z8+, reported as R-Z8. (I'll guess most readers won't know where that is without doing a lookup - or even be sure whether it's on the R1a or an R1b side) And - good luck with your look up - and don't do it in your Haplotree, because you won't find it! When you look at my cousin's Haplotree, it shows up very nicely - with Z8 being the name of the branch. However, the 3 men he matches who are tested one branch further to Z11+ can look at their Haplotree and find themselves, but they can't find my cousin, as thier branch which includes his Z8 is listed as R-Z2 - with the Z8 alternate hidden in a "more" field. They have to either "know" where he fits or have to spend some time hunting for him at ISOGG or elsewhere - as a "Find" search on their FTDNA Haplotree page will fail to find "Z8" anywhere in the tree. (yes- the ISOGG tree is now always open in my browser)

2. Terminal SNP estimate errors. So far, I have reported an even dozen of these to FTDNA. (I find them when I run a Results Update)

3. The same SNP reported as a Terminal SNP in more than one place in the FTDNA tree. I reported many of these when I started this blog posting. Many of those are now gone - but I found another one last night.

4. And for me and anyone who uses the WorldFamilies Results Tool - there is a continuing problem with R1a and R1b both being reported simply as "R". Our tool colorizes R1a and R1b to different references - which means that the Tool needs to know if the kit is R1a or R1b. We applied a quick fix which is more than 90% accurate, but it didn't anticipate the issue I mention in part 1. So - I have to fix each of those manually. I hope we'll get this addressed over the next week or two.

5. No reference Long Form. It takes a lot of extra time to deal with this issue. Where I used to be able to instantly know the relationship between Haplogroups, I now have to do a look up before I can answer a question. For instance, a man wants to know what SNP should he test. I used to go to his match page, quickly identify the most detailed Haplogroups, and almost instantly decide whether it was wise to go for the most detailed Clade or to back off a litlle - one or two branches from the end. Now - I have to figure out where all those Terminal SNPs fit on the tree before I can even see which are the furthest on the tree. Then, I have to visualize how they are related while I assess the complexities

Eventually, this too will pass. Remember how disruptive it was when FTDNA decided to report multi-markers in a single field instead of in multiple fields? We each found our way through that - and we will find our way throuh this, too.