Category Archives: RSRS (Reconstructed Sapiens Reference Sequence)

Recently, I’ve received a number of questions about comparing people and haplogroups between 23andMe and Family Tree DNA. I can tell by the questions that a significant amount of confusion exists about the two, so I’d like to talk about both. In you need a review of “What is a Haplogroup?”, click here.

Haplogroup information and comparisons between Family Tree DNA information and that at 23andMe is not apples and apples. In essence, the haplogroups are not calculated in the same way, and the data at Family Tree DNA is much more extensive. Understanding the differences is key to comparing and understanding results. Unfortunately, I think a lot of misinterpretation is happening due to misunderstanding of the essential elements of what each company offers, and what it means.

There are two basic kinds of tests to establish haplogroups, and a third way to estimate.

Let’s talk about mitochondrial DNA first.

Mitochondrial DNA

You have a very large jar of jellybeans. This jar is your mitochondrial DNA.

In your jar, there are 16,569 mitochondrial DNA locations, or jellybeans, more or less. Sometimes the jelly bean counter slips up and adds an extra jellybean when filling the jar, called an insertion, and sometimes they omit one, called a deletion.

Your jellybeans come in 4 colors/flavors, coincidentally, the same colors as the 4 DNA nucleotides that make up our double helix segments. T for tangerine, A for apricot, C for chocolate and G for grape.

Each of the 16,569 jellybeans has its own location in the jar. So, in the position of address 1, an apricot jellybean is always found there. If the jellybean jar filler makes a mistake, and puts a grape jellybean there instead, that is called a mutation. Mistakes do happen – and so do mutations. In fact, we count on them. Without mutations, genetic genealogy would be impossible because we would all be exactly the same.

When you purchase a mitochondrial DNA test from Family Tree DNA, you have in the past been able to purchase one of three mitochondrial testing levels. Today, on the website, I see only the full sequence test for $199, which is a great value.

However, regardless of whether you purchase the full mitochondrial sequence test today, which tests all of your 16,569 locations, or the earlier HVR1 or HVR1+HVR2 tests, which tested a subset of about 10% of those locations called the HyperVariable Region, Family Tree DNA looks at each individual location and sees what kind of a jellybean is lodged there. In position 1, if they find the normal apricot jellybean, they move on to position 2. If they find any other kind of jellybean in position 1, other than apricot, which is supposed to be there, they record it as a mutation and record whether the mutation is a T,C or G. So, Family Tree DNA reads every one of your mitochondrial DNA addresses individually.

Because they do read them individually, they can also discover insertions, where extra DNA is inserted, deletions, where some DNA dropped out of line, and an unusual conditions called a heteroplasmy which is a mutation in process where you carry some of two kinds of jellybean in that location – kind of a half and half 2 flavor jellybean. We’ll talk about heteroplasmic mutations another time.

So, at Family Tree DNA, the results you see are actually what you carry at each of your individual 16,569 mitochondrial addresses. Your results, an example shown below, are the mutations that were found. “Normal” is not shown. The letter following the location number, 16069T, for example, is the mutation found in that location. In this case, normal is C. In the RSRS model of showing mitochondrial DNA mutations, this location/mutation combination would be written as C16069T so that you can immediately see what is normal and then the mutated state. You can click on the images to enlarge.

Family Tree DNA gives you the option to see your results either in the traditional CRS (Cambridge Reference Sequence) model, above, or the more current Reconstructed Sapiens Reference Sequence (RSRS) model. I am showing the CRS version because that is the version utilized by 23andMe and I want to compare apples and apples. You can read about the difference between the two versions here.

Defining Haplogroups

Haplogroups are defined by specific mutations at certain addresses.

For example, the following mutations, cumulatively, define haplogroup J1c2f. Each branch is defined by its own mutation(s).

Haplogroup

Required Mutations

J

C295T, T489C, A10398G!, A12612G, G13708A, C16069T

J1

C462T, G3010A

J1c

G185A, G228A, T14798C

J1c2

A188G

J1c2f

G9055A

You can see, below, that these results, shown above, do carry these mutations, which is how this individual was assigned to haplogroup J1c2f. You can read about how haplogroups are defined here.

At 23andMe, they use chip based technology that scans only specifically programmed locations for specific values. So, they would look at only the locations that would be haplogroup producing, and only those locations. Better yet if there is one location that is utilized in haplogroup J1c2f that is predictive of ONLY J1c2f, they would select and use that location.

This same individual at 23andMe is classified as haplogroup J1c2, not J1c2f. This could be a function of two things. First, the probes might not cover that final location, 9055, and second, 23andMe may not be utilizing the same version of the mitochondrial haplotree as Family Tree DNA.

By clicking on the 23andMe option for “Ancestry Tools,” then “Haplogroup Tree Mutation Mapper,” you can see which mutations were tested with the probes to determine a haplogroup assignment. 23andMe information for this haplogroup is shown below. This is not personal information, meaning it is not specific to you, except that you know you have mutations at these locations based on the fact that they have assigned you to the specific haplogroup defined by these mutations. What 23andMe is showing in their chart is the ancestral value, which is the value you DON’T have. So your jelly bean is not chocolate at location 295, it’s tangerine, apricot or grape.

Notice that 23andMe does not test for J1c2f. In addition, 23andMe cannot pick up on insertions, deletions or heteroplasmies. Normally, since they aren’t reading each one of your locations and providing you with that report, missing insertions and deletions doesn’t affect anything, BUT, if a deletion or insertion is haplogroup defining, they will miss this call. Haplogroup K comes to mind.

23andMe never looks at any locations in the jelly bean jar other than the ones to assign a haplogroup, in this case,17 locations. Family Tree DNA reads every jelly bean in the jelly bean jar, all 16,569. Different technology, different results. You also receive your haplogroup at 23andMe as part of a $99 package, but of course the individual reading of your mitochondrial DNA at Family Tree DNA is more accurate. Which is best for you depends on your personal testing goals, so long as you accurately understand the differences and therefore how to interpret results. A haplogroup match does not mean you’re a genealogy match. More than one person has told me that they are haplogroup J1c, for example, at Family Tree DNA and they match someone at 23andMe on the same haplogroup, so they KNOW they have a common ancestor in the past few generations. That’s an incorrect interpretation. Let’s take a look at why.

Matches Between the Two

23andMe provides the tester with a list of the people who match them at the haplogroup level. Most people don’t actually find this information, because it is buried on the “My Results,” then “Maternal Line” page, then scrolling down until your haplogroup is displayed on the right hand side with a box around it.

Those who do find this are confused because they interpret this to mean they are a match, as in a genealogical match, like at Family Tree DNA, or like when you match someone at either company autosomally. This is NOT the case.

For example, other than known family members, this individual matches two other people classified as haplogroup J1c2. How close of a match is this really? How long ago do they share a common ancestor?

Taking a look at Doron Behar’s paper, “A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root,” in the supplemental material we find that haplogroup J1c2 was born about 9762 years ago with a variance of plus or minus about 2010 years, so sometime between 7,752 and 11,772 years ago. This means that these people are related sometime in the past, roughly, 10,000 years – maybe as little as 7000 years ago. This is absolutely NOT the same as matching your individual 16,569 markers at Family Tree DNA. Haplogroup matching only means you share a common ancestor many thousands of years ago.

For people who match each other on their individual mitochondrial DNA location markers, their haplotype, Family Tree DNA provides the following information in their FAQ:

Matching on HVR1 means that you have a 50% chance of sharing a common maternal ancestor within the last fifty-two generations. That is about 1,300 years.

Matching on HVR1 and HVR2 means that you have a 50% chance of sharing a common maternal ancestor within the last twenty-eight generations. That is about 700 years.

Matching exactly on the Mitochondrial DNA Full Sequence test brings your matches into more recent times. It means that you have a 50% chance of sharing a common maternal ancestor within the last 5 generations. That is about 125 years.

I actually think these numbers are a bit generous, especially on the full sequence. We all know that obtaining mitochondrial DNA matches that we can trace are more difficult than with the Y chromosome matches. Of course, the surname changing in mitochondrial lines every generation doesn’t help one bit and often causes us to “lose” maternal lines before we “lose” paternal lines.

Autosomal and Haplogroups, Together

As long as we’re mythbusting here – I want to make one other point. I have heard people say, more than once, that an autosomal match isn’t valid “because the haplogroups don’t match.” Of course, this tells me immediately that someone doesn’t understand either autosomal matching, which covers all of your ancestral lines, or haplogroups, which cover ONLY either your matrilineal, meaning mitochondrial, or patrilineal, meaning Y DNA, line. Now, if you match autosomally AND share a common haplogroup as well, at 23andMe, that might be a hint of where to look for a common ancestor. But it’s only a hint.

At Family Tree DNA, it’s more than a hint. You can tell for sure by selecting the “Advanced Matching” option under Y-DNA, mtDNA or Family Finder and selecting the options for both Family Finder (autosomal) and the other type of DNA you are inquiring about. The results of this query tell you if your markers for both of these tests (or whatever tests are selected) match with any individuals on your match list.

Hint – for mitochondrial DNA, I never select “full sequence” or “all mtDNA” because I don’t want to miss someone who has only tested at the HVR1 level and also matches me autosomally. I tend to try several combinations to make sure I cover every possibility, especially given that you may match someone at the full sequence level, which allows for mutations, that you don’t match at the HVR1 level. Same situation for Y DNA as well. Also note that you need to answer “yes” to “Show only people I match on all selected tests.”

Y-DNA at 23andMe

Y-DNA works pretty much the same at 23andMe as mitochondrial meaning they probe certain haplogroup-defining locations. They do utilize a different Y tree than Family Tree DNA, so the haplogroup names may be somewhat different, but will still be in the same base haplogroup. Like mitochondrial DNA, by utilizing the haplogroup mapper, you can see which probes are utilized to determine the haplogroup. The normal SNP name is given directly after the rs number. The rs number is the address of the DNA on the chromosome. Y mutations are a bit different than the display for mitochondrial DNA. While mitochondrial DNA at 23andMe shows you only the normal value, for Y DNA, they show you both the normal, or ancestral, value and the derived, or current, value as well. So at SNP P44, grape is normal and you have apricot if you’ve been assigned to haplogroup C3.

As we are all aware, many new haplogroups have been defined in the past several months, and continue to be discovered via the results of the Big Y and Full Y test results which are being returned on a daily basis. Because 23andMe does not have the ability to change their probes without burning an entirely new chip, updates will not happen often. In fact, their new V4 chip just introduced in December actually reduced the number of probes from 967,000 to 602,000, although CeCe Moore reported that the number of mtDNA and Y probes increased.

By way of comparison, the ISOGG tree is shown below. Very recently C3 was renamed to C2, which isn’t really the point here. You can see just how many haplogroups really exist below C3/C2 defined by SNP M217. And if you think this is a lot, you should see haplogroup R – it goes on for days and days!

How long ago do you share a common ancestor with that other person at 23andMe who is also assigned to haplogroup C3? Well, we don’t have a handy dandy reference chart for Y DNA like we do for mitochondrial – partly because it’s a constantly moving target, but haplogroup C3 is about 12,000 years old, plus or minus about 5,000 years, and is found on both sides of the Bering Strait. It is found in indigenous Native American populations along with Siberians and in some frequency, throughout all of Asia and in low frequencies, into Europe.

How do you find out more about your haplogroup, or if you really do match that other person who is C3? Test at Family Tree DNA. 23andMe is not in the business of testing individual markers. Their business focus is autosomal DNA and it’s various applications, medical and genealogical, and that’s it.

Y-DNA at Family Tree DNA

At Family Tree DNA, you can test STR markers at 12, 25, 37, 67 and 111 marker levels. Most people, today, begin with either 37 or 67 markers.

Of course, you receive your results in several ways at Family Tree DNA, Haplogroup Origins, Ancestral Origins, Matches Maps and Migration Maps, but what most people are most interested in are the individual matches to other people. These STR markers are great for genealogical matching. You can read about the difference between STR and SNP markers here.

When you take the Y test, Family Tree DNA also provides you with an estimated haplogroup. That estimate has proven to be very accurate over the years. They only estimate your haplogroup if you have a proven match to someone who has been SNP tested. Of course it’s not a deep haplogroup – in haplogroup R1b it will be something like R1b1a2. So, while it’s not deep, it’s free and it’s accurate. If they can’t predict your haplogroup using that criteria, they will test you for free. It’s called their SNP assurance program and it has been in place for many years. This is normally only necessary for unusual DNA, but, as a project administrator, I still see backbone tests being performed from time to time.

If you want to purchase SNP tests, in various formats, you can confirm your haplogroup and order deeper testing.

You can order individual SNP markers for about $39 each and do selective testing. On the screen below you can see the SNPs available to purchase for haplogroup C3 a la carte.

You can order the Geno 2.0 test for $199 and obtain a large number of SNPs tested, over 12,000, for the all-inclusive price. New SNPs discovered since the release of their chip in July of 2012 won’t be included either, but you can then order those a la carte if you wish.

Or you can go all out and order the new Big Y for $695 where all of your Y jellybeans, all 13.5 million of them in your Y DNA jar are individually looked at and evaluated. People who choose this new test are compared against a data base of more than 36,000 known SNPs and each person receives a list of “novel variants” which means individual SNPs never before discovered and not documented in the SNP data base of 36,000.

Don’t know which path to take? I would suggest that you talk to the haplogroup project administrator for the haplogroup you fall into. Need to know how to determine which project to join, and how to join? Click here. Haplogroup project administrators are generally very knowledgeable and helpful. Many of them are spearheading research into their haplogroup of interest and their knowledge of that haplogroup exceeds that of anyone else. Of course you can also contact Family Tree DNA and ask for assistance, you can purchase a Quick Consult from me, and you can read this article about comparing your options.

2012 has been a very busy year for genetic genealogists. There have been lots of discoveries and announcements that affect everyone, now and in the future. The watchwords for 2012 would be “churn” and “explosive growth.” Let’s take a look at the 10 most important events, why they are important and what they mean for the future of genetic genealogy.

These items are in what I think are relatively good order, ranked by their importance, although I had a very difficult time deciding between number 1 and 2.

1. The New Root – Haplogroup A00

At the Family Tree DNA conference in November, Michael Hammer, Bonnie Schrack and Thomas Krahn announced that they had made a monumental discovery in the age of modern man known as Y-line Adam. The discovery of Haplogroup A00 pushes the “birth” of mankind back from about 140,000 years ago to an amazing 338,000 years ago. Utterly amazing. The DNA came from an American family from South Carolina. This discovery highlights the importance of citizen science. Bonnie is a haplogroup administrator who recognized the potential importance of one of her participants’ DNA. Thomas Krahn of course is with Family Tree DNA and ran the WTY test, and Michael Hammer is at the University of Arizona. So you have the perfect blend here of participant, citizen scientist, commercial lab and academia. What was never thought possible a decade or so ago is not only working, it’s working well and changing the face of both science and humanity.

Geno 2.0 is the Nickname for the National Geographic Society’s Genographic Project version 2.0. That mouthful is why it has a nickname.

This amazing project has leveraged the results of the past 7 years of research from the original Genographic project into a new groundbreaking product. Geno 2.0, utilizing the GenoChip, a sequencing chip created specifically for Nat Geo, offers the most complete Y tree in the world today, expanding the SNP tree from just over 800 SNPs to over 12,000. They are in essence redrawing the Y chromosome tree as I write this. In addition, the person who purchases Geno 2.0 will receive a mitochondrial DNA haplogroup assignment. Over 3300 new mitochondrial mutations were discovered. A brand new anthropological “percentages of ethnicity” report is featured based on over 75,000 Ancestry Informative Markers, many only recently discovered by the Genographic project. Additionally, participants will receive their percentage of both Neanderthal and Denisovan ancestry based on 30,000 SNPs identified that signal interbreeding between the hominids. A new website will also facilitate social networking and uploading information to Family Tree DNA.

The wonderful news is that there is a massive amount of new information here that will change the landscape of genetic genealogy. The difficulty is that we are struggling a bit under the load of that massive amount of information that is just beginning to descend upon us. It’s a great problem to have!

4. Full Genome and Exome Sequence Offered Commercially by Gene by Gene

It was announced at the November DNA conference that Gene by Gene, the parent company of Family Tree DNA, through their division titled DNA DTC is offering full genomic sequencing for the amazing price of $5495 for the full genome and $695 for the exome. This is a first in the consumer marketspace. Today, this doesn’t have a lot of application for genetic genealogy, but as the price continues to drop, and utilities are built to process the full genomic data, certainly a market and applications will emerge. This is an important step forward in the industry with a product that still cost 3 million dollars in 2007.

It’s official – they did it. Yep, they interbred and well, they are not them anymore, they are us. Given that everyone in Asia and Europe carries a part of them, but not people from Africa, it would appear that two populations admixed rather thoroughly in Eurasia and/or the populations were small. The amount of Neanderthal and Denisovan DNA will continue at approximately the proportions seen today in Europe (2% Neanderthal) and Asia unless a significant amount of admixture from a population (Africa) that does not carry this admixture is introduced. So if you’re European, you carry both Neanderthal and Denisovan DNA. They are your ancestors. The good news is that you can find how much of each through the Geno 2.0 test. 23andMe results give you the percentage of Neanderthal, but not Denisovan.

6. Ancestral Genome Reconstruction Begins, Led by Falling Autosomal Prices and the Ability to Fish in Multiple Ponds

2012 has been the year of autosomal testing price reductions and a great deal of churn in this marketspace. Companies are playing leap-frog with one another. However, sometimes things are not all that they seem.

Initially, 23andMe opted for an initial payment plus monthly subscription model, which they abandoned for a one time payment price of $299 in early 2012. Family Tree DNA was slightly less, at $289.

Ancestry led the price war by giving away kits, then selling them for $99, then $129 plus a subscription as an entrance into this market. However, looking at the Ancestry consent form hints at possible reasons why they were selling below the cost of the tests. You are in essence giving them permission to sell your DNA and associated information. In addition, to gain full access to your results and matches, you must maintain some level of subscription to Ancestry.com, increasing the total effective price.

Next came Family Tree DNA’s sale where they dropped their autosomal price to $199, but they were shortly upstaged by 23andMe whose price has now dropped to $99 permanently, apparently, a result of a 50 million dollar investment in order to reach 1 million customers. They currently have about 180,000. 23andMe has always been in the medical/health business, so their clients have always understood what they were consenting to and for.

Not to be outdone, Family Tree DNA introduced the ability earlier in 2012 to upload your data files from 23andMe to FamilyTree DNA for $89, far less than a second test, which allows you to fish in a second pond where genealogists live for matches. The challenge at 23andMe is that most of their clients test for the health traits and either don’t answer inquiries or match requests, or know little about their genealogy if they do. At Family Tree DNA, matches don’t have to answer and allow a match, testers are automatically matched with all participants who take the Family Finder test (or upload their 23andMe results) and testers are provided with their matches’ e-mail address.

Of course, Geno 2.0 was also introduced in the midst of this, in July, for $199 with the additional lollipop of new SNPS, lots of them, that others simply don’t have access to yet.

The good news is that consumers have benefitted from this leapfrogging, I think. Let’s hope that the subsidized tests at Ancestry and 23andMe don’t serve long term to water down the demand to the point where unsubsidized companies (who don’t selling participants genetic results to others) have problems remaining viable.

Personally, I’ve tested at all of these companies. I’ll be evaluating the results shortly in detail on my blog at www.dna-explained.com.

The tools provided by most testing companies, plus GedMatch, and multiple ponds to fish in are allowing the serious genetic genealogist to “reconstruct” their genome, attributing segments to specific ancestors. Conversely, we will also be able to “reconstruct” specific ancestral family lines as well by identifying autosomal segments in multiple descendants. This new vision of autosomal genetic genealogy will allow much more accurate ancestral line matching, and ancestor identification in the not-so-distant future.

The good news is that the various ethnicity tests (known as BGA or biogeographical ancestry tests) that provide participants with their percentages of various world populations are improving. The bad news is that there is currently one bad apple in the card with very misleading percentages – and that is Ancestry.com.

23andMe introduced a new version of their ethnicity product in December, expanding from only 3 geographic categories to several. The Geno 2.0 test results are just beginning to be returned which include ethnicity predictions and references to several base populations.

Family Tree DNA finally has some competition in this arena where for years they have been the only serious player, although opinions differ widely about which of these three organizations results are the most accurate. All four are Illumina chip based, using hundreds of thousands of locations, as compared with the previous CODIS type tests which used between 15 and 300 markers and are now outdated. All companies use different reference populations which, of course, provide somewhat different results to participants. All companies, except Ancestry, have documented and shared their reference population information.

Outside of these companies, Doug McDonald offers a private analysis and Gedmatch offers a series of BGA comparisons written by third parties.

While this industry continues to grow and mature, I’m thinking about just averaging the autosomal ethnic results and calling it good:)

PBS sponsored a wonderful series in the spring of 2012 hosted by Henry Louis “Skip” Gates, the chair of African American Studies at Harvard. This series followed a lesser known 2010 series. The 2012 inspirational series reached tens of thousands of people and increased awareness of genetic genealogy as well as sparked an interest in genealogy itself, especially for mixed race and African American people. I was disappointed that the series did not pursue the Native American results unexpectedly obtained for one participant. It seemed like a missed opportunity. Series like this bring DNA testing for genealogy into the mainstream, making it less “strange” and frightening and more desirable for the average person. These stories were both inspirational and heartwarming. I hope we can look forward to similar programs in the future.

GeneTree, a for profit company and Sorenson, a non-profit company were both purchased by Ancestry.com. This was about the same time as Ancestry introduced their autosomal AncestryDNA product. Speculation was that the autosomal results at Sorenson might be the foundation for the new autosomal test comparisons, although there has been no subsequent evidence of this.

Ancestry initially gave away several thousand kits in order to build their data base, then sold thousands more for $99 before raising the price to what appears to be a normalized price of $129 plus an annual ancestry subscription.

While GeneTree was never a major player in the DNA testing marketspace, Sorenson Molecular Genealogical Foundation played an important role for many years as a nonprofit research institute. There was significant distress in the genetic genealogy community related to the DNA contributed to Sorenson for research being absorbed by Ancestry as a “for profit” company. Ancestry is maintaining the www.smgf.org website, but no additional results will be added. Sorenson has been entirely shuttered. Many of the Sorenson/GeneTree employees appear to have moved over to Ancestry.

The initial AncestryDNA autosomal product offering is poor, lacks tools and the ethnicity portion has significant issues. It’s strength is that many people who test are already Ancestry subscribers and have attached their trees. So you can’t see how you connect genetically to your matches (lack of tools), but you can see the trees, if they are attached and not marked as private, of those with whom you match. Ancestry provides “hints” relative to matching individuals or surnames.

Eventually, if Ancestry improves its products, provides tools and releases the raw data to consumers, this may be a good thing. It’s an important event in 2012 because of the massive size of Ancestry, but the product is mediocre at best. Ancestry seems unwilling to acknowledge issues unless their feet are held to the fire publicly as illustrated with a “lab error” erroneous match for an adoptee caught by the consuming public and ignored by Ancestry until CeCe Moore exposed them in her blog. Whether Ancestry ultimately helps or hurts the genetic genealogy industry is a story yet to be told. There is very little positive press in the genetic genealogy community surrounding the Ancestry product, but with their captive audience, they are clearly going to be a player.

GedMatch, www.gedmatch.com, created by John Olson and Curtis Rogers, isn’t new in 2012, but it’s maturing into a tool that is becoming the defacto workhorse of the serious autosomal community. People who test at either 23andMe or Family Tree DNA download their raw results and other match information and then use a variety of tools at GedMatch to look at results in different ways and using different thresholds. GedMatch is currently working to accept the newly arriving Geno 2.0 data files. Ancestry does not at this time allow their customers access to their raw data files, so there is nothing to upload. The bad news is that not everyone downloads/uploads their information. Only the most savvy users, and the download/upload is not always a smooth process, often necessitating several attempts, a magic wand and some fairy dust for luck.

GedMatch is a volunteer effort funded by donations on the GedMatch site. The magnitude of this project came to light when they needed new servers this year because the amount of traffic disabled their internet service provider. It may be a volunteer effort, but it has mainstream requirements. Therefore, while occasionally frustrating, it’s easy to understand why it’s light on documentation and one has to poke around a bit to figure things out. I would actually prefer that they make it a subscription site, clean up the bugs, add the documentation and take it to the next level. It would also be very nice if they could arrange something with the major players in terms of a seamless data transfer for clients. All told, it’s an amazing contribution as a volunteer site. Hats off to Curtis and John for their ongoing contribution to genetic genealogists!!!

During my webinars this week for APG, someone asked a question about mitochondrial DNA and I told them I would follow up on my blog. I thought I knew the answer, but I needed to be sure.

When I displayed the slide of my full sequence in the RSRS format, they noticed some of the letters were lower case. Truthfully, since client comparisons are still in the CRS format, I hadn’t paid a lot of attention to my RSRS values except for an initial look-see when the corresponding paper came out (“A ‘Copernican’ Reassessment of the Human Mitochondrial DNA Tree from its Root”) and the RSRS results were added to our personal page information. I know, my bad.

In my blogs titled Citizen Science, the CRS and the RSRS and What Happened to My Mitochondrial DNA?, I explained about the CRS and the RSRS. In a nutshell, the RSRS, the Reconstructed Sapiens Reference Sequence is the new way of interpreting mitochondrial results, comparing them to a “reconstructed” Eve instead of someone who tested in Cambridge in 1981. That 1981 person set the standard for the CRS, or Cambridge Reference Sequence.

But soon, we will be using the RSRS. My understanding is that the Geno 2.0 results, although only providing the haplogroup defining mutations, will be given in RSRS format.

So let’s take a look at what this person saw that caused a question.

In the last mutation in the coding region, all the way at the end, you see that a mutation is noted as C15452a.

Now let’s take a look at the CRS version.

You see the same mutation, but it’s noted differently, as 15452A.

What is the difference, or maybe better asked, why the difference?

On the CRS page, the mutations are shown, as above, but there is also a second part of that page, shown below.

On this second part of the results, the normal value in the CRS, and the value carried by the person with the mutation in 1981, is shown. So this is a translation table for your results. You can see that it shows that the CRS value for location 15452 is normally C and my value is an A.

What are those Cs and As? Or for that matter the other two letters, T and G? Well, referring to Tuesday’s introduction class, these are the 4 base nucleotides that make up the “rungs” in the DNA double helix ladder.

T, A, C and G are short for Adenine, Cytosine, Thymine and Guanine. You can see these nucleotides as they each make up half of the connection between opposite sides of the double helix as it uncoils. Normally, a T is paired with a C and the A is paired with the G. However, not always. When a mutation happens, sometimes the pairing is inverted and a C gets paired with an A or a T gets paired with a G.

When a typical mutation happens, meaning T/C and A/G, it’s called a transition. When a more unusual mutation happens, meaning C/A, A/C, G/T and T/G, it’s called a transversion. I think this is what I said the other night, but given how often I use these terms, which is almost never, it would have been easy to get them switched.

I know, by now you’re VERY sorry you asked aren’t you:)

But we’re not quite to the answer yet, so please, bear with me and read on. Remember, this could qualify you to win the new Genetic Genealogy Trivial Pursuit game whenever that version emerges. We are almost to the punch line….

In order to make life easier and to eliminate the need for a translation table, the new RSRS refers to mutations a little differently. You’ve guessed by now, haven’t you. Yep, you’re right, my mutation shown as C15452a has its own translation table built right in. The mutation location is 15452. The normal value, meaning the one Eve had (RSRS), as well as the CRS, was a C. However, my value is an A, but since it’s a little a, we know that this is a transversion, not a transition. You can see another transversion at my location 825.

Why is this important in genetic genealogy? It’s not, really, because it’s already taken care of for you. If someone else has a value there of C15452T, they simply won’t be shown as a match to me with my value of C15424a. So you don’t have to figure this out, it’s taken care of for you in the matching routine. But hey, you wanted to know, and now you do. Good eye for the catch!

First things first! I want to thank Max and Bennett for graciously hosting the 8th Annual Genetic Genealogy Conference in Houston, Texas! This is actually the 9th year, but a pesky hurricane interfered one year. Max and Bennett are very generous with their time and resources and heavily subsidize this conference for us. We’re registering in the photo above.

Georgia Kinney Bopp said it best. At some point during this amazing conference, someone tweeted an earlier quote from a conversation between Ann Turner and Georgia:

“it’s hard to realize you’re living history while it happens…”

This was ever so true this weekend. Even my husband (who is not genetic genealogy crazy) realized this. I’m not sure everyone at the conference did, or realized the magnitude of what they were hearing, as we did have a lot of newbies. Newbies are a good thing. It means our obsessive hobby and this industry have staying power and there will be people to pass the torch to someday.

For those of you who want the nitty gritty play by play as it happened at the conference, go to www.twitter.com and search for hashtag #ftdna2012. If you want some help with Twitter, I blogged about that too. Twitter is far from perfect, but it is near-realtime as things are happening.

As always, Family Tree DNA hosts a reception on Friday evening. This helps break the ice and allows people to put faces with names. So many of us “know” each other by our e-mail name and online presence alone.

We had a special guest this year too, Nina, a little puppy who was rescued by Rebekah Canada just a few days before the conference. Nina behaved amazingly well and many of us enjoyed her company.

Bennett opened the conference this year, and in the Clint Eastwood political tradition, spoke to his companion, the chair named Max. The real Max, it turns out, was losing his voice, but that didn’t prevent him from chatting with us and answering questions from time to time.

While Bennett was very low key with this announcement, it was monumental. He indicated that the parent company of Family Tree DNA has reorganized a bit. It has changed its name to Gene by Gene and now has 4 divisions. You can check this out at www.genebygene.com. This isn’t the monumental part.

The new division, DNADTC’s new products are the amazing parts. Through this new division, they are the first commercial company to offer a full genome sequence test. The price, only $5495. For somewhat less, $695, they are offering the exome, which are your 20,000 genes. Whoever though it would be a genetic genealogy company who would bring this to the public. Keep in mind that the human genome was only fully sequenced in 2003 at a cost of 3 billion dollars.

The amazing part is that a full genome sequence cost about 3 million in 2007 and the price will continue to fall. While consumers will be able to order this, if they want, it comes with no tools, as it is focused at the research community who would be expected to have their own analytical tools. However, genetic genealogists being who and what they are, I don’t expect the research market will outweigh the consumer market for long, especially when the price threshold reaches about $1000.

Bennett also said that he expects that National Geographic will, in 2013 sometime, decide to allow upgrades from Family Tree DNA clients for the Geno 2.0 product. This will allow those people who cannot obtain a new sample to participate as well. However, an unopened vial will be required. No promises as to when, and the decision is not his to make.

The first session was Spencer Wells via Skype from Italy. Spencer has just presented at two conferences within the week, one in San Francisco and one in Florence, Italy. Fortunately, he was able to work us into his schedule and he didn’t even sound tired.

Of course, his topic was the Geno 2.0 test which is, of course, run on the new GenoChip. The first results are in the final stages of testing, so we should see them shortly. Sometime between the 19th and the end of the month.

This product comes with all new migration maps. He showed one briefly, and I noticed that one of the two Native Y-lines are now showing different routes than before. One across Siberia, which hasn’t changed, and one up the pacific rim. Hmmm, can’t wait for that paper.

The new maps all include heat maps which show frequency by color. The map below is a haplogroup Q heat map, but it is NOT from the Geno project. I’m only using it as an example.

Spencer indicated that the sales of the 2.0 product rival those of the 1.0 product and that they have sold substantially more than 10K and substantially less than 100K kits so far. In total, they have sold more than 470,000 kits in over 130 countries. And that’s just the public participation part, not the indigenous samples. They have collected over 75,000 indigenous samples from more than 100 populations resulting in 36 publications to date with another half dozen submitted but not yet accepted. Academic publication is a very long process.

Nat Geo has given 62 legacy grants to indigenous communities that have participated totaling more than 1.7 million dollars. That money comes in part from the public participation kits, meaning Geno 1.0 and now 2.0.

Geno 2.0 continues to be a partnership between National Geographic and Family Tree DNA. Family Tree DNA is running all of their samples in the expanded Houston lab. Also added to the team is Dr. Eran Elhaik at Johns Hopkins University who has developed a new tool, AIMSFINDER, that locates never before identified Ancestral Informative Markers to identify population specific markers. This is extremely important because it allows us to read our DNA and determine if we carry the markers reflective of any specific population. Well, we don’t do the reading, they do with their sophisticated software. But we are the recipients with the new deep ancestral ethnicity results which are more focused on anthropology than genealogy. Spencer says that if you have 2% or more Native American, they can see it. They have used results from both public and private repositories in developing these tools.

This type of processing power combined with a new protocol that tests all SNPS in a sequence, not just selected ones, promises to expand the tree exponentially and soon. It has already been expanded 7 fold from 863 branches of the Y tree to 6153 and more have already been discovered that are not on the GenoChip, but will be in the next version.

The National Geographic project will also be reaching out to administrators and groups who may have access to populations of interest. For example, an ex-pat group in an American city. Keep this in mind as you think of projects.

Another piece of this pie is a new educational initiative in schools called Threads.

This isn’t all, by any means, on this topic, I really do encourage you to go and use Twitter hashtag #ftdna2012. Several of us were tweeting and the info was coming so fast and furious that no one could possibly get it all.

The future with Nat Geo looks exceedingly bright. We have gone from the Barney Rubble age to the modern era and now there is promise for a rosy and as yet undiscovered future.

Judy Russell was next. I have to tell you, when I saw where they positioned her, I was NOT envious. I mean, who wants to follow Spencer Wells, even if he’s not there in person. Well, if anyone was up to this, it certainly was Judy. For those who don’t know, she blogs as The Legal Genealogist.

Judy is one of us. That means she actually understands our industry, what drives genealogists and why. In addition to being a lawyer, she is a certified genealogist and a genetic genealogy crazy too. Maybe I shouldn’t call a lawyer crazy….well…it was meant as a compliment:)

Judy has the perspective to help us, not just criticize us remotely. She reviewed several areas where we might make mistakes. After all, we’re all volunteers coming from quite varied backgrounds. She suggests that we all put some form of disclosure on our projects explaining what participants can expect in terms of use. She used the Core Melungeon project as a good example, along with the Fox project.

“The goal of this project is to use DNA to better understand the origins of the Melungeon people, and this will be done by comparing the DNA with other project members, those outside of projects, and will incorporate relevant genealogical and historical research. All participants will be included in the ongoing studies and by joining the project, you are giving consent for your information to be anonymously included in ongoing genetic genealogy research. Your personal identity will not be revealed, but your results will be used to better understand the Melungeons as a people and their ancestors.”

From the Fox project:

“The exact function of these STR markers is not yet known and they have no known medical function but recent research shows they have some sort of regulatory function on the genes. While there is no medical information in these numbers, the absence of a certain few markers near a fertility gene could indicate sterility – something that would certainly already be known.

The results do provide a partial means of personal identification and, for this reason, our haplotype tables list only the FTDNA kit number and the most distant known male line ancestor. Within the project, however, the administrators feel free to disclose identities, particularly when a close match occurs.”

Judy’s stressed that we not tell people that there is no medical information revealed. Partially, because we’ve discovered in rare cases that’s not true, and partially because we can’t see into the future.

Judy talked about regulation and that while we fear what it might intentionally or inadvertently do to genetic genealogy, it’s important to have regulations to get rid of the snake oil salesman, and yes, there are a couple in genetic genealogy. They give us all a black eye and a bad name when people discover they’ve been hoodwinked. However, without regulation of some sort, we have no legal tools to deal with them.

Regulation certainly seems to be a double-edged sword.

I hope that Judy writes in her blog about what she covered in her session, because I think her message is important to all administrators and participants alike. And just to be clear, the sky is not falling and Judy is not Chicken Little. In fact, Judy is the most interesting attorney I have ever heard speak, and amazingly reasonable too. She actually makes you WANT to listen, so if you ever get the chance to see one of her webcasts or attend one of her sessions, take the opportunity.

Following the break, breakout sessions began. CeCe Moore ran one about “Family Finder,” Elise Friedman about “Group Administration” and Thomas Krahn provided the “Walk the Y Update.” Bennett called this the propeller head session. Harumph Bennett. Guess you know which one I attended. All sessions were offered a second time on Sunday.

Thomas said that they have once again upgraded their equipment, doubling their capacity again. This gives 4 times the coverage of the original Walk the Y, covering more than 5 million bases. To date, they have run 494 pre-qualified participants and of those, 198 did not find a new SNP.

There are changes coming in how the palindromic region is scored which will change the matches shown. Palindromic mismatches will now be scored as one mutation event, not multiples. Microalleles will able be reported in the next rollout version, expected probably in January. The problem with microalleles is not the display, but the matching routine.

Of importance, there has not been an individual WTY tested from haplogroups B, M, D or S, and we need one. So if you know of anyone, please contact Thomas.

The next session by Dr. Tyrone Bowes was “Pinpointing a Geographical Location Using Reoccurring Surnames Matches.” For those of us without a genetic homeland, this is powerful medicine. Dr. Bowes has done us the huge favor of creating a website to tell us exactly how to do this. http://www.irishorigenes.com/

He uses surnames, clan maps, matches, history and census records to reveal surname clusters. One tidbit he mentioned is that if you don’t know the family ethnicity, look at the 1911 census records and their religion will often tell you. Hmm, never thought of that, especially since our American ancestors left the homeland long ago. But those remaining in the homeland are very unlikely to change, at least not in masse. I’m glad he gave this presentation, or I would never have found his webpage and I can’t wait to apply these tools to some of my sticky-wickets.

This ended Saturday’s sessions, but at the end of every day, written questions are submitted for that day’s presenters or for Family Tree DNA.

Bennett indicated that another 3000 or 4000 SNPs will be added to the Family Finder calculations and a new version based on reference samples from multiple sources will be released in January.

Bennett also said that if and when Ancestry does provide the raw downloadable data to their clients, they will provide a tool to upload so that you can compare 23andMe and Ancestry both with your Family Finder matches.

Saturday evening is the ISOGG reception, also called the ISOGG party. Everyone contributes for the room and food, and a jolly good time is had by all. There is just nothing to compare with face to face communications.

For me, and for a newly found cousin, this was an amazing event. A person named Z. B. Stroud left me a message that she was looking for me. When I found her, along with her friend and cousin Revis, she tells me that she matches me autosomally, at 23andMe, and that she had sent me a sharing request that I had ignored. I am very bad about that, because unless someone says they are related, I presume they aren’t and I don’t like to clutter up my list with non-related people. It makes comparisons difficult. My bad. In fact, I’m going right now to approve that sharing request!!!

I will blog about this in the future, but without spilling too many beans….we had a wonderful impromptu family reunion. We think our common ancestor is from the Halifax and Pittsylvania County region of Virginia, but of course, it will take some work to figure this out.

I’m also cousins with Revis Leonard (second from left). We’ve known that for a long time, but Z.B. whose first name is Brisjon (second from right) is new to genealogy, DNA and cousin matching. I’m on the right above. The Stroud project administrator, Susan Milligan, also related to Brisjon is on the left end. In the center are Brisjon’s two cousins who came to pick her up for dinner and whom she was meeting for the first time.

But that’s not all all, cousin Brisjon also matches Catherine Borges. Let me tell you, I know who got the tall genes in this family, and I’m not normally considered short. Brisjon’s genealogical journey is incredibly amazing and she will be sharing it with us in an upcoming book. Suffice it to say, things are not always what you think they are and Brisjon is living proof. She also met her biological father for the first time this weekend! I’m sure Houston and her 2012 visit where she met so many family members is a watershed event in her lifetime! She is very much a lovely lady and I am so happy to have met her. Cousins Rule!

ISOGG traditionally has its meeting on Sunday morning before the first session. Lots of sleepy people because everyone has so much fun at the ISOGG party and stays up way too late.

Alice Fairhurst, who has done a remarkable job with the ISOGG Y SNP tree (Thank you Alice!) knows an avalanche is about to descend on her with the new Geno 2.0 chip. They are also going to discontinue the haplogroup names, because they pretty much have to, but will maintain an indented tree so you can at least see where you are. The names are becoming obsolete because everytime there is an insertion upstream, everything downstream gets renamed and it makes us crazy. It was bad enough before, but going from 860+ branches to 6150+ in one fell swoop and knowing it’s probably just the beginning confirms the logic in abandoning the names. However, we have to develop or implement some sort of map so you can find your relative location (no pun intended) and understand what it means.

Alice also mentioned that they need people to be responsible for specific haplogroups or subhaplogroups and they have lost people that have not been replaced, so if anyone is willing or knows of anyone….please contact Alice.

Alice also makes wonderful beaded double helix necklaces.

Brian Swann (sorry, no picture) is visiting from England this year and he spoke just a bit about British records. He said it’s imperative to learn how they work and to use some of the British sites where they have been indexed. He also reminded us to check GOONS (Guild of One Name Studies) for our surnames and that can help us localize family groups for recruiting. He said that you may have to do family reconstructions because to get a Brit to test you have to offer them something. That’s not terribly different from over here. He also mentioned that today about half of the British people having children don’t marry, so in the next generation, family reconstruction will be much more difficult. That too isn’t so terribly different than here, although I’m not sure about the percentages. It’s certainly a trend, as are varying surname practices even within marriage.

Dr. Doron Behar began the official Sunday agenda with a presentation about the mtCommunity and a discussion of his recently published paper “A ‘Copernican’ Reassesement of the Human Mitochondrial DNA Tree from its Root.” This paper has absolutely revolutionized the mitochondrial DNA community. I blogged about this when the paper was first released and our home pages were updated. One point he made is that it is important to remember is that your mutations don’t change. The only thing that changes between the CRS (Cambridge Reference Sequence) and the RSRS (Reconstructed Sapiens Reference Sequence) model is what your mutations are being compared to. Instead of being compared to someone from Europe who live in 1981 (the CRS) we are now comparing to the root of the tree, Mitochondrial Eve (RSRS) as best we can reconstruct what her mitochondrial DNA looked like.

He also said that when people join the mtCommunity, their results are not automatically being added to GenBank at NCBI. That is a separate authorization check box.

A survey was distributed to question participants as to whether they want results, when they select the GenBank option, to be submitted with their kit number. Now, they are not, and they are under Bennett’s name, so any researcher with a question asks Bennett who has no “track back” to the person involved. About 6000 of the 16,000 submissions today at GenBank are from Family Tree DNA customers. Dr. Behar said that by this time next year, he would expect it to be over half. Once again, genetic genealogy pioneers are leading the way!

At these conferences, there is always one session that would be considered the keynote. Normally, it’s Spencer Wells when he is on the agenda, and indeed, his session was wonderful. But at the 2012 conference, this next session absolutely stole the show. Less public by far, and much less flashy, but at the core root of all humanity.

You can’t really tell from the title of this session what is coming. Michael Hammer with Thomas Krahn and Bonnie Schrack, one of our own citizen scientists, presented something called “A Highly Divergent Y Chromosome Lineage.” Yawn. But the content was anything but yawn-material. We literally watched scientific discovery unfold in front of our eyes.

Bonnie Schrack is the haplogroup A project administrator. Haplogroup A is African and is at the root of the entire haplotree. One of Bonnie’s participants, an African American man from South Carolina agreed to participate in WTY testing. In a nutshell, when Thomas and Astrid began scoring his results, they continued and continued and continued, and wound up literally taking all night. At dawn’s first light, Thomas told Astrid that he thought they had found an entirely new haplogroup that preceded any known today. But he was too sleep deprived to be sure. Astrid, equally as sleep deprived, replied with “Huh?” in disbelief. It’s certainly not a statement you expect to hear, even once in your lifetime. This is a once in the history of mankind event.

Dr. Michael Hammer confirmed that indeed, they had discovered the new root of the human Y tree. And not by a little either, but by a lot. For those who want to take a look for yourself, Ysearch ID 6M5JA. Hammer’s lab did the age projection on this sample, and it pushed the age of hominid men back by about 100,000 years, from 140,000 years ago to 237,000 years ago. They then reevaluated the aging on all of the tree and have moved the prior date to about 200,000 years ago and the new one to about 338,000 years ago with a 98% confidence level. This is before the oldest fossils that have been found, and also before the earliest mitochondrial DNA estimate, which previously had been twice as old as the Yline ancestor.

The previous root, A1b has been renamed A0 and the new root, just discovered is now A00. Any other new roots discovered will simply get another zero appended.

How is it that we’ve never seen this before? Well, it turns out that this line nearly went extinct. Cruciani published a paper in 2012 that included some STR values that matched this sample, but fortunately, Michael Hammer’s lab held the actual samples. A search of academic data bases reveals only a very few close matches, all in western Cameroon near the Gulf of Guinea. Interestingly, next door, in Nigeria, fossils have been found younger than this with archaic features. This is going to cause us to have to reevaluate the source of this lineage and with it the lineage of all mankind. We must now ask the question about whether perhaps we really have stumbled upon a Neanderthal or other archaic lineage that of course “became” human. Like many scientific discoveries, this answer only begs more questions. My husband says this is like Russian tea dolls where ever smaller ones are nested in larger ones.

This discovery changes the textbooks, upsets the proverbial apple cart in a good way, and will keep scientists’ thinking caps on for years. And to think, this was a result of one of our projects, an astute project administrator (Bonnie) and a single project member. I wonder what the man who tested thinks of all of this. He is making science and all he thought he was doing was testing for genealogy. You just never know where the next scientific breakthrough will come from. Congrats to all involved, Bonnie, Thomas, Michael and to Bennett and Max for having this evolution revolution happen right in their lab!

If I felt sorry for Judy following Spencer, I really felt sorry for the breakout sessions following Thomas, Michael and Bonnie’s session. Thankfully at least we had a break in-between, but most people were wandering around with some degree of stunned disbelief on their faces. We all found it hard to fathom that we had been among the first to know of this momentous breakthrough.

I had a hard time deciding which session to attend, CeCe’s “Family Finder” session or Elise’s. I decided to attend Elise’s “Advanced Admin Techniques” because I work with autosomal DNA with my clients and I tend to keep more current there. Elise’s session was great for newer admins and held tips and hints for us old-timers too. I realized I really need to just sit down and play with all of the options.

There are some great new features built in that I’ve never noticed. For example, did you know that you can group people directly from the Y results chart without going to the subgrouping page? It’s much easier too because it’s one step. However, the bad news is that you still can’t invite someone who has already tested to join your project. Hopefully that feature will be added soon.

The next session was “A Tale of Two Families” given by Rory Van Tuyl detailing how he used various techniques to discern whether individuals who did not show up as matches, meaning they were beyond the match threshold, were actually from the same ancient family or not. Rory is a retired engineer and it shows in his attention to detail and affinity for math.

We always tell people that mutations can and do happen at any time, but Rory proved this. He ran a monte-carlo simulation and showed that in one case, it was 50 generations between mutations, but in others, there was one mutation for three generations in a row. Mutations by no means happen at a constant rate. Of course, this means that our TIP calculator which has no choice but to use means and averages is by definition “not calibrated” for any particular family.

He also mentioned that his simulation shows that by about 150 generations, there are a couple of back mutations taking place.

The final session before the ending Q&A was Elliott speaking about IT, which really translates into new features and functions. Let’s face it, today everything involves IT.

Again, I was having trouble typing fast enough, so you might want to check the Twitter feed.

They added the SNP maps (admins, please turn them on) and the interactive tour this year. The tour isn’t used as much as it should be, so everyone, encourage your newbies to do this.

They have also added advanced matching, which I use a lot for clients, but many people didn’t realize it. So maybe a quick tour through the website options might be in order for most of us.

They are handling 50 times more data now that a year ago. Just think what next year will bring. Wow.

They are going to update the landing page again with more color and more visible options for people to do things. I hope they prompt people through things, like oldest ancestor mapping, for example. Otherwise, if it isn’t easy, most don’t.

They are upgrading Population Finder and the Gedcom viewer. They are adding a search feature. Thank you!! Older Gedcome will still be there but not searchable.

But the best news is that they are adding phasing (parent child) and an advanced capability to “reconstruct” an ancestor using more distant relatives, then the ability to search using that ancestral profile against Family Finder. Glory be! We are finally getting there. Maybe my dreaming big wasn’t as far away as I thought.

They will also remove the 5 person autosomal download restriction and the “in common with” requirement to see additional information. All good news. They are also upgrading the Chromosome browser to add more filtering options.

They are also going to offer a developer “sandbox” area for applications.

The final Q&A session began with Bennett saying that their other priorities preclude upgrading Y search to 111 markers.

They are not planning to drop the entry level tests, 12 or 25 markers or the HVR1. If they do, lots of people will never take that plunge. I was very glad to hear this.

And by way of trivia, Family Tree DNA has run more than 5 million individual tests. Wow, not bad for a company that didn’t exist, in an industry that didn’t exist, 12 years ago!

It’s an incredible time to be alive and to be a genetic genealogist! Thank you Family Tree DNA for making all of this possible.

My husband, Jim, who is kind of a geeky guy in the best of ways and really is interested in genetic genealogy from a technologist’s perspective, asked me a question about the new mitochondrial comparative sequence, the RSRS (Reconstructed Sapiens Reference Sequence). We’ve been talking about it on the blog and on the various DNA lists for days now. So it stands to reason we’re talking about it at the dinner table too.

He asked, “Why now? Why not before when the transition would have been easier?” That’s a great question! The answer isn’t nearly as short as the question. I hate it when he does this to me!

The answer is Citizen Science – that means you and me – lots of us actually. How is that possible? Let’s take a look at some history. It’s actually quite interesting!

In 1981 when the Cambridge Reference Sequence was published as a comparative model, the science of genetics was functionally brand new. This anonymous person at Cambridge University was the first person to have all 16569 bases of their mitochondria sequenced, something anyone can have today for a couple of hundred dollars. But back then in the not so distant past, it was groundbreaking. The Y DNA hadn’t even been mapped yet, so this was the very beginning. At that point in time, there was no concept of mitochondrial Eve or Y-line Adam. So the CRS became the norm because we had no other basis for comparison.

In 1999, the CRS was resequenced, and surprisingly, 11 errors were found in the original sequence. Today that is called the Revised Cambridge Reference Sequence, or rCRS, technically, and that is the sequence that is used for both academia and genetic genealogy. Most people just refer to it as the Cambridge Reference Sequence because no one would use the older sequence today.

1999 was also the first year that any commercially available genetic genealogy tests were available to the public. They were available from Oxford Ancestors and were prohibitively expensive, but that didn’t stop many of us from ordering one. If you bought the book, “Seven Daughters of Eve” you could send in the form in the back of the book, with a hefty check, and you too could discover which of the 7 daughters you descended from.

What you received was one piece of paper in the mail, months later, with a gold attendance star (like from Sunday School when you were a kid) placed on your haplogroup name. So for several hundred dollars, significantly more than a full sequence test today, I got a gold star on a J. I still have that certificate and I was unbelievably excited to know I was a member of Jasmine’s clan. Of course, in order to justify my DNA test, I had to test my husband’s too, so it cost me twice as much!

In the year 2000, Family Tree DNA opened their doors and began selling genetic genealogy testing kits. They also began surname projects. I don’t know if that was a stroke of genius or a stroke of luck. Soon thereafter, they added both haplogroup projects and geographic projects. These various project types allowed people with specific interests to focus on those areas of genetic genealogy. Little did we know that projects would eventually provide a huge pool of people who have been DNA tested for research areas, such as determining new haplogroups. In the past all sequencing had been done at academic institutions and often did not use full sequences initially due to the prohibitive cost. Many of the early academic papers were written with far fewer samples than today’s projects have members. Full sequence commercial testing has fostered exponential change in this industry.

By 2006, Family Tree DNA was offering the full mitochondrial sequence for genealogists, something still not offered today by any of the other major commercial testing companies. This not only enabled genealogists to determine who was actually a close match, but it also enabled the haplogroup projects to collect many samples of full sequence data. The coding region (meaning not the HVR1, HVR2 and HVR3 regions) is not shown in the public projects because of the possibility that they may carry medical information, but they are available for project administrators to see, if the individual participant authorizes administrator view access.

Haplogroups aren’t just determined by the hypervariable (HVR) regions, but by mutations found in the entire mitochondrial sequence, including the coding region. Never before had groupings of participants this size been available outside of academia, and often, not even within academia.

Many of the project administrators began discovering new haplogroups in a flurry of activity. Two that come immediately to mind are both Jim Logan and Bill Hurst. Bill began publishing about haplogroup K in the Fall 2007 JoGG issue, as did Ian Logan with a discussion of what the mitochondrial DNA of “mitochondrial Eve” might look like. In Spring of 2008, Jim Logan published a groundbreaking paper for haplogroup J, still in use today. Indeed, citizen science came into its own in the spring of 2005 when the Journal of Genetic Genealogy (JoGG) was launched to facilitate exactly this type of academic publishing effort. The more traditional publications weren’t quite ready to deal with citizen scientists making discoveries. Clearly, citizen scientists didn’t fit well into the academic publishing “box.”

Bill Hurst has been collaborating with Dr. Doron Behar for several years now and is recognized in his most recent paper. They presented a joint session at the 5th International Conference on Genetic Genealogy for DNA Administrators in Houston, Texas in March of 2009.

During this time, Family Tree DNA implemented an authorization system for people to make their full sequence DNA results, if they wanted, available to Dr. Behar for research.

Dr. Behar’s paper (along with several other authors), “A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root” was published earlier this year, defining the RSRS (Reconstructed Sapiens Reference Sequence) revealing the genetic fingerprint of Mitochondrial Eve, the original mother of us all. He was able to do this, in part, as a result of the many full sequence test results made available by Family Tree DNA customers, you and me, and by the hard work of haplogroup administrators like Bill Hurst and Jim Logan. Of course, there are many other hard-working administrators too, and I don’t mean to slight anyone.

So, this is a long-winded way to answer Jim’s question, which, in case you’ve forgotten, was “why now for the RSRS and why not before?” The answer is quite simply, Citizen Scientists were needed. People like you and me. Until the stars aligned where haplogroup projects existed, full sequence mitochondrial data became affordable and widely available, and there was a way for genealogists to contribute their results for scientific research, it couldn’t have been done – at least not yet. It’s been a long way from the gold star on haplogroup J to the beautifully elegant RSRS, the mitochondrial map of Eve, the common ancestor of everyone living today – the entire trip made in just a dozen years. Congratulations and thank you to everyone involved. Indeed, it’s really quite a remarkable story!