Category Archives: Population Genetics

Post navigation

Elizabeth Warren has released DNA testing results after being publicly challenged and derided as “Pochahontas” as a result of her claims of a family story indicating that her ancestors were Native America. If you’d like to read the specifics of the broo-haha, this Washington Post Article provides a good summary, along with additional links.

I personally find name-calling of any type unacceptable behavior, especially in a public forum, and while Elizabeth’s DNA test was taken, I presume, in an effort to settle the question and end the name-calling, what it has done is to put the science of genetic testing smack dab in the middle of the headlines.

This article is NOT about politics, it’s about science and DNA testing. I will tell you right up front that any comments that are political or hateful in nature will not be allowed to post, regardless of whether I agree with them or not. Unfortunately, these results are being interpreted in a variety of ways by different individuals, in some cases to support a particular political position. I’m presenting the science, without the politics.

This is the first of a series of two articles.

I’m dividing this first article into four sections, and I’d ask you to read all four, especially before commenting. A second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will follow shortly about how to get the most out of an ethnicity test when hunting for Native American (or other minority, for you) ethnicity.

Understanding how the science evolved and works is an important factor of comprehending the results and what they actually mean, especially since Elizabeth’s are presented in a different format than we are used to seeing. What a wonderful teaching opportunity.

Family History and DNA Science – How this works.

Elizabeth Warren’s Genealogy

Elizabeth Warren’s DNA Results

Questions and Answers – These are the questions I’m seeing, and my science-based answers.

My second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will include:

Potential – This isn’t all that can be done with ethnicity results. What more can you do to identify that Native ancestor?

Resources with Step by Step Instructions

Now, let’s look at Elizabeth’s results and how we got to this point.

Family Stories and DNA

Every person that grows up in their biological family hears family stories. We have no reason NOT to believe them until we learn something that potentially conflicts with the facts as represented in the story.

In terms of stories handed down for generations, all we have to go on, initially, are the stories themselves and our confidence in the person relating the story to us. The day that we begin to suspect that something might be amiss, we start digging, and for some people, that digging begins with a DNA test for ethnicity.

My family had that same Cherokee story. My great-grandmother on my father’s side who died in 1918 was reportedly “full blooded Cherokee” 60 years later when I discovered she had existed. Her brothers reportedly went to Oklahoma to claim headrights land. There were surely nuggets of truth in that narrative. Family members did indeed to go Oklahoma. One did own Cherokee land, BUT, he purchased that land from a tribal member who received an allotment. I discovered that tidbit later.

What wasn’t true? My great-grandmother was not 100% Cherokee. To the best of my knowledge now, a century after her death, she wasn’t Cherokee at all. She probably wasn’t Native at all. Why, then, did that story trickle down to my generation?

I surely don’t know. I can speculate that it might have been because various people were claiming Native ancestry in order to claim land when the government paid tribal members for land as reservations were dissolved between 1893 and 1914. You can read more about that in this article at the National Archives about the Dawes Rolls, compiled for the Cherokee, Creek, Choctaw, Chickasaw and Seminole for that purpose.

I can also speculate that someone in the family was confused about the brother’s land ownership, especially since it was Cherokee land.

I could also speculate that the confusion might have resulted because her husband’s father actually did move to Oklahoma and lived on Choctaw land.

But here is what I do know. I believed that story because there wasn’t any reason NOT to believe it, and the entire family shared the same story. We all believed it…until we discovered evidence through DNA testing that contradicted the story.

Before we discuss Elizabeth Warren’s actual results, let’s take a brief look at the underlying science.

Enter DNA Testing

DNA testing for ethnicity was first introduced in a very rudimentary form in 2002 (not a typo) and has progressed exponentially since. The major vendors who offer tests that provide their customers with ethnicity estimates (please note the word estimates) have all refined their customer’s results several times. The reference populations improve, the vendor’s internal software algorithms improve and population genetics as a science moves forward with new discoveries.

Note that major vendors in this context mean Family Tree DNA, 23andMe, the Genographic Project and Ancestry. Two newer vendors include MyHeritage and LivingDNA although LivingDNA is focused on England and MyHeritage, who utilizes imputation is not yet quite up to snuff on their ethnicity estimates. Another entity, GedMatch isn’t a testing vendor, but does provide multiple ethnicity tools if you upload your results from the other vendors. To get an idea of how widely the results vary, you can see the results of my tests at the different vendors here and here.

My initial DNA ethnicity test, in 2002, reported that I was 25% Native American, but I’m clearly not. It’s evident to me now, but it wasn’t then. That early ethnicity test was the dinosaur ages in genetic genealogy, but it did send me on a quest through genealogical records to prove that my family member was indeed Native. My father clearly believed this, as did the rest of the family. One of my early memories when I was about four years old was attending a (then illegal) powwow with my Dad.

In order to prove that Elizabeth Vannoy, that great-grandmother, was Native I asked a cousin who descends from her matrilineally to take a mitochondrial DNA test that would unquestionably provide the ethnicity of her matrilineal line – that of her mother’s mother’s mother’s direct line. If she was Native, her haplogroup would be a derivative either A, B, C, D or X. Her mitochondrial DNA was European, haplogroup J, clearly not Native, so Elizabeth Vannoy was not Native on that line of her family. Ok, maybe through her dad’s line then. I was able to find a Vanoy male descendant of her father, Joel Vannoy, to test his Y DNA and he was not Native either. Rats!

Tracking Elizabeth Vannoy’s genealogy back in time provided no paper-trail link to any Native ancestors, but there were and are still females whose surnames and heritage we don’t know. Were they Native or part Native? Possibly. Nothing precludes it, but nothing (yet) confirms it either.

Ethnicity is often surprising and sometimes disappointing. People who expect Native American heritage in their DNA sometimes don’t find it. Why?

There is no Native ancestor

The Native DNA has “washed out” over the generations, but they did have a Native ancestor

We haven’t yet learned to recognize all of the segments that are Native

The testing company did not test the area that is Native

Not all vendors test the same areas of our DNA. Each major company tests about 700,000 locations, roughly, but not the same 700,000. If you’re interested in specifics, you can read more about that here.

50-50 Chance

Everyone receives half of their autosomal DNA from each parent.

That means that each parent contributes only HALF OF THEIR DNA to a child. The other half of their DNA is never passed on, at least not to that child.

Therefore, ancestral DNA passed on is literally cut in half in each generation. If your parent has a Native American DNA segment, there is a 50-50 chance you’ll inherit it too. You could inherit the entire segment, a portion of the segment, or none of the segment at all.

These calculations are estimates and use averages. Why? Because they tell us what to expect, on average. Every person’s results will vary. It’s entirely possible to carry a Native (or other ethnic) segment from 7 or 8 or 9 generations ago, or to have none in 5 generations. Of course, these calculations also presume that the “Native” ancestor we find in our tree was fully Native. If the Native ancestor was already admixed, then the percentages of Native DNA that you could inherit drop further.

Why Call Ethnicity an Estimate?

You’ve probably figured out by now that due to the way that DNA is inherited, your ethnicity as reported by the major testing companies isn’t an exact science. I discussed the methodology behind ethnicity results in the article, Ethnicity Testing – A Conundrum.

It is, however, a specialized science known as Population Genetics. The quality of the results that are returned to you varies based on several factors:

World Region – Ethnicity estimates are quite accurate at the continental level, plus Jewish – meaning African, Indo-European, Asian, Native American and Jewish. These regions are more different than alike and better able to be separated.

Reference Population – The size of the population your results are being compared to is important. The larger the reference population, the more likely your results are to be accurate.

Vendor Algorithm – None of the vendors provide the exact nature of their internal algorithms that they use to determine your ethnicity percentages. Suffice it to say that each vendor’s staff includes population geneticists and they all have years of experience. These internal differences are why the estimates vary when compared to each other.

Size of the Segment – As with all genetic genealogy, bigger is better because larger segments stand a better chance of being accurate.

Academic Phasing – A methodology academics and vendors use in which segments of DNA that are known to travel together during inheritance are grouped together in your results. This methodology is not infallible, but in general, it helps to group your mother’s DNA together and your father’s DNA together, especially when parents are not available for testing.

Parental Phasing – If your parents test and they too have the same segment identified as Native, you know that the identification of that segment as Native is NOT a factor of chance, where the DNA of each of your parents just happens to fall together in a manner as to mimic a Native segment. Parental phasing is the ability to divide your DNA into two parts based on your parent’s DNA test(s).

Two Chromosomes – You have two chromosomes, one from your mother and one from your father. DNA testing can’t easily separate those chromosomes, so the exact same “address” on your mother’s and father’s chromosomes that you inherited may carry two different ethnicities. Unless your parents are both from the same ethnic population, of course.

All of these factors, together, create a confidence score. Consumers never see these scores as such, but the vendors return the highest confidence results to their customers. Some vendors include the capability, one way or another, to view or omit lower confidence results.

Parental Phasing – Identical by Descent

If you’re lucky enough to have your parents, or even one parent available to test, you can determine whether that segment thought to be Native came from one of your parents, or if the combination of both of your parent’s DNA just happened to combine to “look” Native.

Here’s an example where the “letters” (nucleotides) of Native DNA for an example segment are shown at left. If you received the As from one of your parents, your DNA is said to be phased to that parent’s DNA. That means that you in fact inherited that piece of your DNA from your mother, in the case shown below.

That’s known as Identical by Descent (IBD). The other possibility is what your DNA from both of your parents intermixed to mimic a Native segment, shown below.

This is known as Identical by Chance (IBC).

You don’t need to understand the underpinnings of this phenomenon, just remember that it can happen, and the smaller the segment, the more likely that a chance combination can randomly happen.

Elizabeth Warren’s Genealogy

Elizabeth Warren’s genealogy, is reported to the 5th generation by WikiTree.

Dr. Carlos D. Bustamante is an internationally recognized leader in the application of data science and genomics technology to problems in medicine, agriculture, and biology. He received his Ph.D. in Biology and MS in Statistics from Harvard University (2001), was on the faculty at Cornell University (2002-9), and was named a MacArthur Fellow in 2010. He is currently Professor of Biomedical Data Science, Genetics, and (by courtesy) Biology at Stanford University. Dr. Bustamante has a passion for building new academic units, non-profits, and companies to solve pressing scientific challenges. He is Founding Director of the Stanford Center for Computational, Evolutionary, and Human Genomics (CEHG) and Inaugural Chair of the Department of Biomedical Data Science. He is the Owner and President of CDB Consulting, LTD. and also a Director at Eden Roc Biotech, founder of Arc-Bio (formerly IdentifyGenomics and BigData Bio), and an SAB member of Imprimed, Etalon DX, and Digitalis Ventures among others.

Ancestry Informative Markers (AIMs) are commonly used to estimate overall admixture proportions efficiently and inexpensively. AIMs are polymorphisms that exhibit large allele frequency differences between populations and can be used to infer individuals’ geographic origins.

And:

Using a panel of AIMs distributed throughout the genome, it is possible to estimate the relative ancestral proportions in admixed individuals such as African Americans and Latin Americans, as well as to infer the time since the admixture process.

The methodology produced results of the type that we are used to seeing in terms of continental admixture, shown in the graphic below from the paper.

Matching test takers against the genetic locations that can be identified as either Native or African or European informs us that our own ancestors carried the DNA associated with that ethnicity.

Of course, the Native samples from this paper were focused south of the United States, but the process is the same regardless. The original Native American population of a few individuals arrived thousands of years ago in one or more groups from Asia and their descendants spread throughout both North and South America.

Elizabeth’s request, from the report:

To analyze genetic data from an individual of European descent and determine if there is reliable evidence of Native American and/or African ancestry. The identity of the sample donor, Elizabeth Warren, was not known to the analyst during the time the work was performed.

Elizabeth’s test included 764,958 genetic locations, of which 660,173 overlapped with locations used in ancestry analysis.

The Results section says after stating that Elizabeth’s DNA is primarily (95% or greater) European:

The analysis also identified 5 genetic segments as Native American in origin at high confidence, defined at the 99% posterior probability value. We performed several additional analyses to confirm the presence of Native American ancestry and to estimate the position of the ancestor in the individual’s pedigree.

The largest segment identified as having Native American ancestry is on chromosome 10. This segment is 13.4 centiMorgans in genetic length, and spans approximately 4,700,000 DNA bases. Based on a principal components analysis (Novembre et al., 2008), this segment is clearly distinct from segments of European ancestry (nominal p-value 7.4 x 10-7, corrected p-value of 2.6 x 10-4) and is strongly associated with Native American ancestry.

The total length of the 5 genetic segments identified as having Native American ancestry is 25.6 centiMorgans, and they span approximately 12,300,000 DNA bases. The average segment length is 5.8 centiMorgans. The total and average segment size suggest (via the method of moments) an unadmixed Native American ancestor in the pedigree at approximately 8 generations before the sample, although the actual number could be somewhat lower or higher (Gravel, 2012 and Huff et al., 2011).

Dr. Bustamante’s Conclusion:

While the vast majority of the individual’s ancestry is European, the results strongly support the existence of an unadmixed Native American ancestor in the individual’s pedigree, likely in the range of 6-10 generations ago.

I was very pleased to see that Dr. Bustamante had included the PCA (Principal Component Analysis) for Elizabeth’s sample as well.

PCA analysis is the scientific methodology utilized to group individuals to and within populations.

Figure one shows the section of chromosome 10 that showed the largest Native American haplotype, meaning DNA block, as compared to other populations.

Remember that since Elizabeth received a chromosome from BOTH parents, that she has two strands of DNA in that location.

Here’s our example again.

Given that Mom’s DNA is Native, and Dad’s is European in this example, the expected results when comparing this segment of DNA to other populations is that it would look half Native (Mom’s strand) and half European (Dad’s strand.)

The second graphic shows Elizabeth’s sample and where it falls in the comparison of First Nations (Canada) and Indigenous Mexican individuals. Given that Elizabeth’s Native ancestor would have been from the United States, her sample falls where expected, inbetween.

Let’s take a look at some of the questions being asked.

Questions and Answers

I’ve seen a lot of misconceptions and questions regarding these results. Let’s take them one by one:

Question – Can these results prove that Elizabeth is Cherokee?

Answer – No, there is no test, anyplace, from any lab or vendor, that can prove what tribe your ancestors were from. I wrote an article titled Finding Your American Indian Tribe Using DNA, but that process involves working with your matches, Y and mitochondrial DNA testing, and genealogy.

Q – Are these results absolutely positive?

A – The words “absolutely positive” are a difficult quantifier. Given the size of the largest segment, 13.4 cM, and that there are 5 Native segments totaling 25.6 cM, and that Dr. Bustamante’s lab performed the analysis – I’d say this is as close to “absolutely positive” as you can get without genealogical confirmation.

A 13.4 cM segment is a valid segment that phases to parents 98% of the time, according to Philip Gammon’s work, here, and 99% of the time in my own analysis here. That indicates that a 13.4 cM segment is very likely a legitimately ancestral segment, not a match by chance. The additional 4 segments simply increase the likelihood of a Native ancestor. In other words, for there NOT to be a Native ancestor, all 5 segments, including the large 13.4 cM segment would have to be misidentified by one of the premier scientists in the field.

Q – What did Dr. Bustamante mean by “evidence of an unadmixed Native American ancestor?”

A – Unadmixed means that the Native person was fully Native, meaning not admixed with European, Asian or African DNA. Admixture, in this context, means that the individual is a mixture of multiple ethnic groups. This is an important concept, because if you discover that your ancestor 4 generations ago was a Cherokee tribal member, but the reality was that they were only 25% Native, that means that the DNA was already in the process of being divided. If your 4th generation ancestor was fully Native, you would receive about 6.25% of their DNA which would be all Native. If they were only 25% Native, that means that while you will still receive about 6.25% of their DNA but only one fourth of that 6.25% is possibly Native – so 1.56%. You could also receive NONE of their Native DNA.

Q – Is this the same test that the major companies use?

A – Yes and no. The test itself was probably performed on the same Illumina chip platform, because the chips available cover the markers that Bustamante needed for analysis.

The major companies use the same reference data bases, plus their own internal or private data bases in addition. They do not create PCA models for each tester. They do use the same methodology described by Dr. Bustamante in terms of AIMs, along with proprietary algorithms to further define the results. Vendors may also use additional internal tools.

Q – Did Dr. Bustamante use more than one methodology in his analysis? What if one was wrong?

A – Yes, he utilized two different methodologies whose results agreed. The global ancestry method evaluates each location independently of any surrounding genetic locations, ignoring any correlation or relationship to neighboring DNA. The second methodology, known as the local ancestry method looks at each location in combination with its neighbors, given that DNA pieces are known to travel together. This second methodology allows comparisons to entire segments in reference populations and is what allows the identification of complete ancestral segments that are identified as Native or any other population.

Q – If Elizabeth’s DNA results hadn’t shown Native heritage, would that have proven that she didn’t have Native ancestry?

A – No, not definitively, although that is a possible reason for ethnicity results not showing Native admixture. It would have meant that either she didn’t have a Native ancestor, the DNA washed out, or we cannot yet detect those segments.

Q – Does this qualify Elizabeth to join a tribe?

A – No. Every tribe defines their own criteria for membership. Some tribes embrace DNA testing for paternity issues, but none, to the best of my knowledge, accept or rely entirely on DNA results for membership. DNA results alone cannot identify a specific tribe. Tribes are societal constructs and Native people genetically are more alike than different, especially in areas where tribes lived nearby, fought and captured other tribe’s members.

Q – Why does Dr. Bustamante use words like “strong probability” instead of absolutes, such as the percentages shown by commercial DNA testing companies?

A – Dr. Bustamante’s comments accurately reflect the state of our knowledge today. The vendors attempt to make the results understandable and attractive for the general population. Most vendors, if you read their statements closely and look at your various options indicate that ethnicity is only an estimate, and some provide the ability to view your ethnicity estimate results at high, medium and low confidence levels.

Q – Can we tell, precisely, when Elizabeth had a Native ancestor?

A – No, that’s why Dr. Bustamante states that Elizabeth’s ancestor was approximately 8 generations ago, and in the range of 6-10 generations ago. This analysis is a result of combined factors, including the total centiMorgans of Native DNA, the number of separate reasonably large segments, the size of the longest segment, and the confidence score for each segment. Those factors together predict most likely when a fully Native ancestor was present in the tree. Keep in mind that if Elizabeth had more than one Native ancestor, that too could affect the time prediction.

Q – Does Dr. Bustamante provide this type of analysis or tools for the general public?

A – Unfortunately, no. Dr. Bustamante’s lab is a research facility only.

Roberta’s Summary of the Analysis

I find no omissions or questionable methods and I agree with Dr. Bustamante’s analysis. In other words, yes, I believe, based on these results, that Elizabeth had a Native ancestor further back in her tree.

I would love for every tester to be able to receive PCA results like this.

However, an ethnicity confirmation isn’t all that can be done with Elizabeth’s results. Additional tools and opportunities are available outside of an academic setting, at the vendors where we test, using matching and other tools we have access to as the consuming public.

We will look at those possibilities in a second article, because Elizabeth’s results are really just a beginning and scratch the surface. There’s more available, much more. It won’t change Elizabeth’s ethnicity results, but it could lead to positively identifying the Native ancestor, or at least the ancestral Native line.

Join me in my next article for Possibilities, Wringing the Most Out of Your DNA Ethnicity Test.

This week while working with German records, I came across something very interesting, and as I thought more about this particular document, I realized that there is a deeper message here than is initially evident.

The document is a list of individuals who had obtained permission to emigrate from Wurttemberg, Germany between 1816 and 1822. At that time, one had to file for permission to emigrate, obtain permission, and the list of those departing was a legal document published to forewarn any debtors. This list happens to include, in some cases, the destination of the departing German citizen. It’s obvious that this information was not essential, because at least half of the entries don’t have any destination. They really didn’t care where you were going.

Some destinations are very specific, particularly if they were moving to another German town outside of Wurttemberg.

Several destinations gave locations like “to America or Russia” and sometimes “to America and Russia” and others “some to America and some to Russia.” Either the emigrants hadn’t yet made up their mind, or the German authorities really didn’t care which of the two destinations.

My ancestors were in the “America” group, but I never thought about Germans migrating to Russia. In general, my assumption has been that migration was generally westward, and Russia is significantly east of Germany.

Even more interesting are the entries that say Kaukasus which is dramatically distant. The Caucasus is just north of the Middle East, in the area considered Eurasia, the dividing line between Europe and Asia, between the Black and Caspain Seas. In 8 cases, they gave the name of the town, Odessa, which is in the Ukraine on the Black Sea. So, Russia may not mean the closest portion of Russia – although no part of Russia was close to Germany. Russia as a location may indeed mean traveling thousands of miles east and south. Not exactly the direction in which we think of relatively contemporary population migration.

There were 3605 records total, many without additional information. But those that do provide additional information are quite interesting:

327 America (including North America)

501 Russia (some say Georgian, one says Crimea)

112 Kaukasus (one says Russia – Kaukusas)

11 Asia (1 says Russian Asia)

16 Poland

17 Austria

8 say Odessa, which is in the Ukraine on the Black Sea.

Some name other German towns.

A couple of people are noted as Separatist, one is divorced, two are single females with illegitimate children. Several are noted as widows or widowers. One says “with wife without permission.”

Perhaps the most remarkable aspect of this list are locations not listed. No other countries are listed, other than what is shown above. South America is not listed. No place in southern or western or northern Europe is listed. Neither is Scandinavia.

I would never have thought about “backward migration.” In genetic genealogy, unless you are one of the Vikings who basically invaded pretty much anyplace in Europe and the Mediterranean that could be invaded, we think of settlement and migration as moving northward and eastward into Europe out of the Middle East, Asia and the Caucasus. I have never, not once, thought about people from central Europe migrating back into Eurasia, back into the Caucasus from southwestern Germany – over 2000 km or about 1300 miles. They did, however, and became known as the Black Sea Germans.

Georgia, on the other hand, is even further – about 3680 km or 2300 miles.

At 10 miles a day in a wagon, it would be 230 days to Georgia or 130 days to Odessa. You had to really, really want to go there.

On the other hand, the trip to America was “just” 600 km (370 miles) or so to Rotterdam where you boarded a ship, sailed and waited, probably seasick, for between 2 and 3 months to arrive. You then climbed aboard a wagon again to your final American destination which was probably relatively close to your port of arrival – at least compared to the Caucasus.

We’re not surprised to find “German” DNA in America of course, but finding “German” DNA in the Middle East or the Caucasus could well lead to interpreting the data incorrectly if we adhere to the model of only forward (nearing northward and westward) migration. In these records, we find documentation that significant backwards migration did occur, and relatively recently. We can’t assume that where DNA is found today is where it originated nor that the expansion area follows the generally accepted direction of population migration.

Of course, we’ve always know that about destination locations, like the British Isles for example, but we don’t often think of places in Russia and the Caucasus which was at that time under Russian rule as immigration locations for European emigrants. That small stream of Russian emigrants, over time added up to a significant population. The first Russian census was taken in 1897 and it showed 1.8 million Germans living in Russia.

If you’re interested in further information, there is a very interesting website that includes a history and map of German Russian settlements from the 1700s and 1800s.

Like this:

For the past three years I’ve written a year-in-review article. You can see just how much the landscape has changed in the 2012, 2013 and 2014 versions.

This year, I’ve added a few specific “award” categories for people or firms that I feel need to be specially recognized as outstanding in one direction or the other.

In past years, some news items, announcements and innovations turned out to be very important like the Genographic Project and GedMatch, and others, well, not so much. Who among us has tested their full genome today, for example, or even their exome? And would you do with that information if you did?

And then there are the deaths, like the Sorenson database and Ancestry’s own Y and mitochondrial data base. I still shudder to think how much we’ve lost at the corporate hands of Ancestry.

In past years, there have often been big new announcements facilitated by new technology. In many ways, the big fish have been caught in a technology sense. Those big fish are autosomal DNA and the Big Y types of tests. Both of these have created an avalanche of data and we, personally and as a community, are still trying to sort through what all of this means genealogically and how to best utilize the information. Now we need tools.

This is probably illustrated most aptly by the expansion of the Y tree.

The SNP Tsunami Growing Pains Continue

Going from 800+ SNPs in 2012 to more than 35,000 SNPs today has introduced its own set of problems. First, there are multiple trees in existence, completely or partially maintained by different organizations for different purposes. Needless to say, these trees are not in sync with each other. The criteria for adding a SNP to the tree is decided by the owner or steward of that tree, and there is no agreement as to the definition of a valid SNP or how many instances of that SNP need to be in existence to be added to the tree.

This angst has been taking place for the most part outside of the public view, but it exists just the same.

For example, 23andMe still uses the old haplogroup names like R1b which have not been used in years elsewhere. Family Tree DNA is catching up with updating their tree, working with haplogroup administrators to be sure only high quality, proven SNPs are added to branches. ISOGG maintains another tree (one branch shown above) that’s publicly available, utilizing volunteers per haplogroup and sometimes per subgroup. Other individuals and organizations maintain other trees, or branches of trees, some very accurate and some adding a new “branch” with as little as one result.

The good news is that this will shake itself out. Personally, I’m voting for the more conservative approach for public reference trees to avoid “pollution” and a lot of shifting and changing downstream when it’s discovered that the single instance of a SNP is either invalid or in a different branch location. However, you have to start with an experimental or speculative tree before you can prove that a SNP is where it belongs or needs to be moved, so each of the trees has its own purpose.

The full trees I utilize are the Family Tree DNA tree, available for customers, the ISOGG tree and Ray Banks’ tree which includes locations where the SNPs are found when the geographic location is localized. Within haplogroup projects, I tend to use a speculative tree assembled by the administrators, if one is available. The haplogroup admins generally know more about their haplogroup or branch than anyone else.

The bad news is that this situation hasn’t shaken itself out yet, and due to the magnitude of the elephant at hand, I don’t think it will anytime soon. As this shuffling and shaking occurs, we learn more about where the SNPs are found today in the world, where they aren’t found, which SNPs are “family” or “clan” SNPs and the timeframes in which they were born.

In other words, this is a learning process for all involved – albeit a slow and frustrating one. However, we are making progress and the tree becomes more robust and accurate every year.

We may be having growing pains, but growing pains aren’t necessarily a bad thing and are necessary for growth.

Thank you to the hundreds of volunteers who work on these trees, and in particular, to Alice Fairhurst who has spearheaded the ISOGG tree for the past nine years. Alice retired from that volunteer position this year and is shown below after receiving two much-deserved awards for her service at the Family Tree DNA Conference in November.

Best Innovative Use of Integrated Data

Dr. Maurice Gleeson receives an award this year for the best genealogical use of integrated types of data. He has utilized just about every tool he can find to wring as much information as possible out of Y DNA results. Not only that, but he has taken great pains to share that information with us in presentations in the US and overseas, and by creating a video, noted in the article below. Thanks so much Maurice.

Making Sense of Y Data

The advent of massive amounts of Y DNA data has been both wonderful and perplexing. We as genetic genealogists want to know as much about our family as possible, including what the combination of STR and SNP markers means to us. In other words, we don’t want two separate “test results” but a genealogical marriage of the two.

I took a look at this from the perspective of the Estes DNA project. Of course, everyone else will view those results through the lens of their own surname or haplogroup project.

At the Family Tree DNA Conference in November, James Irvine and Maurice Gleeson both presented sessions on utilizing a combination of STR and SNP data and various tools in analyzing their individual projects.

Peter’s session at the genealogy conference in Sweden this year was packed. This photo, compliments of Katherine Borges, shows the room and the level of interest in Y-DNA and the messages it holds for genetic genealogists.

This type of work is the wave of the future, although hopefully it won’t be so manually intensive. However, the process of discovery is by definition laborious. From this early work will one day emerge reproducible methodologies, the fruits of which we will all enjoy.

Haplogroup Definitions and Discoveries Continue

Often, haplogroup work flies under the radar today and gets dwarfed by some of the larger citizen science projects, but this work is fundamentally important. In 2015, we made discoveries about haplogroups A4 and C, for example.

These aren’t the only discoveries, by any stretch of the imagination. For example, Mike Wadna, administrator for the Haplogroup R1b Project reports that there are now over 1500 SNPs on the R1b tree at Family Tree DNA – which is just about twice as many as were known in total for the entire Y tree in 2012 before the Genographic project was introduced.

The new Y DNA SNP Packs being introduced by Family Tree DNA which test more than 100 SNPs for about $100 will go a very long way in helping participants obtain haplogroup assignments further down the tree without doing the significantly more expensive Big Y test. For example, the R1b-DF49XM222 SNP Pack tests 157 SNPs for $109. Of course, if you want to discover your own private line of SNPs, you’ll have to take the Big Y. SNP Packs can only test what is already known and the Big Y is a test of discovery.

Best Blog

Jim Bartlett, hands down, receives this award for his new and wonderful blog, Segmentology.

Making Sense of Autosomal DNA

Our autosomal DNA results provide us with matches at each of the vendors and at GedMatch, but what do we DO with all those matches and how to we utilize the genetic match information? How to we translate those matches into ancestral information. And once we’ve assigned a common ancestor to a match with an individual, how does that match affect other matches on that same segment?

2015 has been the year of sorting through the pieces and defining terms like IBS (identical by state, which covers both identical by population and identical by chance) and IBD (identical by descent). There has been a lot written this year.

Jim Bartlett, a long-time autosomal researcher has introduced his new blog, Segmentology, to discuss his journey through mapping ancestors to his DNA segments. To the best of my knowledge, Jim has mapped more of his chromosomes than any other researcher, more than 80% to specific ancestors – and all of us can leverage Jim’s lessons learned.

Earlier in the year, there was a lot of discussion and dissention about the definition of and use of small segments. I utilize them, carefully, generally in conjunction with larger segments. Others don’t. Here’s my advice. Don’t get yourself hung up on this. You probably won’t need or use small segments until you get done with the larger segments, meaning low-hanging fruit, or unless you are doing a very specific research project. By the time you get to that point, you’ll understand this topic and you’ll realize that the various researchers agree about far more than they disagree, and you can make your own decision based on your individual circumstances. If you’re entirely endogamous, small segments may just make you crazy. However, if you’re chasing a colonial American ancestor, then you may need those small segments to identify or confirm that ancestor.

It is unfortunate, however, that all of the relevant articles are not represented in the ISOGG wiki, allowing people to fully educate themselves. Hopefully this can be updated shortly with the additional articles, listed above and from Jim Bartlett’s blog, published during this past year.

As we learn more about how to use autosomal DNA, we have begun to reconstruct our ancestors from the DNA of their descendants. Not as in cloning, but as in attributing DNA found in multiple descendants that originate from a common ancestor, or ancestral couple. The first foray into this arena was GedMatch with their Lazarus tool.

Some of you may remember J.R. Ewing on the television show called Dallas that ran from 1978 through 1991. J.R. Ewing, a greedy and unethical oil tycoon was one of the main characters. The series was utterly mesmerizing, and literally everyone tuned in. We all, and I mean universally, hated J.R. Ewing for what he unfeelingly and selfishly did to his family and others. Finally, in a cliffhanger end of the season episode, someone shot J.R. Ewing. OMG!!! We didn’t know who. We didn’t know if J.R. lived or died. Speculation was rampant. “Who shot JR?” was the theme on t-shirts everyplace that summer. J.R. Ewing, over time, became the man all of America loved to hate.

Ancestry has become the J.R. Ewing of the genealogy world for the same reasons.

In essence, in the genetic genealogy world, Ancestry introduced a substandard DNA product, which remains substandard years later with no chromosome browser or comparison tools that we need….and they have the unmitigated audacity to try to convince us we really don’t need those tools anyway. Kind of like trying to convince someone with a car that they don’t need tires.

Worse, yet, they’ve introduced “better” tools (New Ancestor Discoveries), as in tools that were going to be better than a chromosome browser. New Ancestor Discoveries “gives us” ancestors that aren’t ours. Sadly, there are many genealogists being led down the wrong path with no compass available.

Ancestry’s history of corporate stewardship is abysmal and continues with the obsolescence of various products and services including the Sorenson DNA database, their own Y and mtDNA database, MyFamily and most recently, Family Tree Maker. While the Family Tree Maker announcement has been met with great gnashing of teeth and angst among their customers, there are other software programs available. Ancestry’s choices to obsolete the DNA data bases is irrecoverable and a huge loss to the genetic genealogy community. That information is lost forever and not available elsewhere – a priceless, irreplaceable international treasure intentionally trashed.

If Ancestry had not bought up nearly all of the competing resources, people would be cancelling their subscriptions in droves to use another company – any other company. But there really is no one else anymore. Ancestry knows this, so they have become the J.R. Ewing of the genealogy world – uncaring about the effects of their decisions on their customers or the community as a whole. It’s hard for me to believe they have knowingly created such wholesale animosity within their own customer base. I think having a job as a customer service rep at Ancestry would be an extremely undesirable job right now. Many customers are furious and Ancestry has managed to upset pretty much everyone one way or another in 2015.

In October, 23andMe announced that it has reached an agreement with the FDA about reporting some health information such as carrier status and traits to their clients. As a part of or perhaps as a result of that agreement, 23andMe is dramatically changing the user experience.

In some aspects, the process will be simplified for genealogists with a universal opt-in. However, other functions are being removed and the price has doubled. New advertising says little or nothing about genealogy and is entirely medically focused. That combined with the move of the trees offsite to MyHeritage seems to signal that 23andMe has lost any commitment they had to the genetic genealogy community, effectively abandoning the group entirely that pulled their collective bacon out of the fire. This is somehow greatly ironic in light of the fact that it was the genetic genealogy community through their testing recommendations that kept 23andMe in business for the two years, from November of 2013 through October of 2015 when the FDA had the health portion of their testing shut down. This is a mighty fine thank you.

As a result of the changes at 23andMe relative to genealogy, the genetic genealogy community has largely withdrawn their support and recommendations to test at 23andMe in favor of Ancestry and Family Tree DNA.

My account at 23andMe has not yet been converted to the new format, so I cannot personally comment on the format changes yet, but I will write about the experience in 2016 after my account is converted.

Furthermore, I will also be writing a new autosomal vendor testing comparison article after their new platform is released.

Another award this year is the Cone of Shame award which is also awarded to both Ancestry and 23andMe for their methodology of obtaining “consent” to sell their customers’, meaning our, DNA and associated information.

Now, both Ancestry and 23andMe have made research arrangements and state in their release and privacy verbiage that all customers must electronically sign (or click through) when purchasing their DNA tests that they can sell, at minimum, your anonymized DNA data, without any further consent. And there is no opt-out at that level.

They can also use our DNA and data internally, meaning that 23andMe’s dream of creating and patenting new drugs can come true based on your DNA that you submitted for genealogical purposes, even if they never sell it to anyone else.

23andMe is now looking at expanding beyond the development of DNA testing and exploring the possibility of developing its own medications. In July, the company raised $79 million to partly fund that effort. Additionally, the funding will likely help the company continue with the development of its new therapeutics division. In March, 23andMe began to delve into the therapeutics market, to create a third pillar behind the company’s personal genetics tests and sales of genetic data to pharmaceutical companies.

Given that the future of genetic genealogy at these two companies seems to be tied to the sale of their customer’s genetic and other information, which, based on the above, is very clearly worth big bucks, I feel that the fact that these companies are selling and utilizing their customer’s information in this manner should be fully disclosed. Even more appropriate, the DNA information should not be sold or utilized for research without an informed consent that would traditionally be used for research subjects.

Within the past few days, I wrote an article, providing specifics and calling on both companies to do the following.

To minimally create transparent, understandable verbiage that informs their customers before the end of the purchase process that their DNA will be sold or utilized for unspecified research with the intention of financial gain and that there is no opt-out. However, a preferred plan of action would be a combination of 2 and 3, below.

Implement a plan where customer DNA can never be utilized for anything other than to deliver the services to the consumers that they purchased unless a separate, fully informed consent authorization is signed for each research project, without coercion, meaning that the client does not have to sign the consent to obtain any of the DNA testing or services.

To immediately stop utilizing the DNA information and results from customers who have already tested until they have signed an appropriate informed consent form for each research project in which their DNA or other information will be utilized.

And Now Ancestry Healthhttp://dna-explained.com/2015/06/06/and-now-ancestry-health/

The Citizen Science Leadership Award this year goes to Blaine Bettinger for initiating the Shared cM Project, a crowdsourced project which benefits everyone.

Citizen Scientists Continue to Push the Edges of the Envelope with the Shared cM Project

Citizen scientists, in the words of Dr. Doron Behar, “are not amateurs.” In fact, citizen scientists have been contributing mightily and pushing the edge of the genetic genealogy frontier consistently now for 15 years. This trend continues, with new discoveries and new ways of viewing and utilizing information we already have.

For example, Blaine Bettinger’s Shared cM Project was begun in March and continues today. This important project has provided real life information as to the real matching amounts and ranges between people of different relationships, such as first cousins, for example, as compared to theoretical match amounts. This wonderful project produced results such as this:

I don’t think Blaine initially expected this project to continue, but it has and you can read about it, see the rest of the results, and contribute your own data here. Blaine has written several other articles on this topic as well, available at the same link.

Am I Weird or What?http://dna-explained.com/2015/03/07/am-i-weird-or-what/

I hope more genetic genealogists will compile and contribute this type of real world data as we move forward. If you have compiled something like this, the Surname DNA Journal is peer reviewed and always looking for quality articles for publication.

Privacy, Law Enforcement and DNA

Unfortunately, in May, a situation by which Y DNA was utilized in a murder investigation was reported in a sensationalist “scare” type fashion. This action provided cause, ammunition or an excuse for Ancestry to remove the Sorenson data base from public view.

I find this exceedingly, exceedingly unfortunate. Given Ancestry’s history with obsoleting older data bases instead of updating them, I’m suspecting this was an opportune moment for Ancestry to be able to withdraw this database, removing a support or upgrade problem from their plate and blame the problem on either law enforcement or the associated reporting.

I haven’t said much about this situation, in part because I’m not a lawyer and in part because the topic is so controversial and there is no possible benefit since the damage has already been done. Unfortunately, nothing anyone can say or has said will bring back the Sorenson (or Ancestry) data bases and arguments would be for naught. We already beat this dead horse a year ago when Ancestry obsoleted their own data base. On this topic, be sure to read Judy Russell’s articles and her sources as well for the “rest of the story.”

One of the best ongoing sources for this information is Dienekes’ Anthropology Blog. He covered most of the new articles and there have been several. That’s the good news and the bad news, all rolled into one. http://dienekes.blogspot.com/

I have covered several that were of particular interest to the evolution of Europeans and Native Americans.

I know of several projects involving ancient DNA that are in process now, so 2016 promises to be a wonderful ancient DNA year!

Education

Many, many new people discover genetic genealogy every day and education continues to be an ongoing and increasing need. It’s a wonderful sign that all major conferences now include genetic genealogy, many with a specific track.

The European conferences have done a great deal to bring genetic genealogy testing to Europeans. European testing benefits those of us whose ancestors were European before immigrating to North America. This year, ISOGG volunteers staffed booths and gave presentations at genealogy conferences in Birmingham, England, Dublin, Ireland and in Nyköping, Sweden, shown below, photo compliments of Catherine Borges.

Several great new online educational opportunities arose this year, outside of conferences, for which I’m very grateful.

Family Tree DNA receives the Education Award this year along with a huge vote of gratitude for their 11 years of genetic genealogy conferences. They are the only testing or genealogy company to hold a conference of this type and they do a fantastic job. Furthermore, they sponsor additional educational events by providing the “theater” for DNA presentations at international events such as the Who Do You Think You Are conference in England. Thank you Family Tree DNA.

Family Tree DNA Conference

The Family Tree DNA Conference, held in November, was a hit once again. I’m not a typical genealogy conference person. My focus is on genetic genealogy, so I want to attend a conference where I can learn something new, something leading edge about the science of genetic genealogy – and that conference is definitely the Family Tree DNA conference.

Furthermore, Family Tree DNA offers tours of their lab on the Monday following the conference for attendees, and actively solicits input on their products and features from conference attendees and project administrators.

Family Tree DNA 11th International Conference – The Best Yethttp://dna-explained.com/2015/11/18/2015-family-tree-dna-11th-international-conference-the-best-yet/

In 2014, I presented a wish list for 2015 and it didn’t do very well. Will my 2015 list for 2016 fare any better?

Ancestry restores Sorenson and their own Y and mtDNA data bases in some format or contributes to an independent organization like ISOGG.

Ancestry provides chromosome browser.

Ancestry removes or revamps Timber in order to restore legitimate matches removed by Timber algorithm.

Fully informed consent (per research project) implemented by 23andMe and Ancestry, and any other vendor who might aspire to sell consumer DNA or related information, without coercion, and not as a prerequisite for purchasing a DNA testing product. DNA and information will not be shared or utilized internally or externally without informed consent and current DNA information will cease being used in this fashion until informed consent is granted by customers who have already tested.

Improved ethnicity reporting at all vendors including ancient samples and additional reference samples for Native Americans.

Autosomal Triangulation tools at all vendors.

Big Y and STR integration and analysis enhancement at Family Tree DNA.

Ancestor Reconstruction

Mitochondrial and Y DNA search tools by ancestor and ancestral line at Family Tree DNA.

Improved tree at Family Tree DNA – along with new search capabilities.

23andMe restores lost capabilities, drops price, makes changes and adds features previously submitted as suggestions by community ambassadors.

More tools (This is equivalent to “bring me some surprises” on my Santa list as a kid.)

My own goals haven’t changed much over the years. I still just want to be able to confirm my genealogy, to learn as much as I can about each ancestor, and to break down brick walls and fill in gaps.

I’m very hopeful each year as more tools and methodologies emerge. More people test, each one providing a unique opportunity to match and to understand our past, individually and collectively. Every year genetic genealogy gets better! I can’t wait to see what 2016 has in store.

This week has seen a flurry of new scientific and news articles. What has been causing such a stir? It appears that Australian or more accurately, Australo-Melanese DNA has been found in South America’s Native American population. In addition, it has also been found in Aleutian Islanders off the coast of Alaska. In case you aren’t aware, that’s about 8,500 miles as the crow flies. That’s one tired crow. As the person paddles or walks along the shoreline, it’s even further, probably about 12,000 miles.

Whatever the story, it was quite a journey and it certainly wasn’t all over flat land.

This isn’t the first inkling we’ve had. Just a couple weeks ago, it was revealed that the Botocudo remains from Brazil were Polynesian and not admixed with either Native, European or African. This admixture was first discovered via mitochondrial DNA, but full genome sequencing confirmed their ancestry and added the twist that they were not admixed – an extremely unexpected finding. This is admittedly a bit confusing, because it implies that there were new Polynesian arrivals in the 1600s or 1700s.

Unlikely as it seems, it obviously happened, so we set that aside as relatively contemporary.

The findings in the papers just released are anything but contemporary.

How and when the Americas were populated remains contentious. Using ancient and modern genome-wide data, we find that the ancestors of all present-day Native Americans, including Athabascans and Amerindians, entered the Americas as a single migration wave from Siberia no earlier than 23 thousand years ago (KYA), and after no more than 8,000-year isolation period in Beringia. Following their arrival to the Americas, ancestral Native Americans diversified into two basal genetic branches around 13 KYA, one that is now dispersed across North and South America and the other is restricted to North America. Subsequent gene flow resulted in some Native Americans sharing ancestry with present-day East Asians (including Siberians) and, more distantly, Australo-Melanesians. Putative ‘Paleoamerican’ relict populations, including the historical Mexican Pericúes and South American Fuego-Patagonians, are not directly related to modern Australo-Melanesians as suggested by the Paleoamerican Model.

The paper included the gene flow and population migration map, above, along with dates.

The scientists sequenced the DNA of 31 living individuals from the Americas, Siberia and Oceana as follows:

Siberian:

Altai – 2

Buryat – 2

Ket – 2

Kiryak – 2

Sakha – 2

Siberian Yupik – 2

North American Native:

Tsimshian (number not stated, but by subtraction, it’s 1)

Southern North American, Central and South American Native:

Pima – 1

Huichol -1

Aymara – 1

Yakpa – 1

Oceana:

Papuan – 14

The researchers also state that they utilized 17 specimens from relict groups such as the Pericues from Mexico and Fuego-Patagonians from the southernmost tip of South America. They also sequenced two pre-Columbian mummies from the Sierra Tarahumara in northern Mexico. In total, 23 ancient samples from the Americas were utilized.

They then compared these results with a reference panel of 3053 individuals from 169 populations which included the ancient Saqqaq Greenland individual at 400 years of age as well as the Anzick child from Montana from about 12,500 years ago and the Mal’ta child from Siberia at 24,000 years of age.

Not surprisingly, all of the contemporary samples with the exception of the Tsimshian genome showed recent western Eurasian admixture.

As expected, the results confirm that the Yupik and Koryak are the closest Eurasian population to the Americas. They indicate that there is a “clean split” between the Native American population and the Koryak about 20,000 years ago.

They found that “Athabascans and Anzick-1, but not the Greenlandis Inuit and Saqqaq belong to the same initial migration wave that gave rise to present-day Amerindians from southern North America and Central and South America, and that this migration likely followed a coastal route, given our current understanding of the glacial geological and paleoenvironmental parameters of the Late Pleistocene.”

Evidence of gene flow between the two groups was also found, meaning between the Athabascans and the Inuit. Additionally, they found evidence of post-split gene flow between Siberians and Native Americans which seems to have stopped about 12,000 years ago, which meshes with the time that the Beringia land bridge was flooded by rising seas, cutting off land access between the two land masses.

They state that the results support all Native migration from Siberia, contradicting claims of an early migration from Europe.

The researchers then studied the Karitiana people of South America and determined that the two groups, Athabascans and Karitiana diverged about 13,000 years ago, probably not in current day Alaska, but in lower North America. This makes sense, because the Clovis Anzick child, found in Montana, most closely matches people in South America.

By the Clovis period of about 12,500 years ago, the Native American population had already split into two branches, the northern and southern, with the northern including Athabascan and other groups such as the Chippewa, Cree and Ojibwa. The Southern group included people from southern North America and Central and South America.

Interestingly, while admixture with the Inuit was found with the Athabascan, Inuit admixture was not found among the Cree, Ojibwa and Chippewa. The researchers suggest that this may be why the southern branch, such as the Karitiana are genetically closer to the northern Amerindians located further east than to northwest coast Amerindians and Athabascans.

Finally, we get to the Australian part. The researchers when trying to sort through the “who is closer to whom” puzzle found unexpected results. They found that some Native American populations including Aleutian Islanders, Surui (Brazil) and Athabascans are closer to Australo-Melanesians compared to other Native Americans, such as Ojibwa, Cree and Algonquian and South American Purepecha (Mexico),Arhuaco (Colombia) and Wayuu (Colombia, Venezuela). In fact, the Surui are one of the closest populations to East Asians and Australo-Melanese, the latter including Papuans, non-Papuan Melanesians, Solomon Islanders and hunter-gatherers such as Aeta. The researchers acknowledge these are weak trends, but they are nonetheless consistently present.

Dr. David Reich, from Harvard, a co-author of another paper, also published this past week, says that 2% of the DNA of Amazonians is from Oceana. If that is consistent, it speaks to a founder population in isolation, such that the 2% just keeps getting passed around in the isolated population, never being diluted by outside DNA. I would suggest that is not a weak signal.

The researchers suggest that the variance in the strength of this Oceanic signal suggests that the introduction of the Australo-Melanese occurred after the initial peopling of the Americas. The ancient samples cluster with the Native American groups and do not show the Oceanic markers and show no evidence of gene flow from Oceana.

The researchers also included cranial morphology analysis, which I am omitting since cranial morphology seems to have led researchers astray in the past, specifically in the case of Kennewick man.

One of the reasons cranial morphology is such a hotly debated topic is because of the very high degree of cranial variance found in early skeletal remains. One of the theories evolving from the cranial differences involving the populating of the Americans has been that the Australo-Melanese were part of a separate and earlier migration that gave rise to the earliest Americans who were then later replaced by the Asian ancestors of current day Native Americans. If this were the case, then the now-extinct Fuego-Patagonains samples from the location furthest south on the South American land mass should have included DNA from Oceana, but it didn’t.

The Second Article

A second article published this week, titled “’Ghost population’ hints at long lost migration to the Americas” by Ellen Callaway discusses similar findings, presented in a draft letter to Nature titled “Genetic evidence for two founding populations of the Americas” by Skoglund et al. This second group discovers the same artifact Australo-Melanesian DNA in Native American populations but suggests that it may be from the original migration and settlement event or that there may have been two distinct founding populations that settled at the same time or that there were two founding events.

It’s good to have confirmation and agreement between the two labs who happened across these results independently that the Australo-Melanesian DNA is present in some Native populations today.

Their interpretations and theories about how this Oceanic DNA arrived in some of the Native populations vary a bit, but if you read the details, it’s really not quite as different as it first appears from the headlines. Neither group claims to know for sure, and both discuss possibilities.

Questions remain. For example, if the founding group was small, why, then, don’t all of the Native people and populations have at least some Oceanic markers? The Anzick Child from 12,500 years ago does not. He is most closely related to the tribes in South America, where the Oceanic markers appear with the highest frequencies.

In the Harvard study, the scientists fully genome sequenced 63 individuals without discernable evidence of European or African ancestors in 21 Native American populations, restricting their study to individuals from Central and South America that have the strongest evidence of being entirely derived from a homogenous First American ancestral population.

Their results show that the two Amazonian groups, Surui and Karitians are closest to the “Australasian populations, the Onge from the Andaman Island in the Bay of Bengal (a so-called ‘Negrito’ group), New Guineans, Papuans and indigenous Australians.” Within those groups, the Australasian populations are the only outliers – meaning no Africans, Europeans or East Asian DNA found in the Native American people.

When repeating these tests, utilizing blood instead of saliva, a third group was shown to also carry these Oceanic markers – the Xavante, a population from the Brazilian plateau that speaks a language of the Ge group that is different from the Tupi language group spoke by the Karitians and Surui.

The closest populations that these Native people matched in Oceana, shown above on the map from the draft Skoglund letter, were, in order, New Guineans, Papuans and Andamanese. The researchers further state that populations from west of the Andes or north of the Panama isthmus show no significant evidence of an affinity to the Onge from the Andaman Islands with the exception of the Cabecar (Costa Rica).

That’s a very surprising finding, given that one would expect more admixture on the west, which is the side of the continent where the migration occurred.

The researchers then compared the results with other individuals, such as Mal’ta child who is known to have contributed DNA to the Native people today, and found no correlation with Oceanic DNA. Therefore, they surmised that the Oceanic admixture cannot be explained by a previously known admixture event.

They propose that a mystery population they have labeled as “Population Y” (after Ypykuera which means ancestor in the Tupi language family) contributed the Australasian lineage to the First Americans and that is was already mixed into the lineage by the time it arrived in Brazil.

According to their work, Population Y may itself have been admixed, and the 2% of Oceanic DNA found in the Brazilian Natives may be an artifact of between 2 and 85% of the DNA of the Surui, Karitiana and Xavante that may have come from Population Y. They mention that this result is striking in that the majority of the craniums that are more Oceanic in Nature than Asiatic, as would be expected from people who migrated from Siberia, are found in Brazil.

They conclude that the variance in the presence or absence of DNA in Native people and remains, and the differing percentages argue for more than one migration event and that “the genetic ancestry of Native Americans from Central and South America cannot be due to a single pulse of migration south of the Late Pleistocene ice sheets from a homogenous source population, and instead must reflect at least two streams of migration or alternatively a long drawn out period of gene flow from a structured Beringian or Northeast Asian source.”

Perhaps even more interesting is the following statement:

“The arrival of population Y ancestry in the Americas must in any scenario have been ancient: while Population Y shows a distant genetic affinity to Andamanese, Australian and New Guinean populations, it is not particularly closely related to any of them, suggesting that the source of population Y in Eurasia no longer exists.”

They further state they find no admixture indication that would suggest that Population Y arrived in the last few thousand years.

So, it appears that perhaps the Neanderthals and Denisovans were not the only people who were our ancestors, but no longer exist as a separate people, only as an admixed part of us today. We are their legacy.

The Take Away

When I did the Anzick extractions, we had hints that something of this sort might have been occurring. For example, I found surprising instances of haplogroup M, which is neither European, African nor Native American, so far as we know today. This may have been a foreshadowing of this Oceanic admixture. It may also be a mitochondrial artifact. Time will tell. Perhaps haplogroup M will turn out to be Native by virtue of being Oceanic and admixed thousands of years ago. There is still a great deal to learn. Regardless of how these haplogroups and Oceanic DNA arrived in Brazil in South America and in the Aleutian Islands off of Alaska, one thing is for sure, it did.

We know that the Oceanic DNA found in the Brazilian people studied for these articles is not contemporary and is ancient. This means that it is not related to the Oceanic DNA found in the Botocudo people, who, by the way, also sport mitochondrial haplogroups that are within the range of Native people, meaning haplogroup B, but have not been found in other Native people. Specifically, haplogroups B4a1a1 and B4a1a1a. Additionally, there are other B4a1a, B4a1b and B4a1b1 results found in the Anzick extract which could also be Oceanic. You can see all of the potential and confirmed Native American mitochondrial DNA results in my article “Native American Mitochondrial Haplogroups” that I update regularly.

We don’t know how or when the Botocudo arrived, but the when has been narrowed to the 1600s or 1700s. We don’t know how or when the Oceanic DNA in the Brazilian people arrived either, but the when was ancient. This means that Oceanic DNA has arrived in South America at least twice and is found among the Native peoples both times.

We know that some Native groups have some Oceanic admixture, and others seem to have none, in particular the Northern split group that became the Cree, Ojibwa, Algonquian, and Chippewa.

We know that the Brazilian Native groups are most closely related to Oceanic groups, but that the first paper also found Oceanic admixture in the Aleutian Islands. The second paper focused on the Central and South American tribes.

We know that the eastern American tribes, specifically the Algonquian tribes are closely related to the South Americans, but they don’t share the Oceanic DNA and neither do the mid-continent tribes like the Cree, Ojibwa and Chippewa. The only Paleolithic skeleton that has been sequenced, Anzick, from 12,500 years ago in Montana also does not carry the Oceanic signature.

In my opinion, the disparity between who does and does not carry the Oceanic signature suggests that the source of the Oceanic DNA in the Native population could not have been a member of the first party to exit out of Beringia and settle in what is now the Americas. Given that this had to be a small party, all of the individuals would have been thoroughly admixed with each other’s ancestral DNA within just a couple of generations. It would have been impossible for one ancestor’s DNA to only be found in some people. To me, this argues for one of two scenarios.

First, a second immigration wave that joined the first wave but did not admix with some groups that might have already split off from the original group such as the Anzick/Montana group.

Second, multiple Oceanic immigration events. We still have to consider the possibility that there were multiple events that introduced Oceanic DNA into the Native population. In other words, perhaps the Aleutian Islands Oceanic DNA is not from the same migration event as the Brazilian DNA which we know is not from the same event as the Botocudo. I would very much like to see the Oceanic DNA appear in a migration path of people, not just in one place and then the other. We need to connect the dots.

What this new information does is to rule out the possibility that there truly was only one wave of migration – one group of people who settled the Americas at one time. More likely, at least until the land bridge submerged, is that there were multiple small groups that exited Beringia over the 8,000 or so years it was inhabitable. Maybe one of those groups included people from Oceana. Someplace, sometime, as unlikely as it seems, it happened.

The amazing thing is that it’s more than 10,000 miles from Australia to the Aleutian Islands, directly across the Pacific. Early adventurers would have likely followed a coastal route to be sustainable, which would have been significantly longer. The fact that they survived and sent their DNA on a long adventure from Australia to Alaska to South America – and it’s still present today is absolutely amazing.

We know we still have a lot to learn and this is the tip of a very exciting iceberg. As more contemporary and ancient Native people have their full genomes sequenced, we’ll learn more answers. The answer is in the DNA. We just have to sequence enough of it and learn how to understand the message being delivered.

This week, published in Science, we find another ancient DNA full genome sequence from Siberia in an article titled “Genomic structure in Europeans dating back at least 36,200 years” by Seguin-Orlando et al.. This sample, partially shown above, is quite old and closely related to the Mal’ta child, also found in Siberia from about 24,000 years ago. Interestingly enough, K14 carries more Neanderthal DNA than current Europeans. This skeleton was actually excavated in 1954, but was only recently genetically analyzed.

From the paper, this map above shows the locations of recently analyzed ancient DNA samples. Note that even though K14 and Mal’ta child are similar, they are not located in close geographic proximity.

Also from the paper, this chart of population clusters is quite interesting, because we can see which of these ancient samples share some heritage with today’s indigenous American populations, shown in grey. UPGH=Upper Paleolithic Hunter-Gatherer, MHG=Mesolithic Hunter Gatherer, which is later in time that Paleolithic, and NEOL=Neolithic indicating the farming population that arrived in Europe approximately 7,000-10,000 years ago from the Middle East

You can see that the Neolithic samples show no trace of ancestry with today’s Native people, but both pre-Neolithic Hunter-Gatherer cultures show some amount of shared ancestry with Native people. However, to date, MA1, the Malta child is the most closely related and carries the most DNA in common with today’s Native people.

Also from the K14 paper, you can see on the map below where K14 matches current worldwide and European populations, where the warmer colors, i.e. red, indicated a closer match.

Of interest to genealogists and population geneticists, K14’s mitochondrial haplogroup is U2 and his Y haplogroup is C-M130, the same as LaBrana, a late Mesolithic hunter-gatherer found in northern Spain. Haplogroup C is, of course, one of the base haplogroups for the Native people of the Americas.

The K14 paper further fleshes out the new peopling of Europe diagram discussed in my Peopling of Europe article.

The origin of contemporary Europeans remains contentious. We obtain a genome sequence from Kostenki 14 in European Russia dating to 38,700 to 36,200 years ago, one of the oldest fossils of Anatomically Modern Humans from Europe. We find that K14 shares a close ancestry with the 24,000-year-old Mal’ta boy from central Siberia, European Mesolithic hunter-gatherers, some contemporary western Siberians, and many Europeans, but not eastern Asians. Additionally, the Kostenki 14 genome shows evidence of shared ancestry with a population basal to all Eurasians that also relates to later European Neolithic farmers. We find that Kostenki 14 contains more Neandertal DNA that is contained in longer tracts than present Europeans. Our findings reveal the timing of divergence of western Eurasians and East Asians to be more than 36,200 years ago and that European genomic structure today dates back to the Upper Paleolithic and derives from a meta-population that at times stretched from Europe to central Asia.

Neanderthal man, reconstructed at the National Museum of Nature and Science in Tokyo

The photo below shows a step in the process of extracting DNA from ancient bones at Max Planck.

Our Y and mitochondrial DNA haplogroups take us back thousands of years in time, but at some point, where and how people were settling and intermixing becomes fuzzy. Ancient DNA can put the people of that time and place in context. We have discovered that current populations do not necessarily represent the ancient populations of a particular locale.

Recent information discovered from ancient burials tells us that the people of Europe descend from a 3 pronged model. Until recently, it was believed that Europeans descended from Paleolithic hunter-gatherers and Neolithic farmers, a two-pronged model.

Previously, it was believed that Europe was peopled by the ancient hunter-gatherers, the Paleolithic, who originally settled in Europe beginning about 45,000 years ago. At this time, the Neanderthal were already settled in Europe but weren’t considered to be anatomically modern humans, and it was believed, incorrectly, that the two groups did not interbreed. These hunter-gatherers were the people who settled in Europe before the last major ice age, the Younger Dryas, taking refuge in the southern portions of Europe and Eurasia, and repeopling the continent after the ice receded, about 12,000 years ago. By that time, the Neanderthals were gone, or as we now know, at least partially assimilated.

This graphic shows Europe during the last ice age.

The second settlement wave, the agriculturalist farmers from the Near East either overran or integrated with the hunter-gatherers in the Neolithic period, depending on which theory you subscribe to, about 8000-10,000 years ago.

2012 – Ancient Northern European (ANE) Hints

Beginning in 2012, we began to see hints of a third lineage that contributed to the peopling of Europe as well, from the north. Buried in the 2012 paper, Estimating admixture proportions and dates with ADMIXTOOLS by Patterson et al, was a very interesting tidbit. This new technique showed a third population, referred to by many as a “ghost population”, because no one knew who they were, that contributed to the European population.

This revelation caused quite a stir, because it was reported that the Ancestor of Native Americans in Asia was 30% Western Eurasian. Unfortunately, in some cases, this was immediately interpreted to mean that Native Americans had come directly from Europe which is not what this paper said, nor inferred. It was also inferred that the haplogroups of this child, R* (Y) and U (mtDNA) were Native American, which is also incorrect. To date, there is no evidence for migration to the New World from Europe in ancient times, but that doesn’t mean we aren’t still looking for that evidence in early burials.

What this paper did show was that Europeans and Native Americans shared a common ancestor, and that the Siberian population had contributed to the European population as well as the Native American population. In other words, descendants settled in both directions, east and west.

The most fascinating aspect of this paper was the match distribution map, below, showing which populations Malta child matched most closely.

As you can see, MA-1, Malta Child, matches the Native American population most closely, followed by the northern European and Greenland populations. The further south in Europe and Asia, the more distant the matches and the darker the blue.

Based on the various theories and questions, ancient burials were enlightening.

In 2013, there were a total of 32 burials from the Neolithic period, after farmers arrived from the Near East, and haplogroup R did not appear. Instead, haplogroups G, I and E were found.

What this tells us is that haplogroup R, as well as other haplogroup, weren’t present in Europe at this time. Having said this, these burials were in only 4 locations and, although unlikely, R could be found in other locations.

Last year, Dr. Hammer concluded that haplogroup R was not found in the Paleolithic and likely arrived with the Neolithic farmers. That shook the community, as it had been widely believed that haplogroup R was one of the founding European haplogroups.

While this provided tantalizing information, we still needed additional evidence. No paper has yet been published that addresses these findings. The mass full sequencing of the Y chromosome over this past year with the introduction of the Big Y will provide extremely valuable information about the Y chromosome and eventually, the migration path into and across Europe.

In discussing the paper, David Reich from Harvard, one of the co-authors, said, “Prior to this paper, the models we had for European ancestry were two-way mixtures. We show that there are three groups. This also explains the recently discovered genetic connection between Europeans and Native Americans. The same Ancient North Eurasian group contributed to both of them.”

We sequenced the genomes of a ~7,000-year-old farmer from Germany and eight ~8,000-year-old hunter-gatherers from Luxembourg and Sweden. We analysed these and other ancient genomes1, 2, 3, 4 with 2,345 contemporary humans to show that most present-day Europeans derive from at least three highly differentiated populations: west European hunter-gatherers, who contributed ancestry to all Europeans but not to Near Easterners; ancient north Eurasians related to Upper Palaeolithic Siberians3, who contributed to both Europeans and Near Easterners; and early European farmers, who were mainly of Near Eastern origin but also harboured west European hunter-gatherer related ancestry. We model these populations’ deep relationships and show that early European farmers had ~44% ancestry from a ‘basal Eurasian’ population that split before the diversification of other non-African lineages.

This paper utilized ancient DNA from several sites and composed the following genetic contribution diagram that models the relationship of European to non-European populations.

Present day samples are colored purple, ancient in red and reconstructed ancestral populations in green. Solid lines represent descent without admixture and dashed lines represent admixture. WHG=western European hunter-gatherer, EEF=early European farmer and ANE=ancient north Eurasian

2014 – Michael Hammer on Europe’s Ancestral Population

For anyone interested in ancient DNA, 2014 has been a banner years. At the Family Tree DNA conference in Houston, Texas, Dr. Michael Hammer brought the audience up to date on Europe’s ancestral population, including the newly sequenced ancient burials and the information they are providing.

Dr. Hammer said that ancient DNA is the key to understanding the historical processes that led up to the modern. He stressed that we need to be careful inferring that the current DNA pattern is reflective of the past because so many layers of culture have occurred between then and now.

Until recently, it was assumed that the genes of the Neolithic farmers replaced those of the Paleolithic hunter-gatherers. Ancient DNA is suggesting that this is not true, at least not on a wholesale level.

The theory, of course, is that we should be able to see them today if they still exist. The migration and settlement pattern in the slide below was from the theory set forth in the 1990s.

In 2013, Dr. Hammer discussed the theory that haplogroup R1b spread into Europe with the farmers from the Near East in the Neolithic. This year, he expanded upon that topic that based on the new findings from ancient burials.

Last year, Dr. Hammer discussed 32 burials from 4 sites. Today, we have information from 15 ancient DNA sites and many of those remains have been full genome sequenced.

Information from papers and recent research suggests that Europeans also have genes from a third source lineage, nicknamed the “ghost population of North Eurasia.”

Scientists are finding a signal of northeast Asian related admixture in northern Europeans, first suggested in 2012. This was confirmed with the sequencing of Malta child and then in a second sequencing of Afontova Gora2 in south central Siberia.

We have complete genomes from nine ancient Europeans – Mesolithic hunter gatherers and Neothilic farmers. Hammer refers to the Mesolithic here, which is a time period between the Paleolithic (hunter gatherers with stone tools) and the Neolithic (farmers).

In the PCA charts, shown above, you can see that Europeans and people from the Near East cluster separately, except for a bridge formed by a few Mediterranean and Jewish populations. On the slide below, the hunter-gatherers (WHG) and early farmers (EEF) have been overlayed onto the contemporary populations along with the MA-1 (Malta Child) and AG2 (Afontova Gora2) representing the ANE.

When sequenced, separate groups formed including western hunter gathers and early european farmers include Otzi, the iceman. A third group is the north south clinal variation with ANE contributing to northern European ancestry. The groups are represented by the circles, above.

Dr. Hammer said that the team who wrote the “Ancient Human Genomes” paper just recently published used an F3 test, results shown above, which shows whether populations are an admixture of a reference population based on their entire genome. He mentioned that this technique goes well beyond PCA.

Mapped onto populations today, most European populations are a combination of the three early groups. However, the ANE is not found in the ancient Paleolithic or Neolithic burials. It doesn’t arrive until later.

This tells us that there was a migration event 45,000 years ago from the Levant, followed about 7000 years ago by farmers from the Near East, and that ANE entered the population some time after that. All Europeans today carry some amount of ANE, but ancient burials do not.

These burials also show that southern Europe has more Neolithic farmer genes and northern Europe has more Paleolithic/Mesolithic hunter-gatherer genes.

Pigmentation for light skin came with farmers – blue eyes existed in hunter gatherers even though their skin was dark.

Dr. Hammer created these pie charts of the Y and mitochondrial haplogroups found in the ancient burials as compared to contemporary European haplogroups.

The pie chart on the left shows the haplogroups of the Mesolithic burials, all haplogroup I2 and subclades. Note that in the current German population today, no I2a1b and no I1 was found. The chart on the right shows current Germans where haplogroup I is a minority.

Therefore, we can conclude that haplogroup I is a good candidate to be identified as a Paleolithic/Mesolithic haplogroup.

This information shows that the past is very different from today.

In 2014 we have many more burials that have been sequenced than last year, as shown on the map above.

Green represents Neolithic farmers, red are Mesolithic hunter-gatherers, brown at bottom right represents more recent samples from the Metallic age.

There are a total of 48 Neolithic burials where haplogroup G dominates. In the Mesolithic, there are a total of six haplogroup I.

This suggests that haplogroup I is a good candidate to be the father of the Paleolithic/Mesolithic and haplogroup G, the founding father of the Neolithic.

In addition to haplogroup G in the Neolithic, one sample of both E1b1b1 (M35) and C were also found in Spain. E1b1b1 isn’t surprising given it’s north African genesis, but C was quite interesting.

The Metal ages, which according to wiki begin about 3300BC in Europe, is where haplogroup R, along with I1, first appear.

Please note that the diffusion of melallurgy map above is not part of Dr. Hammer’s presentation. I have added it for clarification.

Nothing is constant in Europe. The Y DNA was very upheaved, as indicated on the graphic above. Mitochondrial DNA shifted from pre-Neolithic to Neolithic which isn’t terribly different from the present day.

Dr. Hammer did not say this, but looking at the Y versus the mtDNA haplogroups, I wonder if this suggests that indeed there was more of a replacement of the males in the population, but that the females were more widely assimilated. This would certainly make sense, especially if the invaders were warriors and didn’t have females with them. They would have taken partners from the invaded population.

Haplogroup G represents the spread of farming into Europe.

The most surprising revelation is that haplogroup R1b appears to have emerged after the Neolithic agriculture transition. Given that just three years ago we thought that haplogroup R1b was one of the original European settlers thousands of years ago, based on the prevalence of haplogroup R in Europe today, at about 50%, this is a surprising turn of events. Last year’s revelation that R was maybe only 7000-8000 years old in Europe was a bit of a whammy, but the age of R in Europe in essence just got halved again and the source of R1b changed from the Near East to the Asian steppes.

Obviously, something conferred an advantage to these R1b men. Given that they arrived in the early Metalic age, was it weapons and chariots that enabled the R1b men who arrived to quickly become more than half of the population?

The Bronze Age saw the first use of metal to create weapons. Warrior identity became a standard part of daily life. Celts ranged over Europe and were the most dominant iron age warriors. Indo-European languages and chariots arrived from Asia about this time.

The map above shows the Hallstadt and LaTene Celtic cultures in Europe, about 600BC. This was not a slide presented by Dr. Hammer.

Haplogroup R1b was not found in an ancient European context prior to a Bell Beaker period burial in Germany 4.8-4.0 kya (thousand years ago, i.e. 4,800-4,000 years ago). R1b arrives about 4.6 kya and is also found in a Corded Ware culture burial in Germany. A late introduction of these lineages which now predominate in Europe corresponds to the autosomal signal of the entry of Asian and Eastern European steppe invaders into western Europe.

Local expansion occurred in Europe of R1b subgroups U106, L21 and U152.

A current haplogroup R distribution map that reflects the findings of this past year is shown above.

Haplogroup I is interesting for another reason. It looks like haplogroup I2a1b (M423) may have been replaced by I1 which expanded after the Mesolithic.

One of the benefits of ancient DNA genome processing is that we will be able to map current trees into maps of old SNPs and be able to tell who we match most closely.

Autosomal DNA can also be mapped to see how much of our DNA is from which ancient population.

Dr. Hammer mapped the percentages of European Mesolithic/Paleolithic hunter-gatherers in blue, Neolithic Farmers from the Near East in magenta and Asian Steppe Invaders representing ANE in yellow, over current populations. Note the ancient DNA samples at the top of the list. None of the burials except for Malta Child carry any yellow, indicating that the ANE entered the European population with the steppe invaders; the same group that brought us haplogroup R and possibly I1.

Dr. Hammer says that ANE was introduced to and assimilated into the European population by one or more incursions. We don’t know today if ANE in Europeans is a result of a single blast event or multiple events. He would like to do some model simulations and see if it is related to timing and arrival of swords and chariots.

We know too that there are more recent incursions, because we’re still missing major haplogroups like J.

The further east you go, meaning the closer to the steppes and Volga region, the less well this fits the known models. In other words, we still don’t have the whole story.

At the end of the presentation, Michael was asked if the whole genomes sequenced are also obtaining Y STR data, which would allow us to compare our results on an individual versus a haplogroup level. He said he didn’t know, but he would check.

Family Tree DNA was asked if they could show a personal ancient DNA map in myOrigins, perhaps as an alternate view. Bennett took a vote and that seemed pretty popular, which he interpreted as a yes, we’d like to see that.

In Summary

The advent of and subsequent drop in the price of whole genome sequencing combined with the ability to extract ancient DNA and piece it back together have provided us with wonderful opportunities. I think this is jut the proverbial tip of the iceberg, and I can’t wait to learn more.

If you are interested in other articles I’ve written about ancient DNA, check out these links:

This slide, by Robert Baber, pretty well sums up our group obsession and what we focus on every year at the Family Tree DNA administrator’s conference in Houston, Texas.

Getting to Houston, this year, was a whole lot easier than getting out of Houston. They had storms yesterday and many of us spent the entire day becoming intimately familiar with the airport. Jennifer Zinck, of Ancestor Central, is still there today and doesn’t have a flight until late.

And this is how my day ended, after I finally got out of Houston and into my home airport. This isn’t at the airport, by the way. Everything was fine there, but I made the apparent error of stopping at a Starbucks on the way home. This is the parking lot outside an hour or so later. What can I say? At least I had my coffee, and AAA rocks, as did the tow truck driver and my daughter for getting out of bed to come and rescue me!!! Hmmm, I think maybe things have gone full circle. I remember when I used to go and rescue her:)

So far, today hasn’t improved any, so let’s talk about something much more pleasant…the conference itself.

Resources

One of the reasons I mentioned Jennifer Zinck, aside from the fact that she’s still stuck in the airport, is because she did a great job actually covering the conference as it happened. Since I had some time yesterday to visit with her since our gates weren’t terribly far apart, I asked her how she got that done. I took notes too, and photos, but she turned out a prodigious amount of work in a very short time. While I took a lightweight MacBook Air, she took her regular PC that she is used to typing on, and she literally transcribed as the sessions were occurring. She just added her photos later, and since she was working on a platform that she was familiar with, she could crop and make the other adjustments you never see but we perform behind the scenes before publishing a photo.

On the other hand, I struggled with a keyboard that works differently and is a different size than I’m used to as well as not being familiar with the photo tools to reduce the size of pictures, so I just took rough notes and wrote the balance later. Having familiar tools make such a difference. I think I’ll carry my laptop from now on, even though it is much heavier. Kudos to Jennifer!

I was initially going to summarize each session, but since Jen did such a good job, I’m posting her links. No need to recreate a wheel that doesn’t need to be recreated.

ISOGG, the International Society of Genetic Genealogy is not affiliated with Family Tree DNA or any testing company, but Family Tree DNA is generous enough to allow an ISOGG meeting on Sunday before the first conference session.

Those of you where are members of the ISOGG Yahoo group for project administrators can view photos posted by Katherine Borges in that group and there are also some postings on the Facebook ISOGG group as well.

Now that you have the links for the summaries, what I’d like to do is to discuss some of the aspects I found the most interesting.

The Mix

When I attended my first conference 10 years ago, I somehow thought that for the most part, the same group of people would be at the conferences every year. Some were, and in fact, a handful of the 160+ people attending this conference have attended all 10 conferences. I know of two others for certain, but there were maybe another 3 or so who stood up when Bennett asked for everyone who had been present at all 10 conferences to stand.

Doug Mumma, the very first project administrator was with us this weekend, and still going strong. Now, if Doug and I could just figure out how we’re related…

Some of the original conference group has passed on to the other side where I’m firmly convinced that one of your rewards is that you get to see all of those dead ends of your tree. If we’re lucky, we get to meet them as well and ask all of those questions we have on this side. We remember our friends fondly, and their departure sadly, but they enriched us while they were here and their memories make us smile. I’m thinking specifically of Kenny Hedgepath and Leon Little as I write this, but there have been others as well.

The definition of a community is that people come and go, births, deaths and moves.

This year, about half of the attendees had never attended a conference before. I was very pleased to see this turn of events – because in order to survive, we do need new people who are as crazy as we are…er….I mean as dedicated as we are.

I asked people about their favorite part of the conference or their favorite session. I was surprised at the number of people who said lunches and dinners. Trust me, the food wasn’t that wonderful, so I asked them to elaborate. In essence, the most valuable aspect of the conference was working with and talking to other administrators.

It’s not like we don’t talk online, but there is somehow a difference between online communications and having a group discussion, or a one-on-one discussion. Laptops were out and in use everyplace, along with iPads and other tools. It was so much fun to walk by tables and hear snippets of conversations like “the mutation at location 309.1….” and “null marker at 425” and “I ordered a kit for my great uncle…..”

I agree, as well. I had pre-arranged two dinners before arriving in order to talk with people with whom I share specific interests. At lunches, I either tried to sit with someone I specifically needed to talk to, or I tried to meet someone new.

I also asked people about their specific goals for the next year. Some people had a particular goal in mind, such as a specific brick wall that needs focus. Some, given that we are administrators, had wider-ranging project based goals, like Big Y testing certain family groups, and a surprising number had the goal of better utilizing the autosomal results.

Perhaps that’s why there were two autosomal sessions, an introduction by Jim Bartlett and then Tim Janzen’s more advanced session.

Autosomal DNA Results

Note the cool double helix light fixture behind the speakers.

Tim specifically mentioned two misconceptions which I run across constantly.

Misconception 1 – A common surname means that’s how you match. Just because you find a common surname doesn’t mean that’s your DNA match. This belief is particularly prevalent in the group of people who test at Ancestry.com.

Misconception 2 – Your common ancestor has to be within the past 6 generations. Not true, many matches can be 6-10th cousins because there are so many descendants of those early ancestors, even as many as 15 generations back.

Tim also mentioned that endogamous relationships are a tough problem with no easy answer. Polynesians, Ashkenazi Jews, Low German Mennonites, Acadians, Amish, and island populations. Do I ever agree with him! I have Brethren, Mennonite and Acadian in the same parent’s line.

Tim has graciously made his entire presentation available for download.

There are probably a dozen or so of us that are actively mapping our ancestors, and a huge backlog of people who would like to. As Tim pointed out with one of his slides, this is not an easy task nor is it for the people who simply want to receive “an answer.”

I will also add that we “mappers” are working with and actively encouraging Family Tree DNA to develop tools so that the mapping is less spreadsheet manual work and more automated, because it certainly can be.

Upload GEDCOM Files

If you haven’t already, upload your GEDCOM to Family Tree DNA. This is becoming an essential part of autosomal matching. Furthermore, Family Tree DNA will utilize this file to construct your surname list and that will help immensely determining common surnames and your common ancestor with your Family Finder matches. If you have sponsored tests for cousins, then upload a GEDCOM file for them or at least construct a basic tree on their Family Tree DNA page.

Ethics

Family Tree DNA always tries to provide a speaker about ethics, and the only speakers I’ve ever felt understood anything about what we want to do are Judy Russell and Blaine Bettinger. I was glad to see Blaine presenting this year.

The essence of Blaine’s speech is that ethics isn’t about law. Law is cut and dried. Ethics isn’t, and there are no ethics police.

Sometimes our decisions are colored necessarily by right and wrong. Sometimes those decisions are more about the difference between a better and a worse way.

As a community, we want to reduce negative press coverage and increase positive coverage. We want to be proactive, not reactive.

Blaine stresses that while informed consent is crucial, that DNA doesn’t reveal secrets that aren’t also revealed by other genealogical forms of research. DNA often reveals more recent secrets, such as adoptions and NPEs, so it’s possibly more sensitive.

Two things need to govern our behavior. First, we need to do only things that we would be comfortable seeing above the fold in the New York Times. Second, understand that we can’t make promises about topics like anonymity or about the absence of medical information, because we don’t know what we don’t know.

The SNP Tsunami

One of my concerns has been and remains the huge number of new SNPs that have been discovered over the past year or so with the Big Y by Family Tree DNA and corresponding tests from other vendors.

When I say concern, I’m thrilled about this new technology and the advances it is allowing us to make as a community to discover and define the evolution of haplogroups. My concern is that the amount of data is overwhelming. However, we are working through that, thanks to the hours and hours of volunteer work by haplogroup administrators and others.

Alice Fairhurst, who volunteers to maintain the ISOGG haplotree, mentioned that she has added over 10,000 SNPs to the Y tree this year alone, bringing the total to over 14,000. Those SNPs are fully vetted and placed. There are many more in process and yet more still being discovered. On the first page of the Y SNP tree, the list of SNP sources and other critical information, such as the criteria for a SNP to be listed, is provided.

So, if you’re waiting for that next haplotree poster, give it up because there isn’t a printing press that big, unless you want wallpaper.

These slides are from Alice’s presentation. The ISOGG tree provides an invaluable resource for not only the genetic genealogy community, but also researchers world-wide.

As one example of how the SNP tsunami has affected the Y tree, Alice provided the following summary of R-U106, one of the two major branches of haplogroup R.

From the ISOGG 2006 Y tree, this was the entire haplogroup R Y tree. You can see U106 near the bottom with 3 sub-branches. While this probably makes you chuckle today, remember that 2006 was only 8 years ago and that this tree didn’t change much for several years.

2007 was the same.

2008 shows 5 subclades and one of the subclades had 2 subclades.

2009 showed a total of 12 sub-branches and 2010 added one more.

2011 however, showed a large change. U106 in 2011 had 44 subgroups total and became too large to show on one screen shot. 2012 shows 99 subclades, if I counted accurately. The 2014 U106 tree is shown below.

There’s another slide too, but I didn’t manage to get the picture. You get the idea though…

As you can imagine, for Family Tree DNA, trying to keep up with all of the haplogroups, not just one subgroup like U106 is a gargantuan task that is constantly changing, like hourly. Their Y tree is currently the National Geographic tree, and while they would like to update it, I’m sure, the definition of “current tree” is in a constant state of flux. Literally, Mike Walsh, one of the admins in the R-L21 group uploads a new tree spreadsheet several times every day.

In order to deal attempt to deal with this, and to encourage people who don’t want to do a Big Y discovery type test, but do want to ferret out their location on their assigned portion of the tree, Family Tree DNA is reintroducing the Backbone tests.

They are starting with M222, also known as the Niall of the 9 Hostages haplogroup which is their beta for the new product and new process. You can see the provisional tree and results in the two slides they provided, below. I apologize for the quality, but it was the best I could do.

Haplogroup administrators are going to be heavily involved in this process. Family Tree DNA is putting SNP panels together that will help further define the tree and where various SNPs that have been recently discovered, and continue to be discovered, will fall on the tree.

As Big Y tests arrive, haplogroup project administrators typically assemble a spreadsheet of the SNPS and provisionally where they fall on the tree, based on the Big Y results.

What Bennett asked is for the admins to work with Family Tree DNA to assemble a testing panel based on those results. The goal is for the cost to be between $1.50 and $2 (US) for each SNP in the panel, which will reduce the one-off SNP testing and provide a much more complete and productive result at a far reduced price as compared to the current $29 or $39 per individual SNP.

If you are a haplogroup administrator, get in touch with Family Tree DNA to discuss your desired backbone panels. New panels, when it’s your turn, will take about 2 weeks to develop.

Keep in mind that the following SNPs, according to Bennett, are not optimal for panels:

Palindromic regions

Often mutating regions designated as .1, .2, etc.

SNPs in STRs

Nir Leibovich, the Chief Business Officer, also addressed the future and the Big Y to some extent in his presentation.

Utilizing the Big Y for Genealogy

In my case, during the last sale, I ordered several Big Y tests for my Estes family line because I have several genealogically documented lines from the original Estes family in Kent, England through our common ancestor, Robert Estes born in 1555 and his wife Anne Woodward. The participants also agreed to extend their markers to 111 markers as well. When the results are back, we’ll be able to compare them on a full STR marker set, and also their SNPs. Hopefully, they will match on their known SNPs and there will be some new novel variants that will be able to suffice as line marker mutations.

We need more BIG Y tests of these types of genealogically confirmed trees that have different sons’ lines from a distant common ancestor to test descendant lines. This will help immensely to determine the actual, not imputed, SNP mutation rate and allow us to extrapolate the ages of haplogroups more accurately. Of course, it also goes without saying that it helps to flesh out the trees.

I personally expect the next couple of years will be major years of discovery. Yes, the SNP tsumani has hit land, but it’s far from over.

Research and Development

David Mittleman, Chief Scientific Officer, mentioned that Family Tree DNA now has their own R&D division where they are focused on how to best analyze data. They have been collaborating with other scientists. A haplogroup G1 paper will be published shortly which states that SNP mutation rates equate to Sanger data.

FTDNA wants to get Big Y data into the public domain. They have set up consent for this to be done by uploading into NCBI. Initially they sent a survey to a few people that sampled the interest level. Those who were interested received a release document. If you are interested in allowing FTDNA to utilize your DNA for research, be it mitochondrial, Y or autosomal, please send them an e-mail stating such.

Don’t Forget About Y Genealogy Research

It’s very easy for us to get excited about the research and discovery aspect of DNA – and the new SNPs and extending haplotrees back in time as far as possible, but sometimes I get concerned that we are forgetting about the reason we began doing genetic genealogy in the first place.

Robert Baber’s presentation discussed the process of how to reconstruct a tree utilizing both genealogy and DNA results. It’s important to remember that the reason most of our participants test is to find their ancestors, not, primarily, to participate in the scientific process.

Robert has succeeded in reconstructing 110 or 111 markers of the oldest known Baber ancestor, shown above. I wrote about how to do this in my article titled, Triangulation for Y DNA.

Not only does this allow us to compare everyone with the ancestor’s DNA, it also provides us with a tool to fit individuals who don’t know specific genealogical line into the tree relatively accurately. When I say relatively, the accuracy is based on line marker mutations that have, or haven’t, happened within that particular family.

Jim illustrated how to do this as well, and his methodology is available at the link on his slide, below.

I had to laugh. I’ve often wondered what our ancestors would think of us today. Robert said that that 11 generations after Edward Baber died, he flew over church where Edward was buried and wondered what Edward would have thought about what we know and do today – cars, airplanes, DNA, radio, TV etc.. If someone looked in a crystal ball and told Edward what the future held 11 generations later, he would have thought that they were stark raving mad.

Eleven generations from my birth is roughly the year 2280. I’m betting we won’t be trying to figure out who our ancestors were through this type of DNA analysis then. This is only a tiny stepping stone to an unknown world, as different to us as our world is to Edward Baber and all of our ancestors who lived in a time where we know their names but their lives and culture are entirely foreign to ours.

Publications

When the Journal of Genetic Genealogy was active, I, along with other citizen scientists published regularly. The benefit of the journal was that it was peer reviewed and that assured some level of accuracy and because of that, credibility, and it was viewed by the scientific community as such. My co-authored works published in JOGG as well as others have been cited by experts in the academic community. It other words, it was a very valuable journal. Sadly, it has fallen by the wayside and nothing has been published since 2011. A new editor was recruited, but given their academic load, they have not stepped up to the plate. For the record, I am still hopeful for a resurrection, but in the mean time, another opportunity has become available for genetic genealogists.

Brad Larkin has founded the Surname DNA Journal, which, like JOGG, is free to both authors and subscribers. In case you weren’t aware, most academic journal’s aren’t. While this isn’t a large burden for a university, fees ranging from just over $1000 to $5000 are beyond the budget of genetic genealogists. Just think of how many DNA tests one could purchase with that money.

Brad has issued a call for papers. These papers will be peer reviewed, similarly to how they were reviewed for JOGG.

Take a look at the articles published in this past year, since the founding of Surname DNA Journal.

The citizen science community needs an avenue to publish and share. Peer reviewed journals provide us with another level of credibility for our work. Sharing is clearly the lynchpin of genetic genealogy, as it is with traditional genealogy. Give some thought about what you might be able to contribute.

Brad Larkin solicited nominations prior to the conference and awarded a Genetic Genealogist of the Year award. This year’s award was dually presented to Ian Kennedy in Australia, who, unfortunately, was not present, and to CeCe Moore, who just happened to follow Brad’s presentation with her own.

Don’t Forget about Mitochondrial DNA Either

I believe that mitochondrial DNA the most underutilized DNA tool that we have, often because how to use mitochondrial DNA, and what it can tell you, is poorly understood. I wrote about this in an article titled, Mitochondrial, The Maligned DNA.

Given that I work with mitochondrial DNA daily when I’m preparing client’s Personalized DNA Reports (orderable from your personal page at Family Tree DNA or directly from my website), I know just how useful mitochondrial can be and see those examples regularly. Unfortunately, because these are client reports, I can’t write about them publicly.

CeCe Moore, however, isn’t constrained by this problem, because one of the ways she contributes to genetic genealogy is by working with the television community, in particular Genealogy Roadshow and the PBS series, Finding Your Roots. Now, I must admit, I was very surprised to see CeCe scheduled to speak about mitochondrial DNA, because the area of expertise where she is best known is autosomal DNA, especially in conjunction with adoptee research.

During the research for the production of these shows, CeCe has utilized mitochondrial DNA with multiple celebrities to provide information such as the ethnic identification of the ancestor who provided the mitochondrial DNA as Native American.

Autosomal DNA testing has a broad but shallow reach, across all of your lines, but just back a few generations. Both Y and mitochondrial DNA have a very deep reach, but only on one specific line, which makes them excellent for identifying a common ancestor on that line, as well as the ethnicity of that individual.

I have seen other cases, where researchers connected the dots between people where no paper trail existed, but a relationship between women was suspected.

CeCe mentioned that currently there are only 44,000 full sequence results in the Family Tree DNA data base and and 185K total HVR1, HVR2 and full sequence tests. Y has half a million. We need to increase the data base, which, of course increases matches and makes everyone happier. If you haven’t tested your mitochondrial DNA to the full sequence level, this would be a great time!

There are several lessons on how to utilize mitochondrial DNA at this ISOGG link.

I’m very hopeful that CeCe’s presentation will be made available as I think her examples are quite powerful and will serve to inspire people. Actually, since CeCe is in the “movie business,” perhaps a short video clip could be made available on the FTDNA website for anyone who hasn’t tested their mitochondrial DNA so they can see an example of why they should!

myOrigins

I would be fibbing to you if I told you I am happy with myOrigins. I don’t feel that it is as sensitive as other methods for picking up minority admixture, in particular, Native American, especially in small amounts. Unfortunately, those small amounts are exactly what many people are looking for.

If someone has a great-great-great-great grandparent that is Native, they carry about 1%, more or less, of the Native ancestor’s DNA today. A 4X great grandparent puts their birth year in the range of 1800-1825 – or just before the Trail of Tears. People whose colonial American families intermarried with Native families did so, generally, before the Trail of Tears. By that time, many tribes were already culturally extinct and those east of the Mississippi that weren’t extinct were fighting for their lives, both literally and figuratively.

We really need the ability to develop the most sensitive testing to report even the smallest amounts of Native DNA and map those segments to our chromosomes so that we can determine who, and what line in our family, was Native.

I know that Family Tree DNA is looking to improve their products, and I provided this feedback to them. Many people test autosomally only for their ethnicity results and I surely would love to have those people’s results available as matches in the FTDNA data base.

Razib Khan has been working with Family Tree DNA on their myOrigins product and spoke about how the myOrigins data is obtained.

Given that all humans are related, one way or another, far enough back in time, myOrigins has to be able to differentiate between groups that may not be terribly different. Furthermore, even groups that appear different today may not have been historically. His own family, from India, has no oral history of coming from the East, but the genetic data clearly indicates that they did, along with a larger group, about 1000 years ago. This may well be a result of the adage that history is written by the victors, or maybe whatever happened was simply too long ago or unremarkable to be recorded.

Razib mentioned that depending on the cluster and the reference samples, that these clusters and groups that we see on our myOrigins maps can range from 1000-10,000 years in age.

The good news is that genetics is blind to any preconceived notions. The bad news is that the software has to fit your results to the best population, even though it may not be directly a fit. Hopefully, as we have more and better reference populations, the results will improve as well.

Razib showed a PCA (principal components analysis) graph, above. These graphs chart reference populations in different quadrants. Where the different populations overlap is where they share common historic ancestors. As you can see, on this graph with these reference populations, there is a lot of overlap in some cases, and none in others.

Your personal results would then be plotted on top of the reference populations. The graph below shows me, as the white “target” on a PCA graph created by Doug McDonald.

The Changing Landscape

A topic discussed privately among the group, and primarily among the bloggers, is the changing landscape of genetic genealogy over the past year or so. In many ways I think the bloggers are the canaries in the mine.

One thing that clearly happened is that the proverbial tipping point occurred, and we’re past it. DNA someplace along the line became mainstream. Today, DNA is a household word. At gatherings, at least someone has tested, and most people have heard about DNA testing for genealogy or at least consumer based DNA testing.

The good news in all of this is that more and more people are testing. The bad news is that they are typically less informed and are often impulse purchasers. This gives us the opportunity for many more matches and to work with new people. It also means there is a steep learning curve and those new testers often know little about their genealogy. Those of us in the “public eye,” so to speak, have seen an exponential spike in questions and communications in the past several months. Unfortunately, many of the new people don’t even attempt to help themselves before asking questions.

Sometimes opportunity comes with work clothes – for them and us both.

I was talking with Spencer about this at the reception and he told me I was stealing his presentation. He didn’t seem too upset by this:)

I had to laugh, because this falls clearly into the “be careful what you wish for, you may get it” category. The Genographic project through National Geographic is clearly, very clearly, a critical component of the tipping point, and this was reflected in Spencer’s presentation. Although I covered quite a bit of Spencer’s presentation in my day 2 summary, I want to close with Spencer here. I also want to say that if you ever have the opportunity to hear Spencer speak, please do yourself the favor and be sure to take that opportunity. Not only is he brilliant, he’s interesting, likeable and very approachable. Of course, it probably doesn’t hurt that I’ve know him now for 9 years! I’ve never thought to have my picture taken with Spencer before, but this time, one of my friends did me the favor.

I have to admit, I love talking to Spencer, and listening to him. He is the adventurer through whom we all live vicariously. In the photo below, Spencer along with his crew, drove from London to Mongolia. Not sure why he is standing on the top of the Land Rover, but I’m sure he will tell us in his upcoming book about that journey,

I’m warning you all now, if I win the lottery, I’m going on the world tour that he hosts with National Geographic, and of course, you’ll all be coming with me via the blog!

Spencer talked about the consumer genomics market and where we are today.

Spencer mentioned that genetic genealogy was a cottage industry originally. It was, and it was even smaller than that, if possible. It actually was started by Bennett and his cell phone. I managed to snap a picture of Bennett this weekend on the stage looking at his cell, and I thought to myself, “this is how it all started 14 years ago.” Just look where we are today. Thank you Michael Hammer for telling Bennett that you received “lots of phone calls from crazy genealogists like you.”

So, where exactly are we today? In 2013, the industry crossed the millionth kit line. The second millionth kit was sold in early summer 2014 and the third million will be sold in 2015. No wonder we feel like a tidal wave has hit. It has.

Why now?

DNA has become part of national consciousness. Businesses advertise that “it’s in our DNA.” People are now comfortable sharing via social media like facebook and twitter. What DNA can do and show you, the secrets it can unlock is spreading by word of mouth. Spencer termed this the “viral spread threshold” and we’ve crossed that invisible line in the sand. He terms 2013 as the year of infection and based on my blog postings, subscriptions, hits, reach and the number of e-mails I receive, I would completely agree. Hold on tight for the ride!

Spencer talked about predictions for near term future and said a 5 year plan is impossible and that an 18 month plan is more realistic. He predicts that we will continue to see exponential growth over the next several years. He feels that genetic genealogy testing will be primary driver of growth because medical or health testing is subject to the clinical utility trap being experienced currently by 23andMe. The Big 4 testing companies control 99% of consumer market in US (Ancestry, 23andMe, Family Tree DNA and National Geographic.)

Spencer sees a huge international market potential that is not currently being tapped. I do agree with him, but many in European countries are hesitant, and in some places, like France, DNA testing that might expose paternity is illegal. When Europeans see DNA testing as a genealogical tool, he feels they will become more interested. Most Europeans know where their ancestral village is, or they think they do, so it doesn’t have the draw for them that it does for some of us.

Ancestry testing (aka genetic genealogy as opposed to health testing) is now a mature industry with 100% growth rate.

Spencer also mentioned that while the Genographic data base is not open access, that affiliate researchers can send Nat Geo a proposal and thereby gain research access to the data base if their proposal is approved. This extends to citizen scientists as well.

Michael Hammer

You’ll notice that Michael Hammer’s presentation, “Ancient and Modern DNA Update, How Many Ancestral Populations for Europe,” is missing from this wrapup. It was absolutely outstanding, and fascinating, which is why I’m writing a separate article about his presentation in conjunction with some additional information. So, stay tuned.

Testing, More Testing

It’s becoming quite obvious that the people who are doing the best with genetic genealogy are the ones who are testing the most family members, both close and distant. That provides them with a solid foundation for comparison and better ways to “drop matches” into the right ancestor box. For example, if someone matches you and your mother’s sister, Aunt Margaret, especially if your mother is not available to test, that’s a very important hint that your match is likely from your mother’s line.

So, in essence, while initially we would advise people to test the oldest person in a generational line, now we’ve moved to the “test everyone” mentality. Instead of a survey, now we need a census. The exception might be that the “child” does not necessarily need to be tested because both parents have tested. However, having said that, I would perhaps not make that child’s test a priority, but I would eventually test that child anyway. Why? Because that’s how we learn. Let me give you an example.

I was sitting at lunch with David Pike. were discussing autosomal DNA generational transmission and inheritance. He pulled out his iPad, passed it to me, and showed me a chromosome (not the X) that has been passed entirely intact from one generation to the next. Had the child not been tested, we would never have known that. Now, of course, if you’ll remember the 50% rule, by statistical prediction, the child should get half of the mother’s chromosome and half of the father’s, but that’s not how it worked. So, because we don’t know what we don’t know, I’m now testing everyone I can find and convince in my family. Unfortunately, my family is small.

Full genome testing is in the future, but we’re not ready yet. Several presenters mentioned full genome testing in some context. Here’s the bottom line. It’s not truly full genome testing today, only 95-96%. The technology isn’t there yet, and we’re still learning. In a couple of years, we will have the entire genome available for testing, and over time, the prices will fall. Keep in mind that most of our genome is identical to that of all humans, and the autosomal tests today have been developed in order to measure what is different and therefore useful genealogially. I don’t expect big breakthroughs due to full genome testing for genetic genealogy, although I could be wrong. You can, however, count me in, because I’m a DNA junkie. When the full genome test is below $1000, when we have comparison tools and when the coverage won’t necessitate doing a second or upgrade test a few years later, I’ll be there.

Thank you

I want to offer a heartfelt thank you to Max Blankfeld and Bennett Grenspan, founders of Family Tree DNA, shown with me in the photo below, for hosting and subsidizing the administrator’s conference – now for a decade. I look forward to seeing them, and all of the other attendees, next year.

I anticipate that this next decade will see many new discoveries resulting in tools that make our genealogy walls fall. I can’t help but wonder what the article I’ll be writing on the 20th anniversary looking back at nearly a quarter century of genetic genealogy will say!

I’m often asked about the significance of small percentages of autosomal DNA in results. Specifically, the small percentages are often of Native American or results that would suggest Native admixture. One of the first questions I always ask is whether or not the individual has Germanic or eastern European admixture.

Why?

Take a look at this map of the Invasion of the Roman Empire. See the Huns and their path?

It’s no wonder we’re so admixed.

Here’s a map of the Hunnic empire at its peak under Attila between the years 420-469.

But that wasn’t the end of the Asian invasions. The Magyars, who settled in Hungary arrived from Asia as well, in the 800s and 900s, as shown on this map from LaSalle University.

Since both the Hungarians and some Germanic people descend from Asian populations, as do Native Americans, albeit thousands of years apart, it’s not unrealistic to expect that, as populations, they share a genetic connection.

Therefore, when people who carry heritage from this region of the world show small amounts of Native or Asian origin, I’m not surprised. However, for Americans, trying to sort out their Native ethnic heritage, this is most unhelpful.

Let’s take a look at the perfect example candidate. This man is exactly half Hungarian and half German. Let’s see what his DNA results say, relative to any Asian or Native heritage, utilizing the testing companies and the free admixture tools at www.gedmatch.com.

He has not tested at Ancestry, but at Family Tree DNA, his myOrigins report 96% European, 4% Middle Eastern. At 23andMe in speculative view, he shows 99.7 European and .2 sub-saharan African.

Moving to the admixture tools at GedMatch, MDLP is not recommended for Asian or Native ancestry, so I have excluded that tool.

Eurogenes K13 is the most recently updated admixture tool, so let’s take a look at that one first.

Eurogenes K13

Eurogenes K13 showed 7% West Asian, which makes perfect sense considering his heritage, but it might be counted as “Native” in other circumstances, although I would certainly be very skeptical about counting it as such.

However, East Asian, Siberian and Amerindian would all be amalgamated into the Native American category, for a combined percentage of 1.31.

However, selecting the “admixture proportions by chromosome” view shows something a bit different. The cumulative percentages, by chromosome equate to 10.10%. Some researchers mistakenly add this amount and use that as their percentage of Native ancestry. This is not the case, because those are the portions of 100% of each individual chromosome, and the total would need to be divided by 22 to obtain the average value across all chromosomes. The total is irrelevant, and the average may not reflect how the developer determines the amount of admixture because chromosomes are not the same size nor carry the same number of SNPs. Questions relative to the functional underpinnings of each tool should be addressed to the developers.

Dodecad

I understand that there is a newer version of Dodecad, but that it has not been submitted to GedMatch for inclusion, per a discussion with GedMatch. I can’t tell which of the Dodecad versions on GedMatch is the most current, so I ran the results utilizing both v3 and 12b.

I hope v3 is not the most current, because it does not include any Native American category or pseudocategory – although there is a smattering of Northeast Asian at .27% and Southwest Asian at 1%.

Dodecad 12b below

The 12b version does show .52% Siberian and 2.6% Southwest Asian, although I’m not at all sure the Southwest Asian should be included.

HarappaWorld

Harappaworld shows .09 Siberian, .27% American (Native American), .23% Beringian and 1.8% Southwest Asian, although I would not include Southwest Asian in the Native calculation.

In Summary

Neither Family Tree DNA nor 23andMe find Native ancestry in our German/Hungarian tester, but all 3 of the admixture tools at Gedmatch find either small amounts of Native or Asian ancestry that could certainly be interpreted as Native, such as Siberian or Beringian.

Does this mean this German/Hungarian man has Native American ancestry? Of course not, but it does probably mean that the Native population and his ancestral populations did share some genes from the same gene pool thousands of years ago.

While you might think this is improbable, or impossible, consider for a minute that every person outside of Africa today carries some percentage of Neanderthal DNA, and all Europeans also carry Denisovan DNA. Our DNA does indeed have staying power over the millennia, especially once an entire population or group of people is involved. We’ve recently seen this same type of scenarios in the full genome sequencing of a 24,000 year old Siberian male skeleton.

The net-net of this is that minority admixture is not always what it seems to be, especially when utilizing autosomal DNA to detect small amounts of Native American admixture. The big picture needs to be taken into consideration. Caution is advised.

When searching for Native admixture, when possible, both Y DNA and mitochondrial DNA give specific answers for specific pedigree lines relative to ancestry. Of course, to utilize Y or mtDNA, the tester must descend from the Native ancestor either directly paternally to test the male Y chromosome, or directly matrilineally to test the mitochondrial line. You can read about this type of testing, and how it works, in my article, Proving Native American Ancestry Using DNA. You can also read about other ways to prove Native ancestry using autosomal DNA, including how to unravel which pedigree line the Native ancestry descends from, utilizing admixture tools, in the article, “The Autosomal Me.”

Recently, Family Tree DNA introduced their new ethnicity tool, myOrigins as part of their autosomal Family Finder product. This means that all of the major players in this arena using chip based technology (except for the Genographic project) have now updated their tools. Both 23andMe and Ancestry introduced updated versions of their tools in the fall of 2013. In essence, this is the second generation of these biogeographical or ethnicity products. So lets take a look and see how the vendors are doing.

In a recent article, I discussed the process for determining ethnicity percentages using biogeographical ancestry, or BGA, tools. The process is pretty much the same, regardless of which vendor’s results you are looking at. The variant is, of course, the underlying population data base, it’s quality and quantity, and the way the vendors choose to construct and name their regions.

I’ve been comparing my own known and proven genealogy pedigree breakdown to the vendors results for some time now. Let’s see how the new versions stack up to a known pedigree.

The pedigree analysis portion of this document begins about page 8. My ancestral breakdown is as follows:

Geography

Pedigree Percent

Germany

23.8041

British Isles

22.6104

Holland

14.5511

European by DNA

6.8362

France

6.6113

Switzerland

0.7813

Native American

0.2933

Turkish

0.0031

This leaves about 25% unknown.

Let’s look at each vendor’s results one by one.

23andMe

My results using the speculative comparison mode at 23andMe are shown in a chart, below.

23andMe Category

23andMe Percentage

British and Irish

39.2

French/German

15.6

Scandinavian

7.9

Nonspecific North European

27.9

Italian

0.5

Nonspecific South European

1.6

Eastern European

1.8

Nonspecific European

4.9

Native American

0.3

Nonspecific East Asian/Native American

0.1

Middle East/North Africa

0.1

At 23andMe, if you have questions about what exact population makes up each category, just click on the arrow beside the category when you hover over it.

For example, I wasn’t sure exactly what comprises Eastern European, so I clicked.

The first thing I see is sample size and where the samples come from, public data bases or the 23andMe data base. Their samples, across all categories, are most prevalently from their own data base. A rough add shows about 14,000 samples in total.

Clicking on “show details” provides me with the following information about the specific locations of included populations.

Using this information, and reorganizing my results a bit, the chart below shows the comparison between my pedigree chart and the 23andMe results. In cases where the vendor’s categories spanned several of mine, I have added mine together to match the vendor category. A perfect example is shown in row 1, below, where I added France, Holland, Germany and Switzerland together to equal the 23andMe French and German category. Checking their reference populations shows that all 4 of these countries are included in their French and German group.

Geography

Pedigree Percent

23andMe %

Germany, Holland, Switzerland & France

45.7451

15.6

France

6.6113 (above)

Combined

Germany

23.8014 (above)

Combined

Holland

14.5511 (above)

Combined

Switzerland

0.7813 (above)

Combined

British Isles

22.6104

39.2

Native American

0.2933

0.4 (Native/East Asian)

Turkish

0.0031

0.1 (Middle East/North Africa)

Scandinavian

7.9

Italian

0.5

South European

1.6

East European

1.8

European by DNA

6.8362

4.9 (nonspecific European)

Unknown

25

27.9 (North European)

I can also change to the Chromosome view to see the results mapped onto my chromosomes.

The 23andMe Reference Population

According to the 23andMe customer care pages, “Ancestry Composition uses 31 reference populations, based on public reference datasets as well as a significant number of 23andMe members with known ancestry. The public reference datasets we’ve drawn from include the Human Genome Diversity Project, HapMap, and the 1000 Genomes project. For these datasets as well as the data from 23andMe, we perform filtering to ensure accuracy.

Populations are selected for Ancestry Composition by studying the cluster plots of the reference individuals, choosing candidate populations that appear to cluster together, and then evaluating whether we can distinguish the groups in practice. The population labels refer to genetically similar groups, rather than nationalities.”

Additional detailed information about Ancestry Composition is available here.

Ancestry.com

Ancestry is a bit more difficult to categorize, because their map regions are vastly overlapping. For example, the west Europe category is shown above, and the Scandinavian is shown below.

Ancestry’s European populations and regions are so broadly overlapping that almost any interpretation is possible. For example, the Netherlands could be included in several categories – and based up on the history of the country, that’s probably legitimate.

At Ancestry, clicking on a region, then scrolling down will provide additional information about that region of the world, both their population and history.

The Ancestry Reference Population

Just below your ethnicity map is a section titled “Get the Most Out of Your Ethnicity Estimate.” It’s worth clicking, reading and watching the video. Ancestry states that they utilized about 3000 reference samples, pared from 4245 samples taken from people whose ethnicity seems to be entirely from that specific location in the world.

I wrote about the release of my Origins recently, so I won’t repeat the information about reference populations and such found in that article.

Family Tree DNA shows matches by region. Clicking on the major regions, European and Middle Eastern, shown above, display the clusters within regions. In addition, your Family Finder matches that match your ethnicity are shown in highest match order in the bottom left corner of your match page.

Clicking on a particular cluster, such as Trans-Ural Peneplain, highlights that cluster on the map and then shows a description in the lower left hand corner of the page.

www.GedMatch.com is kind enough to include 4 different admixture utilities, contributed by different developers, in their toolbox. Remember, GedMatch is a free, meaning a contribution site – so if you utilize and enjoy their tools – please contribute.

On their main page, after signing in and transferring your raw data files from either 23andMe, Family Tree DNA or Ancestry, you will see your list of options. Among them is “admixture.” Click there.

Of the 4 tools shown, MDLP is not recommended for populations outside of Europe, such as Asian, African or Native American, so I’ve skipped that one entirely.

I selected Admixture Proportions for the part of this exercise that includes the pie chart.

The next option is Eurogenes K13 Admixture Proportions. My results are shown below.

Eurogenes K13

Of course, there is no guide in terms of label definition, so we’re guessing a bit.

According to John at GedMatch, there is a more current version of Dodecad, but the developer has opted not to contribute the current or future versions.

By the way, in case you’re wondering, Gedrosia is an area along the Indian Ocean – I had to look it up!

Geography

Pedigree Percent

Dodecad K12b

North European

75.19

43.50

Germany

23.8041

Combined above

British Isles

22.6104

Combined above

Holland

14.5511

Combined above

European by DNA

6.8362

Combined above

France

6.6113

Combined above

Switzerland

0.7813

Combined above

Native American

0.2933

3.02 Siberian, South Asia, SW Asia, East Asia

Turkish

0.0031

10.93 Caucus

Gedrosia

7.75

Northwest African

1.22

Atlantic Med

33.56

Unknown

25

Third is Harappaworld.

Harappaworld

Baloch is an area in the Iranian plateau.

Geography

Pedigree Percent

Harappaworld %

Northeast Euro

75.19

46.58

Germany

23.8041

Combined above

British Isles

22.6104

Combined above

Holland

14.5511

Combined above

European by DNA

6.8362

Combined above

France

6.6113

Combined above

Switzerland

0.7813

Combined above

Native American

0.2933

2.81 SE Asia, Siberia, NE Asian, American, Beringian

Turkish

0.0031

10.27

Unknown

25

S Indian

0.21

Baloch

9.05

Papuan

0.38

Mediterranean

28.71

The wide variety found in these results makes me curious about how my European results would be categorized using the MDLP tool, understanding that it will not pick up Native, Asian or African.

MDLP K12

The Celto-Germanic category is very close to my mainland European total – but of course, many Germanic people settled in the British Isles.

Second Generation Report Card

Many of these tools picked up my Native American heritage, along with the African. Yes, these are very small amounts, but I do have several proven lines. By proven, I mean both by paper trail (Acadian church and other records) and genetics, meaning Yline and mtDNA. There is no arguing with that combination. I also have other Native lines that are less well proven. So I’m very glad to see the improvements in that area.

Recent developments in historical research and my mitochondrial DNA matches show that my most distant maternal ancestral line in Germany have some type of a Scandinavian connection. How did this happen, and when? I just don’t know yet – but looking at the map below, which are my mtDNA full sequence matches, the pattern is clear.

Could the gene flow have potentially gone the other direction – from Germany to Scandinavia? Yes, it’s possible. But my relatively consistent Scandinavian ethnicity at around 10% seems unlikely if that were the case.

Actually, there is a second possibility for additional Scandinavian heritage and that’s my heavy Frisian heritage. In fact, most of my Dutch ancestors in Frisia were either on or very near the coast on the northernmost part of Holland and many were merchants.

I also have additional autosomal matches with people from Scandinavia – not huge matches – but matches just the same – all unexplained. The most notable of which, and the first I might add, is with my friend, Marja.

It’s extremely difficult to determine how distant the ancestry is that these tests are picking up. It could be anyplace from a generation ago to hundreds of generations ago. It all depends on how the DNA was passed, how isolated the population was, who tested today and which data bases are being utilized for comparison purposes along with their size and accuracy. In most cases, even though the vendors are being quite transparent, we still don’t know exactly who the population is that we match, or how representative it is of the entire population of that region. In some cases, when contributed data is being used, like testers at 23andMe, we don’t know if they understood or answered the questions about their ancestry correctly – and 23andMe is basing ethnicity results on their cumulative answers. In other words, we can’t see beneath the blanket – and even if we could – I don’t know that we’d understand how to interpret the components.

So Where Am I With This?

I knew already, through confirmed paper sources that most of my ancestry is in the European heartland – Germany, Holland, France as well as in the British Isles. Most of the companies and tools confirm this one way or another. That’s not a surprise. My 35 years of genealogical research has given me an extremely strong pedigree baseline that is invaluable for comparing vendor ethnicity results.

The Scandinavian results were somewhat of a surprise – especially at the level in which they are found. If this is accurate, and I tend to believe it is present at some level, then it must be a combined effect of many ancestors, because I have no missing or unknown ancestors in the first 5 generations and only 11 of 64 missing or without a surname in generation 6. Those missing ancestors in generation 6 only contribute about 1.5% of my DNA each, assuming they contribute an average of 50% of their DNA to offspring in each subsequent generation.

Clearly, to reach 10%, nearly all of my missing ancestors, in the US and Germany, England and the Netherlands would have to be 100% Scandinavian – or, alternately, I have quite a bit scattered around in many ancestors, which is a more likely scenario. Still, I’m having a difficult time with that 10% number in any scenario, but I will accept that there is some Scandinavian heritage one way or another. Finding it, however, genealogically is quite another matter.

However, I’m at a total loss as to the genesis of the South European and Mediterranean. This must be quite ancient. There are only two known possible ancestors from these regions and they are many generations back in time – and both are only inferred with clearly enough room to be disproven. One is a possible Jewish family who went to France from Spain in 1492 and the other is possibly a Roman soldier whose descendants are found within a few miles of a Roman fort site today in Lancashire. Neither of these ancestors could have contributed enough DNA to influence the outcome to the levels shown, so the South European/Mediterranean is either incorrect, or very deep ancestry.

The Eastern European makes more sense, given my amount of German heritage. The Germans are well known to be admixed with the Magyars and Huns, so while I can’t track it or prove it, it also doesn’t surprise me one bit given the history of the people and regions where my ancestors are found.

What’s the Net-Net of This?

This is interesting, very interesting. There are tips and clues buried here, especially when all of the various tools, including autosomal matching, Y and mtDNA, are utilized together for a larger picture. Alone, none of these tools are as powerful as they are combined.

I look forward to the day when the reference populations are in the tens of thousands, not hundreds. All of the tools will be far more accurate as the data base is built, refined and utilized.

Until then, I’ll continue to follow each release and watch for more tips and clues – and will compare the various tools. For example, I’m very pleased to see Family Tree DNA’s new ethnicity matching tool incorporated into myOrigins.

I’ve taken the basic approach that my proven pedigree chart is the most accurate, by far, followed by the general consensus of the combined results of all of the vendors. It’s particularly relevant when vendors who don’t use the same reference populations arrive at the same or similar results. For example, 23andMe uses primarily their own clients and Nat Geo of course, although I did not include them above because they haven’t released a new tool recently, uses their own population sample results.

National Geographic’s Geno2

Nat Geo took a bit of a different approach and it’s more difficult to compare to the others. They showed my ethnicity as 43% North European, 36% Mediterranean and 18% Southwest Asian.

While this initially looks very skewed, they then compared me to my two closest populations, genetically, which were the British and the Germans, which is absolutely correct, according to my pedigree chart. Both of these populations are within a few percent of my exact same ethnicity profile, shown below.

The description makes a lot of sense too. “The dominant 49% European component likely reflects the earliest settlers in Europe, hunter-gatherers who arrived there more than 35,000 years ago. The 44% Mediterranean and the 17% Southwest Asian percentages arrived later, with the spread of agriculture from the Fertile Crescent in the middle East, over the past 10,000 years. As these early farmers moved into Europe, they spread their genetic patterns as well.”

So while individually, and compared to my pedigree chart, these results appear questionable, especially the Mediterranean and Southwest Asian portions, in the context of the populations I know I descend from and most resemble, the results make perfect sense when compared to my closest matching populations. Those populations themselves include a significant amount of both Mediterranean and Southwest Asian. Looking at this, I feel a lot better about the accuracy of my results. Sometimes, perspective makes a world of difference.

It’s A Wrap

Just because we can’t exactly map the ethnicity results to our pedigree charts today doesn’t mean the results are entirely incorrect. It doesn’t mean they are entirely correct, either. The results may, in some cases, be showing where population groups descend from, not where our specific ancestors are found more recently. The more ancestors we have from a particular region, the more that region’s profile will show up in our own personal results. This explains why Mediterranean shows up, for example, from long ago but our one Native ancestor from 7 or 8 generations ago doesn’t. In my case, it would be because I have many British/German/Dutch lines that combine to show the ancient Mediterranean ancestry of these groups – where I have many fewer Native ancestors.

Vendors may be picking up deep ancestry that we can’t possible know about today – population migration. It’s not like our ancestors left a guidebook of their travels for us – at least – not outside of our DNA – and we, as a community, are still learning exactly how to read that! We are, after all, participants on the pioneering, leading edge of science.

Having said that, I’ll personally feel a lot better about these kinds of results when the underlying technology, data bases and different vendors’ tools mature to the point where there the differences between their results are minor.

For today, these are extremely interesting tools, just don’t try to overanalyze the results, especially if you’re looking for minority admixture. And if you don’t like your results, try a different vendor or tool, you’ll get an entirely new set to ponder!

On May 6th, Family Tree DNA released myOrigins as a free feature of their Family Finder autosomal DNA test. This autosomal biogeographic feature was previously called Population Finder. It has not just been renamed, but entirely reworked.

Currently, 22 population clusters in 7 major geographic groups are utilized to evaluate your biogeographic ethnicity or ancestry as compared to these groups, many of which are quite ancient.

Prior to release, Family Tree DNA sent out a notification about new matching options. One of the new features is that you will be able to see the matching regions of the people you match – meaning your populations in common. This powerful feature lets you see matches who are similar which can be extremely useful when searching for minority admixture, for example. However, some participants don’t want their matches to be able to see their ethnicity, so everyone was given an ‘opt out’ option. Fortunately, few people have opted out, less than 1%.

Be aware that only your primary matches are shown. This means that your 4-5th cousins or more distant are not shown as ethnicity matches.

Here’s what the FTDNA notification said:

With myOrigins, you’ll be able compare your ethnicity with your Family Finder matches. If you want to share your ethnic origins with your matches, you don’t need to take any action. You’ll automatically be able to compare your ethnicity with your matches when myOrigins becomes available. This is the recommended option. However, we do understand that sharing your ethnicity with your matches is your choice so we’re sending you this reminder in case you want to not take part (opt-out). To opt-out, please follow the instructions below. *

Select the “Do not share my ethnic breakdown with my matches. This will not let me compare my ethnicity with my matches.” radio button.

Click the Save button.

You can get more details about what will be shared here. You may also join our forums for discussion. * You can change your privacy settings at any time. Thus, you may opt-out of or opt back into ethnic sharing at a later date if you change your mind.

What’s New?

Let’s take a look at the My Origins results. You can see your results by clicking on “My Origins” on the Family Finder tab on your personal page at Family Tree DNA.

Ethnicity and Matches

Your population ethnicity is shown on the main page, as well as up to three shared regions that you share with your matches. This means that if you share more than 3 regions with these people, the 4th one (or 5th or 6th, etc.) won’t show. This also means that if your match has an ethnicity you don’t have, that won’t show either.

Above, you see my main results page. Please note that this map is what is known as a heat map. This means that the darkest, or hottest, areas are where my highest percentages are found.

Each region has a breakdown that can be seen by clicking on the region bar. My European region bar population cluster breakdown is shown below along with my ethnicity match to my mother.

And my Middle Eastern breakdown is shown below.

Ethnicity Mapping

A great new feature is the mapping of the maternal and paternal ethnicity of your Family Finder matches, when known. How does Family Tree DNA know? The location data entered in the “Matches Map” location field. Can’t remember if you completed these fields? It’s easy to take a look and see. On either the Y DNA or the mtDNA tabs, click on Matches Map and you’ll see your white balloon. If the white balloon is in the location of your most distant ancestor in your paternal line (for Y) or your matrilineal line for mtDNA (your mother’s mother’s mother’s line on up the tree until you run out of mothers), then you’ve entered the location data and you’re good to go. If your white balloon is on the equator, click on the tab at the bottom of the map that says “update ancestor’s location” and step through the questions.

If you haven’t completed this information, please do. It makes the experience much more robust for everyone.

How Does This Tool Work?

The buttons to the far right of the page show the mapped locations of the oldest paternal lines and the oldest matrilineal (mtDNA) lines of your matches. Direct paternal matches would of course be surname matches, but only to their direct paternal lines. This does not take into account all of their “most distant ancestors,” just the direct paternal ones. This is the yellow button.

The green button provides the direct maternal matches.

Do not confuse this with your Matches Map for your own paternal (if you’re a male) or mitochondrial matches. Just to illustrate the difference, here is my own direct maternal full sequence matches map, available on my mtDNA tab. As you can see, they are very different and convey very different information for you.

Comparisons

By way of comparison, here are my mother’s myOrigins results.

Let’s say I want to see who else matches her from Germany where our most distant mitochondrial DNA ancestor is located.

I can expand the map by scrolling or using the + and – keys, and click on any of the balloons.

Indeed, here is my balloon, right where it should be, and the 97% European match to my mother pops up right beside my balloon. The matches are not broken down beyond region.

This is full screen, so just hit the back button or the link in the upper right hand corner that says “back to FTDNA” to return to your personal page.

How did Family Tree DNA come up with these new regional and population cluster matches?

As we know, all of humanity came originally from Africa, and all of humanity that settled outside of Africa came through the Middle East. People left the Middle East in groups, it would appear, and lived as isolated populations for some time in different parts of the world. As they did, they developed mutations that are found only in that region, or are found much more frequently in that region as opposed to elsewhere. Patterns of mutations like this are established, and when one of us matches those patterns, it’s determined that we have ancestry, either recent or perhaps ancient, from that region of the world.

The key to this puzzle is to find enough differentiation to be able to isolate or identify one group from another. Of course, the groups eventually interbred, at least most of them did, which makes this even more challenging.

Family Tree DNA says in their paper describing the population clusters:

MyOrigins attempts to reduce the wild complexity of your genealogy to the major historical-genetic themes which arc through the life of our species since its emergence 100,000 years ago on the plains of Africa. Each of our 22 clusters describe a vivid and critical color on the palette from which history has drawn the brushstrokes which form the complexity that is your own genome. Though we are all different and distinct, we are also drawn from the same fundamental elements.

The explanatory narratives in myOrigins attempt to shed some detailed light upon each of the threads which we have highlighted in your genetic code. Though the discrete elements are common to all humans, the weight you give to each element is unique to you. Each individual therefore receives a narrative fabric tailored to their own personal history, a story stitched together from bits of DNA.

They have also provided a white paper about their methodology that provides more information.

After reading both of these documents, I much prefer the explanations provided for each cluster in the white paper over the shorter population cluster paper. The longer version breaks the history down into relevant pieces and describes the earliest history and migrations of the various groups.

I was pleased to see the methodology that they used and that four different reference data bases were utilized.

GeneByGene DNA customer database

Human Genome Diversity Project

International HapMap Project

Estonian Biocentre

Given this wealth of resources, I was very surprised to see how few members of some references populations were utilized.

Population

N

Population

N

Armenian

46

Lithuanian

6

Ashkenazi

60

Masai

140

British

39

Mbuti

15

Burmese

8

Moroccan

7

Cambodian

26

Mozabite

24

Danish

13

Norwegian

17

Filipino

20

Pashtun

33

Finnish

49

Polish

35

French

17

Portuguese

25

German

17

Russian

41

Gujarati

31

Saudi

19

Iraqi

12

Scottish

43

Irish

45

Slovakian

12

Italian

30

Spanish

124

Japanese

147

Surui

21

Karitiana

23

Swedish

33

Korean

15

Ukrainian

10

Kuwaiti

14

Yoruba

136

In particular, the areas of France, Germany, Norway, Slovakia, Denmark and the Ukraine appear to be very under-represented, especially given Family Tree DNA’s very heavy European-origin customer base . I would hope that one of the priorities would be to expand this reference data base substantially. Furthermore, I don’t see any New World references included here which calls into question Native American ancestry.

How did they do? Certainly, Family Tree DNA has a great new interface with wonderful new maps and comparison features. Let’s take a look at accuracy and see if everything makes sense.

I am fortunate to have the DNA of one of my parents, my mother. In the chart below, I’m comparing that result and inferring my father’s results by subtracting mine from my mother’s. This may not be entirely accurate, because this presumes I received the full amount of that ethnicity from my mother, and that is probably not accurate – but – it’s the best I can do under the circumstances. It’s safe to say that my father has a minimum of this amount of that particular population category and may have more.

Region

Me

Mom

Dad Inferred Minimum

European Coastal Plain

68

17

51

European Northlands

12

7

5

Trans Ural Peneplain

11

10

1

European Coastal Islands

7

34

0

Anatolia and Caucus

3

0

3

North Mediterranean

0

34

0

Circumpolar

0

1

0

Undetermined*

0

0

40

*The Undetermined category is not from Family Tree DNA, but is the percentage of my father not accounted for by inference. This 40% is DNA that I did not inherit if it falls into a different category.

Based on these results alone, I have the following observations.

I find it odd that my mother has 34% North Mediterranean and I have none. We have no known ancestry from this region.

My mother does have one distant line of Turkish DNA via France. I have presumed that my Middle Eastern (now Anatolia and Caucus) was through that line, but these results suggest otherwise.

My mother’s Circumpolar may be Native American. She does have proven Native lines (Micmac) through the Acadian families.

These results have missed both my Native lines (through both parents) and my African admixture although both are small percentages.

The European Coastal Plain is one of the groups that covers nearly all of Europe. Given that my mother is 3/4th Dutch/German, with the balance being Acadian, Native and English, one would expect her to have significantly more, especially given my high percentage.

The European Coastal Island percentages are very different for me and my mother, with me carrying much less than my mother. This is curious, because she is 3/4th German/Dutch with between 1/8th and 3/16th English while my father’s lines are heavily UK. My father’s ancestry may well be reflected in European Coastal Plain which covers a great deal of territory.

What We Need to Remember

All of the biogeographic tools, from Family Tree DNA, 23andMe and Ancestry, are “estimates” and each of the tools from the three major vendors rend different results. Each one is using different combinations of reference populations, so this really isn’t surprising. Hopefully, as the various companies increase their population references and the size of their reference data bases, the results will increasingly mesh from company to company. These results are only as good as the back end tools and the DNA that you randomly inherited from your ancestors.

Furthermore, we all carry far more similar DNA than different DNA, so it’s extremely difficult to make judgment calls based on ranges. Europe, for example, is extremely admixed and the US is moreso. The British Isles were a destination location for many groups over thousands of years. Some of the DNA being picked up by these tests may indeed be very ancient and may cause us to wonder where it came from. In future test versions, this may be more perfectly refined.

There is no way to gauge “ancient” DNA, like from the Middle East Diaspora, from more contemporary DNA, only a thousand years or so old, once it’s in very small segments. In other words, it’s all very individual and personal and pretty much cast in warm jello. We’ve come a long way, but we aren’t “there” yet. However, without these tools and the vendors working to make them better, we’ll never get “there,” so keep that in mind.

While this makes great conversation today, and there is no question about accuracy in terms of majority ancestry/ethnicity, no one should make any sweeping conclusions based on this information. This is not “cast in concrete” in the same way as Y DNA and mitochondrial haplogroups and STR markers. Those are irrefutable – while biogeographical ethnicity remains a bit ethereal.

In summary, I would simply say that this tool can provide great hints and tips, especially the matching, which is unique, but it can’t disprove anything. The absence of minority admixture, which is what so many people are hunting for, may be the result of the various data bases and the infancy of the science itself, and not the absence of admixture.

My recommendation would be to utilize all three biogeographic admixture products as well as the free tools in the Admixture category at GedMatch. Look for consistency in results between the tools. I discussed this methodology in “The Autosomal Me” series.

What Next?

I asked Dr. David Mittelman, Chief Scientific Officer, at Family Tree DNA about the reference populations. He indicated that he agreed that some of their reference populations are small and they are actively working to increase them. He also stated that it is important to note that Family Tree DNA prioritized accuracy over false positives so they definitely took a conservative approach.