Category Archives: Minority Admixture Mapping

Elizabeth Warren has released DNA testing results after being publicly challenged and derided as “Pochahontas” as a result of her claims of a family story indicating that her ancestors were Native America. If you’d like to read the specifics of the broo-haha, this Washington Post Article provides a good summary, along with additional links.

I personally find name-calling of any type unacceptable behavior, especially in a public forum, and while Elizabeth’s DNA test was taken, I presume, in an effort to settle the question and end the name-calling, what it has done is to put the science of genetic testing smack dab in the middle of the headlines.

This article is NOT about politics, it’s about science and DNA testing. I will tell you right up front that any comments that are political or hateful in nature will not be allowed to post, regardless of whether I agree with them or not. Unfortunately, these results are being interpreted in a variety of ways by different individuals, in some cases to support a particular political position. I’m presenting the science, without the politics.

This is the first of a series of two articles.

I’m dividing this first article into four sections, and I’d ask you to read all four, especially before commenting. A second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will follow shortly about how to get the most out of an ethnicity test when hunting for Native American (or other minority, for you) ethnicity.

Understanding how the science evolved and works is an important factor of comprehending the results and what they actually mean, especially since Elizabeth’s are presented in a different format than we are used to seeing. What a wonderful teaching opportunity.

Family History and DNA Science – How this works.

Elizabeth Warren’s Genealogy

Elizabeth Warren’s DNA Results

Questions and Answers – These are the questions I’m seeing, and my science-based answers.

My second article, Possibilities – Wringing the Most Out of Your DNA Ethnicity Test will include:

Potential – This isn’t all that can be done with ethnicity results. What more can you do to identify that Native ancestor?

Resources with Step by Step Instructions

Now, let’s look at Elizabeth’s results and how we got to this point.

Family Stories and DNA

Every person that grows up in their biological family hears family stories. We have no reason NOT to believe them until we learn something that potentially conflicts with the facts as represented in the story.

In terms of stories handed down for generations, all we have to go on, initially, are the stories themselves and our confidence in the person relating the story to us. The day that we begin to suspect that something might be amiss, we start digging, and for some people, that digging begins with a DNA test for ethnicity.

My family had that same Cherokee story. My great-grandmother on my father’s side who died in 1918 was reportedly “full blooded Cherokee” 60 years later when I discovered she had existed. Her brothers reportedly went to Oklahoma to claim headrights land. There were surely nuggets of truth in that narrative. Family members did indeed to go Oklahoma. One did own Cherokee land, BUT, he purchased that land from a tribal member who received an allotment. I discovered that tidbit later.

What wasn’t true? My great-grandmother was not 100% Cherokee. To the best of my knowledge now, a century after her death, she wasn’t Cherokee at all. She probably wasn’t Native at all. Why, then, did that story trickle down to my generation?

I surely don’t know. I can speculate that it might have been because various people were claiming Native ancestry in order to claim land when the government paid tribal members for land as reservations were dissolved between 1893 and 1914. You can read more about that in this article at the National Archives about the Dawes Rolls, compiled for the Cherokee, Creek, Choctaw, Chickasaw and Seminole for that purpose.

I can also speculate that someone in the family was confused about the brother’s land ownership, especially since it was Cherokee land.

I could also speculate that the confusion might have resulted because her husband’s father actually did move to Oklahoma and lived on Choctaw land.

But here is what I do know. I believed that story because there wasn’t any reason NOT to believe it, and the entire family shared the same story. We all believed it…until we discovered evidence through DNA testing that contradicted the story.

Before we discuss Elizabeth Warren’s actual results, let’s take a brief look at the underlying science.

Enter DNA Testing

DNA testing for ethnicity was first introduced in a very rudimentary form in 2002 (not a typo) and has progressed exponentially since. The major vendors who offer tests that provide their customers with ethnicity estimates (please note the word estimates) have all refined their customer’s results several times. The reference populations improve, the vendor’s internal software algorithms improve and population genetics as a science moves forward with new discoveries.

Note that major vendors in this context mean Family Tree DNA, 23andMe, the Genographic Project and Ancestry. Two newer vendors include MyHeritage and LivingDNA although LivingDNA is focused on England and MyHeritage, who utilizes imputation is not yet quite up to snuff on their ethnicity estimates. Another entity, GedMatch isn’t a testing vendor, but does provide multiple ethnicity tools if you upload your results from the other vendors. To get an idea of how widely the results vary, you can see the results of my tests at the different vendors here and here.

My initial DNA ethnicity test, in 2002, reported that I was 25% Native American, but I’m clearly not. It’s evident to me now, but it wasn’t then. That early ethnicity test was the dinosaur ages in genetic genealogy, but it did send me on a quest through genealogical records to prove that my family member was indeed Native. My father clearly believed this, as did the rest of the family. One of my early memories when I was about four years old was attending a (then illegal) powwow with my Dad.

In order to prove that Elizabeth Vannoy, that great-grandmother, was Native I asked a cousin who descends from her matrilineally to take a mitochondrial DNA test that would unquestionably provide the ethnicity of her matrilineal line – that of her mother’s mother’s mother’s direct line. If she was Native, her haplogroup would be a derivative either A, B, C, D or X. Her mitochondrial DNA was European, haplogroup J, clearly not Native, so Elizabeth Vannoy was not Native on that line of her family. Ok, maybe through her dad’s line then. I was able to find a Vanoy male descendant of her father, Joel Vannoy, to test his Y DNA and he was not Native either. Rats!

Tracking Elizabeth Vannoy’s genealogy back in time provided no paper-trail link to any Native ancestors, but there were and are still females whose surnames and heritage we don’t know. Were they Native or part Native? Possibly. Nothing precludes it, but nothing (yet) confirms it either.

Ethnicity is often surprising and sometimes disappointing. People who expect Native American heritage in their DNA sometimes don’t find it. Why?

There is no Native ancestor

The Native DNA has “washed out” over the generations, but they did have a Native ancestor

We haven’t yet learned to recognize all of the segments that are Native

The testing company did not test the area that is Native

Not all vendors test the same areas of our DNA. Each major company tests about 700,000 locations, roughly, but not the same 700,000. If you’re interested in specifics, you can read more about that here.

50-50 Chance

Everyone receives half of their autosomal DNA from each parent.

That means that each parent contributes only HALF OF THEIR DNA to a child. The other half of their DNA is never passed on, at least not to that child.

Therefore, ancestral DNA passed on is literally cut in half in each generation. If your parent has a Native American DNA segment, there is a 50-50 chance you’ll inherit it too. You could inherit the entire segment, a portion of the segment, or none of the segment at all.

These calculations are estimates and use averages. Why? Because they tell us what to expect, on average. Every person’s results will vary. It’s entirely possible to carry a Native (or other ethnic) segment from 7 or 8 or 9 generations ago, or to have none in 5 generations. Of course, these calculations also presume that the “Native” ancestor we find in our tree was fully Native. If the Native ancestor was already admixed, then the percentages of Native DNA that you could inherit drop further.

Why Call Ethnicity an Estimate?

You’ve probably figured out by now that due to the way that DNA is inherited, your ethnicity as reported by the major testing companies isn’t an exact science. I discussed the methodology behind ethnicity results in the article, Ethnicity Testing – A Conundrum.

It is, however, a specialized science known as Population Genetics. The quality of the results that are returned to you varies based on several factors:

World Region – Ethnicity estimates are quite accurate at the continental level, plus Jewish – meaning African, Indo-European, Asian, Native American and Jewish. These regions are more different than alike and better able to be separated.

Reference Population – The size of the population your results are being compared to is important. The larger the reference population, the more likely your results are to be accurate.

Vendor Algorithm – None of the vendors provide the exact nature of their internal algorithms that they use to determine your ethnicity percentages. Suffice it to say that each vendor’s staff includes population geneticists and they all have years of experience. These internal differences are why the estimates vary when compared to each other.

Size of the Segment – As with all genetic genealogy, bigger is better because larger segments stand a better chance of being accurate.

Academic Phasing – A methodology academics and vendors use in which segments of DNA that are known to travel together during inheritance are grouped together in your results. This methodology is not infallible, but in general, it helps to group your mother’s DNA together and your father’s DNA together, especially when parents are not available for testing.

Parental Phasing – If your parents test and they too have the same segment identified as Native, you know that the identification of that segment as Native is NOT a factor of chance, where the DNA of each of your parents just happens to fall together in a manner as to mimic a Native segment. Parental phasing is the ability to divide your DNA into two parts based on your parent’s DNA test(s).

Two Chromosomes – You have two chromosomes, one from your mother and one from your father. DNA testing can’t easily separate those chromosomes, so the exact same “address” on your mother’s and father’s chromosomes that you inherited may carry two different ethnicities. Unless your parents are both from the same ethnic population, of course.

All of these factors, together, create a confidence score. Consumers never see these scores as such, but the vendors return the highest confidence results to their customers. Some vendors include the capability, one way or another, to view or omit lower confidence results.

Parental Phasing – Identical by Descent

If you’re lucky enough to have your parents, or even one parent available to test, you can determine whether that segment thought to be Native came from one of your parents, or if the combination of both of your parent’s DNA just happened to combine to “look” Native.

Here’s an example where the “letters” (nucleotides) of Native DNA for an example segment are shown at left. If you received the As from one of your parents, your DNA is said to be phased to that parent’s DNA. That means that you in fact inherited that piece of your DNA from your mother, in the case shown below.

That’s known as Identical by Descent (IBD). The other possibility is what your DNA from both of your parents intermixed to mimic a Native segment, shown below.

This is known as Identical by Chance (IBC).

You don’t need to understand the underpinnings of this phenomenon, just remember that it can happen, and the smaller the segment, the more likely that a chance combination can randomly happen.

Elizabeth Warren’s Genealogy

Elizabeth Warren’s genealogy, is reported to the 5th generation by WikiTree.

Dr. Carlos D. Bustamante is an internationally recognized leader in the application of data science and genomics technology to problems in medicine, agriculture, and biology. He received his Ph.D. in Biology and MS in Statistics from Harvard University (2001), was on the faculty at Cornell University (2002-9), and was named a MacArthur Fellow in 2010. He is currently Professor of Biomedical Data Science, Genetics, and (by courtesy) Biology at Stanford University. Dr. Bustamante has a passion for building new academic units, non-profits, and companies to solve pressing scientific challenges. He is Founding Director of the Stanford Center for Computational, Evolutionary, and Human Genomics (CEHG) and Inaugural Chair of the Department of Biomedical Data Science. He is the Owner and President of CDB Consulting, LTD. and also a Director at Eden Roc Biotech, founder of Arc-Bio (formerly IdentifyGenomics and BigData Bio), and an SAB member of Imprimed, Etalon DX, and Digitalis Ventures among others.

Ancestry Informative Markers (AIMs) are commonly used to estimate overall admixture proportions efficiently and inexpensively. AIMs are polymorphisms that exhibit large allele frequency differences between populations and can be used to infer individuals’ geographic origins.

And:

Using a panel of AIMs distributed throughout the genome, it is possible to estimate the relative ancestral proportions in admixed individuals such as African Americans and Latin Americans, as well as to infer the time since the admixture process.

The methodology produced results of the type that we are used to seeing in terms of continental admixture, shown in the graphic below from the paper.

Matching test takers against the genetic locations that can be identified as either Native or African or European informs us that our own ancestors carried the DNA associated with that ethnicity.

Of course, the Native samples from this paper were focused south of the United States, but the process is the same regardless. The original Native American population of a few individuals arrived thousands of years ago in one or more groups from Asia and their descendants spread throughout both North and South America.

Elizabeth’s request, from the report:

To analyze genetic data from an individual of European descent and determine if there is reliable evidence of Native American and/or African ancestry. The identity of the sample donor, Elizabeth Warren, was not known to the analyst during the time the work was performed.

Elizabeth’s test included 764,958 genetic locations, of which 660,173 overlapped with locations used in ancestry analysis.

The Results section says after stating that Elizabeth’s DNA is primarily (95% or greater) European:

The analysis also identified 5 genetic segments as Native American in origin at high confidence, defined at the 99% posterior probability value. We performed several additional analyses to confirm the presence of Native American ancestry and to estimate the position of the ancestor in the individual’s pedigree.

The largest segment identified as having Native American ancestry is on chromosome 10. This segment is 13.4 centiMorgans in genetic length, and spans approximately 4,700,000 DNA bases. Based on a principal components analysis (Novembre et al., 2008), this segment is clearly distinct from segments of European ancestry (nominal p-value 7.4 x 10-7, corrected p-value of 2.6 x 10-4) and is strongly associated with Native American ancestry.

The total length of the 5 genetic segments identified as having Native American ancestry is 25.6 centiMorgans, and they span approximately 12,300,000 DNA bases. The average segment length is 5.8 centiMorgans. The total and average segment size suggest (via the method of moments) an unadmixed Native American ancestor in the pedigree at approximately 8 generations before the sample, although the actual number could be somewhat lower or higher (Gravel, 2012 and Huff et al., 2011).

Dr. Bustamante’s Conclusion:

While the vast majority of the individual’s ancestry is European, the results strongly support the existence of an unadmixed Native American ancestor in the individual’s pedigree, likely in the range of 6-10 generations ago.

I was very pleased to see that Dr. Bustamante had included the PCA (Principal Component Analysis) for Elizabeth’s sample as well.

PCA analysis is the scientific methodology utilized to group individuals to and within populations.

Figure one shows the section of chromosome 10 that showed the largest Native American haplotype, meaning DNA block, as compared to other populations.

Remember that since Elizabeth received a chromosome from BOTH parents, that she has two strands of DNA in that location.

Here’s our example again.

Given that Mom’s DNA is Native, and Dad’s is European in this example, the expected results when comparing this segment of DNA to other populations is that it would look half Native (Mom’s strand) and half European (Dad’s strand.)

The second graphic shows Elizabeth’s sample and where it falls in the comparison of First Nations (Canada) and Indigenous Mexican individuals. Given that Elizabeth’s Native ancestor would have been from the United States, her sample falls where expected, inbetween.

Let’s take a look at some of the questions being asked.

Questions and Answers

I’ve seen a lot of misconceptions and questions regarding these results. Let’s take them one by one:

Question – Can these results prove that Elizabeth is Cherokee?

Answer – No, there is no test, anyplace, from any lab or vendor, that can prove what tribe your ancestors were from. I wrote an article titled Finding Your American Indian Tribe Using DNA, but that process involves working with your matches, Y and mitochondrial DNA testing, and genealogy.

Q – Are these results absolutely positive?

A – The words “absolutely positive” are a difficult quantifier. Given the size of the largest segment, 13.4 cM, and that there are 5 Native segments totaling 25.6 cM, and that Dr. Bustamante’s lab performed the analysis – I’d say this is as close to “absolutely positive” as you can get without genealogical confirmation.

A 13.4 cM segment is a valid segment that phases to parents 98% of the time, according to Philip Gammon’s work, here, and 99% of the time in my own analysis here. That indicates that a 13.4 cM segment is very likely a legitimately ancestral segment, not a match by chance. The additional 4 segments simply increase the likelihood of a Native ancestor. In other words, for there NOT to be a Native ancestor, all 5 segments, including the large 13.4 cM segment would have to be misidentified by one of the premier scientists in the field.

Q – What did Dr. Bustamante mean by “evidence of an unadmixed Native American ancestor?”

A – Unadmixed means that the Native person was fully Native, meaning not admixed with European, Asian or African DNA. Admixture, in this context, means that the individual is a mixture of multiple ethnic groups. This is an important concept, because if you discover that your ancestor 4 generations ago was a Cherokee tribal member, but the reality was that they were only 25% Native, that means that the DNA was already in the process of being divided. If your 4th generation ancestor was fully Native, you would receive about 6.25% of their DNA which would be all Native. If they were only 25% Native, that means that while you will still receive about 6.25% of their DNA but only one fourth of that 6.25% is possibly Native – so 1.56%. You could also receive NONE of their Native DNA.

Q – Is this the same test that the major companies use?

A – Yes and no. The test itself was probably performed on the same Illumina chip platform, because the chips available cover the markers that Bustamante needed for analysis.

The major companies use the same reference data bases, plus their own internal or private data bases in addition. They do not create PCA models for each tester. They do use the same methodology described by Dr. Bustamante in terms of AIMs, along with proprietary algorithms to further define the results. Vendors may also use additional internal tools.

Q – Did Dr. Bustamante use more than one methodology in his analysis? What if one was wrong?

A – Yes, he utilized two different methodologies whose results agreed. The global ancestry method evaluates each location independently of any surrounding genetic locations, ignoring any correlation or relationship to neighboring DNA. The second methodology, known as the local ancestry method looks at each location in combination with its neighbors, given that DNA pieces are known to travel together. This second methodology allows comparisons to entire segments in reference populations and is what allows the identification of complete ancestral segments that are identified as Native or any other population.

Q – If Elizabeth’s DNA results hadn’t shown Native heritage, would that have proven that she didn’t have Native ancestry?

A – No, not definitively, although that is a possible reason for ethnicity results not showing Native admixture. It would have meant that either she didn’t have a Native ancestor, the DNA washed out, or we cannot yet detect those segments.

Q – Does this qualify Elizabeth to join a tribe?

A – No. Every tribe defines their own criteria for membership. Some tribes embrace DNA testing for paternity issues, but none, to the best of my knowledge, accept or rely entirely on DNA results for membership. DNA results alone cannot identify a specific tribe. Tribes are societal constructs and Native people genetically are more alike than different, especially in areas where tribes lived nearby, fought and captured other tribe’s members.

Q – Why does Dr. Bustamante use words like “strong probability” instead of absolutes, such as the percentages shown by commercial DNA testing companies?

A – Dr. Bustamante’s comments accurately reflect the state of our knowledge today. The vendors attempt to make the results understandable and attractive for the general population. Most vendors, if you read their statements closely and look at your various options indicate that ethnicity is only an estimate, and some provide the ability to view your ethnicity estimate results at high, medium and low confidence levels.

Q – Can we tell, precisely, when Elizabeth had a Native ancestor?

A – No, that’s why Dr. Bustamante states that Elizabeth’s ancestor was approximately 8 generations ago, and in the range of 6-10 generations ago. This analysis is a result of combined factors, including the total centiMorgans of Native DNA, the number of separate reasonably large segments, the size of the longest segment, and the confidence score for each segment. Those factors together predict most likely when a fully Native ancestor was present in the tree. Keep in mind that if Elizabeth had more than one Native ancestor, that too could affect the time prediction.

Q – Does Dr. Bustamante provide this type of analysis or tools for the general public?

A – Unfortunately, no. Dr. Bustamante’s lab is a research facility only.

Roberta’s Summary of the Analysis

I find no omissions or questionable methods and I agree with Dr. Bustamante’s analysis. In other words, yes, I believe, based on these results, that Elizabeth had a Native ancestor further back in her tree.

I would love for every tester to be able to receive PCA results like this.

However, an ethnicity confirmation isn’t all that can be done with Elizabeth’s results. Additional tools and opportunities are available outside of an academic setting, at the vendors where we test, using matching and other tools we have access to as the consuming public.

We will look at those possibilities in a second article, because Elizabeth’s results are really just a beginning and scratch the surface. There’s more available, much more. It won’t change Elizabeth’s ethnicity results, but it could lead to positively identifying the Native ancestor, or at least the ancestral Native line.

Join me in my next article for Possibilities, Wringing the Most Out of Your DNA Ethnicity Test.

Recently Shawn and Lois Potter utilized the Minority Admixture Mapping technique I developed, utilized and described in the series “The Autosomal Me” to establish that the mother of John Red Bank Payne was Native American. Shawn and Lois were so encouraged after that positive experience that they set forth to document another Native ancestor.

They produced this report as a beautiful and fully sourced booklet which they have very graciously given permission to reproduce in part here.

Daughters of Princess Mary Kittamaquund

Every student of American history has heard about Pocahontas—the young Indian princess who struggled to establish peace between the Powhatan Indians and Virginia colonists, married Englishman John Rolfe, and left descendants through her son Thomas Rolfe. But, few have heard about Mary Kittamaquund—another young Indian princess who likewise promoted peace between the Piscataway Indians and Maryland colonists, married Englishman Giles Brent, and, as revealed by archival research combined with DNA analysis, left descendants through her daughters. Both women lived heroic yet brief lives; and both should be remembered for their devotion to their people in an age of momentous danger and change. The following sketch introduces Princess Mary Kittamaquund and her daughters.

Mary Kittamaquund, daughter of the Tayac (Paramount Chief) of the Piscataway Indians, was born in Maryland probably about 1631.[i] Her father ruled over as many as 7,000 people between the Potomac and Patuxent Rivers.[ii] Following about six months of dialogue and study with Jesuit Missionary Father Andrew White, Mary’s father converted to Christianity and was baptized on July 5, 1640.[iii] Soon after February 15, 1640/1, Mary too was baptized, and her father sent her to the English settlement called St Mary’s City, near the mouth of the Potomac River, to be educated by Governor Leonard Calvert and his sister-in-law, Margaret Brent.[iv]

Mary married Giles Brent, brother of Margaret Brent, before January 9, 1644/5.[v] A band of Parliamentarians led by Richard Ingle and William Claiborne attacked St Mary’s City on February 14, 1644/5, and carried Giles Brent, Father Andrew White, and others in chains to England. Upon his arrival in London, Giles brought suit against his captors and returned to Maryland before June 19, 1647.[vi] Mary and Giles moved to present day Aquia, Stafford County, Virginia, after November 8, 1648, and before August 20, 1651.[vii] Mary died probably after April 18, 1654, and before September 4, 1655.[viii] Giles Brent died in Middlesex County, Virginia, on September 2, 1679.[ix]

Scholars disagree about the number of children born to Mary Kittamaquund and Giles Brent. Some list only three children named in the 1663 and 1671 wills of sister and brother Margaret and Giles Brent.[x] Margaret appointed her brother Giles “and his children Giles Brent, Mary Brent, and Richard Brent” executors of her estate.[xi] Giles left bequests to his son Giles Brent and daughter Mary Fitzherbert.[xii] Other historians, such as Dr. Lois Green Carr, Maryland Historian at the Maryland State Archives, on the basis of information gleaned from provincial court records, probate records, and quitrent rolls, identify six children of Mary and Giles, including Katherine Brent (who married Richard Marsham), Giles Brent (who married his cousin Mary Brent), Mary Brent (who married John Fitzherbert), Richard Brent (who died after December 26, 1663), Henry Brent (who died young), and Margaret Brent (who also died young).[xiii]

Some researchers further believe daughter Mary Brent divorced John Fitzherbert before April 26, 1672, and married second Charles Beaven. This belief is supported by (1) a reference to the divorce of Mary and John in a letter of this date from Charles Calvert to his father, (2) a statement regarding “my brother iñ Richard Marsham” in the June 20, 1698 will of Charles Beaven, (3) the appointment of “my well beloved Richard Marsham” by Mary Beaven to be executor of her 1712 will, and (4) other circumstances demonstrating kinship ties between these families.[xiv] Still others refuse to accept this relationship without further evidence, lamenting the loss of contemporary records which has “confused researchers for a hundred years.”[xv]

Recent DNA analysis, however, reveals six descendants of Katherine and Richard Marsham and three descendants of Mary and Charles Beaven, representing six separate lineages, inherited at least sixteen matching segments of Native American DNA on chromosomes 2, 3, 4, 5, 6, 7, 8, 13, 15, 16, 20, and 22. Figure 1 shows the relationships between these descendants; and Figures 2-17 illustrate the sixteen matching Native American chromosomal segments (see Figures 18-33 for additional images of these segments produced by four independent admixture tools; and also see http://dna-explained.com/2013/06/02/the-autosomal-me-summary-and-pdf-file/ for information about Minority Admixture Mapping). These matching chromosomal segments point to a common Native American ancestor, who, because other possibilities can be eliminated, must have been the mother of Katherine and Mary.[xvi] Considering this DNA evidence in light of contemporary records, it now seems certain Mary Kittamaquund and Giles Brent were the parents of Katherine, wife of Richard Marsham, and Mary, wife first of John Fitzherbert and second of Charles Beaven.

Genealogical Summary

Katherine Brent was born probably in Aquia, Stafford County, Virginia, say about 1650. She may have served an unknown period of indentured service to Thomas Brooke, perhaps following the death of her mother, before she married Richard Marsham perhaps before December 26, 1663, and certainly before March 11, 1664/5.[xvii] Richard immigrated to Maryland in 1658, where he served three-years of indentured service to John Horne for his transatlantic voyage.[xviii] Katherine died in Calvert County, Maryland, before October 26, 1670.[xix] Richard married second Anne Calvert, widow first of Baker Brooke Sr., and second of Henry Brent, after April 30, 1695, and before February, 1696.[xx] Richard died in Prince George’s County, Maryland, between April 14 and 22, 1713.[xxi] Katherine and Richard were the parents of the following children:

1. Sarah Marsham was born in Calvert County, Maryland, say about 1667, married first Basil Waring say about 1685, married second William Barton after December 29, 1688, married third James Haddock after April 19, 1703, and died in Charles County, Maryland, after January 8, 1733.[xxii]

2. Katherine Marsham was born in Calvert County, Maryland, say about 1669, married first her future step-brother Baker Brooke Jr. say about 1689, married second Samuel Queen after May 27, 1698, and died in St Mary’s County after March 18, 1712, and before April 14, 1713.[xxiii]

Mary Brent was born probably in Aquia, Stafford County, Virginia, say about 1654.[xxiv] She married first John Fitzherbert before 1671.[xxv] Mary and John divorced before April 26, 1672.[xxvi] Mary married second Charles Beaven say about 1674. Charles died in Prince George’s County, Maryland, between June 20, 1698, and June 21, 1699.[xxvii] Mary died in Prince George’s County between April 28, 1712, and June 13, 1713.[xxviii] Mary and Charles were the parents of the following children:

1. Richard Beaven was born in Calvert County, Maryland, say about 1676, married Jane Blanford before June 11, 1703, and died in Prince Georges County, Maryland, before August 9, 1744.[xxix]

2. Sarah Beaven was born in Calvert County, Maryland, say about 1678, married Thomas Blanford on June 20, 1698, and died in Prince Georges County, Maryland, after August 7, 1749.[xxx]

3. Margaret Beaven was born in Calvert County, Maryland, say about 1680, and died in Prince George’s County, Maryland, between April 28, 1712, and June 13, 1713.

4. Elizabeth Beaven was born in Calvert County, Maryland, say about 1682, married John Boone about 1708, and died in Prince Georges County, Maryland, before October 30, 1725.

5. Katherine Beaven was born in Calvert County, Maryland, say about 1684, married Henry Culver about 1711, and died in Prince Georges County, Maryland, before December 20, 1762.[xxxi]

6. Charles Beaven was born in Calvert County, Maryland, say about 1686, married Mary Finch about 1712, and died in Prince Georges County, Maryland, on December 16, 1761.[xxxii]

Following this lineage information, Shawn and Lois included a chromosome by chromosome analysis of the various individuals who tested. I am including only one example, below.

Following the many pages of genetic comparison information, Shawn and Lois included quite a bit for their readers about the Piscataway History and Culture. After all, DNA without genealogy and history is impersonal science. Included were early drawings and paintings of Native people and villages, an account of the people by Father Andrew White in 1635 as well as anonymous documents from 1639 and 1640. Their food, language and vocabulary were discussed as well with historical events being presented in timeline format.

1644 Wahocasso succeeded as Tayac, who was succeeded by Uttapoingassenem in 1658, who was succeeded by Wannasapapin in 1662, who was succeeded by Nattowasso (son of Wahocasso—breaking the tradition of matrilineal succession) in 1663

1666 Facing increasing encroachments by European settlers, the Piscataway petitioned the Maryland council, saying: “We can flee no further. Let us know where to live, and how to be secured for the future from the hogs and cattle.”

1695 Maryland Governor Francis Nicholson “advised the council to find a way of depriving Indians beyond Mattawoman Creek of their lands, in order to ‘occasion a greater quantity of Tobacco to be made.'”

1697 Piscataway Tayac Ochotomaquath and about 400 others fled to northern Virginia; then they allied with the Iroquois in 1701 and moved to Pennsylvania.

Although many Piscataway left Maryland by the end of the 17th century in the face of encroaching European settlements, others remained on their homeland, intermarrying with Europeans and Africans, while preserving their cultural traditions. In 1996, an advisory committee appointed by the Maryland Historical Trust voted unanimously to recommend state recognition of the Piscataway Indian Nation, citing genealogical, linguistic, cultural, and political continuity between the earliest Piscataway people and their modern descendants. On January 9, 2012, Maryland Governor Martin O’Malley issued two executive orders, granting official state recognition to the Piscataway Indian Nation (about 100 members), and the Piscataway Conoy Tribe—consisting of the Piscataway Conoy Confederacy and Subtribes (about 3,500 members), and the Cedarville Band of Piscataway (about 500 members).

This drawing of St Mary’s City in 1634 by Cary Carson from the Maryland State Archives Map Collection shows the Native people living outside the city fortifications.

This 262 page book is a wonderful combination of genealogy, genetics and history, and does exactly what genetic genealogy is supposed to do. It enables us to document and better understand our ancestors, and in this case, to prove they were indeed, Native American. Shawn and Lois would welcome inquiries about the book or the family lines included and you can contact them at shpxlcp@comcast.net.

[i] Most scholars estimate her year of birth as 1634, because an unidentified Catholic missionary made the following statement about her. “On the 15th of February we came to Pascatoe, not without the great gratulation and joy of the inhabitants, who indeed seem well inclined to receive the christian faith. So that not long after, the king brought his daughter, seven years old, (whom he loves with great affection,) to be educated among the English at St. Mary’s; and when she shall well understand the christian mysteries, to be washed in the sacred font of baptism.” See “Extracts from Different Letters of Missionaries, from the Year 1635, to the Year 1638,” in E.A. Dalrymple, ed., Relatio Itineris in Marylandiam. Declaratio Coloniae Domini Baronis de Baltimoro. Excerpta ex Diversis Litteris Missionariorum ab Anno 1635, ad Annum 1638, Narrative of a Voyage to Maryland, by Father Andrew White, S.J. An Account of the Colony of the Lord Baron Baltimore. Extracts from Different Letters of Missionaries, from the Year 1635 to the Year 1677 (Baltimore: Maryland Historical Society, 1874), 76. But, the circumstances of Mary’s life suggest she was born a few years earlier. So, we suspect the author of this letter underestimated her age.

[v] John Lewger to Governor Leonard Calvert, January 9, 1644/5, in Proceedings of the Council of Maryland, 1636-1667, Vol 3, pp. 162-163 (original pages 186-187), Archives of Maryland Online. “To the horle Governor. Sir I doe signify unto you that Mr Giles Brent hath delivered unto me 2. petitions nerewth sent unto you; and I desire you by vertue of the Law in that behalfe, that you wilbe pleased to give him a competent security for his indemnification in the possession of the lands at Kent, mentioned in one of the said petitions, & for iustification of his title in them, according to the said petition, dated 7. January instant: & likewise to satisfy unto him 5700l tob & cask, demanded in the other petition for damage of non pformance of a covenant to his wife Mary touching certaine cattell; or els to shew cause why you refuse to doe either; and to appoint some time when the Counsell shall attend you for it, betweene this & Monday next. So humbly take leave to rest Yor servant S. Johns. 9th Jan: 1644 John Lewger.” See also Margaret Brent, “Account of the Estate of Governor Leonard Calvert,” June 6, 1648, in Judicial and Testamentary Business of the Provincial Court, 1637-1650, Vol. 4, pp. 388-389 (original pages 159-160). “By payd to Mrs. Mary Brent Kittamagund 0748.”

[vi] For information about the arrest and transport of Giles Brent to London during Richard Ingle’s Rebellion, see “Richard Ingle in Maryland” in Maryland Historical Magazine, Vol. 1(1906), 125-140. For the terminus ad quem (limit to which—latest possible date) Giles Brent returned to Maryland, see Maryland State Archives, Judicial and Testamentary Business of the Provincial Court, 1637-1650, Vol. 4:312-313. “June 19th This day came Margaret Brent Gent, & desyred the testimony of the prnt Gouernor Mr Tho: Greene concerning the last will & Testamt of the late Gouernor Leonard Calvert Esqr And the sd Gouernor did authorize Giles Brent Esqr one of his Lops Counsell to administer an oath unto him the sd Gouernr concerning the foresd busines. The sd Gouernor Tho: Greene Esqr answered uppon oath concerning the last will & Testamt of Leo: Calvert Esqr aforesd That the sd Leo: Calvert, lying uppon his death bed, some 6 howres before his death, being in prfect memory, directing his speech to Mrs Margarett Brent sayd in pnce of him the sd Mr Greene & some others I make you my sole Exequutrix, Take all, & pay all. After wch words hee the sd Leon: Calvert desyred every one to depart the roome & was some space in priuate conference wth Mrs Marg: Brent aforesd Afterwards the Mr Greene comeing into the roome againe, he heard the sd Mr L: Calvert appoint certaine Legacies in manner following. Viz I doe giue my warring cloaths to James Linsay, & Richard William my servants, specifying his coath suite to Rich. Willan & his black suite to James Linsey. & his waring Linnen to be diuided betweene them. Aliso I giue a mare Colt to my God sonne Leon: Greene. Allso hee did desyre tht his exequutrix should giue the first mare Colt tht should fall this yeare, (& if non fall in this yeare, then the first tht shall hereafter fall) unto Mrs Temperance Pippett of Virginea. And further he deposeth not. Recognit Teste mc Willm Bretton Clk.”

[vii] The terminus a quo (limit from which—earliest possible date) for the relocation of Giles Brent from Maryland to Virginia is the date Giles Brent appeared in court at St. Mary’s on November 8, 1648, requesting compensation for destruction of his property on the Isle of Kent by anti-Papists. See Archives of Maryland, November 8, 1648, Liber A, Folio 205. The terminus ad quem (limit to which—latest possible date) Giles Brent removed from Maryland to Virginia is the date Giles Brent patented Marlborough in Potomac Neck, Virginia, on August 20, 1651. See entry from Mercer Land Book cited by W.B. Chilton, ed., “The Brent Family,” The Virginia Magazine of History and Biography, Vol. 16, No. 1 (Jul., 1908), 96-97.

[viii] Virginia Magazine XVI, 211. On April 17, 1654, Giles conveyed his personal estate in Virginia and Maryland to his sister Mary, in trust to educate his children and allow maintenance to his wife Mary. See also Lurene Rose Bivin in “Brent-Marsham-Beaven-Blandford Article: A Closer Look,” Maryland Genealogical Society Bulletin, Vol. 37, No. 3, 328-334. “In the grant to John Harrison (dated 4 September 1655), he refers to his “sister” as Mrs. Frances Harrison (Nugent, p. 319).” Giles may have been engaged to marry his second wife, Frances Whitgreaves, widow of Jeremiah Harrison, on this date, because John Harrison made a provision for Giles.

[ix] W.B. Chilton, ed., “The Brent Family,” The Virginia Magazine of History and Biography, Vol. 16, No. 2 (Oct., 1908), 212. “‘Register of Christ Church, Middlesex County, Virginia. Collo Giles Brent of Potomac departed this life 2d of September 1679 and was buried in the Great Church Yard ye next day following.'”

[xi] W.B. Chilton, Vol. 16, No. 1 (Jul., 1908), 98-99. “The Will of Margaret Brent. In the name of God Amen. I Margaret Brent of Peace in the County of Westmoreland in Virginia considering the casualtys of human life do therefore make this my last Will and Testament as followeth my soul I do bequeath to the mercies of my Savior Jesus Christ and my worldly estate to be disposed of by my Executors as followeth to my nephew George Brent I give all my rights to take up land in Maryland except those already assigned to my cousin James Clifton to my niece Clifton I give a cow and to my neece Elizabeth Brent I give a heifer; to Ann Vandan I give a cow calf; to my neece Mary Brent daughter of my Brother Giles Brent I give all my silver spoons which are six; to my nephew Richard Brent son of my brother Giles Brent I give my patent of lands at the Falls of Rappahanock River also my lease of Kent Fort Mannor in Maryland saving yet power to his Father my brother Giles Brent that if he shall like to do so he may sell said lease and satisfye to his son other where as he shall think fitt in lands good or money and in case of my said nephew Richard Brents death under age and without heirs of his body lawfully begotten his legacy thereto to go to his brother Giles Brent or his sister Mary Brent or to the heirs of my brother Giles Brent or otherwise as my said brother shall dispose it by his Deed or last Will to my brother Giles Brent and to his heirs forever I give all my lands goods and chattles and all my estate real and personal and all that is or may be due to me in England Virginia Maryland or elsewhere still excepting the before disposed of in this my last will and Testament and I do appoint him my said Brother Giles Brent and his children Giles Brent Mary Brent and Richard Brent or such of them as are living at the time of my death the Executors of this my last Will and Testament. In witness whereof I have hereunto set my hand and seal this 26th day of December, Anno Domini, 1663.”

[xii] W.B. Chilton, Vol. 16, No. 1 (Jul., 1908), 98. “The Will of Giles Brent. In the Name of God Amen. I Giles Brent of the Retirement in Stafford County in Virginia Esquire contemplating the uncertainty of my time of death do ordain this my last Will and Testament in manner and form following my body to the earth and my Soul I bequeath to the mercy of my Savior Christ all my worldly estate I appoint to my Exectors to be disposed of as followeth to my daughter Mary Fitzherbert I give five ewes and a ram to my son and heir Giles Brent and to the heirs of his body lawfully begotten I give for ever all my lands rights unto lands and reversions of lands any ways due to me in either England Virginia or Maryland and for want of such heirs then unto mine own right heirs and for want of such then to the right heirs of my Honored Father Richard Brent, Esquire, deceased Antiently Lord of the mannors of Admington and Lark Stoke in the County of Gloucestershire in England after my debts paid I give all my goods moveable or immoveable whatsoever to be disposed of as followeth three thousand pounds of good tobacco with cask to be given by them my Executors unto pious use where and to whom they shall see fitt for which doing and how and to whom given I Will that to none else but God they shall be accountable. I also Will that to Mr. Edward Sanders they give four ewes and a ram and to John Howard four ewes and a ram. Executors of this my last Will and Testament I appoint my son Giles Brent and my Brother Richard Brent and my Brother William Brent both in England and as Attorneys in their Executorship untill my said Brothers shall otherwise order and I do appoint Mr. Edward Sanders and John Howard above mentioned both of Stafford County to be and to act and it is my Will that after my debts and my Legacies paid my said Executors stand possessed of all my goods and personal estate to the sole use of my son Giles Brent then to be delivered into his sole dispose when it shall please God that he hath arrived to the age of one and twenty years. In witness unto this my within written last Will and Testament I have hereunto set my hand and seal this last day of August, Anno Domini, 1671.”

[xiv] See excerpt from Charles Calvert to Cecilius Calvert, April 26, 1672, in William Hand Browne, ed., Proceedings of the Council of Mayland: 1671-1682 (Baltimore: Maryland Historical Society, 1896), xiv. “Major Fitzherbert’s brother who maryed the Indian Brent, has civilly parted with her, and (as I suppose) will never care to bed with her more; soe that your Lordship needs not to feare any ill consequence from that match, butt what has already happened to the poore man, who unadvisedly threw himself away upon her in hopes of a great portion which now is come to little.” See also Will of Charles Beaven, signed June 20, 1698, proved June 21, 1699, Prerogative Court (Wills) Vol. 2, pp. 182-183, Liber 6, Folios 285-286. See also Will of Mary Beaven, signed April 18, 1712, proved June 13, 1713, Prerogative Court (Wills) Vol. 3, p. 240, Liber 13, Folio 513. See also Maryland Land Patents, BB#37:374. On March 15, 1696/7, Richard Marsham transferred 600 acre grant called The Hickory Thickett to Charles Beaven by assignment.

[xvi] Four potential scenarios explain this matching DNA considered together with Charles Beaven’s reference to Richard Marsham as “my brother iñ Richard Marsham.” The first scenario is Richard Marsham and Charles Beaven were brothers. This scenario almost certainly is not true because Richard Marsham and Charles Beaven had different last names and the written reference by Charles Beaven to Richard Marsham as “my brother iñ” appears to have been a standard contraction of “my brother-in-law.” The second scenario is Richard Marsham and Mary, wife of Charles Beaven, were brother and sister. This scenario almost certainly is not true because Mary referred to Richard Marsham as “my well beloved Richard Marsham.” If Richard Marsham and Mary had been brother and sister, Mary surely would have referred to Richard as her brother. The third scenario is Charles Beaven and Katherine, wife of Richard Marsham, were brother and sister. This scenario almost certainly is not true because their descendants inherited matching segments of Native American DNA. Charles Beaven immigrated from England to Maryland in 1666 (Skordas, Liber 9, folio 455), so he surely did not inherit Native American DNA from his parents. The fourth and most compelling scenario is Katherine, wife of Richard Marsham, and Mary, wife of Charles Beaven, were sisters, and they also were daughters of a parent with Native American ancestry. This scenario is consistent with other indications that Katherine and Mary were daughters of Mary Kittamaquund and Giles Brent.

[xvii] Maryland Colonial Land Records, Liber 7, Folio 582, 583, Maryland State Archives. “March xith 1664. Came David Bowens and demands land for these rights following John Barnes, Clement Barnes, Margaret Whitthe, Martha Garbett, Catherine Marsham by Assign and Francis Street by Assign as follows–Know all to whom these presents may concern, that I Katherine Marsham doe assigne all my Right and Title of a Right due to mee the said Katherine for fifty acres of land unto David Bowing as witness my hand this Eleventh of March One Thousand six hundred sixty foure. Katherine Marsham (her K mark). Witness Richard Marsham, Robert Turner. Know all men by these presents to whom this may concern that I Francis Streete doe assigne all my Right and Title of a right due to mee the said Francis Streete for fifty acres of Land unto David Bowing as witness my hand this Eleventh of March One Thousand six hundred sixty four. Francis Streete. Witness Richard Marsham, Robert Turner.” See also Maryland Colonial Land Records, Liber 12, Folio 512, Maryland State Archives. “May 11th 1670. Came Richard Marsham of Calvert County and proved right to fifty acres of land it being due to him for the time of service of Katherine his wife performed to Major Thomas Brooke, Warrant then issued in the name of the said Richard Marsham for fifty acres of land it being due to him for the causio oraem above. Certified the 11th of August next.” Note: Even though these two documents indicate Katherine was due a total of 100 acres, the first 50 acres for an unstated cause and the second 50 acres for service to Thomas Brooke, neither record says Katherine was transported to Maryland, and both records may result from fraudulent claims. If these records reflect legitimate claims, they do not say or prove Katherine was transported to Maryland, since some claims were granted for people who were born in Maryland. For example, a patent for 1,644 acres was granted to Mary Brent on November 17, 1652, for the transportation of 33 persons, including “Mrs. Mary Brent, wife to Capt. Brent.” See Nugent, pp. 266-267. This Mrs. Mary Brent was Mary Kittamaquund, wife of Giles Brent, who certainly was born in Maryland. Furthermore, according to Abbott Emerson Smith (“The Indentured Servant and Land Speculation in Seventeenth Century Maryland,” in The American Historical Review, Vol. 40, p. 467), “A great many of the warrants which were granted were for rights proved by the wife of a freedman. It is not unlikely that some persons managed to get freedom dues in land, although they had never been in indentured service.” Finally, if Katherine did serve a term of indenture, her service may have resulted from the death of her mother at a time when she was old enough to begin providing for her own maintenance. It was not unusual during this era for children of deceased well-to-do colonists to serve a term of indenture.

[xviii] See Maryland Colonial Land Records, Liber 4, Folio 4, Maryland State Archives. “May the 7th 1659. John Home demands Land for the transportation of himself and his Servants, Richard Marsham & John Edmondson, in 1658.” See also Maryland Colonial Land Records, Liber 5, Folio 295, Maryland State Archives. “Know all men that I Richard Marsham do give and make over to Thomas Pagett my right as is due to me as being a Servant, and now being free in Roberto McJohn Hearen as witness my hand the 16th of September 1661. Richard Marsham. Wit: Robert Coberthwail, Michael Coreuly.”

[xix] See Maryland Colonial Land Records, Liber 12, Folio 512, Maryland State Archives, as cited above. “May 11th 1670. Came Richard Marsham of Calvert County and proved right to fifty acres of land it being due to him for the time of service of Katherine his wife performed to Major Thomas Brooke, Warrant then issued in the name of the said Richard Marsham for fifty acres of land it being due to him for the causio oraem above. Certified the 11th of August next.” See also Maryland Colonial Land Records, October 26, 1670, Liber 14, Folio 228. “Patent for 50 acres in St. Mary’s County, originally Calvert County, to Richard Marsham, tract called St. Katherine’s.” Note: This patent establishes the terminus ad quem (limit to which—latest possible date) for Katherine’s death, because Richard would be unlikely to name this property Saint Katherine’s unless Katherine had died.

[xx] The terminus a quo (limit from which—earliest possible date) for Richard’s marriage to Anne Calvert is established by the date of a Prerogative Court record concerning the estate of Henry Brent naming Anne Brent executrix. See Prerogative Court Records, April 30, 1695, Liber 13A, folio 291, Maryland State Archives. The terminus ad quem (limit to which—latest possible date) for Richard’s marriage to Anne Calvert is the date they were named as husband and wife on a probate record. See Provincial Court Judgments, February Court 1696, Liber P. L. #3, Folios 556-557, Maryland State Archives. Richard Marsham with Ann Marsham, administrator of Henry Brent, against Thomas Collier.

[xxii] The approximate year of Sarah’s marriage to Basil Waring is estimated from the year of Basil’s death preceded by four years to account for the births of two children. See Will of Basil Waring, signed December 8, 1688, probated December 29, 1688, Maryland Calendar of Wills, Vol. 2, p. 50, and Liber 6, Folio 66. Basil named his wife Sarah and sons Marsham and Basil. The terminus a quo (limit from which—earliest possible date) for Sarah’s marriage to William Barton is determined by the probate date of the will of her first husband Basil Waring. See Will of Basil Waring, signed December 8, 1688, probated December 29, 1688, Maryland Calendar of Wills, Vol. 2, p. 50, and Liber 6, Folio 66. The terminus a quo (limit from which—earliest possible date) for Sarah’s death is determined by her deed to Robert Mackhorn. See Deed from Sarah Haddock to Robert Mackhorn, signed January 8, 1733, recorded March 18, 1733/4, Charles County Land Rcords: 1733-1743, Book O #2, page 28. “Sarah Haddock, widow, of Prince George’s County, formerly wife of William Barton, late of Charles County, Gent., deceased, to Robert Mackhorn of Charles County, planter. William Barton by his will, divised to his son-in-law, Basil Waring, 300 acres, being part of this tract of land called Hadlow, lying in Charles County, and the rest of Hadlow to his wife, being now the aforementioned Sarah Haddock. Now this deed witnesses that sd. Sarah Haddock, for 4500 lbs tobacco, has sold to said Robert the rest of Hadlow, lying in Charles County, bounded by Thos. Gerard, the division line made by sd. Sarah Haddock and Basil Waring. Signed Sarah Haddock. Wit. Jas. Haddock Waring, Henry Keen.”

[xxiii] The approximate year of Katherine’s marriage to Baker Brook is estimated from the year of Baker’s death preceded by eight years to account for the births of four children. See Will of Baker Book, signed February 5, 1698, probated May 27, 1698, Maryland Calendar of Wills, Vol. 2, p. 142, and Liber 6, Folio 83. Baker named his wife Katherine and four children Baker, Leonard, Richard, and Ann. The terminus ad quem (limit to which—latest possible date) for Katherine’s marriage to Samuel Queen is determined by the probate date of the will of her first husband Baker Brooke. See Will of Baker Book, signed February 5, 1698, probated May 27, 1698, Maryland Calendar of Wills, Vol. 2, p. 142, Liber 6, Folio 83. The terminus a quo (limit from which—earliest possible date) for Katherine’s death is determined by the date her husband’s will was probated. See Will of Samuel Queen, signed January 10, 1711, probated March 18, 1712, Maryland Prerogative Court (Wills), Vol. 3, p. 222, Liber 13, Folio 389, Maryland State Archives. The terminus ad quem (limit to which—latest possible date) for Katherine’s death is determined by the date of the will of her father, Richard Marsham, which provides for her children but does not mention her. See Will of Richard Marsham, signed April 14, 1713, probated April 22, 1713, Maryland Prerogative Court (Wills), Liber 13, Folios 514-520, Maryland State Archives.

[xxiv] On April 5, 1673, Giles Brent Jr., son of Col. Giles Brent and Mary Kittamaquund, deeded 500 acres, which he had inherited from his father, to his uncle George Brent of Woodstock, Stafford County, Virginia, stating he had reached the age of 21—a condition set in his father’s will for his ability to take possession of the land. This suggests Giles Brent Jr. was born about 1652. See W.B. Chilton, Vol. 16, No. 1 (Jul., 1908), 99-100.

[xxvi] See excerpt from Charles Calvert to Cecilius Calvert, April 26, 1672, in William Hand Browne, ed., Proceedings of the Council of Mayland: 1671-1682 (Baltimore: Maryland Historical Society, 1896), xiv. “Major Fitzherbert’s brother who maryed the Indian Brent, has civilly parted with her, and (as I suppose) will never care to bed with her more; soe that your Lordship needs not to feare any ill consequence from that match, butt what has already happened to the poore man, who unadvisedly threw himself away upon her in hopes of a great portion which now is come to little.”

Last year I wrote a column at the end of the year titled “2012 Top 10 Genetic Genealogy Happenings.” It’s amazing the changes in this industry in just one year. It certainly makes me wonder what the landscape a year from now will look like.

I’ve done the same thing this year, except we have a dozen. I couldn’t whittle it down to 10, partly because there has been so much more going on and so much change – or in the case of Ancestry, who is noteworthy because they had so little positive movement.

If I were to characterize this year of genetic genealogy, I would call it The Year of the SNP, because that applies to both Y DNA and autosomal. Maybe I’d call it The Legal SNP, because it is also the year of law, court decisions, lawsuits and FDA intervention. To say it has been interesting is like calling the Eiffel Tower an oversized coat hanger.

I’ll say one thing…it has kept those of us who work and play in this industry hopping busy! I guarantee you, the words “I’m bored” have come out of the mouth of no one in this industry this past year.

I’ve put these events in what I consider to be relatively accurate order. We could debate all day about whether the SNP Tsunami or the 23andMe mess is more important or relevant – and there would be lots of arguing points and counterpoints…see…I told you lawyers were involved….but in reality, we don’t know yet, and in the end….it doesn’t matter what order they are in on the list:)

Y Chromosome SNP Tsunami Begins

The SNP tsumani began as a ripple a few years ago with the introduction at Family Tree DNA of the Walk the Y program in 2007. This was an intensively manual process of SNP discovery, but it was effective.

By the time that the Geno 2.0 chip was introduced in 2012, 12,000+ SNPs would be included on that chip, including many that were always presumed to be equivalent and not regularly tested. However, the Nat Geo chip tested them and indeed, the Y tree became massively shuffled. The resolution to this tree shuffling hasn’t yet come out in the wash. Family Tree DNA can’t really update their Y tree until a publication comes out with the new tree defined. That publication has been discussed and anticipated for some time now, but it has yet to materialize. In the mean time, the volunteers who maintain the ISOGG tree are swamped, to say the least.

Another similar test is the Chromo2 introduced this year by Britain’s DNA which scans 15,000 SNPs, many of them S SNPs not on the tree nor academically published, adding to the difficulty of figuring out where they fit on the Y tree. While there are some very happy campers with their Chromo2 results, there is also a great deal of sloppy science, reporting and interpretation of “facts” through this company. Kind of like Jekyll and Hyde. See the Sloppy Science section.

But Walk the Y, Chromo2 and Geno 2.0, are only the tip of the iceburg. The new “full Y” sequencing tests brought into the marketspace quietly in early 2013 by Full Genomes and then with a bang by Family Tree DNA with the their Big Y in November promise to revolutionize what we know about the Y chromosome by discovering thousands of previously unknown SNPs. This will in effect swamp the Y tree whose branches we thought were already pretty robust, with thousands and thousands of leaves.

In essence, the promise of the “fully” sequenced Y is that what we might term personal or family SNPs will make SNP testing as useful as STR testing and give us yet another genealogy tool with which to separate various lines of one genetic family and to ratchet down on the time that the most common recent ancestor lived.

The story of 23andMe began as the consummate American dotcom fairy tale, but sadly, has deteriorated into a saga with all of the components of a soap opera. A wealthy wife starts what could be viewed as an upscale hobby business, followed by a messy divorce and a mystery run-in with the powerful overlording evil-step-mother FDA. One of the founders of 23andMe is/was married to the founder of Google, so funding, at least initially wasn’t an issue, giving 23andMe the opportunity to make an unprecedented contribution in the genetic, health care and genetic genealogy world.

Another way of looking at this is that 23andMe is the epitome of the American Dream business, a startup, with altruism and good health, both thrown in for good measure, well intentioned, but poorly managed. And as customers, be it for health or genealogy or both, we all bought into the altruistic “feel good” culture of helping find cures for dread diseases, like Parkinson’s, Alzheimer’s and cancer by contributing our DNA and responding to surveys.

The genetic genealogy community’s love affair with 23andMe began in 2009 when 23andMe started focusing on genealogy reporting for their tests, meaning cousin matches. We, as a community, suddenly woke up and started ordering these tests in droves. A few months later, Family Tree DNA also began offering this type of testing as well. The defining difference being that 23andMe’s primary focus has always been on health and medical information with Family Tree DNA focused on genetic genealogy. To 23andMe, the genetic genealogy community was an afterthought and genetic genealogy was just another marketing avenue to obtain more people for their health research data base. For us, that wasn’t necessarily a bad thing.

For awhile, this love affair went along swimmingly, but then, in 2012, 23andMe obtained a patent for Parkinson’s Disease. That act caused a lot of people to begin to question the corporate focus of 23andMe in the larger quagmire of the ethics of patenting genes as a whole. Judy Russell, the Legal Genealogist, discussed this here. It’s difficult to defend 23andMe’s Parkinson’s patent while flaying alive Myriad for their BRCA patent. Was 23andMe really as altruistic as they would have us believe?

Personally, this event made me very nervous, but I withheld judgment. But clearly, that was not the purpose for which I thought my DNA, and others, was being used.

But then came the Designer Baby patent in 2013. This made me decidedly uncomfortable. Yes, I know, some people said this really can’t be done, today, while others said that it’s being done anyway in some aspects…but the fact that this has been the corporate focus of 23andMe with their research, using our data, bothered me a great deal. I have absolutely no issue with using this information to assure or select for healthy offspring – but I have a personal issue with technology to enable parents who would select a “beauty child,” one with blonde hair and blue eyes and who has the correct muscles to be a star athlete, or cheerleader, or whatever their vision of their as-yet-unconceived “perfect” child would be. And clearly, based on 23andMe’s own patent submission, that is the focus of their patent.

Upon the issuance of the patent, 23andMe then said they have no intention of using it. They did not say they won’t sell it. This also makes absolutely no business sense, to focus valuable corporate resources on something you have no intention of using? So either they weren’t being truthful, they lack effective management or they’ve changed their mind, but didn’t state such.

What came next, in late 2013 certainly points towards a lack of responsible management.

23andMe had been working with the FDA for approval the health and medical aspect of their product (which they were already providing to consumers prior to the November 22nd cease and desist order) for several years. The FDA wants assurances that what 23andMe is telling consumers is accurate. Based on the letter issued to 23andMe on November 22nd, and subsequent commentary, it appears that both entities were jointly working towards that common goal…until earlier this year when 23andMe mysteriously “somehow forgot” about the FDA, the information they owed them, their submissions, etc. They also forgot their phone number and their e-mail addresses apparently as well, because the FDA said they had heard nothing from them in 6 months, which backdates to May of 2013.

It may be relevant that 23andMe added the executive position of President and filled it in June of 2013, and there was a lot of corporate housecleaning that went on at that time. However, regardless of who got housecleaned, the responsibility for working with the FDA falls squarely on the shoulders of the founders, owners and executives of the company. Period. No excuses. Something that critically important should be on the agenda of every executive management meeting. Why? In terms of corporate risk, this was obviously a very high risk item, perhaps the highest risk item, because the FDA can literally shut their doors and destroy them. There is little they can do to control or affect the FDA situation, except to work with the FDA, meet deadlines and engender goodwill and a spirit of cooperation. The risk of not doing that is exactly what happened.

It’s unknown at this time if 23andMe is really that corporately arrogant to think they could simply ignore the FDA, or blatantly corporately negligent or maybe simply corporately stupid, but they surely betrayed the trust and confidence of their customers by failing to meet their commitments with and to the FDA, or even communicate with them. I mean, really, what were they thinking?

There has been an outpouring of sympathy for 23andme and negative backlash towards the FDA for their letter forcing 23andMe to stop selling their offending medical product, meaning the health portion of their testing. However, in reality, the FDA was only meting out the consequences that 23andMe asked for. My teenage kids knew this would happen. If you do what you’re not supposed to….X, Y and Z will, or won’t, happen. It’s called accountability. Just ask my son about his prom….he remembers vividly. Now why my kids, or 23andMe, would push an authority figure to that point, knowing full well the consequences, utterly mystifies me. It did when my son was a teenager and it does with 23andMe as well.

Some people think that the FDA is trying to stand between consumers and their health information. I don’t think so, at least not in this case. Why I think that is because the FDA left the raw data files alone and they left the genetic genealogy aspect alone. The FDA knows full well you can download your raw data and for $5 process it at a third party site, obtaining health related genetic information. The difference is that Promethease is not interpreting any data for you, only providing information.

There is some good news in this and that is that from a genetic genealogy perspective, we seem to be safe, at least for now, from government interference with the testing that has been so productive for genetic genealogy. The FDA had the perfect opportunity to squish us like a bug (thanks to the opening provided by 23andMe,) and they didn’t.

The really frustrating aspect of this is that 23andMe was a company who, with their deep pockets in Silicon Valley and other investors, could actually afford to wage a fight with the FDA, if need be. The other companies who received the original 2010 FDA letter all went elsewhere and focused on something else. But 23andMe didn’t, they decided to fight the fight, and we all supported their decision. But they let us all down. The fight they are fighting now is not the battle we anticipated, but one brought upon themselves by their own negligence. This battle didn’t have to happen, and it may impair them financially to such a degree that if they need to fight the big fight, they won’t be able to.

Right now, 23andMe is selling their kits, but only as an ancestry product as they work through whatever process they are working through with the FDA. Unfortunately, 23andMe is currently having some difficulties where the majority of matches are disappearing from some testers records. In other cases, segments that previously matched are disappearing. One would think, with their only revenue stream for now being the genetic genealogy marketspace that they would be wearing kid gloves and being extremely careful, but apparently not. They might even consider making some of the changes and enhancements we’ve requested for so long that have fallen on deaf ears.

One thing is for sure, it will be extremely interesting to see where 23andMe is this time next year. The soap opera continues.

I hope for the sake of all of the health consumers, both current and (potentially) future, that this dotcom fairy tale has a happy ending.

In a landmark decision, the Supreme Court determined that genes cannot be patented. Myriad Genetics held patents on two BRCA genes that predisposed people to cancer. The cost for the tests through Myriad was about $3000. Six hours after the Supreme Court decision, Gene By Gene announced that same test for $995. Other firms followed suit, and all were subsequently sued by Myriad for patent infringement. I was shocked by this, but as one of my lawyer friends clearly pointed out, you can sue anyone for anything. Making it stick is yet another matter. Many firms settle to avoid long and very expensive legal battles. Clearly, this issue is not yet resolved, although one would think a Supreme Court decision would be pretty definitive. It potentially won’t be settled for a long time.

As 23andMe comes unraveled and Ancestry languishes in its mediocrity, Gene by Gene, the parent company of Family Tree DNA has stepped up to the plate, committed to do “whatever it takes,” ramped up the staff both through hiring and acquisitions, and is producing results. This is, indeed, a breath of fresh air for genetic genealogists, as well as a welcome relief.

Autosomal DNA testing and analysis has simply exploded this past year. More and more people are testing, in part, because Ancestry.com has a captive audience in their subscription data base and more than a quarter million of those subscribers have purchased autosomal DNA tests. That’s a good thing, in general, but there are some negative aspects relative to Ancestry, which are in the Ancestry section.

Another boon to autosomal testing was the 23andMe push to obtain a million records. Of course, the operative word here is “was” but that may revive when the FDA issue is resolved. One of the down sides to the 23andMe data base, aside from the fact that it’s not genealogist friendly, is that so many people, about 90%, don’t communicate. They aren’t interested in genealogy.

A third factor is that Family Tree DNA has provided transfer ability for files from both 23andMe and Ancestry into their data base.

Fourth is the site, GedMatch, at www.gedmatch.com which provides additional matching and admixture tools and the ability to match below thresholds set by the testing companies. This is sometimes critically important, especially when comparing to known cousins who just don’t happen to match at the higher thresholds, for example. Unfortunately, not enough people know about GedMatch, or are willing to download their files. Also unfortunate is that GedMatch has struggled for the past few months to keep up with the demand placed on their site and resources.

A great deal of time this year has been spent by those of us in the education aspect of genetic genealogy, in whatever our capacity, teaching about how to utilize autosomal results. It’s not necessarily straightforward. For example, I wrote a 9 part series titled “The Autosomal Me” which detailed how to utilize chromosome mapping for finding minority ethnic admixture, which was, in my case, both Native and African American.

As the year ends, we have Family Tree DNA, 23andMe and Ancestry who offer the autosomal test which includes the relative-matching aspect. Fortunately, we also have third party tools like www.GedMatch.com and www.DNAGedcom.com, without which we would be significantly hamstrung. In the case of DNAGedcom, we would be unable to perform chromosome segment matching and triangulation with 23andMe data without Rob Warthen’s invaluable tool.

While this tool, www.dnagedcom.com, falls into the Autosomal grouping, I have separated it out for individual mention because without this tool, the progress made this year in autosomal DNA ancestor and chromosomal mapping would have been impossible. Family Tree DNA has always provided segment matching boundaries through their chromosome browser tool, but until recently, you could only download 5 matches at a time. This is no longer the case, but for most of the year, Rob’s tool saved us massive amounts of time.

23andMe does not provide those chromosome boundaries, but utilizing Rob’s tool, you can obtain each of your matches in one download, and then you can obtain the list of who your matches match that is also on your match list by requesting each of those files separately. Multiple steps? Yes, but it’s the only way to obtain this information, and chromosome mapping without the segment data is impossible

A special hats off to Rob. Please remember that Rob’s site is free, meaning it’s donation based. So, please donate if you use the tool.

I covered www.Gedmatch.com in the “Best of 2012” list, but they have struggled this year, beginning when Ancestry announced that raw data file downloads were available. GedMatch consists of two individuals, volunteers, who are still struggling to keep up with the required processing and the tools. They too are donation based, so don’t forget about them if you utilize their tools.

Ancestry – How Great Thou Aren’t

Ancestry is only on this list because of what they haven’t done. When they initially introduced their autosomal product, they didn’t have any search capability, they didn’t have a chromosome browser and they didn’t have raw data file download capability, all of which their competitors had upon first release. All they did have was a list of your matches, with their trees listed, with shakey leaves if you shared a common ancestor on your tree. The implication, was, and is, of course, that if you have a DNA match and a shakey leaf, that IS your link, your genetic link, to each other. Unfortunately, that is NOT the case, as CeCe Moore documented in her blog from Rootstech (starting just below the pictures) as an illustration of WHY we so desperately need a chromosome browser tool.

In a nutshell, Ancestry showed the wrong shakey leaf as the DNA connection – as proven by the fact that both of CeCe’s parents have tested at Ancestry and the shakey leaf person doesn’t match the requisite parent. And there wasn’t just one, not two, but three instances of this. What this means is, of course, that the DNA match and the shakey leaf match are entirely independent of each other. In fact, you could have several common ancestors, but the DNA at any particular location comes only from one on either Mom or Dad’s side – any maybe not even the shakey leaf person.

So what Ancestry customers are receiving is a list of people they match and possible links, but most of them have no idea that this is the case, and blissfully believe they have found their genetic connection. They have found a genealogical cousin, and it MIGHT be the genetic connection. But then again, they could have found that cousin simply by searching for the same ancestor in Ancestry’s data base. No DNA needed.

Ancestry has added a search feature, allowed raw data file downloads (thank you) and they have updated their ethnicity predictions. The ethnicity predictions are certainly different, dramatically different, but equally as unrealistic. See the Ethnicity Makeovers section for more on this. The search function helps, but what we really need is the chromosome browser, which they have steadfastly avoided promising. Instead, they have said that they will give us “something better,” but nothing has materialized.

I want to take this opportunity, to say, as loudly as possible, that TRUST ME IS NOT ACCEPTABLE in any way, shape or form when it comes to genetic matching. I’m not sure what Ancestry has in mind by the way of “better,” but it if it’s anything like the mediocrity with which their existing DNA products have been rolled out, neither I nor any other serious genetic genealogist will be interested, satisfied or placated.

Regardless, it’s been nearly 2 years now. Ancestry has the funds to do development. They are not a small company. This is obviously not a priority because they don’t need to develop this feature. Why is this? Because they can continue to sell tests and to give shakey leaves to customers, most of whom don’t understand the subtle “untruth” inherent in that leaf match – so are quite blissfully happy.

In years past, I worked in the computer industry when IBM was the Big Dog against whom everyone else competed. I’m reminded of an old joke. The IBM sales rep got married, and on his wedding night, he sat on the edge of the bed all night long regaling his bride in glorious detail with stories about just how good it was going to be….

You can sign a petition asking Ancestry to provide a chromosome browser here, and you can submit your request directly to Ancestry as well, although to date, this has not been effective.

The most frustrating aspect of this situation is that Ancestry, with their plethora of trees, savvy marketing and captive audience testers really was positioned to “do it right,” and hasn’t, at least not yet. They seem to be more interested in selling kits and providing shakey leaves that are misleading in terms of what they mean than providing true tools. One wonders if they are afraid that their customers will be “less happy” when they discover the truth and not developing a chromosome browser is a way to keep their customers blissfully in the dark.

This has been a huge year for advances in sequencing ancient DNA, something once thought unachievable. We have learned a great deal, and there are many more skeletal remains just begging to be sequenced. One absolutely fascinating find is that all people not African (and some who are African through backmigration) carry Neanderthal and Denisovan DNA. Just this week, evidence of yet another archaic hominid line has been found in Neanderthal DNA and on Christmas Day, yet another article stating that type 2 Diabetes found in Native Americans has roots in their Neanderthal ancestors. Wow!

Closer to home, by several thousand years is the suggestion that haplogroup R did not exist in Europe after the ice age, and only later, replaced most of the population which, for males, appears to have been primarily haplogroup G. It will be very interesting as the data bases of fully sequenced skeletons are built and compared. The history of our ancestors is held in those precious bones.

Unfortunately, as DNA becomes more mainstream, it becomes a target for both sloppy science or intentional misinterpretation, and possibly both. Unfortunately, without academic publication, we can’t see results or have the sense of security that comes from the peer review process, so we don’t know if the science and conclusions stand up to muster.

The race to the buck in some instances is the catalyst for this. In other cases, and not in the links below, some people intentionally skew interpretations and results in order to either fulfill their own belief agenda or to sell “products and services” that invariably report specific findings.

It’s equally as unfortunate that much of these misconstrued and sensationalized results are coming from a testing company that goes by the names of BritainsDNA, ScotlandsDNA, IrelandsDNA and YorkshiresDNA. It certainly does nothing for their credibility in the eyes of people who are familiar with the topics at hand, but it does garner a lot of press and probably sells a lot of kits to the unwary.

I hope they publish their findings so we can remove the “sloppy science” aspect of this. Sensationalist reporting, while irritating, can be dealt with if the science is sound. However, until the results are published in a peer-reviewed academic journal, we have no way of knowing.

Thankfully, Debbie Kennett has been keeping her thumb on this situation, occurring primarily in the British Isles.

Citizen science has been slowing coming of age over the past few years. By this, I mean when citizen scientists work as part of a team on a significant discovery or paper. Bill Hurst comes to mind with his work with Dr. Doron Behar on his paper, A Copernican Reassessment of the Human Mitochondrial DNA from its Root or what know as the RSRS model. As the years have progressed, more and more discoveries have been made or assisted by citizen scientists, sometimes through our projects and other times through individual research. JOGG, the Journal of Genetic Genealogy, which is currently on hiatus waiting for Dr. Turi King, the new editor, to become available, was a great avenue for peer reviewed publication. Recently, research projects have been set up by citizen scientists, sometimes crowd-funded, for specific areas of research. This is a very new aspect to scientific research, and one not before utilized.

The first paper below includes the Family Tree DNA Lab, Thomas and Astrid Krahn, then with Family Tree DNA and Bonnie Schrack, genetic genealogist and citizen scientist, along with Dr. Michael Hammer from the University of Arizona and others.

Unfortunately, ethnicity percentages, as provided by the major testing companies still disappoint more than thrill, at least for those who have either tested at more than one lab or who pretty well know their ethnicity via an extensive pedigree chart.

Ancestry.com is by far the worse example, swinging like a pendulum from one extreme to the other. But I have to hand it to them, their marketing is amazing. When I signed in, about to discover that my results had literally almost reversed, I was greeted with the banner “a new you.” Yea, a new me, based on Ancestry’s erroneous interpretation. And by reversed, I’m serious. I went from 80% British Isles to 6% and then from 0% Western Europe to 79%. So now, I have an old wrong one and a new wrong one – and indeed they are very different. Of course, neither one is correct…..but those are just pesky details…

23andMe updated their ethnicity product this year as well, and fine tuned it yet another time. My results at 23andMe are relatively accurate. I saw very little change, but others saw more. Some were pleased, some not.

The bottom line is that ethnicity tools are not well understood by consumers in terms of the timeframe that is being revealed, and it’s not consistent between vendors, nor are the results. In some cases, they are flat out wrong, as with Ancestry, and can be proven. This does not engender a great deal of confidence. I only view these results as “interesting” or utilize them in very specific situations and then only using the individual admixture tools at www.Gedmatch.com on individual chromosome segments.

As Judy Russell says, “it’s not soup yet.” That doesn’t mean it’s not interesting though, so long as you understand the difference between interesting and gospel.

With the explosion of genetic genealogy testing, as one might expect, the demand for education, and in particular, basic education has exploded as well.

I’ve written a 101 series, Kelly Wheaton wrote a series of lessons and CeCe Moore did as well. Recently Family Tree DNA has also sponsored a series of free Webinars. I know that at least one book is in process and very near publication, hopefully right after the first of the year. We saw several conferences this year that provided a focus on Genetic Genealogy and I know several are planned for 2014. Genetic genealogy is going mainstream!!! Let’s hope that 2014 is equally as successful and that all these folks asking for training and education become avid genetic genealogists.

I want to close by taking a minute to thank the thousands of volunteers who make such a difference. All of the project administrators at Family Tree DNA are volunteers, and according to their website, there are 7829 projects, all of which have at least one administrator, and many have multiple administrators. In addition, everyone who answers questions on a list or board or on Facebook is a volunteer. Many donate their time to coordinate events, groups, or moderate online facilities. Many speak at events or for groups. Many more write articles for publications from blogs to family newsletters. Additionally, there are countless websites today that include DNA results…all created and run by volunteers, not the least of which is the ISOGG site with the invaluable ISOGG wiki. Without our volunteer army, there would be no genetic genealogy community. Thank you, one and all.

2013 has been a banner year, and 2014 holds a great deal of promise, even without any surprises. And if there is one thing this industry is well known for….it’s surprises. I can’t wait to see what 2014 has in store for us!!! All I can say is hold on tight….

There is nothing I love more than a happy ending. Second to that perhaps is to know that my blog or work helped someone, and in particularly, helped someone document their Native heritage. In doing so, this confirms and unveils one more of our elusive Native people in early records.

I recently received a lovely thank you note from Shawn Potter. We had exchanged notes earlier, after I wrote “The Autosomal Me” series, about how to utilize small segments of Native American (and Asian) DNA to identify Native American lines and/or ancestors. This technique is called Minority Admixture Mapping (MAP) and was set forth in detail in various articles in the series.

Shawn’s note said: “I’ve been doing more work on this segment and others following your method since we exchanged notes. I’m pretty sure I’ve found the source of this Native American DNA — an ancestor named John Red Bank Payne who lived in North Georgia in the late 18th and 19th centuries. Many of his descendants believe on the basis of circumstantial evidence that his mother was Cherokee. I’ve found 10 descendants from four separate lines that inherited matching Native American DNA, pointing to one of his parents as the source.”

Along with this note, Shawn attached a beautiful 65 page book he had written for his family members which did document the Native DNA, but in the context of his family history. He included their family story, the tales, the genealogical research, the DNA evidence and finally, a chapter of relevant Cherokee history complete with maps of the area where his ancestors lived. It’s a beautiful example of how to present something like this for non-DNA people to understand. In addition, it’s also a wonderful roadmap, a “how to” book for how to approach this subject from a DNA/historical/genealogical perspective. As hard as it is for me to sometimes remember, DNA is just a tool to utilize in the bigger genealogy picture.

Shawn has been gracious enough to allow me to reprint some of his work here, so from this point on, I’ll be extracting from his document. Furthermore, Elizabeth Shown Mills would be ecstatic, because Shawn has fully documented and sourced his document. I am not including that information here, but I’m sure he would gladly share the document itself with any interested parties. You can contact Shawn at shpxlcp@comcast.net.

From the book, “Cherokee Mother of John Red Bank Payne” by Shawn Potter and Lois Carol Potter:

Descendants of John Red Bank Payne describe his mother as Cherokee. Yet, until now, some have questioned the truth of this claim because genealogists have been unable to identify John’s mother in contemporary records. A recent discovery, however, reveals both John Red Bank Payne and his sister Nancy Payne inherited Native American DNA.

Considering information from contemporary records, clues from local tradition, John’s name itself, and now the revelation that John and his sister inherited Native American DNA, there seems to be sufficient evidence to say John Red Bank Payne’s mother truly was Cherokee. The following summary describes what we know about John, his family, and his Native American DNA.

John Red Bank Payne was born perhaps near present-day Canton, Cherokee County, Georgia, on January 24, 1754, married Ann Henslee in Caswell County, North Carolina, on March 5, 1779, and died in Carnesville, Franklin County, Georgia, on December 14, 1831.

John’s father, Thomas Payne, was born in Westmorland County, Virginia, about 1725, and owned property in Halifax and Pittsylvania counties, Virginia, as well as Wilkes County, North Carolina, and Franklin County, Georgia. Several factors suggest Thomas travelled with his older brother, William, to North Georgia and beyond, engaging in the deerskin trade with the Cherokee Nation during the mid 1700s. Thomas Payne died probably in Franklin County, Georgia, after February 23, 1811.

Contemporary records reveal Thomas had four children (William, John, Nancy, and Abigail) by his first wife, and nine children (Thomas, Nathaniel, Moses, Champness, Shrewsbury, Zebediah, Poindexter, Ruth, and Cleveland) by his second wife Yanaka Ayers. Thomas married Yanaka probably in Halifax County, Virginia, before September 20, 1760.

Local North Georgia tradition identifies the first wife of Thomas Payne as a Cherokee woman. Anna Belle Little Tabor, in History of Franklin County, Georgia, wrote that “Trader Payne” managed a trading post on Payne’s Creek, and “one of his descendants, an offspring of his Cherokee marriage, later married Moses Ayers whose descendants still live in the county.”

Descendants of John Red Bank Payne also cite his name Red Bank, recorded in his son’s family Bible, as evidence of his Cherokee heritage. Before the American Revolution, British Americans rarely defied English legal prohibitions against giving a child more than one Christian name. So, the very existence of John’s name Red Bank suggests non-English ethnicity. On the other hand, many people of mixed English-Cherokee heritage were known by their Cherokee name as well as their English first and last names during this period.

Furthermore, while the form of John’s middle name is unlike normal English names, Red Bank conforms perfectly to standard Cherokee names. It also is interesting to note, Red Bank was the name of a Cherokee village located on the south side of Etowah River to the southwest of present-day Canton, Cherokee County, Georgia.

While some believe the above information from contemporary records and clues from local tradition, as well as John’s name Red Bank, constitute sufficient proof of John’s Cherokee heritage, recently discovered DNA evidence confirms at least one of John’s parents had Native American ancestry. Ten descendants of John Red Bank Payne and his sister Nancy Payne, representing four separate lineages, inherited six segments of Native American DNA on chromosomes 2, 3, 5, 8, 13, and 18 (see Figure 1 for the relationship between these descendants; Figures 2-7 for images of their shared Native American DNA; and http://dna-explained.com/2013/06/02/the-autosomal-me-summary-and-pdf-file/ for an explanation of this method of identifying Native American chromosomal segments).

In this segment, Bert P, Rosa P, Nataan S, Cynthia S, and Kendall S inherited matching Native American DNA described as Amerindian, Siberian, Southeast Asian, and Oceanian by the Eurogenes V2 K15 admixture tool, and as North Amerind, Mesoamerican, South America Amerind, Arctic Amerind, East Siberian, Paleo Siberian, Samoedic, and East South Asian by the Magnus Ducatus Lituaniae Project World22 admixture tool. Since their common ancestors were Thomas Payne and his wife, the source of this Native American DNA must be either Thomas Payne or his wife. See Figures 2a-2g.

Note: Since Native Americans and East Asians share common ancestors in the pre-historic past, their DNA is similar to each other in many respects. This similarity often causes admixture tools to interpret Native American DNA as various types of East Asian DNA. Therefore, the presence of multiple types of East Asian DNA together with Native American DNA tends to validate the presence of Native American DNA.

Roberta’s Summary: Shawn continues to document the other chromosome matches in the same manner. In total, he has 10 descendants of Thomas Payne and his wife, who it turns out, indeed was Cherokee, as proven by this exercise in combination with historical records. These people descend through 2 different children. Cynthia and Kendall descend through daughter Nancy Payne, and the rest of the descendants descend through different children of John Red Bank Payne. All of the DNA segments that Shawn utilized in his report share Native/Asian segments in both of these family groups, the descendants of both Nancy and John Red Bank Payne.

Shawn’s success in this project hinged on two things. First, being able to test multiple (in this case, two) descendants of the original couple. Second, he tested several people and had the tenacity to pursue the existence of Native DNA segments utilizing the Minority Admixture Mapping (MAP) technique set forth in “The Autosomal Me” series. It certainly paid off. Shawn confirmed that the wife of Thomas Payne was, indeed Native, most likely Cherokee since he was a Cherokee trader, and that today’s descendants do indeed carry her heritage in their DNA.

Great job Shawn!! Wouldn’t you love to be his family member and one of the recipients of these lovely books about your ancestor! Someone’s going to have a wonderful Christmas!

This article is probably less polished than my normal articles. I’d like to get this information out and to you sooner rather than later, and I’m still on the road the rest of this week with little time to write. So you’re getting a spruced up version of my notes. There are some articles here I’d like to write about more indepth later, after I’m back at home and have recovered a bit.

Max Blankfield and Bennett Greenspan, founders, opened the conference on the first day as they always do. Max began with a bit of a story.

13 years ago Bennett started on a quest….

Indeed he did, and later, Bennett will be relating his own story of that journey.

Someone mentioned to Max that this must be a tough time in this industry. Max thought about this and said, really, not. Competition validates what you are doing.

For competition it’s just a business opportunity – it was not and is not approached with the passion and commitment that Family Tree DNA has and has always had.

He said this has been their best year ever and great things in the pipeline.

One of the big moves is that Arpeggi merged into Family Tree DNA.

10th Anniversary Pioneer Awards

Quite unexpectedly, Max noted and thanked the early adopters and pioneers, some of which who are gone now but remain with us in spirit.

Max and Bennett recognized the administrators who have been with Family Tree DNA for more than 10 years. The list included about 20 or so early adopters. They provided plaques for us and many of us took a photo with Max as the plaques were handed out.

I am always impressed by the personal humility and gratitude of Max and Bennett, both, to their administrators. A good part of their success is attributed, I’m sure, to their personal commitment not only to this industry, but to the individual people involved. When Max noted the admins who were leaders and are no longer with us, he could barely speak. There were a lot of teary eyes in the room, because they were friends to all of us and we all have good memories.

Thank you, Max and Bennett.

The second day, we took a group photo of all of the recipients along with Max and Bennett.

With that, it was Bennett’s turn for a few remarks.

Bennett says that having their own lab provides a wonderful environment and allows them to benchmark and respond to an ever changing business environment.

Today, they are a College of American Pathologists certified lab and tomorrow, we will find out more about what is coming. Tomorrow, David Mittleman will speak about next generation sequencing.

The handout booklet includes the information that Family Tree DNA now includes over 656,898 records in more than 8,700 group projects. These projects are all managed by volunteer administrators, which in and of itself, is a rather daunting number and amount of volunteer crowd-sourcing.

Session 1 – Amy McGuire, PhD, JD – Am I My Brother’s Keeper?

Dr. McGuire went to college for a very long time. Her list of degrees would take a page or so. She is the Director of the Center for Medical Ethics and Health Policy at Baylor College of Medicine.

Thirteen years ago, Amy’s husband was sitting next to Bennett’s wife on an airplane and she gave him a business card. Then two months ago, Amy wound up sitting next to Max on another airplane. It’s a very small world.

I will tell you that Amy said that her job is asking the difficult questions, not providing the answers. You’ll see from what follows that she is quite good at that.

How is genetic genealogy different from clinical genetics in terms of ethics and privacy? How responsible are we to other family members who share our DNA?

What obligations do we have to relatives in all areas of genetics – both clinical, direct to consumer that related to medical information and then for genetic genealogy.

She referenced the article below, which I blogged about here. There was unfortunately, a lot of fallout in the media.

In 2004, a paper was published that stated that it took only 30 to 80 specifically selected SNPS to identify a person.

2008 – Can you identify an individual from pooled or aggregated or DNA? This is relevant to situations like 911 where the DNA of multiple individuals has been mixed together. Can you identify individuals from that brew?

2005 – 15 year old boy identifies his biological father who was a sperm donor. Is this a good thing or a bad thing? Some feel that it’s unethical and an invasion of the privacy of the father. But others feel that if the donor is concerned about that, they shouldn’t be selling their sperm.

Today, for children conceived from sperm donors, there are now websites available to identify half-siblings.

The movement today is towards making sure that people are informed that their anonymity may not be able to be preserved. DNA is the ultimate identifier.

Genetic Privacy – individual perspectives vary widely. Some individuals are quite concerned and some are not the least bit concerned.

Some of the concern is based in the eugenics movement stemming from the forced sterilization (against their will) of more than 60,000 Americans beginning in 1907. These people were considered to be of no value or injurious to the general population – meaning those institutionalized for mental illness or in prison.

1927 – Buck vs Bell – The Supreme court upheld forced sterilization of a woman who was the third generation institutionalized female for retardation. “Three generations of imbeciles is enough.” I must say, the question this leaves me with is how institutionalized retarded women got pregnant in what was supposed to be a “protected” environment.

I will also note here that in my experience, concern is not rooted in Eugenics, but she deals more with medical testing and I deal with genetic genealogy.

The issues of privacy and informed consent have become more important because the technology has improved dramatically and the prices have fallen exponentially.

In 2012, the Nonopore OSB Sequencer was introduced that can sequence an entire genome for about $1000.

Originally, DNA data was provided in open access data bases and was anonymized by removing names. The data base from which the 2013 individuals were identified removed names, but included other identifying information including ages and where the individuals lived. Therefore, using Y-STRs, you could identify these families just like an adoptee utilizes data bases like Y-Search to find their biological father.

Today, research data bases have moved to controlled access, meaning other researchers must apply to have access so that their motivations and purposes can be evaluated.

In a recent medical study, a group of people in a research study were informed and educated about the utility of public data bases and why they are needed versus the tradeoffs, and then they were given a release form providing various options. 53% wanted their info in public domain, 33 in restricted access data bases and 13% wanted no data release. She notes that these were highly motivated people enrolled in a clinical study. Other groups such as Native Americans are much more skeptical.

People who did not release their data were concerned with uncertainly of what might occur in the future.

People want to be respected as a research participant. Most people said they would participate if they were simply asked. So often it’s less about the data and more about how they are treated.

I would concur with Dr. McGuire on this. I know several people who refused to participate in a research study because their results would not be returned to them personally. All they wanted was information and to be treated respectfully.

What the new genetic privacy issues are really all about is whether or not you are releasing data not just about yourself, but about your family as well. What rights or issues do the other family members have relative to your DNA?

Jim Watson, one of the discoverers of DNA, wanted to release his data publicly…except for his inherited Alzheimer’s status. It was redacted, but, you can infer the “answer” from surrounding (flanking regions) DNA. He has two children. How does this affect his children? Should his children sign a consent and release before their father’s genome is published, since part of it is their sequence as well? The academic community was concerned and did not publish this information. Jim Watson published his own.

There is no concrete policy about this within the academic community.

Dr McGuire then referenced the book, “The Immortal Life of Henrietta Lacks”. Henrietta Lacks was a poor African-American woman with ovarian cancer. At that time, in the 1950s, her cancer was considered “waste” and no release was needed as waste could be utilized for research. She was never informed or released anything, but then they were following the protocols of the time. From her cell line, the HeLa cell line, the first immortal cell line was created which ultimately generated a great deal of revenue for research institutes. The family however, remained impoverished. The genome was eventually fully sequenced and published. Henrietta Lacks granddaughter said that this was private family information and should never have been published without permission, even though all of the institutions followed all of the protocols in place.

So, aside from the original ethics issues stemming from the 1950s – who is relevant family? And how does or should this affect policy?

How does this affect genetic genealogy? Should the rules be different for genetic genealogy, assuming there are (will be) standard policies in place for medical genetics? Should you have to talk to family members before anyone DNA tests? Is genetic information different than other types of information?

Should biological relatives be consulted before someone participates in a medical research study as opposed to genetic genealogy? How about when the original tester dies? Who has what rights and interests? What about the unborn? What about when people need DNA sequencing due to cancer or another immediate and severe health condition which have hereditary components. Whose rights trump whose?

Dr. Mcguire feels the way to protect people is through laws like GINA (Genomic Information Nondiscrimination Act) which protects people from discrimination, but does not reach to all industries like life insurance.

Is this different than people posting photos of family members or other private information without permission on public sites?

While much of Dr. McGuire’s focus in on medical testing and ethics, the topic surely is applicable to genetic genealogy as well and will eventually spill over. However, I shudder to think that someone would have to get permission from their relatives before they can have a Y-line DNA test. Yes, there is information that becomes available from these tests, including haplogroup information which has the potential to make people uncomfortable if they expected a different ethnicity than what they receive or an undocumented adoption is involved. However, doesn’t the DNA carrier have the right to know, and does their right to know what is in their body override the concerns about relatives who should (but might not) share the same haplogroup and paternal line information?

And as one person submitted as a question at the end of the session, isn’t that cat already out of the bag?

Session 2 – Dr. Miguel Vilar – Geno 2.0 Update and 2014 Tree

Dr. Vilar is the Science manager for the National Geographic’s Genographic Project.

“The greatest book written is inside of us.”

Miguel is a molecular anthropologist and science writer at the University of Pennsylvania. He has a special interest in Puerto Rico which has 60% Native mitochondrial DNA – the highest percentage of Native American DNA of any Caribbean Island.

The Genographic project has 3 parts, the indigenous population testing, the Legacy project which provides grants back to the indigenous community and the public participation portion which is the part where we purchase kits and test.

Below, Dr. Vilars discussed the Legacy portion of the project.

The indigenous population aspect focuses both on modern indigenous and ancient DNA as well. This information, cumulatively, is used to reconstruct human population migratory routes.

These include 72,000 samples collected 2005-2012 in 12 research centers on 6 continents. Many of these are working with indigenous samples, including Africa and Australia.

42 academic manuscripts and >80 conference presentations have come forth from the project. More are in the pipeline.

Most recently, a Science paper was published about the spread of mtDNA throughout Europe across the past 5000 years. More than 360 ancient samples were collected across several different time periods. There seems to be a divide in the record about 7000 years ago when several disappear and some of the more well known haplogroups today appear on the scene.

Nat Geo has funded 7 new scientific grants since the Geno 2.0 portion began for autosomal including locations in Australia, Puerto Rico and others.

Public participants – Geno 1.0 went over 500,000 participants, Geno 2.0 has over 80,000 participants to date.

Dr. Vilar mentioned that between 2008 and today, the Y tree has grown exponentially. That’s for sure. “We are reshaping the tree in an enormous way.” What was once believed to very homogenous, but in reality, as it drills down to the tips, it’s very heterogenous – a great deal of diversity.

As anyone who works with this information on a daily basis knows, that is probably the understatement of the year. The Geno 2.0 project, the Walk the Y along with various other private labs are discovering new SNPs more rapidly than they can be placed on the Y tree. Unfortunately, this has led to multiple trees, none of which are either “official” or “up to date.” This isn’t meant as a criticism, but more a testimony of just how fast this part of the field is emerging. I’m hopeful that we will see a tree in 2014, even if it is an interim tree. In fact, Dr. Vilars referred to the 2014 tree.

Next week, the Nat Geo team goes to Ireland and will be looking for the first migrants and settlers in Ireland – both for Y DNA and mitochondrial DNA. Dr. Vilars says “something happened” about 4000 years ago that changed the frequency of the various haplogroups found in the population. This “something” is not well understood today but he feels it may be a cultural movement of some sort and is still being studied.

Nat Geo is also focused on haplogroup Q in regions from the Arctic to South America. Q-M3 has also been found in the Caribbean for the first time, marking a migration up the chain of islands from Mexico and South America within the past 5,000 years. Papers are coming within the next year about this.

They anticipate that interest will double within the next year. They expect that based on recent discoveries, the 2015 Y tree will be much larger yet. Dr. Michael Hammer will speak tomorrow on the Y tree.

Nat Geo will introduce a “new chip by next year.” The new Ireland data should be available on the National Geographic website within a couple of weeks.

They are also in the process up updating the website with new heat maps and stories.

Session 3 – Matt Dexter – Autosomal Analyses

Matt is a surname administrator, an adoptee and has a BS in Computer Science. Matt is a relatively new admin, as these things go, beginning his adoptive search in 2008.

Matt found out as a child that he was adopted through a family arrangement. He contacted his birth mother as an adult. She told him who his father was who subsequently took a paternity test which disclosed that the man believed to be his biological father, was not. Unfortunately, his ‘father’ had been very excited to be contacted by Matt, and then, of course, was very disappointed to discover that Matt was not his biological child.

Matt asked his mother about this, and she indicated that yes, “there was another guy, but I told him that the other guy was your father.’ With that, Matt began the search for his biological father.

In order to narrow the candidates, his mother agreed to test, so by process of elimination, Matt now knows which side of his family his autosomal results are from.

Matt covers how autosomal DNA works.

This search has led Matt to an interest in how DNA is passed in general, and specifically from grandparents to grandchildren.

One advantage he has is that he has five children whose DNA he can then compare to his wife and three of their grandparents, inferring of course, the 4th grandparent by process of elimination. While his children’s DNA doesn’t help him identify his father, it did give him a lot of data to work with to learn about how to use and interpret autosomal DNA. Here, Matt is discussing his children’s inheritance.

Dr.Jeffrey Paul, who has a doctorate in Public Health from John Hopkins, noticed that his and his wife’s Family Finder results were quite different, and he wanted to know why. Why did he, Jewish, have so many more?

There are 84 participants in the Jewish project that he used for the autosomal comparison.

Arranged marriages based on family backgrounds. Rabbinical lineages are highly esteemed and they became very inbred with cousins marrying cousins for generations.

Cultural and legal restrictions restrict Jewish movements and who they could marry.

Overprediction, meaning people being listed as being cousins more closely than they are, is one of the problems resulting from the endogamous population issue. Some labs “correct” for this issue, but the actual accuracy of the correction is unknown.

Jeffrey compared his FTDNA Family Finder test with the expected results for known relatives and he finds the results linear – meaning that the results line up with the expected match percentages for unrelated relatives. This means that FTDNA’s Jewish “correction” seems to be working quite well. Of course, they do have a great family group with which to calibrate their product. Bennett’s family is Jewish.

Jeffrey has downloaded the results of group participants into MSAccess and generates queries to test the hypothesis that Jewish participants have more matches than a non-Jewish control group.

The Jewish group had approximately a total of 7% total non-Ashkenazi Jewish in their Population Finder results, meaning European and Middle Eastern Jewish. The non-Jewish group had almost exactly the opposite results.

Jewish people have from 1500-2100 matches.

Interfaith 700-1100 (Jewish and non)

NonJewish 60-616

Jewish people match almost 33% of the other Jewish people in the project. Jewish people match both Jewish and Interfaith families. NonJewish families match NonJewish and interfaith matches.

Jeffrey mentioned that many people have Jewish ancestry that they are unaware of.

This session was quite interesting. This study while conducted on the Jewish population, still applies to other endogamous populations that are heavily intermarried. One of the differences between Jewish populations and other groups, such as Amish, Brethren, Mennonite and Native American groups is that there are many Jewish populations that are still unmixed, where most of these other groups are currently intermixed, although of course there are some exceptions. Furthermore, the Jewish community has been endogamous longer than some of the other groups. Between both of those factors, length of endogamy and current mixture level, the Jewish population is probably much more highly admixed than any other group that could be readily studied.

Due to this constant redistribution of Jewish DNA within the same population, many Jewish people have a very high percentage of distant cousin relationships.

For non-Jewish people, if you are finding match number is the endogamous range, and a very high number of distant cousins, proportionally, you might want to consider the possibility that some of your ancestors descend from an endogamous population.

Unfortunately, the photo of Dr. Paul was unuseable. I knew I should have taken my “real camera.”

Session 5 – Finding Your Indian Prince(ss) Without Having to Kiss Too Many Frogs

This was my session, and I’ll write about it later.

Someone did get a photo, which I’ve lifted from Jennifer Zinck’s great blog (thank you Jennifer), Ancestor Central. In fact, you can see her writeup for Day 1 here and she is probably writing Day 2’s article as I type this, so watch for it too.

At the end of the day, after the breakout sessions, roundtable discussions were held. There were several topics. Rebekah Canada, Marie Rundquist and I together “hostessed” the Y DNA and SNP discussion group, which was quite well attended. We had a wide range of expertise in the group and answered many questions. One really good aspect of these types of arrangements is that they are really set up for the participants to interact as well. In our group, for example, we got the question about what is a public versus a private SNP, and Terry Barton who was attending the session answered the question by telling about his “private” Barton SNPs which are no longer considered private because they have now been found in three other surname individuals/groups. This means they are listed on the “tree.” So sometimes public and private can simply be a matter of timing and discovery.

Here’s Bennett leading another roundtable discussion.

Session 7 – Dr. David Mittleman

Dr. Mittleman has a PhD in genetics, is a professor as well as an entrepreneur. He was one of the partners in Arpeggi and came along to Gene by Gene with the acquisition. He seems to be the perfect mixture of techie geek, scientist and businessman.

He began his session by talking a bit about the history of DNA sequencing, next generation sequencing and a discussion about the expectation of privacy and how that has changed in the past few years with Google which was launched in 2006 and Facebook in 2010.

David also discussed how the prices have dropped exponentially in the past few years based on the increase in the sophistication of technology. Today, Y SNPs individually cost $39 to test, but for $199 at Nat Geo you can test 12,000 Y SNPs.

The WTY test, now discontinued tsted about 300,000 SNPs on the Y. It cost between $950 (if you were willing to make your results public) and $1500 (if the results were private,)

Today, the Y chromosome can be sequenced on the Illumina chip which is the same chip that Nat Geo used and that the autosomal testing uses as well. Family Tree DNA announced their new Big Y product that will sequence 10 million positions and 25,000 known SNPs for an introductory sale price of $495 for existing customers. This is not a test that a new customer would ever order. The test will normally cost $695.

Candid Shots

Tech row in the back of the room – Elliott Greenspan at left seated at the table.

ISOGG Reception

The ISOGG reception is one of my favorite parts of the conference because everyone comes together, can sit in groups and chat, and the “arrival” adrenaline has worn off a bit. We tend to strategize, share success stories, help each other with sticky problems and otherwise have a great time. We all bring food or drink and sometimes pitch in to rent the room. We also spill out into the hallways where our impromptu “meetings” generally happen. And we do terribly, terribly geeky things like passing our iPhones around with our chromosome painting for everyone to see. Do we know how to party or what???

Here’s Linda Magellan working hard during the reception. I think she’s ordering the Big Y actually. We had several orders placed by admins during the conference.

We stayed up way too late visiting and the ISOGG meeting starts at 8 AM tomorrow!

Recently, as a comment to one of my blog postings, someone asked how the testing companies can reach so far back in time and tell you about your ancestors. Great question.

The tests that reliably reach the furthest back, of course, are the direct line Y-Line and mitochondrial DNA tests, but the commenter was really asking about the ethnicity predictions. Those tests are known as BGA, or biogeographical ancestry tests, but most people just think of them or refer to them as the ethnicity tests.

Currently, Family Tree DNA, 23andMe and Ancestry.com all provide this function as a part of their autosomal product along with the Genographic 2.0 test. In addition, third party tools available at www.gedmatch.com don’t provide testing, but allow you to expand what you can learn with their admixture tools if you upload your raw data files to their site. I wrote about how to use these ethnicity tools in “The Autosomal Me” series. I’ve also written about how accurate ethnicity predictions from testing companies are, or aren’t, here, here and here.

But today, I’d like to just briefly review the 3 steps in ethnicity prediction, and how those steps are accomplished. It’s simple, really, in concept, but like everything else, the devil is in the details.

There are three fundamental steps.

Creation of the underlying population data base.

Individual DNA extraction.

Comparison to the underlying population data base.

Step 1: Creation of the underlying population data base.

Don’t we wish this was as simple as it sounds. It isn’t. In fact, this step is the underpinnings of the accuracy of the ethnicity predictions. The old GIGO (garbage in, garbage out) concept applies here.

How do researchers today obtain samples of what ancestral populations looked like, genetically? Of course, the evident answer is through burials, but burials are not only few and far between, the DNA often does not amplify, or isn’t obtainable at all, and when it is, we really don’t have any way to know if we have a representative sample of the indigenous population (at that point in time) or a group of travelers passing through. So, by and large, with few exceptions, ancient DNA isn’t a readily available option.

The second way to obtain this type of information is to sample current populations, preferably ones in isolated regions, not prone to in-movement, like small villages in mountain valleys, for example, that have been stable “forever.” This is the approach the National Geographic Society takes and a good part of what the Genograpic Geno 2.0 project funding does. Indigenous populations are in most cases our most reliable link to the past. These resources, combined with what we know about population movement and history are very telling. In fact, National Geographic included over 75,000 AIMs (Ancestrally Informative Markers) on the Geno 2.0 chip when it was released.

The third way to obtain this type of information is by inference. Both Ancestry.com and 23andMe do some of this. Ancestry released its V2 ethnicity updates this week, and as a part of that update, they included a white paper available to DNA participants. In that paper, Ancestry discusses their process for utilizing contributed pedigree charts and states that, aside from immigrant locations, such as the United States and Canada, a common location for 4 grandparents is sufficient information to include that individuals DNA as “native” to that location. Ancestry used 3000 samples in their new ethnicity predictions to cover 26 geographic locations. That’s only 115 samples, on average, per location to represent all of that population. That’s pretty slim pickins. Their most highly represented area is Eastern Europe with 432 samples and the least represented is Mali with 16. The regions they cover are shown below.

Survey Monkey, a widely utilized web survey company, in their FAQ about Survey Size For Accuracy provides guidelines for obtaining a representative sample. Take a look. No matter which calculations you use relative to acceptable Margin of Error and Confidence Level, Ancestry’s sample size is extremely light.

23andMe states in their FAQ that their ethnicity prediction, called Ancestry Composition covers 22 reference populations and that they utilize public reference datasets in addition to their clients’ with known ancestry.

23andMe asks geographic ancestry questions of their customers in the “where are you from” survey, then incorporates the results of individuals with all 4 grandparents from a particular country. One of the ways they utilize this data is to show you where on your chromosomes you match people whose 4 grandparents are from the same country. In their tutorial, they do caution that just because a grandparent was born in a particular location doesn’t necessarily mean that they were originally from that location. This is particularly true in the past few generations, since the industrial revolution. However, it may still be a useful tool, when taken with the requisite grain of salt.

The third way of creating the underlying population data base is to utilize academically published information or information otherwise available. For example, the Human Genome Diversity Project (HGDP) information which represents 1050 individuals from 52 world populations is available for scrutiny. Ancestry, in their paper, states that they utilized the HGDP data in addition to their own customer database as well as the Sorenson data, which they recently purchased.

Academically published articles are available as well. Family Tree DNA utilizes 52 different populations in their reference data base. They utilize published academic papers and the specific list is provided in their FAQ.

As you can see, there are different approaches and tools. Depending on which of these tools are utilized, the underlying data base may look dramatically different, and the information held in the underlying data base will assuredly affect the results.

Step 2: Your Individual DNA Extraction

This is actually the easy part – where you send your swab or spit off to the lab and have it processed. All three of the main players utilize chip technology today. For example, 23andMe focuses on and therefore utilizes medical SNPs, where Family Tree DNA actively avoids anything that reports medical information, and does not utilize those SNPs.

In Ancestry’s white paper, they provide an excellent graphic of how, at the molecular level, your DNA begins to provide information about the geographic location of your ancestors. At each DNA location, or address, you have two alleles, one from each parent. These alleles can have one of 4 values, or nucleotides, at each location, represented by the abbreviations T, A, C and G, short for Thymine, Adenine, Cytosine and Guanine. Based on their values, and how frequently those values are found in comparison populations, we begin to fine correlations in geography, which takes us to the next step.

Step 3: Comparison to Underlying Population Data Base

Now that we have the two individual components in our recipe for ethnicity, a population reference set and your DNA results, we need to combine them.

After DNA extraction, your individual results are compared to the underlying data base. Of course, the accuracy will depend on the quality, diversity, coverage and quantity of the underlying data base, and it will also depend on how many markers are being utilized or compared.

For example, Family Tree DNA utilizes about 295,000 out of 710,000 autosomal SNPs tested for ethnicity prediction. Ancestry’s V1 product utilized about 30,000, but that has increased now to about 300,000 in the 2.0 version.

When comparing your alleles to the underlying data set one by one, patterns emerge, and it’s the patterns that are important. To begin with, T, A, C and G are not absent entirely in any population, so looking at the results, it then becomes a statistics game. This means that, as Ancestry’s graphic, above, shows, it becomes a matter of relativity (pardon the pun), and a matter of percentages.

For example, if the A allele above is shown is high frequencies in Eastern Europe, but in lower frequencies elsewhere, that’s good data, but may not by itself be relevant. However if an entire segment of locations, like a street of DNA addresses, are found in high percentages in Eastern Europe, then that begins to be a pattern. If you have several streets in the city of You that are from Eastern Europe, then that suggests strongly that some of your ancestors were from that region.

To show this in more detailed format, I’m shifting to the third party tool, GedMatch and one of their admixture tools. I utilized this when writing the series, “The Autosomal Me” and in Part 2, “The Ancestor’s Speak,” I showed this example segment of DNA.

On the graph below, which is my chromosome painting of one a small part of one of my chromosomes on the top, and my mother’s showing the exact same segment on the bottom, the various types of ethnicity are colored, or painted.

The grid shows location, or address, 120 on the chromosome and each tick mark is another number, so 121, 122, etc. It’s numbered so we can keep track of where we are on the chromosome.

You can readily see that both of us have a primary ethnicity of North European, shown by the teal. This means that for this entire segment, the results are that our alleles are found in the highest frequencies in that region.

However, notice the South Asian, East Asian, Caucus, and North Amerindian. The important part to notice here, other than I didn’t inherit much of that segment at 123-127 from her, except for a small part of East Asian, is that these minority ethnicities tend to nest together. Of course, this makes sense if you think about it. Native Americans would carry Asian DNA, because that is where their ancestors lived. By the same token, so would Germans and Polish people, given the history of invasion by the Mongols. Well, now, that’s kind of a monkey-wrench isn’t it???

This illustrates why the results may sometimes be confusing as well as how difficult it is to “identify” an ethnicity. Furthermore, small segments such as this are often “not reported” by the testing companies because they fall under the “noise” threshold of between about 5 and 7cM, depending on the company, unless there are a lot of them and together they add up to be substantial.

In Summary

In an ideal world, we would have one resource that combines all of these tools. Of course, these companies are “for profit,” except for National Geographic, and they are not going to be sharing their resources anytime soon.

I think it’s clear that the underlying data bases need to be expanded substantially. The reliability of utilizing contributed pedigrees as representative of a population indigenous to an area is also questionable, especially pedigrees that only reach back two generations.

All of these tools are still in their infancy. Both Ancestry and Family Tree DNA’s ethnicity tools are labeled as Beta. There is useful information to be gleaned, but don’t take the results too seriously. Look at them more as establishing a pattern. If you want to take a deeper dive by utilizing your raw data and downloading it to GedMatch, you can certainly do so. The Autosomal Me series shows you how.

Just keep in mind that with ethnicity predictions, with all of the vendors, as is particularly evident when comparing results from multiple vendors, “your mileage may vary.” Now you know why!

I can’t even begin to tell you how many questions I receive that go something like this:

“I received my ethnicity results from XYZ. I’m confused. The results don’t seem to align with my research and I don’t know what to make of them?”

In the above question, the vendors who are currently offering these types of results among their autosomal tests are Family Tree DNA, 23andMe and Ancestry along with National Geographic who is a nonprofit. Of those four, by far, Ancestry is the worst at results matching reality and who I receive the most complaints and comments about. I wrote an article about Ancestry’s results and Judy Russell recently wrote an article about their new updated results as did Debbie Kennett. My Ancestry results have not been updated yet, so I can’t comment personally.

Let’s take a look at the results from the four players and my own analysis.

Some years back, I did a pedigree analysis of my genealogy in an attempt to make sense of autosomal results from other companies.

The pedigree analysis portion of this document begins about page 8. My ancestral breakdown is as follows:

Geography

Percent

Germany

23.8041

British Isles

22.6104

Holland

14.5511

European by DNA

6.8362

France

6.6113

Switzerland

.7813

Native American

.2933

Turkish

.0031

This leaves about 25% unknown. However, this looks nothing like the 80% British Isles and the 12% Scandinavian at Ancestry.

Here are my current ethnicity results from the three major testing companies plus Genographic.

Ancestry

80% British Isles

12% Scandinavian

8% Uncertain

Family Tree DNA

75% Western Europe

25% Europe – Romanian, Russian, Tuscan, Finnish

23andMe (Standard Estimate)

99.2% European

0.5% East Asian and Native American

0.3% Unassigned

Genographic 2.0

Northern European – 43%

Mediterranean – 36%

Southwest Asian – 18%

Why Don’t The Results Match?

Why don’t the results match either my work or each other?

1. The first answer I always think of when asked this question is that perhaps some of the genealogy is incorrect. That is certainly a possibility via either poor genealogy research or undocumented adoptions. However, as time has marched forward, I’ve proven that I’m descended from most of these lines through either Y-line, mitochondrial DNA or autosomal matches. This confirms my genealogy research. For example, Acadians were originally French and I definitely descend from Acadian lines.

2. The second answer is time. The vendors may well be using different measures of time, meaning more recent versus deep ancestry. Geno 2.0 looks back the furthest. Their information says that “your percentages reflect both recent influences and ancient genetic patterns in your DNA due to migrations as groups from different regions mixed over thousands of years. Your ancestors also mixed with ancient, now extinct hominid cousins like Neanderthals in Europe and the Middle East of the Denisovans in Asia.”

It’s difficult to determine which of the matching populations are more recent and which are less recent. By way of example, many Germans and others in eastern Europe are descendants of Genghis Khan’s Mongols who invaded portions of Europe in the 13th century. So, do we recognize and count their DNA when found as “German,” “Polish,” “Russian,” or “Asian?” The map below shows the invasions of Genghis Khan. Based on this, Germans who descend from Genghis’s Mongols could match Koreans on those segments of DNA. Both of those people would probably find that confusing.

3. The third answer is the reference populations. Here is what National Geographic has to say: “Modern day indigenous populations around the world carry particular blends of these regions. We compared your DNA results to the reference populations we currently have in our database and estimated which of these were most similar to you in terms of the genetic markers you carry. This doesn’t necessarily mean that you belong to these groups or are directly from these regions, but that these groups were a similar genetic match and can be used as a guide to help determine why you have a certain result. Remember, this is a mixture of both recent (past six generations) and ancient patterns established over thousands of years, so you may see surprising regional percentages.”

Each of the vendors has compiled their own list of reference populations from published material, and in the case of National Geographic, as yet unpublished material as well.

If you read the fine print, some of these results that at first glance appear to not match actually do, or could. For example, Southwest Asia (Geno 2.0) could be Russia (Family Tree DNA) or at least pointing to the same genetic base.

This video map of Europe through the ages from 1000AD to present will show the ever changing country boundaries and will quickly explain why coming up with labels for ethnicity is so difficult. I mean, what exactly does “France” or “Germany” mean, and when?

4. The fourth answer is focus. Each of these organizations comes to us as a consumer with a particular focus. Of them, one and only one must make their way on their own merits alone. That one is Family Tree DNA. Unlike the Genographic Project, Family Tree DNA doesn’t have a large nonprofit behind them. Unlike 23andMe, they are not subsidized by the medical community and venture capital. And unlike Ancestry.com, Family Tree DNA is not interested in selling you a subscription. In fact, the DNA market could dry up and go away for any of those three, meaning 23andMe, National Geographic and Ancestry, and their business would simply continue with their other products. To them, DNA testing is only a blip on a spreadsheet. Not true for Family Tree DNA. Their business IS genetic genealogy and DNA testing. So of all these vendors, they can least afford to have upset clients and are therefore the most likely to be the most vigilant about the accuracy of their testing, the quality of the tools and results provided to customers.

I think that as more academic papers are published and we learn more about these reference populations and where their genes are found in various populations, all of these organizations will have an opportunity to “tighten up” their results. If you’ll notice, both Ancestry and Family Tree DNA still include the words “beta.” The vendors know that these results are not the end all and be all in the ethnicity world.

Am I upset with these vendors? Aside from Ancestry who has to know they have a significant problem and has yet to admit to or fix it, no, I’m not. Frustrated, as a consumer, yes, because like all genealogists, I want it NOW please and thank you!!!

Without these kinds of baby steps, we will never as a community crawl, walk, or run. I dream of the day when we will be able to be tested, obtain our results, and along with that, maybe a list of ancestors we descend from and where their ancestors originated as well. So, in essence, current genealogy (today Y-line and mtdna), older genealogy (autosomal lines) and population genetics (ethnicity of each line).

So what should we as consumers do today? Personally, I think we should file this information away in the “that’s interesting” folder and use it when and where it benefits us. I think we should look at it as a display of possibilities. We should not over-interpret these results.

There is perhaps one area of exception, and that is when dealing with majority ethnic groups. By this, I mean African, Asian, Native American and European. For those groups, this type of ethnicity breakdown, the presence or absence of a particular group is more correct than incorrect, generally. Very small amounts of any admixture are difficult to discern for any vendor. For an example of that, look at my Native percentages and some of those are proven lines. For the individual who wants more information, and more detail into the possibilities, I wrote about how to use the raw autosomal data outside of the vendors tools, at GedMatch, to sort out minority admixture in The Autosomal Me series.

Perhaps the Genographic Project page sums it up best with their statement that, “If you have a very mixed background, the pattern can get complicated quickly!” Not only is that true, it can be complicated by any and probably all of the factors above. When you think about it, it’s rather amazing that we can tell as much as we can.

_____________________________________________________________________

Standard Disclosure

This standard disclosure appears at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 900 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA, or one of the affiliate links below:

“The Autosomal Me” is a 9 part series published between February 6, 2013 and May 31, 2013 on the blog, www.dna-explained.com. I’ve been asked to do a couple of things. First, to put together a document that has all of the links in one place, and second, to create a pdf file of the compiled articles for download.

The second part turned out to be easier than I anticipated. One of my readers also saw that request and put together the pdf file, which I’ve uploaded to my website at www.dnaexplain.com and is now available (free) with lots of other good things under the Publications tab. Very big hat tip to John for doing this. Thank you so much!

Part 8, “The Autosomal Me – Extracting Data Segments and Clustering,” we extract all of the Native and Blended Asian segments in all 22 chromosomes, but only used chromosomes 1 and 2 for illustration purposes. We then clustered the resulting data to look for trends, grouping clusters by either the Strong Native criteria or the Blended Asian criteria.

The final segment, Part 9, “The Autosomal Me – The Holy Grail – Identifying Native Genealogy Lines,” utilized all of the chromosomal information we’ve gathered in the earlier steps. We apply that information to our matches and determine which of our lines are the most likely to have Native Ancestry. This, of course, fulfills the goal of using DNA information to identify small amounts of minority admixture.

In summary, this series has been quite interesting and indeed, it did achieve the goals initially set forth. However, it was very manually intensive and took far longer than anticipated, partly due to circumstances beyond my control, like software updates and vendor changes. A second reason that it took longer than expected was due to the sheer amount of work involved in the various steps, particularly steps 8 and 9. In addition, because Minority Admixture Mapping (MAP) is developmental, I had to try several different approaches to determine which one, or ones, worked best. Despite the immense amount of work, I would describe this approach certainly as useful and successful. In fact, I don’t know how else I would have ever eliminated some genealogical lines as candidates for Native heritage and focused on others without the combination of MAP’s new techniques combined with both old and new tools provided by others.

Having said that, I would suggest that this technique, because of the intensive manual effort required, is only for the very committed genetic genealogist – the warrior, so to speak. It also will not work well with only a few matches. I would suggest that you would need at least 200 or 300 matches, preferably more, which is typical of someone with colonial American heritage. If that is you, and you are desperate to find your minority admixed lines….then this type of project may be for you. Please thoroughly read all 9 articles before beginning.

Many of the techniques in the various steps can be utilized individually, without completing the entire MAP process. For example, comparing vendor and third party results, using the GedMatch admixture tools and the chromosome comparisons for percentages of ethnicity all provide useful information in their own right, outside of the full MAP process.

Bon voyage on your journey of discovery to find “The Autosomal You”! Your ancestors are the pot of gold at the end of that rainbow.

Sangreal – the Holy Grail. We are finally here, Part 9 and the final article in our series. The entire purpose of The Autosomal Me series has been to use our DNA and the clues it holds to identify minority admixture, in this case, Native American, and by identifying those Native segments, and building chromosomal clusters, to identify the family lines that contributed that Native admixture. Articles 1-8 in the series set the stage, explained the process and walked us through the preparatory steps. In this last article, we apply all of the ingredients, fasten the lid, shake and see what we come up with. Let’s take a minute and look at the steps that got us to this point.

Part 8, “The Autosomal Me – Extracting Data Segments and Clustering,” we extract all of the Native and Blended Asian segments in all 22 chromosomes, but only used chromosomes 1 and 2 for illustration purposes. We then clustered the resulting data to look for trends, grouping clusters by either the Strong Native criteria or the Blended Asian criteria.

In this final segment, Part 9, we will be applying the chromosomal information we’ve gathered to our matches and determine which of our lines are the most likely to have Native Ancestry. This, of course, has been the goal all along. So, drum roll…..here we go.

In Part 8, we ended by entering the start and stop locations of both Strong Native and Blended Asian clusters into a table to facilitate easy data entry into the chromosome match spreadsheet downloaded from either 23andMe or Family Tree DNA. If you downloaded it previously, you might want to download it again if you haven’t modified it, or download new matches since you last downloaded the spreadsheet and add them to the master copy.

My goal is to determine which matches and clusters indicate Native ancestry, and how to correlate those matches to lineage. In other words, which family lines in my family were Native or carry Native heritage someplace.

The good news is that my mother’s line has proven Native heritage, so we can use her line as proof of concept. My father’s family has so many unidentified wives, marginalized families and family secrets that the Native line could be almost any of them, or all of them! Let’s see how that tree shakes out.

Finding Matches

So let’s look at a quick example of how this would work. Let’s say I have a match, John, on chromosome 4 in an area where my mother has no Native admixture, but I do. Therefore, since John does not match my mother, then the match came from my father and if we can identify other people who also match both John and I in that same region on that chromosome, they too have Native ancestry. Let’s say that we all also share a common ancestor. It stands to reason at that point, that the common ancestor between us indicates the Native line, because we all match on the Native segment and have the same ancestor. Obviously, this would help immensely in identifying Native families and at least giving pointers in which direction to look. This is a “best case’ example. Some situations, especially where both parents contribute Native heritage to the same chromosome, won’t be this straightforward.

Based on our findings, the maximum range and minimum (least common denominator or “In Common” range is as follows for the strongest Native segments on chromosomes 1 and 2.

Chromosome 1

Chromosome 2

Largest Range

162,500,000 – 180,000,000

79,000,000 – 105,000,000

Smallest Range

165,658,091 – 171,000,000

90,000,000 – 103,145,425

At GedMatch

At GedMatch, I used a comparison tool to see who matched me on chromosome 1. Only 2 people outside of immediate family members matched, and both from Family Tree DNA. Both matched me on the critical Native segments between about 165-180mg. I was excited. I went to Family Tree DNA and checked to see if these two people also matched my mother, which would confirm the Native connection, but neither did, indicating of course that these two people matched me on my father’s side. That too is valuable information, but it didn’t help identify any common Native heritage with my mother on chromosome 1. It did, however, eliminate them as possibilities which is valuable information as well.

DNAGedcom

I used a new tool, DNAGedcom, compliments of Rob Warthen who has created a website, DNA Tools, at www.dnagedcom.com. This wonderful tool allows you to download all of your autosomal matches at Family Tree DNA and 23andMe along with their chromosomal segment matches. Since my mother’s DNA has only been tested at Family Tree DNA, I’m limiting the download to those results for now, because what I need is to find the people who match both she and I on the critical segments of chromosome 1 or 2.

Working with the Download Spreadsheet

It was disappointing to discover that my mother and I had no common matches that fell into this range on chromosome 1, but chromosome 2 was another matter. Please note that I have redacted match surnames for privacy.

The spreadsheet above shows the comparison of my matches (pink) and Mother’s (white). The Native segment of chromosome 2 where I match Mother is shaded mustard. I shaded the chromosome segments that fell into the “common match” range in green. Of those matches, there is only one person who matches both Mother and I, Emma. The next step, of course, is to contact Emma and see if we can discover our common ancestor, because whoever it is, that is the Native line. As you might imagine, I am chomping at the bit.

There are no segments of chromosome 2 that are unquestionably isolated to my father’s line.

Kicking it up a Notch

Are you wondering about now how something that started out looking so simple got so complex? Well, I am too, you’re not alone. But we’ve come this far, so let’s go that final leg in this journey. My mom always used to say there was no point in doing something at all if you weren’t going to do it right. Sigh….OK Mom.

The easiest way to facilitate a chromosome by chromosome comparison with all of your matches and your Strong Native and Blended Asian segments is to enter all of these segment groups into the match spreadsheet. If you’re groaning and your eyes glaze over right after you do one big ole eye roll, I understand.

But let’s take a look at how this helps us.

On the excerpt from my spreadsheet below, for a segment of chromosome 5, I have labeled the people and how they match to me. The ones labeled “Mom” in the last column are labeled that way because these people match both Mom and I. The ones labeled “Dad” are labeled that way because I know that person is related on my father’s side.

Using the information from the tables created in Step 8, I entered the beginning and end of all matching segment clusters into my spreadsheet. You can see these entries on lines 7, 8, 22, 23 and 24. You then proceed to colorize your matches based on the entry for either Mom or Dad – in other words the blue row or the purple row, line 7, 22 or 24. In this example, actually, line 5 Rex, based on the coloration, should have been half blue and half purple, but we’ll discuss his case in a minute.

The you can then sort either by match name or by chromosome to view data in both ways. Let’s look at an example of how this works.

Legend:

White Rows: Mother’s matches. When Mother and I both match an individual, you’ll see the same matches for me in pink. This double match indicates that the match is to Mother’s side and not Father’s side.

Pink Rows: My matches.

Purple “Mom” labels in last column: The individual matches both me and Mom. This is a genetic match.

Teal “Dad” labels in last column: Genealogically proven to be from my father’s side. This is a genealogical, not a genetic label, since I don’t have Dad’s DNA and can only infer these genetically when they don’t also match Mother.

Dark Pink Rows labeled “Me Amerind Only” are Strong Native or Blended Asian segments from Chromosome Table that I have entered. My segments must come from one of my parents, so I’ve either colored them purple, if the match is someone who matches Mother and I both, or teal, if they don’t match both Mom and I, so by inference they come from my father’s line.

Dark Teal Rows labeled “Dad Amerind Only” are inferred segments belonging to my father based on the fact that Mother and I don’t share them.

Inferred Relationships

This is a good place to talk for just a minute about inferred relationships in this context. Inference gets somewhat tenuous or weak. The inferred matches on my father’s side began with the Native segments in the admix tools. Some inferences are very strong, where Mother has no Native at all in that region. For example, Mom has European and I have Native American. No question, this had to come from my father. But other cases are much less straightforward.

In many cases, categorization may be the issue. Mom has West Asian for example and I have Siberian or Beringian. Is this a categorization issue or is this a real genetic difference, meaning that my Siberian/Beringian is actually Native and came from my father’s side?

Other cases of confusion arise from segment misreads, etc. I’ve actually intentionally included a situation like this below, so we can discuss it. Like all things, some amount of common sense has to enter the picture, and known relationships will also weigh heavily in the equation. How known family members match on other chromosome segments is important too. Do you see a pattern or is this match a one-time occurrence? Patterns are important.

Keep in mind that these entries only reflect STRONG Asian or Native signals, not all signals. So even if Mother doesn’t have a strong signal, it doesn’t mean that she doesn’t have ANY signal in that region. In some cases, start and stop segments for Mom and Dad overlapped due to very long segments on some matches. In this case, we have to rely on the fact that we do have Mother’s actual DNA and assume that if they aren’t also a match to Mother, that what we are seeing is actually Dad’s lines, although this may not in actuality always be true. Why? Because we are dealing with segments below the matching threshold limit at both Family Tree DNA and 23andMe, and both of my parents carry Native heritage. We can also have crossed a transitional boundary where the DNA that is being matched switches from Mom’s side to Dad’s side.

Ugh, you say, now that’s getting messy. Yes, it is, and it has complicated this process immensely.

The Nitty-Gritty Data Itself

Taking a look at this portion of chromosome 5, we have lots going on in this cluster. Most segments will just be boring pink and white (meaning no Native), but this segment is very busy. Mom and I match on a small segment from 52,000,000 to 53,000,000. Indeed, this is a very short segment when compared to the entire chromosome, but it is strongly Native. We both also match Rex, our known cousin. I’ve noted him with yellow in the table. Please note that Mom’s white matches are never shaded. I am focused on determining where my own segments originate, so coloring Mother’s too was only confusing. Yes, I did try it.

You can see that Mother actually shares all or any part of her segment with only me and Rex. This simplifies matters, actually. However, also note that I carry a larger segment in this region than does Mother, so either we have a categorization issue, a misread, or my father also contributed. So, a conundrum. This very probably implies that my father also carried Native DNA in this region.

Let’s see what Rex’s DNA looks like on this same segment of chromosome 5, from 52-53 using Eurogenes. In the graph below, my chromosome is the top bar, Rex’s the middle and the bottom bar shows common DNA with the black nonmatching. Yellow is Native American, red is South Asian, putty is Siberian, lime green is Mediterranean, teal is North Europe, orange is Caucus.

This same comparison is shown to Mother’s DNA (top row) below.

It’s interesting that while Mother doesn’t have a lot of yellow (Native), she does have it throughout the same segment where Rex’s occurs, from about 52 through 53.5.

Does this actually point to a Native ancestor in the common line between Rex, Mom and I, which is the Swiss/German Johann Michael Miller line which does include an unidentified wife stateside, or does this simply indicate a common ancient population long ago in Asia? It’s hard to say and is deserving of more research. I feel that it is most likely Native because of the actual yellow, Native segment. If this was an Asian/European artifact, it would be much less likely to carry the actual yellow segment.

Is Rex also genealogically related to my father? As I’ve worked through this process with all of my chromosomes and matches, I’ve really come to question if one of my father’s dead ends is also an ancestral line of my mother’s.

The key to making sense of these results is clusters.

Clusters vs Singleton Outliers

The work we’ve already done, especially in Step 8, clusters the actual DNA matching segments. We’ve now entered that information into the spreadsheet and colored the segments of those who match. What’s next?

The key is to look for people with clusters. Many matches will have one segment, of say, 10 that match, colored. Unless this is part of a large chromosome cluster, it’s probably simply an outlier. Part of a large chromosome cluster would be like the large Strong Native segments on chromosome 1 or 2, for example. How do we tell if this is a valid match or just an outlier?

Sort the spreadsheet by match name. Take a look at all of the segments.

The example we’ll use is that of my cousin, Rex. If you recall, he matches both me and Mother, is a known first cousin twice removed to me, (genetically equal to a second cousin), and is descended from the Miller line.

In this example, I also colored Mother’s segments because I wanted to see which segments that I did not receive from her were also Native. You can see that there are many segments where we all match and several of those are Native. These also match to other Miller descendants as well, so are strongly indicative of a Native connection someplace in our common line.

If we were only to see one Native segment, we would simply disregard this as an outlier situation. But that’s not the case. We see a cluster of matches on various segments, we match other cousins from the same line on these segments, and reverting back to the original comparison admixture tools verifies these matches are Native for Rex, Mom and me.

Hmmmm…..what is Dad’s blue segment color doing in there? Remember I said that we are only dealing with strong match segments? Well, Mom didn’t have a strong segment at that location and so we inferred that Dad did. But we know positively that this match does come from Mother’s side. I also mentioned that I’ve come to wonder if my Mom and Dad share a common line. It’s the Miller line that’s in question. One of Johann Michael Miller’s children, Lodowick, moved from Pennsylvania to Augusta County, Virginia in the 1700s and his line became Appalachian, winding up in many of the same counties as my father’s family. I’m going to treat this as simply an anomaly for now, but it actually could be, in this case, an small indication that these lines might be related. It also might be a weak “Mom” match, or irrelevant. I see other “double entries” like this in other Miller cousins as well.

What is the pink row on chromosome 12? When I grouped the Strong Native and Asian Clusters, sometimes I had a strong grouping, and Mom had some. The way I determined Dad’s inferred share was to subtract what Mom had in those segments from mine. In a few cases, Mom didn’t have enough segments to be considered a cluster but she had enough to prevent Dad from being considered a cluster either, so those are simply pink, me with no segment coloring for Mom or Dad.

Let’s say I carry Strong Native/Mixed Asian at the following 8 locations:

10, 12, 14, 16, 18, 20, 22, 24

This meets the criteria for 8 of 15 ethno-geographic locations (in the admix tools) within a 2.5 cM distance of each other, so this cluster would be included in the Mixed Asian for me. It could also be a Strong Native cluster if it was found in 3 of 4 individual tools. Regardless of how, it has been included.

Let’s now say that Mom carries Native/Mixed Asian at 10, 12 and 14, but not elsewhere in this cluster.

Mom’s 3 does not qualify her for the 8/15 and it only leaves Dad with 5 inferred segments, which disqualifies him too. So in this case, my cluster would be listed, but not attributable directly to either parent.

What this really says is that both of my parents carry some Native/Blended Asian on this segment and we have to use other tools to extrapolate anything further. The logic steps are the same as for Dad’s blue segment. We’re going to treat that as an outlier. If I really need to know, I can go back to the actual admixture tools and see whether Mom or Dad really match me strongly on which segments and how we compare to Rex as well. In this case, it’s obvious that this is a match to my Mother’s side, so I’m leaving well enough alone.

Let’s see what the matches reveal.

Matches

Referring back to the Nitty Gritty Data spreadsheet, Mom’s match to Phyllis on row 15 confirms an Acadian line. This is the known line of Mother’s Native ancestry. This makes sense and they match on Native segments on several other chromosomes as well. In fact, many of my and Mother’s matches have Acadian ancestry.

My match to row 19, Joy, is a known cousin on my father’s side with common Campbell ancestry. This line is short however, because our common ancestor, believed to be Charles Campbell died before 1825 in Hawkins County, TN. He was probably born before 1750, given that his sons were born about 1770 and 1772. Joy and I descend from those 2 sons. Charles wife and parents are unknown, as is his wife.

My match to row 20, inferred through my father’s side, is to a Sizemore, a line with genetically proven Native ancestry. Of course, this needs more research, but it may be a large hint. I also match with several other people who carry Sizemore ancestors. This line appears to have originated near the NC/VA border.

I wanted to mention rows 4 and 17. Using our rules for the spreadsheet, if I match someone and they don’t also match Mother on this segment, I have inferred them to be through my father. These are two instances that this is probably incorrect. I do match these people through Mother, but Mother didn’t carry a strong signal on this segment, so it automatically became inferred to Dad. Remember, I’m only recording the Strong Native or the Blended Asian segments, not all segments. However, I left the inferred teal so that you can see what kinds of judgment calls you’ll have to make. This also illustrates that while Mom’s genetic matches are solid, Dad’s inferred matches are less so and sometimes require interpretation. The proper thing to do in this instance would be to refer back to the original admixture tools themselves for clarification.

Let’s see what that shows.

Using HarrappaWorld, the most pronounced segment is at about 52. Teal is American. You can see that Mother has only a very small trace between 53 and 54, almost negligible. Mother’s admixture at location 52 is two segments of purple, brown and cinnamon which translate to Southwest Asian (lt purple), Mediterranean (dk purple), Caucasian (brown) and Balock (cinnamon), from Pakistan.

Checking Dodecad shows pretty much the same thing, except Mother’s background there is South Asian, which could be the same thing as Caucus and Pakistan, just different categorizations.

In this case, it looks like the admixture is not a categorization issue, but likely did come from my father. Each segment will really be a case by case call, with only the strongest segments across all tools being the most reliable.

It’s times like this that we have to remember that we have two halves of each chromosome and they carry vastly different information from each of our parents. Determining which is which is not always easy. If in doubt, disregard that segment.

Raw Numbers

So, what, really did I figure out after all of this?

First, let’s look at some numbers.

I was working with a total of 292 people who had at least one chromosomal segment that matched me with a Strong Native or Blended Asian segment. Of those, 59 also matched Mom’s DNA. Of those, 18 had segments that matched only Mom. This means that some of them had segments that also matched my father. Keep in mind, again, that we are only using “strong matches” which involves inferring Dad’s segments and that referring back to the original tools can always clarify the situation. There seems to be some specific areas that are hotspots for Native ancestry where it appears that both of my parents passed Native ancestry to me.

Many of my and my mother’s 59 matches have Acadian ancestry which is not surprising as the Acadians intermarried heavily with the Native population as well as within their own ethnic group.

Several also have Miller Ancestry. My Miller ancestor is Johann Michael Miller (1692-1771) who immigrated in the colonial period and settled on the Pennsylvania frontier. His son, Philip Jacob Miller’s (1726-1799) wife was a woman named Magdalena whose last name has been rumored for years to be Rochette, but no trace of a Rochette family has ever been found in the county where they lived, region or Brethren church history…and it’s not for lack of looking. Several matches point to Native Ancestry in this line. This also begs the question of whether this is really Native or whether it is really the Asian heritage of the German people. Further analysis, referring back to the admixture tools, suggests that this is actually Native. It’s also interesting that absolutely none of Mother’s other German or Dutch lines show this type of ancestry.

There is no suggestion of Native ancestry in any of her other lines. Mother’s results are relatively clean. Dad’s are anything but.

Dad’s Messy Matches

My father’s side of the family, however, is another story.

I have 233 matches that don’t also match my mother. There can be some technical issues related to no-calls and such, but by and large, those would not represent many. So we need to accept that most of my matches are from my Father’s side originating in colonial America. This line is much “messier” than my mother’s, genealogically speaking.

Of those 233 matches, only 25 can be definitely assigned to my father. By definitely assigned, I mean the people are my cousins or there is an absolutely solid genealogical match, not a distant match. Why am I not counting distant matches in this total? We all know by virtue of the AncestryDNA saga that just because we match family lines and DNA does NOT mean that the DNA match is the genealogical line we think it is. If you would like to read all about this, please refer to the details in CeCe Moore’s blog where she discussed this phenomenon. The relevant discussion begins just after the third photo in this article where she shows that 3 of 10 matches at Ancestry where they “identify” the common DNA ancestor are incorrect. Of course, they never SAY that the common ancestor is the DNA match, but it’s surely inferred by the DNA match and the “leaf” connecting these 2 people to a common ancestor. It’s only evident to someone who has tested at least one parent and is savvy enough to realize that the individual whose ancestor on Mom’s side that they have highlighted, isn’t a match to Mom too. Oops. Mega-oops!!!

However, because we are dealing in our project, on Dad’s side, with inferences, we’re treading on some of the same ground. Also, because we are dealing with only “strong clustered” segments, not all Native or Asian segments and because it appears that my parents both have Native ancestry. To make matters worse, they may both have Algonquian, Iroquoian or both.

I have also discovered during this process that several of my matches are actually related to both of my parents. I told you this got complex.

Of the people who don’t match Mother, 32 of them have chromosomal matches only to my father, so those would be considered reliable matches, as would the closest ones of the 25 that can be identified genealogically as matching Dad. Many of these 25 are cousins I specifically asked to test, and those people’s results have been indispensable in this process.

In fact, it’s through my close circle of cousins that we have been able to eliminate several lines as having Native ancestry, because it doesn’t’ show as strong and they don’t have it either.

Many of these lines group together when looking at a specific chromosome. There is line after line and cousin after cousin with highlighted data.

Dad’s Native Ancestors

So what has this told me? This information strongly suggests that the following lines on my father’s side carry Native heritage. Note the word “carry.” All we can say at this point is that it’s in the soup – and we can utilize current matches at our testing company and at GedMatch, genealogy research and future matches to further narrow the branches of the tree. Many of these families are intermarried and I have tried to group them by marriage group. Obviously, eventually, their descendants all intermarried because they are all my ancestors on my father’s side. But multiple matches to other people who carry the Native markers but aren’t related to my other lines are what define these as lines carrying Native heritage someplace.

Campbell – Hawkins County, Tn around 1800, missing wife and parents, married into the Dodson family

Dodson – Hawkins County, Tn, Virginia – written record of Lazarus Dodson camping with the Cherokee – missing wife, married into the Campbell and Estes family

While this looks like a long list, the list of families that don’t have any Native ancestry represented is much longer and effectively serves to eliminate all of those lines. While I don’t have “THE” answer, I certainly know where to focus my research. Maybe there isn’t the one answer. Maybe there are multiple answers, in multiple lines.

The Take Away

Is this complex? Yes! Is it a lot of work? You bet it is! Is everything cast in concrete? Never! You can see that by the differences we’ve found in data interpretation, not to mention issues like no-calls (areas that for some reason in the test don’t read) and cross overs where your inheritance switches from your mom’s side to your dad’s side. Is there any other way to do this? No, not if your minority admixture is down in that weedy area around 1%.

Is it worth it? You’ll have to decide. It guess it depends on how desperately you want to know.

Part of the reason this is difficult is because we are missing tools in critical locations. It’s an intensively laborious manual process. In essence, using various tools, one has to figure out the locations of the Native and Asian chromosome segments and then use that information to infer Native matches by a double match (genetic match at DNA company plus match with Strong Native/Blended Asian segment) with the right parent. It becomes even more complex if neither parent is available for testing, but it is doable although I would think the reliability could drop dramatically.

Tidbits and Trivia

I’ve picked up a number of little interesting tidbits during this process. These may or may not be helpful to you. Just kind of file them away until needed:)

Matches at testing companies come and go….and sometimes just go. At Family Tree DNA, I have some matches that must be trembling on the threshold that come and go periodically. Now you see them, now you don’t. I lost matches moving from the Affy chip to the Illumina chip and lost additional matches between Build 36 and 37. Some reappeared, some haven’t.

The start and stop boundaries changed for some matches between build 36 and build 37. I did not go back and readjust, as most of these, in the larger scheme of things, were minor. Just understand that you are looking for patterns here that indicate Native heritage, not exact measurements. This process is a tool, and unfortunately, not a magic wand:)

The centromere locations change between builds. If you have matches near or crossing the middle of the chromosome, called the centromere, there may be breaks in that region. I enter the centromere start and stop locations in my spreadsheet so that if I notice something odd going on in that region, the centromere addresses are right there to alert me that I’m dealing with that “odd” region. You can find the centromere addresses in the FAQ at Family Tree DNA for their current build.

At 23andMe, when you reach the magic 1000 matches threshold, you start losing matches and the matching criteria is elevated so that you can stay under 1000 matches. For people with colonial American or Jewish heritage, in other words those with high numbers of matches, this is a problem.

Watch for matches that are related to both sides of your family. If your family lived in colonial America, you’re going to have a lot of matches and many are probably related to each other in ways you aren’t aware of.

If your parents are related to each other, this process might simply be too complex and intertwined to provide enough granular data to be useful.

Endogamous groups are impossible to sort through as to where, meaning which ancestor, the DNA came from. This is because the original group founders’ DNA is just getting passed around and around, with little or no new DNA being introduced. The effect of this on downstream generations relative to genetic genealogy is that matches appear to be more closely related than they are because of the amount of matching DNA they carry. For my Brethren and my Acadian groups of people, I just list them by the group name, since, as the saying goes, “if you’re related to one Acadian, you’re related to all Acadians.”

If you’re going to follow this procedure, save one spreadsheet copy with the Strong Native only and then a second one with both the Strong Native and Blended Asian. I’m undecided truthfully whether the Mixed Asian adds enough resolution for the extra work it generates.

When in question, refer back to the original tools. The answer will always be found there.

Unfortunately, tools change. You may want to take screen shots. During this process, FTDNA went from build 36 to 37, match thresholds changed, 23andMe introduced a new user interface (which I find much less intuitive) and GedMatch has made significant changes. The net-net of this is when you decide to undertake this project, commit to it and do it, start to finish. Doing this little by little makes you vulnerable to changes that may make your data incompatible midstream – and you may not even realize it.

This entire process is intensively manual. My spreadsheet is over 5500 rows long. I won’t be doing it again…although I will update my spreadsheet with new matches from time to time. The hard work is already done.

This same technique applies to any minority ancestry, not just Native, although that’s what I’ve been hunting for and one of the most common inquiries I receive.

I am hopeful that in the not too distant future many of these steps and processes will be automated by the group of bright developers that contribute to GedMatch or via other tools like DNAGedcom. HINT – HINT!!!

I would like to follow this same process to identify the source of my African heritage, but I’m thinking I’ll wait for the tools to become automated. The great irony is that it’s very likely in the same lines as my Native ancestors.

If You Want to Test

What does it take to do this for yourself using the tools we have today, as discussed?

If your parents are living, the best gift you can give yourself is to test them, now, while you still can. My mother has been gone for several years, but her DNA archived at Family Tree DNA was still viable. This is not always the case. I was fortunate. Her DNA is one of the best gifts she gave me. Not just by inheritance, but by having hers tested. I thank her every single day, for both! I could not have written this article without her DNA results. The gift that keeps on giving.

If you don’t have a parent to test, you can test several other family members who will provide some information, but clearly won’t carry the same amounts of common DNA with you as your parents. These would include your aunts and uncles, your parents’ siblings and what I’ve referred to as your close cousin circle. Attempt to test at least someone from each line. Yes, it gets expensive, but as one of my cousins said, as she took her third or 4th DNA test. “It’s only money. This is about family.”

You can also test your own siblings as well to obtain more information that you can use to match up to your family lines. Remember, you only receive half of your parents DNA, and your siblings will received some DNA from your parents that you didn’t.

I don’t have any other siblings to test, but I have tested cousins from several lines which have proven invaluable when trying to discern the sources of certain segments. For example, one of these Native segments fell on a common segment with my cousin Joy. Therefore, I know it’s from the Campbell line, and because I have the Campbell paternal Y-DNA which is European, I know immediately the Native admixture would have had to be from a wife.

Much of this puzzle is deductive, but we now have the tools, albeit manual, to do this type of work that was previously impossible. I am somewhat disappointed that I can’t pinpoint the exact family lines, yet, but hopefully as more people test and more matches provide genealogical information, this will improve.

If you want to play in this arena, you need to test at either Family Tree DNA, 23andMe, or both. Right now, the most cost effective way to achieve this is to purchase a $99 kit from 23andMe, test there, then download your results from 23andMe and upload them to Family Tree DNA for $99. That way, you are fishing in both pools. Be aware that less than half of the people who test at either company download results to GedMatch, so your primary match locations are with the testing companies. GedMatch is auxiliary, but critical for this analysis. And the newest tool, DNAGedcom is a Godsend.

Also note that transferring your result to Family Tree DNA is NOT the same thing as actually testing there. Why does this matter? If you want a future test at Family Tree DNA, who is the premiere genetic genealogy testing company, offering the most variety and “deepest” commercial tests, they archive your DNA for 25 years, but if you transfer results, they don’t have your DNA to archive, so no future products can be ordered. All I can say is thank Heavens Mom’s DNA was there.

Ancestry.com doesn’t provide any tools such as the chromosome browser or even the basic information of matching segments. All you get is a little leaf that says you’re related, but the questions of which segment or how are not answerable today at Ancestry and as CeCe’s experience proved, its unreliable. It’s possible that you share the same surnames and ancestor, but your genetic connection is not through that family line. Without tools, there is no way to tell. Ancestry released raw data files a few weeks ago and very recently, GedMatch has implemented the ability to upload them so that Ancestry participants can now utilize the additional tools at GedMatch.

Although this has been an extraordinarily long and detailed process, I can’t tell you how happy I am to have developed this new technique to add to my toolbox. My Native and African ancestors have been most elusive. There are no records, they didn’t write and probably didn’t even speak English, certainly not initially. The only clues to their existence, prior to DNA, were scant references and family lore. The only prayer of actually identifying them is though these small segments of our DNA – yep – down in the weeds. Are there false starts perhaps, and challenges and maybe a few snakes down there? Yes, for sure, but so is the DNA of your ancestors.

Happy gardening and rooting around in the weeds. Just think of it as searching for the very best buried treasure! It’s down there, just waiting to be found. Keep digging!

I hope you’ve enjoyed this series and that it leads you to your own personal genealogical treasure trove!

In this segment, Part 8, we’ll be extracting all of the Native and Blended Asian segments on all 22 chromosomes, but I’ll only be using chromosomes 1 and 2 for illustration purposes. We will then be clustering the resulting data to look for trends. If you’re following along and using this methodology, you’ll be extracting the Native segment start and stop locations from all 22 chromosomes.

I apologize in advance for the length of this article, but there was just no good place to break it into pieces.

So, let’s get started. As a reminder, we are using the admixture tools at www.gedmatch.com.

I experimented with several types of extractions to see which ones best reflected the results found by both 23andMe and Dr. McDonald and confirmed by the start and stop segments in the highly Native segments of chromosomes 1 and 2 in Part 7 of this series. We verified that all 4 tools accurately reflected and corroborated the segments listed as Native, so now we’re going to apply that same methodology to the rest of our chromosomal data.

Initially, I tried to use the information from chromosomes 1 and 2 to extract the Native chromosomes using only the “best” tool, but when I looked at all 4 tools, I quickly realized that there was no single “best” choice. A couple of crucial points came to light.

Some of the geographic colors are almost impossible to tell apart.

None of the tools are universally best.

When looking at all 4 tools, generally a “best 3 out of 4” approach allowed for one of the tools to be wrong, to perhaps reference a slightly different data base that called the segment differently or for the colors to be indistinguishable. In other words, if three called a segment Native and one did not, it’s Native and conversely, if less than 3 call it Native, in this comparison, it’s not.

Unfortunately, this created an awful lot of work. This is probably the best example of where automation tools could and would make a huge difference in this process.

I did two separate extracts. The first one is what I refer to as the “Strong Native” extract and the second is the “Blended Asian.” In part, I did these separately as a check and balance to be sure that my first extraction was accurate.

In the first extract, I selected only one category, the one best fitted to “Native American” for each tool. I used the following categories for each admixture tool:

MDLP – Amerind

Eurogenes – North Amerindian

Dodecad – NE Asian

Harrappaworld – American

I completed this process for every chromosome, but I’m only showing the first two chromosomes in this article.

By way of example, using the first tool, MDLP, North Amerind looks black, but is actually very dark grey. It is, fortunately, distinctive.

On the chromosome painting below, my results for the first part of chromosome 1 are shown in the first band, and mother’s for the same segment are shown as the second band. The bottom band represents common segments and the black is non-matching segments, meaning those I obtained from my father. Sometimes this third band can help you determine what you are really seeing in terms of colors and blending, but it’s not always useful. In this case, trying to spot a small amount of dark gray against black is almost impossible, so not terribly helpful. But if you were looking for red, that would be another story. As you move through this process, remember, it’s not exact and utilizing best 3 of 4 will help you recover from any major errors.

You can see that my grey segments show up from about 12-13 and then again at about 14.5. Sometimes it’s difficult to know how to count something. For example, my Native at 14.5 – it’s actually more like 14.25 -14.5, but I chose not to divide further than half mb segments. As long as you are consistent in whatever methodology you select, it will work out.

Please note that when reading these charts, that the small hash mark is the indicator for the measure. In other words, the small hash mark above 10M means that is the 10M location. It’s obvious here, but on some charts, the hash mark and the location legend look to be 1-off. Again, as long as you’re consistent, it really doesn’t matter.

Mother’s Native segments are more pronounced and obvious. They range from about 8-14. Using the actual tools, you would record this and then continue scrolling to the right until you reach the end of the chromosome. On chromosomes 1 and 2, I found the strong Native segments for the four admixture tools, as shown below.

The boxed numbers show the areas that were found “in common” between 23andMe, Dr. McDonald and the admixture tools, as determined in Part 7 of this series. Highlighted segments show segments where at least 3 of 4 admixture tools reported Native heritage. As you can see, there were clearly additional Native segments not reported by 23andme and Dr. McDonald.

Strong Native Chromosomal Detail Table

Because we have both my and mother’s results, we can infer my father’s contribution. Clearly, some of his will wind up being some amount of “noise” and some IBS segments, but not all, by any means, and this is the only way to get a “read” on Dad. This is one form of phasing data. Phasing refers to various methodologies of figuring out which DNA comes from what source, meaning which parental line.

While the strongest Native segments are the ones individually most likely to indicate Native American ancestry, that really isn’t the whole story. I discovered that many of these Native segments are actually embedded in other segments that are indicative of Native heritage too. In other words, it’s not a line in the sand, yes or no, but more of a sliding scale.

On the chromosome painting below, this one using Eurogenes, with my results shown above and mother’s below, you can see two excellent examples. Regions relevant to Native ancestry include:

Red – South Asian

Brown – Southwest Asian

Yellow – North Amerindian and Arctic

Putty – Siberian

Emerald – East Asian

You can see that while mine is almost universally yellow, or Native, with a little Siberian (putty) mixed in for good measure between 169-170, a hint of East Asian (emerald) plus a little Asian (red), mother’s isn’t. In fact, hers is a mixture of Native American and South Asian (red), with more red than yellow, Siberian (putty) and a large segment of East Asian (emerald green).

While her yellow Native segments alone would be staggered across this entire segment in 7 different pieces, when taken together as a whole, the “blended Asian” segment reaches entirely across the screen with the exception of 1 mb between 161.5-162.5, roughly.

The following Blended Asian Chromosomal Detail Table shows all of the blended Asian segments using all four of the admixture tools for chromosomes 1 and 2.

It’s clear that these regions are not solely “Native American” but reach back in time genetically into Asia, particularly Northeast Asia.

Again, the boxed numbers show the “in common” segments between all tools and the yellow highlighted segments are common between at least three of the four admixture tools.

Please note that there were some issues distinguishing colors, as follows:

For the MDLP comparison, Mesoamerican and Paleo Siberian are both putty colored and indistinguishable on the chart. Also, the apple green for Arctic Amerind is very similar to the Austronesian.

When using Dodecad, Southeast Asian (light green) and South Asian (apple green) are nearly impossible to distinguish from each other on the graphs.

When using HarappaWorld, the apple green for Siberian was very similar to the light forest green for Papua New Guinea and was very difficult to distinguish. The South Asian putty appears often with the other Native markers, and I considered including this group, but it too was difficult to distinguish from other regions so in the end, I opted not to include this category.

If you are colorblind – get help as this is impossible otherwise.

Blended Asian Chromosomal Detail Table

On the blended Asian Chromosome Detail Table, I added yellow highlighting where the same segments show in other Asian geographies that showed in the Strong Native table. In each column, the Strong Native category is the last one at the bottom of the list.

The blue highlighting shows other common segments found that were not included in the Strong Native segments. For a Strong Native yellow segment to be highlighted, it had to be present in 3 of 4 tools, or 75%. In the Blended Asian group, there are a total of 15 categories between the 4 admixture tools, so for a segment to be shaded blue, it must be found in at least 8 of the categories, so just over half. There are many segments that are found in several categories across the tools. For example, segment 192-193 on chromosome 1 is found five times. This isn’t to say you should discount this segment, only that it isn’t one of the strongest, most universal. Surprisingly, there really weren’t too many that were close to the cutoff. Several, but not a majority, were in the 4 or 5 range, only one was at 7.

Clustering

The third step in data extraction is to look at all of the data together. In this step, we are removing the geographic boundaries of Siberian, N. Amerindian, etc. and combining all of our data. I have only combined the data within columns, not between columns, so we can get a feel for which tool or tools performed best or maybe not so well. Each chromosome in each column has its data ordered numerically, and yes, this is a manual cut and paste process. Sorry. I warned you, this is an very manually intensive process.

After I put each column in numerical order, I arranged them so that the numbers were approximately in a line, or a row, with each other. For example, in the first group below, you can clearly see that the first cluster of results is found using all 4 tools. When looked at individually, only the blue results were noted as common (at least 8 of 15 for blue), but when viewed as a cluster, you can see between the tools that the cluster itself runs from about 7.5, with a small break from 8-9, and then to about 14.5. As you would expect the beginning and end points of the cluster trail off and are not uniform between tools, but the main part of the cluster is found in all the tools. This introduces the question of how to measure a cluster. In this case, there is a clean break using all tools between 8 and 9, but that is only 1 mb, rather difficult to measure accurately. You could record this as two distinct clusters but since it’s very closely adjacent the rest of the cluster, I’m inclined to include this as one large cluster and use the starting and ending segments for the cluster as a whole, in other words, the cluster runs from 7.5 through 14.5. The alternate, or more conservative methodology would be to use the “in common” numbers, but in this case, that would be only 10-11.5 and I think you would miss a great deal of useful data. So, for clusters, I’m recording the full extent of the cluster. In some cases, you may need to exercise a judgment call.

Let’s look at the second group of numbers, beginning with 18.5 in Harrappaworld. This grouping runs though about 28. Eurogenes found some blended Asian between 27-28.5 as well in two of the geographies, but over all, of the 15 tools, we don’t see much. This could be a result of a number of things. I could have had problems with the colors, there may be only a very small amount and it may be categorized as something else with the other tools. I would not consider this a cluster, and using our best 3 or 4 methodology eliminates this cluster from consideration. This also holds true for 43-43.5.

However, the next cluster, from 55.5 to 58 is found in the Strong Native comparison, indicated by the yellow highlighting and is found using all 4 tools. This is definitely a cluster.

I’ve synthesized the cluster information into a list. From the clusters above, I’ve created a list that I will be using in the next segment for data input into my spreadsheet of matches. The blended segments below that include Strong Native segments are shown with yellow.

Using the GedMatch admixture applications, we’ve isolated the strongest Native and the Blended Asian segments and clusters in preparation for identifying specific Native family lines within our group of matches.

This process shows that, for the most part, the Strong Native segments picked up the strongest signals, about half of the segments that will be useful in determining Native admixture, although it does miss some.

When we use the clustering technique to view our results across all the admixture tools, we see a somewhat different picture emerge, adding several Blended Asian clusters.

In Part 9 of this series, we will use the highlighted Strong Native segments and the Blended Asian clusters, both of which suggest Native chromosomal “hotspots” to begin our comparison to our genetic matches for genealogical relevance. In other words, using this information, we will determine which genealogical lines carry Native ancestry.

Part 9 may be somewhat delayed. The good news is that Family Tree DNA is finishing work on their Build 36 to Build 37 conversion. The bad news is that it fell right in the middle of writing this series. When they finish Build 37, I’ll finish Part 9 of this series. In the mean time, you can be extracting your minority segments using the tools and techniques that we have covered in Parts 1-8.