incorporating DNA in genealogy research

Using shared cM counts to help find the common ancestor: part 2 of 3

The previous post in this short series talked about the genealogy basics one should have covered before dabbling in DNA. Now we’re ready to explore how to use some of the DNA information in conjunction with that genealogy.

I suspect one of the most popular references in genetic genealogy is Blaine Bettinger’s Shared cM Project. Blaine used crowd-sourcing to collect data on over 10,000 pairs of confirmed relatives. The result is a chart showing the empirical size ranges of shared DNA for each degree of kinship. See information on Blaine’s blog or the chart on the ISOGG Wiki. For example, simple math tells us that two 4th cousins may share an average of 13.28 cM, but Blaine’s study found the range anywhere from 0 to 90 cM.

Note: here’s an important observation. You may be 4th cousins with someone and share zero matching DNA. But if you are looking at someone you do share matching DNA with, you’ll notice the chart offers plenty of overlap in predicting relationships. If you and your match share 16 cM, you could be anywhere from 2nd cousins once removed to 8th cousins. But the more DNA you share, the fewer the kinship possibilities. If you and your match share more than 60 cM, you should be no more distant than 4th cousins. And if you and your match each did the recommended genealogy in the last post, and identified all 32 of your 3x-great-grandparents, you should be able to identify the common ancestor.

But how do you know how much matching DNA you and your match share? Good question!

When you add up your matching segments to come up with a total, you should exclude any segments smaller than about 6-7 cM. Those may be either false positives (for example, due to genotyping errors) or they may be what some call ‘population genetics’ from centuries ago, not useful for genealogy. Most testing companies do exclude these small pieces, but if you’re looking at a FamilyFinder match list from FamilyTreeDNA (FTDNA), you need to do a little more work.

FamilyTreeDNA: At least as of this writing, FTDNA includes very small segments, which may be false matches, in the total. For example, see Lavon in Image 1, showing a total of 63 cM.

Image 1.

Select the box beside her name and then above the list of names, select Chromosome Browser to see the exact segments. You’ll be taken to a graphic image showing which segments exceed their default 5 cM. You may select “View this data in a table” to see the numbers. As seen in Image 2, there is really only one segment of a practical size. We should consider Lavon’s total 27 or 28 cM, not 63 cM. She may be more distantly related than a 4th cousin.

Image 2.

Note: some of us like to sort our Family Finder match list by the size of the largest segment instead of the total cM shared.

AncestryDNA: If you’re using AncestryDNA, they already exclude anything under 6cM, so we’re on solid ground. Any match with a confidence level of Extremely High is over 60 cM total, so we should start there.

When you’ve worked through those, you can drop down to the Very High matches, which are 45-60 cM; hypothetically no further back than 4th cousins once removed. If you want to see the total shared cM, you canjust click on a person from your match list, and on the screen specific to that individual, click on the little [i] icon next to the ‘Confidence’ rank. A box will pop up with the total cM we care about. See Image 3.

Image 3.

GEDmatch: At GEDmatch, you want to leave the threshold at the default of 7cM when you run the One-to-Many search. The resulting match list sorts by the total size of matching segments over that 7cM threshold. See Image 4. (If desired, you can also click on “largest cM” to sort by largest segment.)

Image 4.

Note: AncestryDNA massages its data somewhat. So if you copy raw AncestryDNA data to GEDmatch, you may see slight differences in the numbers. That’s expected.

23andMe: At 23andMe, the match list shows percents of matching DNA, sorted by predicted relationship.Similar to AncestryDNA, 23andMe disregards most segments under 5-7 cM. (There must be at least one matching segment of 7 cM; additional segments are counted if they are at least 5cM.)You can start by contacting each match sharing at least 0.70%. (Thanks Debbie Kennett for the correction!) It is possible to get the exact figures between you and match by going through the DNA page to “Compare your DNA to see what segments you share with close and distant relatives.” Then select your own kit. Then search for and select by name those matches with whom you are sharing genomes that you want to compare to. I confess to finding the current 23andMe website difficult to navigate and not all participants have the same online experience, so my focus is on the other autosomal DNA options.

Now, we’re ready. Decide on a threshold to start with (I recommend matches above 55 or 60 cM) and contact them to exchange genealogy. Don’t assume that your matches will have online trees and you’ll be able to solve your genealogy mysteries without ever having to contact anyone. Many won’t have trees available. Even those who do have trees may have more valuable information they haven’t posted online. So message them! Some of your matches won’t reply. But some will!

If you don’t have any matches that share 60 cM or more with you, go down to 45. The lower you go, the farther back you and your match may need to flesh out all lines in your trees to find the common ancestor. But hundreds of thousands of new DNA kits are processed annually, so if success doesn’t come right away, be patient! Maybe in a few months you’ll find new matches that are closer. Good luck!

3 thoughts on “Using shared cM counts to help find the common ancestor: part 2 of 3”

I’m curious as to why you set a much lower threshold for 23andMe compared to the other companies. Did you mean to write 0.20% for 23andMe? That equates to about 13-14 cMs. Matches at this level at 23andMe are generally third to sixth cousins.

Do you have many matches at the thresholds you’re recommending? I only have 6 matches at Ancestry, one match at GedMatch and a couple of matches at FTDNA that share 45 cMs or more. The experience is very different for those of us with English ancestry, and I normally look at all matches that share more than 15 cMs even though it’s rarely possible to find a genealogical connection.

Yes, I definitely meant 0.20% for 23andMe, when I wrote .20%. Honestly, I would hold 23andMe match lists to a higher standard, except it’s so time-consuming to get to the information where you can get cM totals that I just don’t even bother. You’re right — I should up that percentage. (I probably didn’t because I have the same number of matches at 23andMe over 0.20% as I do at GEDmatch over 60 cM–just a handful.)

I have 7 matches on GEDmatch over 60 cM (that I didn’t recruit – LOL!) and just 4 more between 45-60. (I’m 50% Famine-era Irish roots, 12.5% German, and the rest were in the U.S. since before 1800. I’ve never found any matches on my German lines.) My intention was probably to guide American newbies who aren’t dealing with adoption issues on how to get started. I participate in monthly meetings at the Central Indiana DNA Interest Group and I have a couple presentations coming up on this topic, so I thought I’d work some kinks out in a blog post first.

Certainly, I think people can look at segments under 45-60 cM, but my experience has been that if you don’t have other closer relatives identified to include in the network of shared matches, it can be very hard to find that common ancestor. (Even if you find someone in your tree that matches someone in their tree, if that makes you 5th-6th cousins or more distant, it’s hard to be confident the identified common ancestor was the source of that shared DNA. Most people’s trees don’t go that far back on all lines.)

So for beginners, I suggest starting with matches over 60 cM, then move down to 45 cM, and then … (but not covered in these blog posts) enlist some 2nd and 3rd cousins and then with triangulation and genetic networks and lots more effort expanding our trees with documentary evidence… then we can work with smaller segments. Just my two cents though, based on my own experiences.

Thanks very much for reading and commenting. I always look forward to your posts on social media! ~ Ann

Ann, This is what I don’t understand. You’re not holding 23andMe to a higher standard but a lower standard. It’s almost impossible to find the connection with matches who only share 13-14 cMs (which is what 0.20% equates to). If you wanted to use the same standard for 23andMe as you do for AncestryDNA and FTDNA then you should set the level at about 0.70%. The percentages on the ISOGG Wiki page on autosomal DNA statistics are very misleading and are calculated by halving the amount of DNA shared each generation. Although the figure given by this method for fourth cousins is 0.20% that is not the expected amount for fourth cousins. We only share DNA with a subset of our fourth cousins and therefore we tend to share more DNA with those we match than the mathematical averages suggest. AncestryDNA give the expected range for fourth cousins as 20-85 cMs (that’s assuming of course that the cousins share DNA in the first place).

I’ve not had much luck at 23andMe and haven’t been able to make a single connection there but that’s probably because I only have low-level matches.

The difficulty I find with UK testers is that high-level matches like you’re getting are few and far between so we have to work with what we’ve got. However, in practice it’s generally not possible to make the connection with these more distant relatives unless, as you say, you’ve got other family members tested as well who are part of a genetic network.

Keep up the good work. It was good to see an illustrated explanation of how to remove all the small segments which seem to lead a lot of people astray.