How to determine one’s ancestry using DNA tests?

One of the key early results of population genetics in the 1960s was the neutral theory of molecular evolution. This theory states that the vast majority of genetic mutations do not cause any actual difference in an organism’s biology. This means that although two humans differ at, on average, one in every thousand bases (bases are the letters that make up the code of your genome), most changes are not subject to natural selection. We call these (mostly) harmless changes to the genome genetic markers, and these are what geneticists use when comparing genomes.

We know where to find these markers in the human genome, and over the last ten years it’s become easy and cheap to look at thousands of these markers across hundreds of people, and therefore understand the differences between different people and populations. It is straightforward to change this type of genetic data into strings of zeroes and ones — based on the presence or absence of particular markers — and therefore represent an individual’s genome as a binary sequence, which makes sophisticated mathematical analyses possible.

Exploring genetic similarities between human populations can help to fill gaps in history books

One easy thing to with two genomes is to count the number of differences between these binary strings of digits: the more differences, the less related two people are. But we can do better than that; because we know where in the genome these genetic markers are we can see how these differences are distributed along chromosomes. For example, two people might differ at 10% of the genetic markers tested. These differences may be evenly distributed across the genome or may clumped together in pockets along chromosomes, leaving long stretches were there are no differences. The latter suggests that the two individuals are more closely related than the former, with the clumped together differences perhaps entering one person’s ancestral lineage recently. So whilst the absolute number of differences between two genomes helps, knowing where specific markers differ is also important. These concepts are used to identify the ancestry of different parts of the genome, and are the basis of most commercial ancestry tests.

The accuracy of ancestry tests depends a lot on the available reference datasets and the quality of the genetic data. In general, we can be more confident about more recent ancestry (within the last 500 years), than anything deeper. This is because recombination acts to break down the relationships between markers that allow us to assign ancestry to specific parts of the genome, and more time means more recombination.

The number of ancestors we have as we go back in time increases exponentially. After 20 generations (~500 years) we have over one million ancestors, and after 30 generations (~750 years) it’s over one billion. Of course, these ancestors are not independent and many will be shared with other people, indeed you will share more of these ancestors with people who have closer ancestral ties to you. However, there’s some very clever maths that shows that all Europeans share an ancestor who was alive about 600 years ago. We don’t know who this person was, or what their DNA looked like, but what it does tell us is that all Europeans have ancestors who were ancient Greek, Roman, or Viking, no matter where in Europe you happen to be living today.

Having said that, genomes that have been separated for many generations, like Africans and Europeans, will ‘look’ quite different to geneticists and it is possible to ascertain African and European ancestral segments in these genomes. As we develop more sophisticated methodology our abilities to differentiate more closely related populations becomes possible.

We can now differentiate, with a reasonable amount of confidence, northern from southern European genomes (which are in fact pretty similar) as well as sub-continental groupings from other parts of the world. One should be more cautious, however, on the interpretation on where and why this ancestry might be present in a specific person. There have been several detailed studies of ancestry in different parts of the world, and this research has been able to link genetic patterns to historical events. But the power of these studies comes from using populations of genomes. Applying these sorts of tests to individuals remains a challenge, and therefore relating any single person’s ancestry to a particular historic event or people should be treated with a large degree of scepticism and caution.