The research findings demonstrate that on the basis of DNA information it is possible to determine with an accuracy of more than 90 percent whether a person has red hair, with a similarly high accuracy whether a person has black hair, and with an accuracy of more than 80 percent whether a person’s hair color is blond or brown. This new DNA approach even allows differentiating hair colors that are similar, for example, between red and reddish blond, or between blond and dark blond hair. The necessary DNA can be taken from blood, sperm, saliva or other biological materials relevant in forensic case work.

A few years ago Wired published a mildly alarmist piece titled ‘The Inconvenient Science of Racial DNA Profiling’, which focused on eye color identification. It turns out that most of the eye color variation in European populations can be predicted by variants across two genes, HERC2 and OCA2. Because markers in this region can explain ~75% of the variation in the trait it is ‘quasi-Mendelian.’ As it happens, markers in this gene also seem to effect skin color, and hair color. Naturally these loci loom large in the paper which the above press release is based on. It’s in Human Genetics. Model-based prediction of human hair color using DNA variants:

Predicting complex human phenotypes from genotypes is the central concept of widely advocated personalized medicine, but so far has rarely led to high accuracies limiting practical applications. One notable exception, although less relevant for medical but important for forensic purposes, is human eye color, for which it has been recently demonstrated that highly accurate prediction is feasible from a small number of DNA variants. Here, we demonstrate that human hair color is predictable from DNA variants with similarly high accuracies. We analyzed in Polish Europeans with single-observer hair color grading 45 single nucleotide polymorphisms (SNPs) from 12 genes previously associated with human hair color variation. We found that a model based on a subset of 13 single or compound genetic markers from 11 genes predicted red hair color with over 0.9, black hair color with almost 0.9, as well as blond, and brown hair color with over 0.8 prevalence-adjusted accuracy expressed by the area under the receiver characteristic operating curves (AUC). The identified genetic predictors also differentiate reasonably well between similar hair colors, such as between red and blond-red, as well as between blond and dark-blond, highlighting the value of the identified DNA variants for accurate hair color prediction.

The authors used a logistic regression model where the SNPs are the variant inputs which predict the odds of the hair color categories in a sample of Polish individuals. In this paper they used categorical classes as the dependent variables. They don’t seem to make a persuasive case that this is more accurate than using quantitative measures of hair color as dependent variables, though they do indicate that there isn’t any gain in accuracy to using a quantitative model of hair color. Since this is for the purposes of forensic analysis perhaps there is a strong cost vs. benefit angle which I’m not foreseeing. To the left you have a figure which shows the rapid diminishing marginal returns on SNPs when it comes to predicting hair color in Northern Europeans. The reason that red hair is so easy to detect is that it’s a rare trait which has a very distinctive genetic signature. In fact forensic identification of individuals with red hair via DNA has been practiced in the past, though due to the rarity of this trait its utility hasn’t been too great.

The alleles here are the usual suspects which show up in pigmentation genetics. As noted in the paper there have been some difficulties in the intersection between genomics and medicine in terms of substantive results, but in forensics the localization of salient phenotypic variation to only a few markers has been a relative success.

Click on the image to the left and you’ll see the genes and associated markers. There are two sets of phenotypes predicted (depending on how they categorized them). All of them are ratios like so: (non-black hair color)/(black hair color). The numerical values are betas, which show the relationship between the independent variable and the predictor. The magnitude indicates the scale of the direction of the effect, and the genes are sorted by utility in prediction. So the first row has MC1R, and a set of highly penetrant markers termed “R.” The presence of R is highly correlated to red hair, as can be seen in the high values in the columns which denote red hair. Aside from MC1R you have conventional SNPs (some of which you can look up online pretty easily).

What does this mean, and why is it important? The law enforcement application is rather straightforward, though the existence of hair dyes and bleaching agents means that it isn’t quite that useful. Rather, I think what we’re seeing here is a step-wise improvement in forensic genetics to the point where in the near future the perpetrator sketch artist may be as antiquated as the VCR. We’ll be getting somewhere when markers for nose size and shape, as well as other facial characteristics, are smoked out. I don’t believe that a fine-grained reconstruction of someone’s countenance will be possible, but that’s not usually what’s needed in any case for forensics. A coarse reconstruction will probably be superior to the sketches derived from the memories of witness; they will be less precise, but more accurate.

I remember some reading (long ago) concerning forensics
and hair, which said, that (in central Europe at least) hair
on the same skull varies a lot in diameter and colour.
(even when growing adjacent)
This was said to be important excluding single hairs
as a source of evidence.
So, on typical Europeans the “colour” of the hair is some
mixture. Is this something one can correlate to
that genes?
Georg

Stephen

Accuracy rates of 80-90 percent don’t sound all that impressive. Clearly better than eye-witness memory, but as opposed to subjective evidence, these tests provide a defense attorney with a *quantified* reasonable doubt. When used for investigation, not prosecution, these rates could lead to exclusion of an otherwise likely suspect. Importantly, the lack of perfection here seems less technical than it is genetic, so I’m guessing such tests need more markers to go to prime time. And nose shape’s going to be much more difficult than hair color.

http://washparkprophet.blogspot.com ohwilleke

As a way to prove the hair or eye color of an offender, Stephen is right that 80-90% isn’t very impressive but is probably better than eye-witness memory, particularly in a dark place when a stranger is involved. But, if you represent a suspect, and can get access to a DNA report in a police file that shows that the defendant probably has a different hair or eye color than your client, the DNA evidence could be useful in deciding what risks are involved in cooperating with police and in defending yourself in a criminal trial.

A key point, however, is that this methodology is only validated in Northern Europeans. But, a very large share of criminal defendants have some combination of substantial African, indigeneous American or Southern European admixture, and that percentage is higher in cases where the suspected perpetrator is a stranger and in certain parts of the U.S. making DNA identification necessary. This tool may be considerably more useful in a country like Germany, with a considerable mix of hair and eye colors within an overall predominantly Northern European population, than it is in the United States.

In the United States, identifying offender race, which DNA can do more accurately than it can hair and eye color phenotypes, is probably the more useful trick, and if one wanted more useful hair and eye color identifiers, one would want identifiers that differentiate between African-Americans and between Hispanics, as misidentification of non-white strangers by Anglo eye witnesses is the most common cause of wrongful convictions in the United States. For example, a test that would distinguish between light skinned and dark skinned African-Americans with a considerable degree of accuracy (e.g. 95%) would have considerable forensic value.

Stephen

I appreciate the added nuances from ohwilleke. Another point here comes up in the first paragraph, “But, if you represent a suspect – – – .” The usual gee-wiz benefit given for forensic developments is the societal one of protecting the innocent and discovering the guilty by more accurately discovering the objective facts. This good is enhanced to the extent that the test is accurate. And of course it’s a societal good to the extent that the innocent can better navigate the courts. But if the test is less accurate, an exonerating false negative is substantially possible. This benefit is extended to the guilty, and it conveniences the defense apart from the actual facts. It’s still a good, but an individual one for attorney and client, and it can act against the societal good. And the less accurate the test, the more likely it benefits the guilty.

Divalent

Ultimately if anyone is actually charged with the crime they will do a normal genetic comparison of the suspect with the DNA collected at the scene, so it doesn’t seem to me like this ability is going to open up another class of evidence whose reliability is going to be a factor in any trial. To the extent it’s useful, it will be as an investigative tool to narrow down the list of possible suspects. That fact that it is not 100% reliable is not unique to any type of evidence at this point.

http://blogs.discovermagazine.com/gnxp Razib Khan

my initial thought is with divalent.

http://ecophysio.fieldofscience.com/ EcoPhysioMichelle

I see this as being much more advantageous for identifying human remains (where DNA extraction is possible) than for trying to prosecute suspects.

Gav

No hair colour category for grey?

Sigh.

Miley Cyrax

@ ohwilleke
“But, a very large share of criminal defendants have some combination of substantial African… admixture”

In terms of arrests, black people are more or less represented in proportions that reflect the US population. In terms of how many of these cases go to trial, well, I dunno. It isn’t a secret that there are disproportionate numbers of black men in US prisons, though.

http://blogs.discovermagazine.com/gnxp Razib Khan

hm. 2.3 X over-representation is reflective in your book?

http://ecophysio.fieldofscience.com/ EcoPhysioMichelle

Err. I’m dumb. I thought black people were ~20%. Although how the different studies handle multiracial people is also a factor.

http://blogs.discovermagazine.com/gnxp Razib Khan

Err. I’m dumb. I thought black people were ~20%. Although how the different studies handle multiracial people is also a factor.

1) a 20% number is a normal estimate. people regularly seem to overestimate the proportion of minorities. this is a robust social science finding.

2) very few blacks identify as multiracial, though obviously most blacks have white ancestry. but in that particular statistic the main issue is that hispanics are classified as white.

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

Gene Expression

This blog is about evolution, genetics, genomics and their interstices. Please beware that comments are aggressively moderated. Uncivil or churlish comments will likely get you banned immediately, so make any contribution count!

About Razib Khan

I have degrees in biology and biochemistry, a passion for genetics, history, and philosophy, and shrimp is my favorite food. In relation to nationality I'm a American Northwesterner, in politics I'm a reactionary, and as for religion I have none (I'm an atheist). If you want to know more, see the links at http://www.razib.com