Thursday, 15 September 2016

Polygenic scores

Stuart Ritchie (as in Intelligence: All that Matters) has done a guest post on the British Psychological Society Research Digest. This has wide readership among psychologists, so that it is very good news that they will be getting an update on contemporary research by an active researcher. I hope that they will consider the inheritance of characteristics in all their research.

This is intended to be a very brief post, just directing you to Stuart’s article, and adding a few links.

As Stuart says, only 1 or 2% of the variance in these behaviours is explained by the polygenic score. This sound little, and is, but the miracle is that any link can be shown between gene sequences and complex human outcomes.

The next paper by Selzam boosts the variance-accounted-for to 9.1%. Stuart says: The polygenic scores are already pretty good predictors: in Selzam’s study, they have just about half of the predictive value of asking about the parent’s socio-economic status, or testing the child’s IQ at age 7 (and the scores are based on DNA variants that are unchanged since birth and can be measured with a simple saliva or blood test).

Of course, parent’s socio-economic status is not random. Higher status is achieved by brighter persons. IQ at age 7 is usually a better predictor of adult success than class of origin, though the two are confounded, and quite properly so.

Stuart adds: Using an even newer polygenic education estimate from a more recent gene-finding study (published in Nature this year), Saskia Selzam and colleagues found that their polygenic score explained a remarkable 9.1 per cent of the variance in age-16 GCSE results in a sample of 4,300 British teenagers

It is worth noticing that the most easily available and most often used educational achievement measure is very crude: years of schooling. Once proper scholastic and intellectual assessment measures are used on much larger genetic samples the power of the predictive polygenetic scores can very probably be considerably refined.

I thought young Ritchie wrote that rather well, until "ignorance and denial are no longer an option". As he probably knows perfectly well, for many people ignorance and denial are no longer an option only in the sense that they are always embraced.

whenever one of these things makes it to the popular press it makes me sad. the whole thing is meaningless.

for instance, since when is 9% R^2 “pretty good”? polygenic scores can be slightly interesting from the point of view of animal/plant breeding, where they might actually be usable/measurable. but human GWAS is too much of a hot mess for these things to be even slightly interpretable. what conceivable use does a measure have when it maybe sometimes can explain 1% of the variance? recall that polygenic scores are, basically, linear combinations of dozens of features–when there is no plausible reason to believe that these effects are statistically independent–so you’ve basically overfit the crap out of what little signal there is, and gee whiz, turns out the prediction is crap also.

if this is as good as anyone can do, maybe it’s time to think about something other than the genetic basis of these traits, and more importantly something that can actually be fixed with known large effects (nutrition, poverty, teacher quality etc.). the worst part is that we’ve known this for >40 years: http://europepmc.org/backend/ptpmcrender.fcgi?accid=PMC1762622&blobtype=pdf

until we solve the basic problem that heritability measures are near-hopelessly unstable, this kind of study is just more noise harvesting; genetics of behavior is a much worse reproducibility fiasco than other issues in e.g. psychology, because in genomics there might as well not be a statistical significance filter. if you test a gazillion hypotheses, a bunch will pass whatever threshold you might happen to set. some might even be real signal! but if your odds ratio is 1.01, there is no perceptible reason to care.

There is a definitional problem here. When Richie says 'prediction' he doesn't actually mean prediction in a dictionary sense of the word (whether he does this knowingly or unknowingly is another question). At best, these methods 'explain' variance, they do not predict variance. Remember regression methods are simply correlational.

For more reading: http://jakewestfall.org/publications/Yarkoni_Westfall_choosing_prediction.pdf