Thursday, 3 September 2015

GWAS hits and country IQ

Country IQ (Lynn and Vanhanen 2012) has become a much researched variable, and now this paper out today links alleles associated with individual’s IQ to overall country IQ. This will begin to provide estimates of the extent to which country intelligence levels have a genetic cause. A key methodological aspect of this paper is extracting a common factor from among the SNPs, utilizing un-weighted least squares factor analysis, yielding a metagene — this being a term utilized in genetics to describe patterns of covariance among genes.

A review of intelligence GWAS hits: Their relationship to country IQ and the issue of spatial autocorrelation

Highlights

Published GWAS, reporting the alleles exhibiting significant and replicable associations with IQ, are reviewed.

The average frequency and the factor score of nine GWAS hits are strongly correlated to country IQ (r = 0.91).

Allele frequencies varied in a way that matched group-level phenotypic intelligence, albeit not significantly so.

GWAS hits were stronger predictors of IQ than random SNPs or Fst distances in correlational and regression analyses.

The GWAS hits are independent predictors of aggregate IQ differences, after accounting for genetic drift and migrations.

Abstract

Published Genome Wide Association Studies (GWAS), reporting the presence of alleles exhibiting significant and replicable associations with IQ, are reviewed. The average between-population frequency (polygenic score) of nine alleles positively and significantly associated with intelligence is strongly correlated to country-level IQ (r = .91). Factor analysis of allele frequencies furthermore identified a metagene with a similar correlation to country IQ (r = .86). The majority of the alleles (seven out of nine) loaded positively on this metagene. Allele frequencies varied by continent in a way that corresponds with observed population differences in average phenotypic intelligence. Average allele frequencies for intelligence GWAS hits exhibited higher inter-population variability than random SNPs matched to the GWAS hits or GWAS hits for height. This indicates stronger directional polygenic selection for intelligence relative to height. Random sets of SNPs and Fst distances were employed to deal with the issue of autocorrelation due to population structure. GWAS hits were much stronger predictors of IQ than random SNPs. Regressing IQ on Fst distances did not significantly alter the results nonetheless it demonstrated that, whilst population structure due to genetic drift and migrations is indeed related to IQ differences between populations, the GWAS hit frequencies are independent predictors of aggregate IQ differences.

The author says:

The average frequency (polygenic score) of nine alleles positively associated with IQ and proxy phenotypes at the individual differences level in published GWAS is strongly and significantly correlated to population, or country IQ (r = .91). Factor analysis of allele frequencies yielded a metagene factor with a similar correlation to IQ (.86). The majority of alleles (seven out of nine) loaded positively on this factor. 40 unrelated SNPs were drawn at random and their frequencies factor analyzed for use as a control. The pattern of very high-magnitude positive or negative correlations suggests that spatial autocorrelation might be inflating the relationships between variables. That is to say, factors extracted utilizing random SNPs exhibited very high correlations to the GWAS hits factors (r = .6 to .98) and similarly high correlations with country IQ distances. Unexpectedly, the method of correlated vectors produced very high values also when run using the random SNPs, rendering the extremely high magnitude and significant correlation (.99, p < .05) found for the GWAS hits somewhat less impressive. However, the correlation of IQ with the GWAS hits metagene (.89) was somewhat higher than the IQ correlation with the random SNP factors (.74).

Comparison of allele frequency means for the five continental groups from the 1000 Genomes database revealed frequency differences that closely correspond to observed continent-level aggregate IQ, yielding the following pattern: East Asian > European > South Asian > American (Hispanic) > African. However, ANOVA did not yield p values that meet the conventional significance threshold (p < .05), furthermore Tukey's test produced confidence intervals that bisected zero. The lack of statistical significance is clearly due to the very small sample size (N = 9). Increasing numbers of GWA studies will undoubtedly provide more hits in the future permitting the generation of an increasingly accurate picture of cognitive-related genetic variation, both within and between populations.

This is a very promising start, after a long review process. Once we get larger genetic samples of people in different continents it should be possible to move from country level aggregate intelligence scores to individual personal scores.

8 comments:

Or download it at http://emilkirkegaard.dk/en/wp-content/uploads/PifferIntelligence2015.pdf

I wonder if this would have been more convincing if he'd waited for the SSGAC hits supposedly coming out sometime this year? But I suppose he can always do a followup paper and this must have taken long enough to get finished as it is.

"...Two reviews found substantial cross-validity for the Eurasian population (Europeans and East Asians), and less for Africans (usually African Americans) (23,24). The first review only relied on SNPs with p<α and found weaker results. This is expected because using only these is a threshold effect, as discussed earlier.

The second review (from 2013; 299 included GWAS) found much stronger results, probably because it included more SNPs and because they also adjusted for statistical power. Doing so, they found that: ~100% of SNPs replicate in other European samples when accounting for statistical power, ~80% in East Asian samples but only ~10% in the African American sample (not adjusted for statistical power, which was ~60% on average). There were fairly few GWAS for AAs however, so some caution is needed in interpreting the number. Still, this throws some doubt on the usefulness of GWAS results from Europeans or Asians used on African samples (or reversely)."

"7. POOR AFRICAN-EURASIAN CROSS-VALIDITY AND THE PIFFER METHOD

The findings related to the relatively poor, but non-zero cross-validity of GWAS betas between European and African samples throw some doubt on the SNP evidence found by Piffer in his studies of the population/country IQ and cognitive ability SNP factors (29). If the betas for the SNPs identified in European sample GWAS do not work well as predictors for Africans, they would be equally unsuitable for estimating mean genotypic cognitive ability from SNP frequencies. Thus, further research is needed to more precisely estimate the cross-racial validity of GWAS betas, especially with regards to African vs. Eurasian samples."

I am afraid the author of the commentary got it wrong this time. Different LD patterns should simply reduce the reliability of the cross-population comparison, but they should not bias the results towards lower polygenic scores for Africans. Apparently the author never heard of a thing called correction for reliability, which if done would actually increase the PS difference between Africans and Europeans. But there is no reason why different LD patterns would increase allele frequency differences between populations. The effect would actually be the opposite, as the real frequency differences are expected to be larger than those observed.

Piffer seems to have reanalyzed his data and calculated somewhat higher scores for Africans in both height and iq (if my reading is right). How significant the difference is between corrected and uncorrected iq projections, I cannot tell..

"I computed two polygenic scores (mean population frequencies): ancestral and derived. Then I created a composite score by averaging them. This gives equal weight to ancestral and derived alleles (Piffer, 2015b).The end result is that populations with higher baseline frequencies of derived alleles (such as Africans) obtain a higher score after this correction, because more weight is given to ancestral alleles."

"We can see that the ranking of corrected polygenic scores for height and IQ gives higher scores to Africans compared to the uncorrected scores..."