Abstract

ADHD and general intelligence
are negatively correlated (within populations) and this correlation is driven
by common genetic variants shared between the two phenotypes. This paper
analyzes the population frequency patterns of alleles associated with ADHD and
intelligence in two samples of 26 and 50 populations (1000 genomes and ALFRED).
Factor analysis of allele frequencies was used to estimate the strength of
natural selection on the two traits. The two factors, indicating selection for
general intelligence and ADHD, show strong negative correlations in both 1000
Genomes and ALFRED samples (r= -0.93 and -0.90, respectively).

Alleles with lower p-values would be less likely to be false
positives, so the more significant ADHD GWAS hits are expected to be more
strongly negatively correlated with the general intelligence SNP and the ADHD
SNP factors, which were also found (r=-0.26 and 0.37, respectively).

The ADHD factor predicted national IQs also after accounting
for a measure of population structure (Fst).

Results are interpreted in a framework based on evolutionary
convergent selection pressure for higher general intelligence and lower ADHD.

Introduction

Attention
deficit hyperactivity disorder (ADHD) is a psychiatric disorder characterized
by having trouble concentrating and impulsive behavior. Prevalence depends on
specific criteria used (e.g. self-report vs. clinical diagnosis) but is around
5-7% (Willcutt, 2012). Genetically informative designs find high heritabilities
of ADHD. One review of 20 twin studies found a mean heritability of 76%
(Faraone et al, 2005). ADHD is related to many other psychiatric problems such
as bipolar disorder and drug dependency (Kessler et al, 2006). More
importantly, ADHD is negatively related to general intelligence (GI; Kuntsi et
al, 2003) and learning ability (Mayes et al, 2000). The correlation with GI is
about -.3. This may seem small, but in practice it means that the ADHD group
has an IQ about 9 points below the non-ADHD group (see Frazier 2004 for a
meta-analysis of ADHD x IQ correlations). Moreover, Kunstsi et al. (2003) found
that the genetic overlap between ADHD and GI was 100%; the same genes have an
influence on both traits.

Martin
et al. (2014) found that a genetic composite risk score, based on a
case-control genome-wide association study (GWAS) for clinical ADHD
(Stergiakouli et al., 2012), was independently associated with lower IQ.
However, the association was rather weak (beta = -.05). On the other hand, the
discovery sample was quite small for a GWAS (700 with ADHD, 5100 controls), so
the association will likely be larger in a better powered GWAS.

Molecular
genetic studies of height, general intelligence and educational attainmentOver the last
few years, researchers have started moving away from the study of genetic
evolution using a single-gene, Mendelian approach towards models that examine
many genes together (polygenic). The more genes are involved in a given
phenotype, the more the signal of natural selection will be “diluted” across
different genomic regions (because each gene accounts for a tiny effect) making
it difficult to detect it using approaches focused on a single gene (Pritchard
et al., 2010; Piffer, 2014; Davies et al, 2011; Groen-Blokhuis et al, 2014). A
first attempt at empirically identifying polygenic selection was made by
Turchin et al. (2012) on two populations (Northern and Southern Europeans),
providing evidence for higher frequency of height-increasing alleles (obtained
from GWAS studies) among Northern Europeans. A drawback of that study was the
reliance on populations from a single continent and that crude pairwise
comparisons (e.g. French vs. Italian) were used without correlating frequency
differences to average population height. Moreover, the strength of selection
was not determined.

Rietveld
et al. (2013)’s meta-analysis found ten SNPs that increased educational
attainment, comprising three with nominal genome-wide significance and seven
with suggestive significance. A recent study has replicated the positive effect
of these top three SNPs, rs9320913, rs11584700 and rs4851266, on mathematics
and reading performance in an independent sample of school children (Ward et
al., 2014).

Two
different approaches to identify selection based on the correlation of allele
frequencies across different populations have been recently developed by Piffer
(2013) and Berg & Coop (2014).

Piffer
(2013) obtained two samples comprising 14 and 50 populations (1000 Genomes and
ALFRED databases, respectively) and applied principal components analysis to
the frequencies of the ten alleles reported in Rietveld et al. (2013). The
alleles loaded highly and in the expected direction (positively) on a single
factor accounting for most of the variance. The factor scores were correlated
to indexes of country educational achievement (PISA) and IQ, producing high correlations
(r’s around 0.9). This factor was interpreted as indicating the strength of
polygenic selection. This was the first time that genetic frequencies had been
used from a cross-racial sample and an estimate of selection strength was
provided, thus correlating it with measured average phenotypic scores.

The
genetic correlation between ADHD and GI within a population makes it plausible
that the same pattern will be found across populations (Jensen, 1998, “default
hypothesis”). Based on the
above, we formed the following hypotheses:

If the ADHD SNPs are true positives
and have been selected for or against, then there should be a general ADHD
SNP selection factor across populations.

There should be a negative correlation
between the ADHD SNP general factor and measures of national GI, as well
as with the previously reported GI SNP selection factor.

There should be a negative correlation between
the p-values of ADHD SNPs and their relationships with the GI/educational
attainment SNP genetic factor and the ADHD SNP general factor because SNPs
with higher p-values are more likely to be false positives.

Methods

IQ/educational
attainment-increasing alleles: Piffer’s factor, extracted from 4 SNPs affecting general
intelligence, was used (Piffer, 2015a). The top three SNPs (not in LD), were
obtained from the most recent GWAS of cognitive function (Davies et al.,
2015).This study focused on fluid intelligence.

ADHD
risk alleles were
obtained from the largest GWAS to this date, including 5,621 clinical patients
and 13,589 controls (Grohen-Blokuis et al., 2014).

National/Ethnic
IQs were obtained from Lynn and Vanhanen (2012). The IQ for Tuscany was
obtained from Piffer & Lynn, 2014, as the IQ of Central Italy. There were 3
missing cases (Chinese Dai, Gujarati Indian, Indian Telegu). See suppl. material (“gadhdfactors”
spreadsheet).

Since
the probability that an SNP is a true positive depends on its p-value, only
alleles with a p-value<5*e-5 were included. This is already a
very liberal threshold, because the conventional threshold to be considered
significant is 5*10-8 after correction for multiple testing issues
(Clarke et al., 2011). However,
the SNPs with the lowest p-value in this study were only <2*e-6,
so we chose as a-priori convention to include all SNPs with a p-value of an
order of magnitude lower. This resulted in a set of 42 SNPs. For allele frequencies,
see the supplementary material.The set of SNPs (N=42) included many
hits within the same genomic region (500Kb). To avoid redundancy, linked loci
were not counted more than once and, in order to reduce noise, if two or more
SNPs were in linkage, only the one with the best p-value was included. Since
having a case-to-variable ratio of at least 2:1 is recommended for factor
analysis (Zhao, 2009) and there were a total of 17 SNPs, we created two sets of
9 and 8 SNPs, ordered according to their p-value (i.e. the first set comprising
the half with the lower p-values).

A
factor analysis using minimum residuals was carried out and factor scores saved
with the Thurstone method. Other extraction methods were not used as they
produced nearly identical solutions. The same factor analytic procedure was
used throughout this paper.

Most
of the alleles’ loadings (11/17) were in the right direction (positive),
p=0.166. These are reported
in Tables 5 and 6.

The
two factors (obtained from the two separate and unlinked SNP sets) were highly
correlated with each other (r=0.96) and with the genetic population GI SNP
factor (from the 4 alleles) (r=-0.90 and -0.89, respectively).

Given the high correlation between the
factors extracted from the two sets of unlinked alleles, a composite factor was
obtained as the mean of the two vectors, to make the analysis more parsimonious
and less erroneous.

Figures 1(a,b) shows a scatterplot with
the factor scores for the ADHD (composite) and GI SNP factors. The correlations
between the two variables were r= -0.91 and -0.93 for the 4 and 6 SNPs factors,
respectively.

Figure 1a. Regression of ADHD SNP factor
on the 4 SNPs GI SNP factor.

Figure 1b. Regression of ADHD SNP factor
on the 6 SNPs GI SNP factor.

Visual inspection of the plot revealed
that the African groups were outliers. After removal of the African groups, the
correlations between the ADHD SNP composite factor and the 4 or 6 SNPs GI
factors were still negative and significant (respectively, r=-0.64 and r=0.61,
N=19, p<0.05). This indicates that the relationship also persists across the
other continents.

Method of correlated vectorsThe method of
correlated vectors consists of correlating the factor loadings of indicator
variables with the correlations between each indicator variable and the
criteria variable (see Kirkegaard, 2014). The method was employed to test the
hypothesis that SNPs with higher p-values would be more likely to be false
positives, hence if there has been natural selection, less selection signal
would be detected. The composite ADHD factor (Table 10, 5th col.) was a
criterion variable and the unlinked SNPs were correlated with it. Spearman
rank-order correlation was slightly negative (in line with predictions) but
nonsignificant: r= -0.26 (N=17, p=0.31). Using all the 42 SNPs, the Spearman
rank-order correlation was r= 0.37 and significant (p= .015). This relationship
is plotted in Figure 2.

Correlation of population genetic factors
with aggregate phenotypic measuresA systematic
review and meta-analysis concluded that “the large variability of ADHD/HD
prevalence rates worldwide resulted mainly from methodological differences
across studies” (Polanczyk et al., 2007). Since prevalence rates do not reflect
an underlying phenotype, correlating them with genetic scores would not be
useful. However, since there is a genetic correlation between GI and ADHD, and
natural selection has stronger effects upon higher-order constructs with
pervasive effects on life outcomes and survival, such as GI (Jensen, 1998; Gottfredson,
1997; Gordon, 1997; Gottfredson, 2004) we expect to find a correlation between
population-level ADHD genetic scores and aggregate measures of cognitive
capacity.

STUDY 2 (ALFRED dataset)
The 4 SNPs influencing general intelligence used by Piffer (2015a) were
searched on ALFRED.When a SNP was not found in ALFRED, the SNP in close
linkage disequilibrium (r>0.8) was used. Linkage calculator (http://www.broadinstitute.org/mpg/snap/) based on 1000 Genomes phase
1, CEU data. Corresponding SNPs are in brackets in the spreadsheet file.

None of the 3 new SNPs was found in ALFRED
with this method, apart from the one loading in the wrong direction
(rs17522122) in the 1000 Genomes analysis. Thus, analysis of ALFRED was limited
to the 4 SNPs.

Table 8 shows the factor loadings of the 4
GI-related SNPs in ALFRED.

Table 8. Factor loadings of 4
GI-related SNPs in ALFRED.

SNP

Factor
loading

rs9320913 A

0.43

rs11584700 G

0.70

rs4851266 T

0.84

rs236330 C

0.72

Factor analysis of ADHD risk allelesIn total, 13
(out of the 42) SNPs were found in ALFRED (including those in close LD). Results
of factor analysis are reported in Table 9.

Table 9. Factor loadings of 13
ADHD SNPs in ALFRED.

SNP

Factor
Loading

rs507533.G

-0.22

rs6453417.T

0.89

rs7160641.T

0.81

rs4936536.G

-0.31

rs10026084.A

0.63

rs4673145.A

-0.24

rs2206922.T

0.72

rs1982863.T

-0.48

rs2841633.T

0.72

rs12668989.T

0.32

rs4635724.G

-0.79

rs4580847.G

0

rs3026685.T

-0.40

The factor was negatively correlated to
the GI SNP factor (r=-0.81, N=50).

Since the method of correlated vectors
showed an abundance of noise in the SNPs with lower significance, another
factor analysis was carried out including only the SNPs with the p-value in the
top half of the total 13 (N=6). Results are reported in Table 10.

Table 10.Top 6 ADHD SNPs loadings.

SNP

Factor
Loading

rs507533.G

-0.28

rs6453417.T

0.90

rs7160641.T

0.75

rs4936536.G

-0.30

rs10026084.A

0.65

rs4673145.A

-0.28

In accord with the hypothesis of stronger
selection signal among the genuine hits, the correlation between the extracted
factor and the GI SNP factor slightly improved over the all-inclusive factor
analysis (r= -0.9 vs -0.81). Factor scores are reported in Table 11.

Visual inspection of the regression plot
suggests that the distribution of scores is not driven by outliers as in the
1000 Genomes database but is fairly uniform across continental clusters and
populations (Figure 3).

Accordingly, Fst distances published by
Piffer (2015c) were used in a multiple linear regression with ADHD factor
distances to predict IQ distances between populations. After list-wise deletion
there were 253 caes (NA=72). Standardized Betas were 0.168 and 0.483 for Fst
and ADHD factor, respectively.

4 and 6 SNPs g factor distances were also employed
as dependent variables to assess the predictive power of ADHD distances net of
population structure. Fst and ADHD had similar Betas (0.409,0.443), (0.5,0.429)
for the 4 and 6 SNPs factors, respectively.

Discussion

Factor analysis of allele frequencies
obtained from GWAS hits by independent studies on different but partly
overlapping phenotypes (ADHD and GI) shows that they follow an inverse spatial
distribution.

Robustness (reliability) of the findings
was provided by the internal consistency of the measures: factor analysis of
two sets of unliked ADHD risk alleles (using 1000 Genomes) located on different
chromosomes yielded two very similar factors (r= 0.96). Moreover, most of the
factor loadings (11/17) were positive (in the right direction), (p= 0.17).
Rietveld’s and Davies’ three top hits were related to two distinct phenotypes
(educational attainment and fluid intelligence, respectively), yet the two sets
of alleles produced similar frequency patterns (r=0.79). Moreover, the effect
on educational attainment of a genomic region on chromosome 6 was replicated by
Davies for fluid intelligence.

Validity was confirmed by the strong
negative correlation of the factor of ADHD SNPs with a factor extracted from an
independent set of 6 GI SNPs produced by several studies, with frequencies
obtained from the ALFRED and 1000 Genomes datasets, comprising 50 and 26
populations respectively. GWAS hits with a lower p-value had a stronger
relationship with the ADHD SNP factor, confirming the hypothesis that a
stronger signal of selection would be found in alleles that are likely to be
true positives. Moreover, the factor was negatively related to average
phenotypic country/ethnic GI.

The 6 GI-related SNPs were not in linkage
disequilibrium (LD) with the 17 unlinked ADHD alleles, so the correlation
between the two factors is not confounded by LD. However, two of them
(GI-related rs11584700 and ADHD risk rs2802837) were in nearby genomic regions
(around 1.2 Mbp apart), on chromosome 1.

Another noteworthy finding is that groups
which are the most distant genetically (Africans and Oceanians) appear to have
the most similar genetic scores for cognitive traits, whereas genetically
similar groups (East Asians and Native Americans) differ by almost a standard
deviation (Tables 15 and 16). This fact is hard to square with an explanation
of the results in terms of genetic drift and instead favours a model of
differential selection pressures faced by isolated populations living in
different environments.

We controlled for the effect of population
structure using the method outlined by Piffer(2015c). This showed that although
the ADHD factor was a stronger predictor of IQ distances between populations
than Fst, as indicated by Fst, it lost part of its predictive power (Beta=
0.483). Moreover, it did not have more predictive power of GI factor distances
than Fst. These results together are weaker than what found by Piffer for the 4
SNPs g factor(2015c), which retained all its predictive power on IQ distances
when regressed with Fst.

We see at least two possible explanations
for the general results: 1) The SNPs found by the ADHD GWAS are in reality
GI-increasing alleles and have no specific effect on ADHD. 2) ADHD and GI have
undergone opposite selection pressure, that is ancestral environments that
selected for higher GI also selected against behavioral predispositions to
ADHD, such as inattention and impulsivity. In such a hypothetical scenario,
harsher environments could have placed more demands on focused attention and
problem solving skills that resulted in survival and fitness differentials
among carriers of different alleles.

Obviously, a third possible explanation is
that the ADHD hits were not under selection and represent population structure just
as any other random set of SNPs. The not entirely satisfying results of
regressing these with Fst distances does not allow us to rule out this
possibility.

Another limitation of this study is that
ADHD prevalence rates are too affected by different diagnostic criteria to
provide a universal measure that can be used for a cross-cultural comparison.

Future researchFuture studies
should replicate the present analysis when new GI and ADHD SNPs are reported.

Piffer, D. & Lynn, R. (2014). New
evidence for differences in fluid intelligence between North and South Italy
and against school resources as an explanation for the north-south IQ
differential. Intelligence, 46: 246-249.

Piffer, D. (2015a). Estimating the
genotypic intelligence of populations and assessing the impact of socioeconomic
factors and migrations., The Winnower 2:e142299.93508 (2015). DOI:10.15200/winn.142299.93508