Home » Social Policy » CESR Conference on Polygenic Prediction and its Application in the Social Sciences

CESR Conference on Polygenic Prediction and its Application in the Social Sciences

The Center for Economic and Social Research at USC hosted the largest conference ever convened on the use of genetic data in the social sciences. More than 90 researchers from a wide range of disciplines discussed how polygenic scores (measures constructed from genetic variants across the genome) help predict behavioral traits.

On April 13 and 14, 2017, the Center for Economic and Social Research (CESR) at the University of Southern California hosted a conference on the use of genetic data in the social sciences. The conference brought together more than 90 researchers—economists, psychologists, sociologists, political scientists, and geneticists. A central focus of the meeting was on polygenic scores (PGS), which are measures constructed from genetic variants across the genome that can be used to predict behavioral traits, such as educational attainment, happiness, depression, and fertility. The conference was the largest conference ever convened on this topic and featured cutting-edge research from many of the leading researchers in social-science genomics. This blog post summarizes the conference presentations. The presentations themselves can be viewed here: https://www.thessgac.org/polygenic-conference.

The conference was co-organized by Daniel Benjamin, Director of the Behavioral Health Genetics Center at CESR; Titus Galama, Director of the Center for Study of Inequality at CESR; David Cesarini, Associate Professor of Economics at NYU; and Ciji Davis, CESR Center Assistant.

The conference began with a crash course on polygenic scores, presented by Daniel Benjamin. The human genome is a sequence of 3 billion nucleotide pairs. Some of these pairs, called single nucleotide polymorphisms (SNPs), differ across individuals. Genome-wide association studies (GWAS) have identified certain SNPs that are associated with specific behavioral traits. Researchers aggregate the effects of SNPs to create a “polygenic score” (PGS), which represents a composite genetic propensity for the trait. A potential problem arises when adding up the effects of SNPs since many SNPs are correlated with each other. This can result in double counting, which reduces the predictive power of the PGS. Aysu Okbay presented various methods to address this issue. One important lesson from this session is that while current PGSs are starting to reach substantively meaningful levels of predictive power in European-descent populations, they have much lower predictive power in non-European-descent populations.

After the introduction of these central analytical concepts, the next session brought attention to new GWAS results and the PGSs that can be constructed from them. Robbee Wedow presented the predictive power of the latest PGS for educational attainment. This PGS is currently one of the strongest genetic predictors of any trait, behavioral or otherwise. The predictive power of the best available PGS for educational attainment is rapidly improving: using a discovery sample of 101k individuals in 2013, a sample of 294k individuals in 2016, and in ongoing research a sample of ~800k, the predictive power increased from ~2, to ~6 and is currently ~11 percent of the variation in educational attainment. Still, there is much heritability that is not explained. (Heritability can be thought of as the extent to which genetics contributes to variation across people in a trait.) This suggests that even larger samples will lead to PGSs with even greater predictive power, highlighting a promise of continued substantial advancement in social science genomics.

Sometimes the predictive power of the PGS was found to differ across samples, or cohorts. Meghan Zacher delved into these differences in her presentation. She reported that, for educational attainment specifically, average educational attainment in each country, age of the sample, and number of response options, help to account for observed differences in the predictive power of the PGS. Some of these differences are suggestive of interesting gene-by-environment (GxE) interactions, i.e., environmental influences on how genes matter.

The third session concentrated on the role of interactions between genetic propensities and the environment. Nature and nurture complement each other and are not separable, with genetic influence depending on environment. Two presentations examined how the effects of an additional year of schooling differ across individuals who differ in their PGS for educational attainment, where a change in the compulsory schooling age is used to isolate the causal effect of an additional year of schooling. Sven Oskarsson presented findings using data from Sweden before and after a major schooling reform. Results show that the positive effects of completing schooling, including increased years of schooling and a higher income, were highest for high-PGS individuals while the positive effects for completing junior high were highest for low-PGS individuals. Patrick Turley presented findings from the 1972 compulsory UK schooling law, showing that individuals with a high PGS for predicting BMI were more likely to drop out at age 16, while individuals with a high educational attainment PGS were less likely to drop out at age 16.

In the fourth session, Melinda Mills presented a new GWAS on two measures of fertility: age at first birth and number of children ever born. Her team discovered twelve genetic variants that are statistically significantly associated with fertility. A PGS constructed from the GWAS results explains about one percent of the variability in age at first birth. A one-standard-deviation decrease in the PGS is associated with approximately a half-year delay in age at first birth for women and a third of a year delay for men. She also finds that a one-standard deviation increase in the PGS for number of children ever born is associated with a decrease of approximately 9% likelihood of never having children for women (there is no association for men). Her team used a variety of analytical methods in evaluating this relationship, including Multi-Trait Analysis of GWAS, or MTAG. MTAG is a new method for combining GWAS results across multiple, related traits to increase power for finding genetic associations with each of the traits without having to increase the sample size for any of the individual traits. Raymond Walters presented the theory and an application of MTAG to three traits: subjective well-being, depressive symptoms, and neuroticism.

In the fifth session, Dan Belsky examined how genetics influence the development of socioeconomic status and mobility. In his analysis of socioeconomic status, Dan found that the educational attainment PGS is on average slightly lower for children born into a lower SES family, compared with those born into higher SES families. Dan found similar trends for neighborhood: those with a higher educational attainment PGS tend to move to less disadvantaged neighborhoods at a modestly higher rate than those with a lower educational attainment PGS.

In political science, educational attainment has consistently been shown to predict political participation. In his presentation, Chris Dawes extended this literature by presenting evidence that the PGS for educational attainment is a predictor of political participation. One theory for this relationship suggests that education provides individuals with resources that allow them to become engaged in politics more easily than those with lower levels of educational attainment. Chris’ finding, that education mediates the PGS’s effect on political participation by approximately half, supports this theory.

In the sixth session, Riccardo Marioni presented research showing a link between the PGS for educational attainment and longevity. Specifically, he found that those with a higher PGS have longer-lived parents (where parental longevity is used as a proxy for the individual’s longevity). There could be a number of pathways through which this relationship operates. For example, perhaps the same behaviors that lead to greater education also lead to improved quality of life, which in turn increases longevity. Another potential pathway for this relationship is that educational-attainment-related genes are directly linked to longevity (in addition to educational attainment).

Jonathan Beauchamp presented evidence in favor of natural selection (here, simply meaning changes in gene frequencies associated with an increased number of offspring) in recent history. Using data from the Health and Retirement Study, he found that a higher PGS for educational attainment is associated with having fewer children among US households born in the 1950s and 1960s, which implies that the next generation will on average have lower values of the PGS. However, this effect is very small and is swamped by environmental and institutional changes that have led to increases in educational attainment over time.

Typical epidemiological studies provide correlational evidence between behaviors and health outcomes. Mendelian Randomization is a method for inferring the causal effect of behavior on health by using genes as instrumental variables. Using this method, Taavi Tillmann presented findings that educational attainment may have a causal effect on coronary heart disease. One of the major assumptions in Mendelian Randomization is that each gene only affects coronary heart disease through its effect on educational attainment, but a common concern is that single genes may affect multiple traits. George Davey Smith discussed new methods for conducting Mendelian Randomization that relax this assumption.

The final session further explored the interaction between genetic propensities and the environment. Laura Beirut presented on GxE interplay in smoking, an example of a trait heavily influenced by genetics. She presented evidence that individuals growing up in a low SES household with a high genetic risk for smoking (as measured by a PGS constructed from a GWAS on smoking) smoked about 20 percent more cigarettes per day in later life than someone from a low SES household with a low PGS score. However, for a high-risk individual from a high SES household, this genetic gradient vanishes, as he/she smokes as many cigarettes per day as a low risk individual from a high SES household. This shows that GxE interplay can be substantial, fully eliminating the genetic propensity for individuals from high SES backgrounds in this example.

Kevin Thom then presented findings on the link between wealth and the PGS for education. He found that even after controlling for one’s own and parental education, a one-standard-deviation increase in the PGS is associated with a 10% increase in wealth. He argued that this is likely due to differences in stock market participation.

Lauren Schmitz continued this theme of GxE interplay with an example where the PGS for educational attainment moderated the post-war educational attainment of Vietnam conscripts. Specifically, she found that the impact of the Vietnam draft on schooling depended on the PGS for education. Vietnam vets (those who got unlucky in the randomized draft lottery), with a PGS of 1-2 standard deviations below the mean, were approximately 70-90% less likely to get a postsecondary degree than those with a similar PGS who were not drafted. No significant differences were found among those with an average or higher-than-average PGS.

The field of social-science genomics is a young, exciting and fast developing field of science. The conference highlighted how, leveraging recent advances in the collection and analysis of genetic data, significant progress has been made in our understanding of the role of genetics in various social-science outcomes. We are only at the very beginning of these developments, and much is still in store. We invite you to take a moment to view some of the presentations.