Abstract

Mendelian randomization (MR) is widely used to identify causal relationships among heritable traits, but can be confounded by genetic correlations reflecting shared etiology. We propose a model in which a latent causal variable (LCV) mediates the genetic correlation between two traits. Under this model, trait 1 is fully genetically causal for trait 2 if it is perfectly genetically correlated with the latent variable, and partially genetically causal for trait 2 if the latent variable has a larger effect on trait 1 than on trait 2. By comparing the size of these effects we define the genetic causality proportion (gcp), which is equal to 1 when trait 1 is fully genetically causal for trait 2. We fit this model using mixed fourth moments E(α21α1α2) and E(α22α1α2) of marginal effect sizes for each trait, exploiting the fact that if trait 1 is causal for trait 2 then SNPs with large effects on trait 1 will have correlated effects on trait 2, but not vice versa. We performed simulations under a wide range of genetic architectures and determined that LCV, unlike state-of-the-art MR methods, produced well-calibrated false positive rates and reliable gcp estimates in the presence of genome-wide genetic correlations and asymmetric genetic architectures. We applied LCV to GWAS summary statistics for 52 traits (average N=326k), identifying statistically significant genetically causal effects (1% FDR) for 63 pairs of traits. Results consistent with the published literature included causal effects on myocardial infarction (MI) for LDL, triglycerides and BMI. Novel findings included an effect of LDL on bone mineral density, consistent with clinical trials of statins in osteoporosis. Our results demonstrate that it is possible to distinguish between correlation and causation using genetic data.