Principal Investigator: Professor Shi Huang

1a: Genetic variations among humans mostly exist in the form of SNPs and may underlie in part the quantitative variations in most complex traits in humans. Our hypothesis is that most common SNPs may be functional and that the minor alleles of SNPs may be deleterious to most complex traits and hence negatively selected. We have developed novel methods to examine the role of minor alleles in complex traits and diseases. Here in this project, we would like to examine the role of minor alleles of common SNPs in various human traits using data from the UK Biobank. As a start, we will examine associations of all minor alleles of common SNPs in the database with several human complex diseases and traits, including Parkinson’s disease, educational attainment, and learning and memory. The objective is to identify specific set of minor alleles underlying variability in quantitative traits underlying common, chronic diseases and social network traits. Results of the proposed analyses will be reported in peer-reviewed journals.

1b: Our project meets the goal of the UK Biobank in understanding the genetic basis of complex human traits and diseases.

1c: Briefly, our research will involve several steps. First, we will download the SNP genotyping data for all relevant samples as well as the phenotype/traits data of these samples. Second, we will do a principle component analysis to identify homogeneous population and exclude from our study of outliers. Third, we will identify the minor alleles of all SNPs for a specific population group and calculate the minor allele content (MAC) of each individual in the population. Finally, we will determine whether any trait or disease may be associated with the MAC value. And we will further determine whether we can identify a subset of minor alleles for a specific trait with the goal of obtaining different subsets of minor alleles for different trait/disease.

1d: Our research is to study complex traits and diseases. Therefore, we are interested in all the dataset collected by the Biobank. We expect thus the full cohort to be most useful for our purpose.