Genetics and Genomics Study of Osteoporosis

Personnel

Research focuses

Our research is interested in genetic dissection of human complex diseases using the state-of-the-art multi- and inter-disciplinary approaches of genomic technologies, and statistical and bioinformatical methods. The complex disease we are focusing on is osteoporosis, which is a prevalent, debilitating disorder characterized by bone fragility and an increased risk of low-trauma fractures. The approaches we are using involve genome-wide association analyses, genome-wide transcriptome analyses, proteome-wide protein expression profiling and in vivo and in vitro functional analyses of specific genes of interest.

We are also extending our research from genome to epigenome, e.g., we are performing epigenome-wide profiling of osteoporosis at micro-RNA, DNA phosphorylation and histone modification levels. In particular, we are interested in developing novel statistical methods and bioinformatics tools for analyzing and managing large, complex datasets in genomic and epigenomic research.

Through Louisiana Osteoporosis Study (LOS), we are building a large research cohort and database for human complex disease studies. The LOS will enroll >20,000 subjects of different ethnicities in New Orleans and their surrounding areas. Each subject will be phenotyped for body composition (including bone mineral density, lean and fat mass), muscle function, and blood pressure etc., and assayed for important health-related and life-style information. Their blood samples will be collected for extraction of DNA, RNA and proteins and for cell isolation and biobanking. The LOS will become a sample pool for selecting subjects of extreme phenotypes (e.g., extremely high vs. low bone mass) for our ongoing funded and future genomic and epigenomic studies.

Whole genome sequencing studies are essential to obtain a comprehensive understanding of the vast pattern of human genomic variations. Here we report the results of a high-coverage whole genome sequencing study for 44 unrelated healthy Caucasian adults, each sequenced to over 50-fold coverage (averaging 65.8x). We identified approximately 11 million single nucleotide polymorphisms (SNPs), 2.8 million short insertions and deletions, and over 500,000 block substitutions. We showed that, although previous studies, including the 1000 Genomes Project Phase 1 study, have catalogued the vast majority of common SNPs, many of the low-frequency and rare variants remain undiscovered. For instance, approximately 1.4 million SNPs and 1.3 million short indels were novel to both the dbSNP and the 1000 Genomes Project Phase 1 data sets. On average, each individual genome carried approximately 330 loss-of-function variants that resulted in protein truncation, frameshift change, or the loss of a stop codon. At the population level, an average of 103 of these variants occurred in a homozygous state in an individual genome, which would completely "knock out" the annotated genes. Interestingly, while majority of these genes were "knocked-out" in just one or two individual genomes, a number of genes that are mainly related to antigen processing and immune response were frequently "knocked-out" in general populations. Our results contribute towards a comprehensive characterization of human genomic variation, especially for less-common and rare variants, and provide an invaluable resource for future genetic studies of human variation and diseases.