We have generated a next-generation whole-exome sequencing data set of 2628 participants of the population-based Rotterdam Study cohort, comprising 669 737 single-nucleotide variants and 24 019 short insertions and deletions. Because of broad and deep longitudinal phenotyping of the Rotterdam Study, this data set permits extensive interpretation of genetic variants on a range of clinically relevant outcomes, and is accessible as a control data set. We show that next-generation sequencing data sets yield a large degree of population-specific variants, which are not captured by other available large sequencing efforts, being ExAC, ESP, 1000G, UK10K, GoNL and DECODE...

Serum biomarker levels are associated with the risk of complex diseases. Here, we aimed to gain insights into the genetic architecture of biomarker traits which can reflect health status. We performed genome-wide association analyses for twenty serum biomarkers involved in organ function and reproductive health. 9,961 individuals from the UK Household Longitudinal Study were genotyped using the Illumina HumanCoreExome array and variants imputed to the 1000 Genomes Project and UK10K haplotypes. We establish a polygenic heritability for all biomarkers, confirm associations of fifty-four established loci, and identify five novel, replicating associations at genome-wide significance...

Vitamin D insufficiency is common, correctable, and influenced by genetic factors, and it has been associated with risk of several diseases. We sought to identify low-frequency genetic variants that strongly increase the risk of vitamin D insufficiency and tested their effect on risk of multiple sclerosis, a disease influenced by low vitamin D concentrations. We used whole-genome sequencing data from 2,619 individuals through the UK10K program and deep-imputation data from 39,655 individuals genotyped genome-wide...

Massively parallel sequencing technologies provide great opportunities for discovering rare susceptibility variants involved in complex disease etiology via large-scale imputation and exome and whole-genome sequence-based association studies. Due to modest effect sizes, large sample sizes of tens to hundreds of thousands of individuals are required for adequately powered studies. Current analytical tools are obsolete when it comes to handling these large datasets. To facilitate the analysis of large-scale sequence-based studies, we developed SEQSpark which implements parallel processing based on Spark to increase the speed and efficiency of performing data quality control, annotation, and association analysis...

Obesity is a genetically heterogeneous disorder. Using targeted and whole-exome sequencing, we studied 32 human and 87 rodent obesity genes in 2,548 severely obese children and 1,117 controls. We identified 52 variants contributing to obesity in 2% of cases including multiple novel variants in GNAS, which were sometimes found with accelerated growth rather than short stature as described previously. Nominally significant associations were found for rare functional variants in BBS1, BBS9, GNAS, MKKS, CLOCK and ANGPTL6...

By performing a meta-analysis of rare coding variants in whole-exome sequences from 4,133 schizophrenia cases and 9,274 controls, de novo mutations in 1,077 family trios, and copy number variants from 6,882 cases and 11,255 controls, we show that individuals with schizophrenia carry a significant burden of rare, damaging variants in 3,488 genes previously identified as having a near-complete depletion of loss-of-function variants. In patients with schizophrenia who also have intellectual disability, this burden is concentrated in risk genes associated with neurodevelopmental disorders...

The genetic features of isolated populations can boost power in complex-trait association studies, and an in-depth understanding of how their genetic variation has been shaped by their demographic history can help leverage these advantageous characteristics. Here, we perform a comprehensive investigation using 3,059 newly generated low-depth whole-genome sequences from eight European isolates and two matched general populations, together with published data from the 1000 Genomes Project and UK10K. Sequencing data give deeper and richer insights into population demography and genetic characteristics than genotype-chip data, distinguishing related populations more effectively and allowing their functional variants to be studied more fully...

A fundamental challenge in analyzing next-generation sequencing (NGS) data is to determine an individual's genotype accurately, as the accuracy of the inferred genotype is essential to downstream analyses. Correctly estimating the base-calling error rate is critical to accurate genotype calls. Phred scores that accompany each call can be used to decide which calls are reliable. Some genotype callers, such as GATK and SAMtools, directly calculate the base-calling error rates from phred scores or recalibrated base quality scores...

Deep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the broader allelic architecture of 12 anthropometric traits associated with height, body mass, and fat distribution in up to 267,616 individuals. We report 106 genome-wide significant signals that have not been previously identified, including 9 low-frequency variants pointing to functional candidates...

Ocular coloboma (OC) is a defect in optic fissure closure and is a common cause of severe congenital visual impairment. Bilateral OC is primarily genetically determined and shows marked locus heterogeneity. Whole-exome sequencing (WES) was used to analyze 12 trios (child affected with OC and both unaffected parents). This identified de novo mutations in 10 different genes in eight probands. Three of these genes encoded proteins associated with actin cytoskeleton dynamics: ACTG1, TWF1, and LCP1. Proband-only WES identified a second unrelated individual with isolated OC carrying the same ACTG1 allele, encoding p...

Osteoporosis is a common and debilitating bone disease that is characterised by low bone mineral density, typically assessed using dual-energy X-ray absorptiometry. Quantitative ultrasound (QUS), commonly utilising the two parameters velocity of sound (VOS) and broadband ultrasound attenuation (BUA), is an alternative technology used to assess bone properties at peripheral skeletal sites. The genetic influence on the bone qualities assessed by QUS remains an under-studied area. We performed a comprehensive genome-wide association study (GWAS) including low-frequency variants (minor allele frequency ≥0...

BACKGROUND AND AIMS: Familial hypercholesterolaemia (FH) is an autosomal-dominant disease with frequency of 1/500 to 1/250 that leads to premature coronary heart disease. New approaches to identify FH mutation-carriers early are needed to prevent premature cardiac deaths. In a cross-sectional study of the Avon Longitudinal Study of Parents and Children (ALSPAC), we evaluated the biochemical thresholds for FH screening in childhood, and modelled a two-stage biochemical and sequencing screening strategy for FH detection...

By moving essential body fluids and molecules, motile cilia and flagella govern respiratory mucociliary clearance, laterality determination and the transport of gametes and cerebrospinal fluid. Primary ciliary dyskinesia (PCD) is an autosomal recessive disorder frequently caused by non-assembly of dynein arm motors into cilia and flagella axonemes. Before their import into cilia and flagella, multi-subunit axonemal dynein arms are thought to be stabilized and pre-assembled in the cytoplasm through a DNAAF2-DNAAF4-HSP90 complex akin to the HSP90 co-chaperone R2TP complex...

PURPOSE: To characterize features associated with de novo mutations affecting SATB2 function in individuals ascertained on the basis of intellectual disability. METHODS: Twenty previously unreported individuals with 19 different SATB2 mutations (11 loss-of-function and 8 missense variants) were studied. Fibroblasts were used to measure mutant protein production. Subcellular localization and mobility of wild-type and mutant SATB2 were assessed using fluorescently tagged protein...

Isolated populations with enrichment of variants due to recent population bottlenecks provide a powerful resource for identifying disease-associated genetic variants and genes. As a model of an isolate population, we sequenced the genomes of 1463 Finnish individuals as part of the Sequencing Initiative Suomi (SISu) Project. We compared the genomic profiles of the 1463 Finns to a sample of 1463 British individuals that were sequenced in parallel as part of the UK10K Project. Whereas there were no major differences in the allele frequency of common variants, a significant depletion of variants in the rare frequency spectrum was observed in Finns when comparing the two populations...

Parkinson's disease (PD) is the most common neurodegenerative movement disorder, affecting 1% of the population over 65 years characterized clinically by both motor and non-motor symptoms accompanied by the preferential loss of dopamine neurons in the substantia nigra pars compacta. Here, we sequenced the exomes of 244 Parkinson's patients selected from the Oxford Parkinson's Disease Centre Discovery Cohort and, after quality control, 228 exomes were available for analyses. The PD patient exomes were compared to 884 control exomes selected from the UK10K datasets...

Imputation using the 1000 Genomes haplotype reference panel has been widely adapted to estimate genotypes in genome wide association studies. To evaluate imputation quality with a relatively larger reference panel and a reference panel composed of different ethnic populations, we conducted imputations in the Framingham Heart Study and the North Chinese Study using a combined reference panel from the 1000 Genomes (N = 1,092) and UK10K (N = 3,781) projects. For rare variants with 0.01% < MAF ≤ 0...

Histone lysine methylation, mediated by mixed-lineage leukemia (MLL) proteins, is now known to be critical in the regulation of gene expression, genomic stability, cell cycle and nuclear architecture. Despite MLL proteins being postulated as essential for normal development, little is known about the specific functions of the different MLL lysine methyltransferases. Here we report heterozygous variants in the gene KMT2B (also known as MLL4) in 27 unrelated individuals with a complex progressive childhood-onset dystonia, often associated with a typical facial appearance and characteristic brain magnetic resonance imaging findings...