Migraine without aura is the most common form of migraine, characterized by recurrent disabling headache and associated autonomic symptoms. To identify common genetic variants associated with this ... [more ▼]

Migraine without aura is the most common form of migraine, characterized by recurrent disabling headache and associated autonomic symptoms. To identify common genetic variants associated with this migraine type, we analyzed genome-wide association data of 2,326 clinic-based German and Dutch individuals with migraine without aura and 4,580 population-matched controls. We selected SNPs from 12 loci with 2 or more SNPs associated with P values of <1 x 10(-5) for replication testing in 2,508 individuals with migraine without aura and 2,652 controls. SNPs at two of these loci showed convincing replication: at 1q22 (in MEF2D; replication P = 4.9 x 10(-4); combined P = 7.06 x 10(-11)) and at 3p24 (near TGFBR2; replication P = 1.0 x 10(-4); combined P = 1.17 x 10(-9)). In addition, SNPs at the PHACTR1 and ASTN2 loci showed suggestive evidence of replication (P = 0.01; combined P = 3.20 x 10(-8) and P = 0.02; combined P = 3.86 x 10(-8), respectively). We also replicated associations at two previously reported migraine loci in or near TRPM8 and LRP1. This study identifies the first susceptibility loci for migraine without aura, thereby expanding our knowledge of this debilitating neurological disorder. [less ▲]

Several risk factors for Crohn's disease have been identified in recent genome-wide association studies. To advance gene discovery further, we combined data from three studies on Crohn's disease (a total ... [more ▼]

Several risk factors for Crohn's disease have been identified in recent genome-wide association studies. To advance gene discovery further, we combined data from three studies on Crohn's disease (a total of 3,230 cases and 4,829 controls) and carried out replication in 3,664 independent cases with a mixture of population-based and family-based controls. The results strongly confirm 11 previously reported loci and provide genome-wide significant evidence for 21 additional loci, including the regions containing STAT3, JAK2, ICOSLG, CDKAL1 and ITLN1. The expanded molecular understanding of the basis of this disease offers promise for informed therapeutic development. [less ▲]

We propose a minimal protocol for exhaustive genome-wide association interaction analysis that involves screening for epistasis over large-scale genomic data combining strengths of different methods and ... [more ▼]

We propose a minimal protocol for exhaustive genome-wide association interaction analysis that involves screening for epistasis over large-scale genomic data combining strengths of different methods and statistical tools. The different steps of this protocol are illustrated on a real-life data application for Alzheimer's disease (AD) (2259 patients and 6017 controls from France). Particularly, in the exhaustive genome-wide epistasis screening we identified AD-associated interacting SNPs-pair from chromosome 6q11.1 (rs6455128, the KHDRBS2 gene) and 13q12.11 (rs7989332, the CRYL1 gene) (p = 0.006, corrected for multiple testing). A replication analysis in the independent AD cohort from Germany (555 patients and 824 controls) confirmed the discovered epistasis signal (p = 0.036). This signal was also supported by a meta-analysis approach in 5 independent AD cohorts that was applied in the context of epistasis for the first time. Transcriptome analysis revealed negative correlation between expression levels of KHDRBS2 and CRYL1 in both the temporal cortex (β = -0.19, p = 0.0006) and cerebellum (β = -0.23, p < 0.0001) brain regions. This is the first time a replicable epistasis associated with AD was identified using a hypothesis free screening approach. [less ▲]

Identification of epistasis is a challenging task that when successful gives new clues to systems-level genetics where the complexity of underling biology of human disease can be better understood. Though ... [more ▼]

Identification of epistasis is a challenging task that when successful gives new clues to systems-level genetics where the complexity of underling biology of human disease can be better understood. Though many novel methods for detecting epistasis have been proposed and many studies for epistasis detection have been conducted, so far few studies can demonstrate replicable epistasis. In the present work, we propose a minimal protocol for exhaustive genome-wide association interaction (GWAI) analysis that involves screening for epistasis over large-scale genomic data combining strengths of different methods and statistical tools. The different steps of this protocol are illustrated on a real-life data application for Alzheimer’s disease (a large cohort of 2259 patients and 6017 controls from France). Using this protocol, we identified AD-associated interacting SNPs-pair from chromosome 6q11.1 (rs6455128, the KHDRBS2 gene) and 13q12.11 (rs7989332, the CRYL1 gene) and male-specific epistasis between SNPs from chromosome 5q34 (rs729149 and rs3733980, the WWC1 gene) and 15q22.2 (rs9806612, rs9302230 and rs7175766, the TLN2 gene). The transcriptome analysis revealed negative correlation between expression levels of KHDRBS2 and CRYL1 in both the temporal cortex and cerebellum brain regions and positive correlation between the expression levels of CRYL1 and WWC1 in the temporal cortex brain region. A replication analysis strategy and a meta-analysis approach in independent data confirmed effects of some of the discovered interactions. [less ▲]

Objectives: Common genetic mutations that can be detected via a genome-wide association (GWA) study and at the same time have a strong contribution to disease risk are fairly limited. Some of the genetic ... [more ▼]

Objectives: Common genetic mutations that can be detected via a genome-wide association (GWA) study and at the same time have a strong contribution to disease risk are fairly limited. Some of the genetic variants in humans are either rare, thus more difficult to be identified, or they are common, but exert relatively small or even no individual effects that are masked or enhanced by one or several genes. The discovery of interacting genetic variants, possibly explaining part of the hidden genetic heritability, requires the development of sophisticated strategies and bioinformatics tools. Methods: In the present study, we propose a minimal protocol for genome-wide association interaction (GWAI) analysis that involves screening over large-scale genomic data in the search for epistatic or synergetic effects. The different steps of this minimal protocol are illustrated on a real-life data application for Alzheimer disease (AlzD) (large human cohort of 2,259 cases and 6,017 controls from France) and the pros and cons of the approaches are discussed. Results: Using the protocol, we identified two pairs of AlzD-associated interacting SNPs: from chromosome 6q11.1 and 13q12.11 and male-specific epistasis between SNPs from chromosome 5q34 and 15q22.2. Conclusion: In the present work we developed and applied an epistasis detection protocol to perform a comprehensive genome-wide search for AlzD-associated epistatic effects, hereby combining the strengths of different strategies, methods and statistical tools. It is the first time an epistasis study of this magnitude has been conducted in the context of AlzD. We show the advantages of viewing and analyzing data from different angles. A replication analysis strategy adapted to the epistasis detection context, as well as a meta-analytic approach confirmed effects of the discovered interactions. Apart from the biological and clinical importance, the present work offers a roadmap for future investigations in the field of epistasis detection and interpretation. [less ▲]

The regulation of proviral latency is a central problem in retrovirology. We postulate that the genomic integration site of human T lymphotropic virus type 1 (HTLV-1) determines the pattern of expression ... [more ▼]

The regulation of proviral latency is a central problem in retrovirology. We postulate that the genomic integration site of human T lymphotropic virus type 1 (HTLV-1) determines the pattern of expression of the provirus, which in turn determines the abundance and pathogenic potential of infected T cell clones in vivo. We recently developed a high-throughput method for the genome-wide amplification, identification and quantification of proviral integration sites. Here, we used this protocol to test two hypotheses. First, that binding sites for transcription factors and chromatin remodelling factors in the genome flanking the proviral integration site of HTLV-1 are associated with integration targeting, spontaneous proviral expression, and in vivo clonal abundance. Second, that the transcriptional orientation of the HTLV-1 provirus relative to that of the nearest host gene determines spontaneous proviral expression and in vivo clonal abundance. Integration targeting was strongly associated with the presence of a binding site for specific host transcription factors, especially STAT1 and p53. The presence of the chromatin remodelling factors BRG1 and INI1 and certain host transcription factors either upstream or downstream of the provirus was associated respectively with silencing or spontaneous expression of the provirus. Cells expressing HTLV-1 Tax protein were significantly more frequent in clones of low abundance in vivo. We conclude that transcriptional interference and chromatin remodelling are critical determinants of proviral latency in natural HTLV-1 infection. [less ▲]

Genome-wide gene-environment (GxE) and gene-gene (GxG) interaction studies share a lot of challenges via the common genetic component they involve. GWEI studies may therefore benefit from the abundance of ... [more ▼]

Genome-wide gene-environment (GxE) and gene-gene (GxG) interaction studies share a lot of challenges via the common genetic component they involve. GWEI studies may therefore benefit from the abundance of methodologies that are available in the context of genome-wide epistasis detection methods. One of these is Model-Based Multifactor Dimensionality Reduction (MB-MDR), which does not make any assumption about the genetic inheritance model. MB-MDR involves reducing a high-dimensional GxE space to GxE factor levels that either exhibit high or low or no evidence for their association to disease outcome. In contrast to logistic regression and random forests, MB-MDR can be used to detect GxE interactions in the absence of any main effects or when sample sizes are too small to be able to model all main and GxE interaction effects. In this ongoing study, we demonstrate the opportunities and challenges of MB-MDR for genome-wide GxE interaction analysis and analyzed the difference in prebronchodilator FEV1 following 8 weeks of inhaled corticosteroid therapy, for 565 pediatric Caucasian CAMP (ages 5-12) from the SHARE project. [less ▲]

Objectives: Alzheimer disease (AlzD) is a complex, progressive neurodegenerative disease where dementia symptoms (memory and other intellectual abilities loss) gradually worsen over a number of years. The ... [more ▼]

Objectives: Alzheimer disease (AlzD) is a complex, progressive neurodegenerative disease where dementia symptoms (memory and other intellectual abilities loss) gradually worsen over a number of years. The disease is characterized by the neuropathologic findings of neurofibrillary tangles and amyloid plaques that accumulate in vulnerable brain regions. AlzD is inherited as complex trait and appears to be highly heritable with 58-79 percent attributable to genetic factors. So far, although a number of main-effect genes have been identified, only a fraction of AlzD cases can be explained by specific gene mutations. In our study we performed an exhaustive and selective genome-wide screening for SNP-SNP interactions associated with AlzD in a large case/control cohort to reveal hidden heritability that can be accounted for by epistasis. Methods: We developed a minimal protocol for genome-wide association interaction (GWAI) analysis that involves screening over large-scale genomic data in the search for epistatic or synergetic effects. The protocol was applied on a large human cohort of 2,259 cases AlzD cases and 6,017 healthy controls from France to search for AlzD-associated epistasic effects. Results: In the exhaustive genome-wide screening, we identified two pairs of AlzD-associated interacting SNPs from chromosomes 6q11.1 and 13q12.11, and male-specific epistasis between SNPs from chromosomes 5q34 and 15q22.2. In the selective epistasis search, screening over the candidate genes for AlzD previously reported to be in interaction, we replicated seven out of twelve AlzD-associated gene pairs (INS / PPARA, IL1A / PPARA, IL10 / PPARA, TF / HFE, MTHFR / IL6, ABCA1 / NPC1, LRP1 / MAPT). Conclusion: It is the first time an epistasis study of this magnitude has been conducted in the context of AlzD. We show the advantages of viewing and analyzing data from different angles. A replication analysis strategy adapted to the epistasis detection context, as well as a meta-analytic approach confirmed effects of the discovered interactions. Apart from the biological and clinical importance, the present work offers a roadmap for future investigations in the field of epistasis detection and interpretation. [less ▲]

Genome-wide association (GWA) studies of asthma and associated traits have identified numerous genes. A substantial portion of the heritability of these traits remains unexplained. Some variants, not ... [more ▼]

Genome-wide association (GWA) studies of asthma and associated traits have identified numerous genes. A substantial portion of the heritability of these traits remains unexplained. Some variants, not detectable via main effects GWA study may manifest themselves only in interaction with other variants. To search for interacting genes involved in regulation of asthma associated traits (total IgE, eosinophils, FEV1, FVC, FEV1/FVC) we performed GWA epistasis screening in two family groups of asthma patients:CAMP (Childhood Asthma Management Program:814 cases and 467 trios) and CARE (Childhood Asthma Research and Education:796 cases and 338 trios) [dbGaP accession number phs000166.v1.p1.c1]. Individuals were genotyped with the Aymetrix 6.0 array. After quality control 574922 and 575010 SNPs in CAMP and CARE respectively, were tested with FBAT. No main effects genome-wide significant associations were found. We prioritized candidate pairs of SNPs for MB-MDR epistasis screening using Biofilter leading to 7632 SNPs for CAMP and 7603 SNPs for CARE. The most significant pair-wise interaction was identified between SNPs from loci 7p21.1 and 12q23.3 influencing eosinophil level in asthmatics. [less ▲]

Genome-wide association (GWA) studies of Crohn's disease have identified numerous genes. However, a substantial portion of the heritability of this disease remains unexplained. Some gene variants, not ... [more ▼]

Genome-wide association (GWA) studies of Crohn's disease have identified numerous genes. However, a substantial portion of the heritability of this disease remains unexplained. Some gene variants, not detectable via main effects GWA study, may manifest themselves only in interaction with other variants. To search for interacting genes involved in the regulation of Crohn's disease, we performed GWA epistasis screening in a large human cohort (1851 cases/2938 controls) belonging to the Wellcome Trust Case Control Consortium (WTCCC). All subjects were genotyped with the GeneChip 500K Mapping Array Set (Affymetrix chip). SNPs that passed our quality control (359,479 SNPs) were processed in Biofilter (a software package that looks for candidate epistatic genes contributing to disease risk) giving rise to 14,185 SNPs. Subsequent MB-MDR epistasis screening discovered four pairs of interacting SNPs on chromosome 4q35.1 and eight pairs on chromosome 11q23.2. The identified pairs of SNPs were confirmed with synergy-based measures. Notably, despite their mapping to the same genomic regions, the interacting SNPs were not in LD (r^2 < 0.5). Our findings support the idea of close chromosomal localization of two pairs of interacting genes that are involved in development of Crohn's disease. [less ▲]

The NEO-Five-Factor Inventory divides human personality traits into five dimensions: neuroticism, extraversion, openness, conscientiousness and agreeableness. In this study, we sought to identify regions harboring genes with large effects on the five NEO personality traits by performing genome-wide linkage analysis of individuals scoring in the extremes of these traits ( > 90th percentile). Affected-only linkage analysis was performed using an Illumina 6K linkage array in a family-based study, the Erasmus Rucphen Family study. We subsequently determined whether distinct, segregating haplotypes found with linkage analysis were associated with the trait of interest in the population. Finally, a dense single-nucleotide polymorphism genotyping array (Illumina 318K) was used to search for copy number variations (CNVs) in the associated regions. In the families with extreme phenotype scores, we found significant evidence of linkage for conscientiousness to 20p13 (rs1434789, log of odds (LOD) = 5.86) and suggestive evidence of linkage (LOD > 2.8) for neuroticism to 19q, 21q and 22q, extraversion to 1p, 1q, 9p and12q, openness to 12q and 19q, and agreeableness to 2p, 6q, 17q and 21q. Further analysis determined haplotypes in 21q22 for neuroticism (P-values = 0.009, 0.007), in 17q24 for agreeableness (marginal P-value = 0.018) and in 20p13 for conscientiousness (marginal P-values = 0.058, 0.038) segregating in families with large contributions to the LOD scores. No evidence for CNVs in any of the associated regions was found. Our findings imply that there may be genes with relatively large effects involved in personality traits, which may be identified with next-generation sequencing techniques. [less ▲]

We undertook a meta-analysis of six Crohn's disease genome-wide association studies (GWAS) comprising 6,333 affected individuals (cases) and 15,056 controls and followed up the top association signals in ... [more ▼]

We developed a data-mining method, Model-Based Multifactor Dimensionality Reduction (MB-MDR) to detect epistatic interactions for different types of traits. MB-MDR enables the fast identification of gene ... [more ▼]

We developed a data-mining method, Model-Based Multifactor Dimensionality Reduction (MB-MDR) to detect epistatic interactions for different types of traits. MB-MDR enables the fast identification of gene-gene interactions among 1000nds of SNPs, without the need to make restrictive assumptions about the genetic modes of inheritance. This thesis primarily focused on applying Model-Based Multifactor Dimensionality Reduction for quantitative traits, its performance and application to a variety of data problems. We carried out several simulation studies to evaluate quantitative MB-MDR in terms of power and type I error, when data are noisy, non-normal or skewed and when important main effects are present. Firstly, we assessed the performance of MB-MDR in the presence of noisy data. The error sources considered were missing genotypes, genotyping error, phenotypic mixtures and genetic heterogeneity. Results from this study showed that MB-MDR is least affected by presence of small percentages of missing data and genotyping errors but much affected in the presence of phenotypic mixtures and genetic heterogeneity. This is in line with a similar study performed for binary traits. Although both Multifactor Dimensionality Reduction (MDR) and MB-MDR are data reduction techniques with a common basis, their ways of deriving significant interactions are substantially different. Nevertheless, effects on power of introducing error sources were quite similar. Irrespective of the trait under consideration, epistasis screening methodologies such as MB-MDR and MDR mainly suffer from the presence of phenotypic mixtures and genetic heterogeneity. Secondly, we extensively addressed the issue of adjusting for lower-order genetic effects during epistasis screening, using different adjustment strategies for SNPs in the functional SNP-SNP interaction pair, and/or for additional important SNPs. Since, in this thesis, we restrict attention to 2-locus interactions only, adjustment for lower-order effects always (and only) implies adjustment for main genetic effects. Unfortunately most data dimensionality reduction techniques based on MDR do not explicitly require that lower-order effects are included in the ‘model’ when investigating higher-order effects (a prerequisite for most traditional, especially regression-based, methods). However, epistasis results may be hampered by the presence of significant lower-order effects. Results from this study showed hugely increased type I errors when main effects were not taken into account or were not properly accounted for. We observed that additive coding (the most commonly used coding in practice) in main effects adjustment does not remove all of the potential main effects that deviate from additive genetic variance. In addition, also adjusting for main effects prior to MB-MDR (via a regression framework), whatever coding is adopted, does not control type I error in all scenarios. From this study, we concluded that correction for lower-order effects should preferentially be done via codominant coding, to reduce the chance of false positive epistasis findings. The recommended way of performing an MB-MDR epistasis screening is to always adjust the analysis for lower-order effects of the SNPs under investigation, “on-the-fly”. This correction avoids overcorrection for other SNPs, which are not part of the interacting SNP pair under study. Thirdly, we assessed the cumulative effect of trait deviations from normality and homoscedasticity on the overall performance of quantitative MB-MDR to detect 2-locus epistasis signals in the absence of main effects. Although MB-MDR itself is a non-parametric method, in the sense that no assumptions are made regarding genetic modes of inheritance, the data reduction part in MB-MDR relies on association tests. In particular, for quantitative traits, the default MB-MDR way is to use the Student’s t-test (steps 1 and 2 of MB-MDR). Also when correcting for lower-order effects during quantitative MB-MDR analysis, we intrinsically maneuver within a regression framework. Since the Student’s t-statistic is the square root of the ANOVA F-statistic. Hence, along these lines, for MB-MDR to give valid results, ANOVA assumptions have to be met. Therefore, we simulated data from normal and non-normal distributions, with constant and non-constant variances, and performed association tests via the student’s t-test as well as the unequal variance t-test, commonly known as the Welch’s t-test. At first somewhat surprising, the results of this study showed that MB-MDR maintains adequate type I errors, irrespective of data distribution or association test used. On the other hand, MB-MDR give rise to lower power results for non-normal data compared to normal data. With respect to the association tests used within MB-MDR, in most cases, Welch’s t-test led to lower power compared to student’s t-test. To maintain the balance between power and type I error, we concluded that when performing MB-MDR analysis with quantitative traits, one ideally first rank-transforms traits to normality and then applies MB-MDR modeling with Student’s t-test as choice of association test. Clearly, before embarking on using a method in practice, there is a need to extensively check the applicability of the method to the data at hand. This is a common practice in biostatistics, but often a forgotten standard operating procedure in genetic epidemiology, in particular in GWAI studies. In addition to the presentation of extensive simulation studies, we also presented some MB-MDR applications to real-life data problems. These analyses involved MB-MDR analyses on quantitative as well as binary complex disease traits, primarily in the context of asthma/allergy and Crohn’s disease. In two of the presented analyses, MB-MDR confirmed logistic regression and transmission disequilibrium test (TDT) results. Part of the aforementioned methodological developments was initiated on the basis of observations of MB-MDR behavior on real-life data. Both the practical and theoretical components of this thesis confirm our belief in the potential of MB-MDR as a promising and versatile tool for the identification of epistatic effects, irrespective of the design (family-based or unrelated individuals) and irrespective of the targeted disease trait (binary, continuous, censored, categorical, multivariate). A thorough characterization of the different faces of MB-MDR this versatility gives rise to is work in progress. [less ▲]

Oral-facial-digital type I syndrome (OFDI) is characterised by an X-linked dominant mode of inheritance with lethality in males. Clinical features include facial dysmorphism with oral, dental and distal ... [more ▼]

Oral-facial-digital type I syndrome (OFDI) is characterised by an X-linked dominant mode of inheritance with lethality in males. Clinical features include facial dysmorphism with oral, dental and distal abnormalities, polycystic kidney disease and central nervous system malformations. Considerable allelic heterogeneity has been reported within the OFD1 gene, but DNA bi-directional sequencing of the exons and intron-exon boundaries of the OFD1 gene remains negative in more than 20% of cases. We hypothesized that genomic rearrangements could account for the majority of the remaining undiagnosed cases. Thus, we took advantage of two independent available series of patients with OFDI syndrome and negative DNA bi-directional sequencing of the exons and intron-exon boundaries of the OFD1 gene from two different European labs: 13/36 cases from the French lab; 13/95 from the Italian lab. All patients were screened by a semiquantitative fluorescent multiplex method (QFMPSF) and relative quantification by real-time PCR (qPCR). Six OFD1 genomic deletions (exon 5, exons 1-8, exons 1-14, exons 10-11, exons 13-23 and exon 17) were identified, accounting for 5% of OFDI patients and for 23% of patients with negative mutation screening by DNA sequencing. The association of DNA direct sequencing, QFMPSF and qPCR detects OFD1 alteration in up to 85% of patients with a phenotype suggestive of OFDI syndrome. Given the average percentage of large genomic rearrangements (5%), we suggest that dosage methods should be performed in addition to DNA direct sequencing analysis to exclude the involvement of the OFD1 transcript when there are genetic counselling issues. [less ▲]