Division of Gastroenterology, Hepatology and Nutrition, Department of Medicine, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA; Department of Human Genetics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, PA, USA.

16

School of Medical Sciences, Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand.

Norwegian PSC Research Center, Research Insitute of Internal Medicine and Department of Transplantation Medicine, Oslo University Hospital and University of Oslo, Oslo, Norway.

22

Gastrointestinal Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK.

23

Department of Gastroenterology, Lithuanian University of Health Sciences, Kaunas, Lithuania.

24

Centre for Inflammatory Bowel Diseases, Saint John of God Hospital, Subiaco WA and School of Medicine and Pharmacology, University of Western Australia, Harry Perkins Institute for Medical Research, Murdoch, WA, Australia.

Inflammatory Bowel Diseases, Genetics and Computational Biology, Queensland Institute of Medical Research, Brisbane, Australia; Department of Gastroenterology, Royal Brisbane and Women's Hospital, and School of Medicine, University of Queensland, Brisbane, Australia.

Abstract

BACKGROUND:

Crohn's disease and ulcerative colitis are the two major forms of inflammatory bowel disease; treatment strategies have historically been determined by this binary categorisation. Genetic studies have identified 163 susceptibility loci for inflammatory bowel disease, mostly shared between Crohn's disease and ulcerative colitis. We undertook the largest genotype association study, to date, in widely used clinical subphenotypes of inflammatory bowel disease with the goal of further understanding the biological relations between diseases.

METHODS:

This study included patients from 49 centres in 16 countries in Europe, North America, and Australasia. We applied the Montreal classification system of inflammatory bowel disease subphenotypes to 34,819 patients (19,713 with Crohn's disease, 14,683 with ulcerative colitis) genotyped on the Immunochip array. We tested for genotype-phenotype associations across 156,154 genetic variants. We generated genetic risk scores by combining information from all known inflammatory bowel disease associations to summarise the total load of genetic risk for a particular phenotype. We used these risk scores to test the hypothesis that colonic Crohn's disease, ileal Crohn's disease, and ulcerative colitis are all genetically distinct from each other, and to attempt to identify patients with a mismatch between clinical diagnosis and genetic risk profile.

FINDINGS:

After quality control, the primary analysis included 29,838 patients (16,902 with Crohn's disease, 12,597 with ulcerative colitis). Three loci (NOD2, MHC, and MST1 3p21) were associated with subphenotypes of inflammatory bowel disease, mainly disease location (essentially fixed over time; median follow-up of 10·5 years). Little or no genetic association with disease behaviour (which changed dramatically over time) remained after conditioning on disease location and age at onset. The genetic risk score representing all known risk alleles for inflammatory bowel disease showed strong association with disease subphenotype (p=1·65 × 10(-78)), even after exclusion of NOD2, MHC, and 3p21 (p=9·23 × 10(-18)). Predictive models based on the genetic risk score strongly distinguished colonic from ileal Crohn's disease. Our genetic risk score could also identify a small number of patients with discrepant genetic risk profiles who were significantly more likely to have a revised diagnosis after follow-up (p=6·8 × 10(-4)).

INTERPRETATION:

Our data support a continuum of disorders within inflammatory bowel disease, much better explained by three groups (ileal Crohn's disease, colonic Crohn's disease, and ulcerative colitis) than by Crohn's disease and ulcerative colitis as currently defined. Disease location is an intrinsic aspect of a patient's disease, in part genetically determined, and the major driver to changes in disease behaviour over time.

Evolution of clinical subphenotypes(A) Proportion of patients with Crohn's disease who have inflammatory (Montreal classification B1), stricturing (B2), or penetrating (B3) disease over time from diagnosis to most recent follow-up. (B) Proportion of patients with Crohn's disease who have ileal (L1), colonic (L2), or ileocolonic (L3) disease over time from diagnosis to most recent follow-up. (C) Survival plot of time from diagnosis of Crohn's disease to resectional surgery stratified by disease location. (D) Survival plot of time from diagnosis of ulcerative colitis to colectomy stratified by disease extent (extensive disease, E3; non-extensive disease, E1 and E2).

Effect of single nucleotide polymorphisms, HLA alleles, and polygenic risk scores on phenotypes of inflammatory bowel disease(A) Effect sizes for genotype–phenotype associations for risk of Crohn's disease and ulcerative colitis (odds ratio relative to controls), Crohn's disease location (odds ratio of ileal vs colonic disease), Crohn's disease behaviour (proportional odds ratio), disease extent of ulcerative colitis (odds ratio of extensive vs non-extensive disease), and age at diagnosis (linear coefficients) for MST1, MHC, and NOD2 variants. All effect sizes are per allele, and are adjusted for associations with correlated phenotypes by including them as additional predictors in the regression model, along with principal components to control for stratification. See for more details on these regression models. Genome-wide significant associations are depicted by filled circles, and error bars depict 95% CIs. (B) Effect sizes of genetic risk scores for disease location, disease behaviour, and age at diagnosis including all 163 susceptibility loci. Effect sizes are calculated by linear regression of the risk score against the phenotype, adjusted for the effect of the other phenotypes and for principal components, and error bars depict 95% CIs. Filled circles represent effects that are significant after correcting for 15 phenotype-score combinations (p<0·003). Effect sizes are measured on scales standardised to unit variance (and thus represent the number of standard deviations that the mean phenotype increases by per standard deviation increase in the risk score).

Violin plot showing the genetic substructure of inflammatory bowel disease locationThe violin represents the range of the log CD versus UC score for the indicated subphenotype (calculated with the R package “vioplot”), with dots representing the mean of that group and error bars the 95% CIs. Although the effects are small compared with the variation within groups, the mean effects can still be measured accurately (right side of the figure). It can be seen on this figure that the Crohn's disease versus ulcerative colitis (CD vs UC) risk score placed colonic Crohn's disease between ileal Crohn's disease and ulcerative colitis. The plot also shows the positioning of the intermediate phenotypes (ileocolonic Crohn's disease and inflammatory bowel disease unclassified [IBD-U]) in between ileal and colonic Crohn's disease, and ulcerative colitis and colonic Crohn's disease, respectively.