This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Rheumatoid arthritis (RA) is an archetypal, common, complex autoimmune disease with both genetic and environmental contributions to disease aetiology. Two novel RA susceptibility loci have been reported from recent genome-wide and candidate gene association studies. We, therefore, investigated the evidence for association of the STAT4 and TRAF1/C5 loci with RA using imputed data from the Wellcome Trust Case Control Consortium (WTCCC). No evidence for association of variants mapping to the TRAF1/C5 gene was detected in the 1860 RA cases and 2930 control samples tested in that study. Variants mapping to the STAT4 gene did show evidence for association (rs7574865, P = 0.04). Given the association of the TRAF1/C5 locus in two previous large case–control series from populations of European descent and the evidence for association of the STAT4 locus in the WTCCC study, single nucleotide polymorphisms mapping to these loci were tested for association with RA in an independent UK series comprising DNA from >3000 cases with disease and >3000 controls and a combined analysis including the WTCCC data was undertaken. We confirm association of the STAT4 and the TRAF1/C5 loci with RA bringing to 5 the number of confirmed susceptibility loci. The effect sizes are less than those reported previously but are likely to be a more accurate reflection of the true effect size given the larger size of the cohort investigated in the current study.

Rheumatoid arthritis [RA (MIM 180300)] is a chronic inflammatory arthritis occurring in 0.8% UK population and characterized by progressive destruction of synovial joints (1). A strong genetic component to disease aetiology has been determined with heritability estimates of 50–60% (2). The major susceptibility loci are: (i) the HLA-DRB1 gene, in which a group of alleles each sharing a common amino acid sequence in the peptide-binding groove and collectively referred to as the shared epitope, is associated with both susceptibility and severity to RA (3) and (ii) the protein tyrosine phosphatase 22 (PTPN22) gene, which is associated with susceptibility to RA (4). Together, they account for 50% of the genetic susceptibility to disease in populations of European descent, although, interestingly, variation across the PTPN22 gene does not appear to be associated with RA in Japanese or Korean populations (5).

Recently, tremendous progress has been made in identifying further RA susceptibility variants and this has been achieved using both genome-wide association (GWA) and candidate gene approaches. First, the Wellcome Trust Case Control Consortium (WTCCC) study was a GWA that included 1860 RA cases and 2930 controls and confirmed association to the two known susceptibility variants, HLA-DRB1 and PTPN22 (P < 10−7) (6). In addition, nine other loci showed modest evidence for significance and association to one of these, a locus lying between the OLIG3 and TNFAIP3 genes on chromosome 6q, has been unequivocally replicated in UK and US populations (7,8). Secondly, a GWA study in US and Swedish populations reported association to the TRAF1/C5 locus and this has subsequently been confirmed by a separate study (9,10). Finally, the results of a fine-mapping strategy investigating candidate genes mapping under a peak of linkage in US RA families has identified another RA susceptibility locus mapping to the STAT4 gene in US subjects, which has been subsequently confirmed in Swedish and Korean populations (11,12).

Many of the variants putatively associated with RA susceptibility were not genotyped directly in the WTCCC study. However, genotypes have been imputed using the data from directly genotyped single nucleotide polymorphisms (SNPs) nearby and from patterns of linkage disequilibrium inferred from HapMap data. Hence, imputed genotypes are available from the WTCCC study for all common HapMap SNPs and this data provides an opportunity to explore the two candidate RA susceptibility loci, STAT4 and TRAF1/C5, in this data set.

The aim of the current study was to look for evidence for association of previously reported RA susceptibility loci in the WTCCC data set and to validate the findings by direct genotyping of SNPs mapping to these regions in an independent cohort of 3418 cases and 3337 controls from the UK.

RESULTS

Imputed data

Imputed genotype frequencies conformed to Hardy–Weinberg expectations in the control population and allele frequencies were similar to those reported previously in European populations where data were available.

No difference in genotype frequencies was observed for the SNPs mapping to the TRAF1/C5 locus, previously reported to be associated with RA in other populations, in the UK WTCCC series (Table 1). Nominal evidence for association was observed for SNPs mapping to the STAT4 gene (Table 1).

Validation

The same SNPs were genotyped directly in an independent series of 3418 RA cases and 3337 controls. Concordance rate for duplicate samples was 99.5%. A Breslow Day test was undertaken to investigate heterogeneity between the samples recruited by the different centres but none was observed for any of the SNPs (P > 0.1). Hence genotype counts were combined across the centres and compared between validation cases and validation controls. SNPs mapping to both the STAT4 and TRAF1/C5 loci were significantly associated with RA susceptibility in the validation cohort (Table 2), although effect sizes were lower than had been reported in the previous, smaller studies.

Validation of SNPs in independent data set and combined analysis (with WTCCC samples)

Combined analysis

The imputed data from the WTCCC study were firstly combined with those from the validation study to create a combined sample of >5000 RA cases and ~6000 controls in order to allow robust estimates of the strengths of the effect sizes for these two loci to be calculated. The four SNPs mapping to each locus show strong correlation with each other (r2 between STAT4 SNPs = >0.97; r2 between TRAF1/C5 SNPs = 0.97). The largest effect size of the SNPs mapping to the STAT4 locus arises from rs7574865 (OR 1.15, 95% CI 1.08, 1.22; P = 1. 9 × 10−5) and this is in line with previous studies. Of the four TRAF1/C5 SNPs tested, rs10760130 showed the greatest statistical evidence for association but the effect size was less than that of the STAT4 locus (OR 1.09, 95% CI 1.03, 1.15; P = 0.001). Secondly, a meta-analysis including both the WTCCC and validation studies was undertaken, which yielded very similar odds ratios (OR rs7574865 1.14, 95% CI 1.07, 1.22; OR rs10760130 1.08, 95% CI 1.03, 1.14).

Stratification by anti-cyclic-citrullinated peptide antibody status

RA is a clinically heterogeneous disease and there has been some speculation recently that it may comprise two distinct subgroups characterized by the presence or absence of antibodies to citrullinated peptides recognized by anti-cyclic-citrullinated peptide (CCP) antibodies. For example, the association of carriage of shared epitope alleles of the HLA-DRB1 gene, the major RA susceptibility gene, appears confined to anti-CCP-positive RA cases (reviewed in 13). Hence, subgroup analysis was undertaken to explore whether the associations of the STAT4 and TRAF1/C5 loci differed in anti-CCP-positive and -negative individuals (Table 3). For the STAT4 locus, the strength of the association was similar in both subgroups. Although the TRAF1/C5 associations appeared stronger in the anti-CCP-positive subgroup, an effect, albeit smaller, was observed in the negative subgroup and a direct comparison between anti-CCP-positive and -negative subjects showed no statistically significant evidence for a difference between the subgroups (Table 3).

Stratification by CCP-positive/negative subgroups in validation data set (controls are the same as shown in Table 2)

DISCUSSION

We have performed the best powered study, to date, investigating putative RA susceptibility variants and have confirmed that both STAT4 and the TRAF1/C5 loci are associated with RA susceptibility. This brings to 5 the number of confirmed RA susceptibility genes (HLA DRB1, PTPN22 and OLIG3/TNFAIP3 in addition to the above two). The large sample size allows robust estimates of the effect sizes to be inferred and suggests that the remaining RA susceptibility genes may have smaller effect sizes.

It is interesting to note that the TRAF1/C5 locus showed no evidence for association in the WTCCC series but significant evidence for association in the larger validation cohort, despite the similarity in allele and genotype frequencies between the two UK control groups analysed. The linkage disequilibrium across the region is high and confidence scores for the imputed genotypes exceeded 97%, suggesting that the imputed genotypes from the WTCCC study should be accurate. Indeed, although none of the previously associated markers was genotyped directly in the WTCCC study, a SNP in near-perfect correlation (rs10118357 r2 = 0.97 with rs3761847 and r2 = 1 with rs10818488) was genotyped and also failed to show evidence for association with RA in that series (rs10118357 P = 0.89).

The different effect sizes observed at the TRAF1/C5 locus in different populations could be explained by population heterogeneity but we feel that this is unlikely because first, clinical, demographic and control allele frequencies were similar in the two UK cohorts investigated, only one of which showed evidence for association to the locus. Secondly, clinical and demographic characteristics of the UK RA cases were similar to the US, Swedish and Dutch series and, finally, the frequency of other RA associated variants is similar across all the populations studied. For example, published data on the rs2476601 SNP in the PTPN22 gene suggests that effect sizes are similar across the two UK cohorts, the US and Dutch series (6,7,14,15).

A more likely explanation of the differences of effect seen across the different cohorts is that the effect size at this locus is very modest, and it is, therefore, not surprising that the SNPs will be associated in some, but not all, studies. Surprisingly, this locus was first identified from two independent studies in which the initial sample sizes were much smaller than those tested in the WTCCC study (9,10). This aptly demonstrates the phenomenon of ‘winner's curse’ in which effect sizes are often over-estimated in the original studies. It also demonstrates that failure to detect evidence for association, even in a large case–control series like the WTCCC sample, does not exclude a true but modest effect being present. Indeed, if the true effect size is more like 1.09 and given the risk allele frequency, the WTCCC study would only have had 54% power to detect it at the 5% significance level.

The evidence for association of SNPs mapping to the STAT4 gene is more consistent, in keeping with the larger effect size conferred by this locus. The locus has previously been associated with both RA and systemic lupus erythematosus (SLE) (11), suggesting that it may predispose to a number of autoimmune diseases in the same way as the PTPN22 functional variant has been shown to. In fact, the strength of the association may be more pronounced in SLE. Further exploration in other autoimmune diseases is now warranted.

Sequencing and fine mapping studies will be required to identify the most likely causal polymorphisms in these loci. However, the strong genetic correlation of the SNPs in both the STAT4 and TRAF1/C5 regions mean that genetic studies alone will be unlikely to identify the causal variant. Functional studies will be necessary to determine which the disease variant is and how it contributes to the development of RA but already it is possible to hypothesize a potential pathway: previous evidence suggests that carriage of the PTPN22 susceptibility variant leads to a failure to delete autoreactive T-cells, thereby predisposing to autoimmunity in general (16). The presence of cigarette smoke, a recognized environmental risk factor for RA, has been shown to result in the citrullination of proteins (17) and the shared epitope alleles of the HLA DRB1 gene are known to bind citrullinated epitopes more efficiently resulting in an exaggerated helper T-cell response (18). Variation in either gene could predispose to RA, therefore, by creating a permissive environment for an up-regulated immune response. In turn, IL-12-induced activation of STAT4 has been shown to drive the production of Th1 and Th17 rather than Th2 cells, and a Th1/Th17 mediated inflammatory response is a hallmark of RA (19). The exaggerated Th1/Th17 response may mean that, in the presence of an inflammatory trigger, the inflammatory response is up-regulated leading to excess tumour necrosis factor (TNF) production with resultant binding of TNF to its cell-surface receptors TNFR1 and 2. Signalling via these receptors is mediated via a number of pathways but the TRAF family of receptors has been implicated, including TRAF1. In turn, TRAF1 signalling has been reported to activate TNFAIP3 (A20)-induced apoptosis and activation of NF-κB (20). Variation in the genes or regulatory sequences at any stage of this pathway could, therefore, predispose to RA by producing a prolonged and sustained inflammatory response characteristic of RA. Functional studies will be required to confirm or refine this hypothesis and determine where environmental susceptibility factors interact.

The studies confirm the utility of GWA approaches and well-powered studies in unravelling the intricacies of common, complex diseases such as RA. It is likely that a number of other susceptibility loci of more moderate effect sizes exist and these may only be revealed through combining data or meta-analysis to generate sufficiently large sample sizes to be confident of identifying their effects. However, already key pathways are emerging and provide the possibility of the development of further targeted therapies for this disabling condition in future.

METHODS

Study design

Imputed genotypes for SNP markers previously reported to be associated with RA were compared between 1860 RA cases and 2930 controls from the WTCCC study. Previous work suggests that, where linkage disequilibrium is high and confidence scores for imputed genotypes exceed 95%, the accuracy of imputation in predicting actual genotype counts exceeds 98.4% (21). Variants mapping to the STAT4 and TRAF1/C5 genes previously reported to be associated with RA in other populations, were genotyped in an independent UK sample of 3418 cases with RA and 3337 controls. A combined analysis was then undertaken including all the data from the WTCCC and the validation cohort to provide a robust estimate of effect sizes for these variants.

Cohorts

All subjects were white Caucasians and all cases satisfied 1987 ACR classification criteria for RA, modified for genetic studies. Both cohorts have been described previously but briefly:

WTCCC cohort

Cases (n = 1860) with RA were recruited as part of the arc National Repository of Family Material, from a primary care-based inception cohort or from NHS Rheumatology Clinics throughout the UK. Control data (n = 2930) were available for the 1958 birth cohort and from blood donors as described and reported previously (6).

Validation data set

DNA was available from RA patients recruited from six centres in the UK and five of these centres also provided DNA samples from healthy controls: Manchester 1372 cases and 924 controls (including 357 controls from the 1958 birth cohort, not overlapping with those samples tested in the WTCCC study); Sheffield 979 cases and 995 controls; Leeds, 1126 cases and 532 controls; Aberdeen, 523 cases and 862 controls; Oxford, 736 cases and 536 controls; London, 327 cases, as reported previously. The cases were from NHS Rheumatology Clinics throughout the UK. Presence of autoantibodies [rheumatoid factor (RF) and/or anti-CCP] was documented for a proportion of the patients.

SNP selection

Imputed genotype data for the following SNPs were retrieved from the WTCCC Web site:

STAT4: rs7574865, rs10181656, rs8179673 and rs11889341 show high correlation (r2) with each other, map to the STAT4 gene and have been associated following a fine mapping study of a peak of linkage in US samples and replicated in Swedish and Korean cohorts (11,12);

TRAF1/C5: rs3761847, rs10818488, rs10760130, rs10760129 and rs2900180 map to the TRAF1/C5 locus and these SNPs (or highly correlated proxies) have been associated with RA in populations from the Netherlands, US and Sweden (9,10).

Genotyping and analysis

Genotyping of the same variants in an independent cohort of RA cases and controls was undertaken using the Sequenom MassArray platform (Sequenom, Inc., San Diego, CA, USA). Negative controls and duplicate samples were included to ensure the accuracy of genotyping. Only those samples with a >90% success rate and only those SNPs with a genotyping success rate of >95% were included in the analysis. Allele and genotype frequencies were compared between RA patients and population controls using the trend test implemented in STATA. Combined analysis of SNPs using imputed data from the WTCCC study and the validation cohort was subsequently performed and, in this combined group, stratification analysis based on the presence of anti-CCP antibodies was undertaken.

This study was approved by the North West Multicentre Research Ethics Committee (MREC 99/8/84) and all subjects provided informed consent.