Forest plot for the 56 human observational studies assessing the relationship between 5-HTTLPR, stress, and depression. The boxes indicate the 1-tailed P value for each study, with lower values corresponding to greater stress sensitivity of s allele carriers and higher values, greater stress sensitivity of l allele carriers. The size of the box indicates the relative sample size. The triangle indicates the overall result of our meta-analysis. Purple indicates that the study was included only in the Munafò et al meta-analysis3; cyan, only in the Risch et al meta-analysis4; blue, both in the Munafò et al and the Risch et al meta-analyses; black, both in the Munafò et al and the Risch et al meta-analyses but the validity is questionable. *Participants were selected according to extremes of high and low neuroticism scores.74 Because there is a strong positive correlation between depression and neuroticism, a large portion of variance in depression in such a sample will be explained by neuroticism alone.86 The absence of individuals with intermediate neuroticism scores, for which the gene × environment effect may be strongest, might obscure any gene × environment interaction effect.

The principal function of the serotonin transporter is to remove serotonin from the synapse, returning it to the presynaptic neuron where the neurotransmitter can be degraded or rereleased at a later time. A polymorphism in the promoter region of theserotonin transporter gene (5-HTTLPR) has been found to affect the transcription rate of the gene, with the short (s) allele transcriptionally less efficient than the alternate long (l) allele. In 2003, Caspi and colleagues1 used a prospective, longitudinal design to examine the relationship between 5-HTTLPR, stress, and depression in a large birth cohort and found a significant interaction between 5-HTTLPR and both stressful life events (SLEs) and childhood maltreatment in the development of depression. In this cohort, subjects carrying the less functional 5-HTTLPR s allele reported greater sensitivity to stress.

The Caspi et al study has been cited more than 2000 times in the scientific literature and generated a great deal of excitement around the potential of gene × environment interaction studies.2 To date, there have been 55 follow-up studies exploring whether 5-HTTLPR moderates the relationship between stress and depression, with some studies supporting the association between the 5-HTTLPR s allele and greater stress sensitivity and others not. Two recent meta-analyses have assessed a subset of these studies and concluded that there is no evidence supporting the presence of the interaction.3,4

Since their publication, these meta-analyses have generated substantial debate and intense criticism. Some of the discussion has revived the long-standing debate about whether exploring epidemiological interaction effects, in general, will produce worthwhile results.5,6 The criticism specific to this genetic association, however, has largely revolved around the fact that only a subset of the studies investigating the relationship between 5-HTTLPR, stress, and depression were included in the meta-analyses.7-12 In fact, while 56 primary data studies have assessed whether 5-HTTLPR moderates the relationship between stress and depression, the Munafò et al3 and Risch et al4 meta-analyses included only 5 and 14 of those studies, respectively.13-51 Further, Uher and McGuffin11 have demonstrated that the larger Risch et al meta-analysis included a significantly greater proportion of negative replication studies than positive replication studies.

There are multiple reasons that the studies included in the meta-analyses were limited. First, the primary study data needed for traditional meta-analysis were often not available, either in the original publications or in follow-up e-mail inquiries to study authors. For instance, Munafò and colleagues reported that 15 studies met criteria for inclusion in their meta-analysis. However, they were only able to obtain the primary study data needed for inclusion for 5 of those studies. There is no evidence that the studies that were able to be included in the meta-analyses were of higher “quality” than those not included.

A second reason why many studies were not included in the Risch et al and Munafò et al meta-analyses is that both meta-analyses focused exclusively on studies that explored an interaction between 5-HTTLPR and SLEs in the development of depression. The original Caspi et al article, however, not only reported an interaction between 5-HTTLPR and SLEs, but also an interaction between 5-HTTLPR and childhood maltreatment stress. Nine studies have attempted to replicate this interaction with childhood maltreatment, but these studies were not included in the meta-analyses.

Some observers have noted that the SLE study design may have limited power to detect genetic moderation effects because they are susceptible to a set of potential biases: (1) impaired recall of stressors by subjects, (2) highly variable stressors between subjects, and (3) the reduced statistical power inherent to tests of statistical interaction.10,48 A newer class of studies has attempted to bypass these potential problems by focusing on specific populations that have experienced a substantial, specific stressor. Eighteen studies have used such a specific stressor design to assess whether 5-HTTLPR moderates the relationship between stress and depression, but like the childhood maltreatment studies, these studies were excluded from the previous meta-analysis.

In this investigation, rather than focus on a specific class of studies, we sought to perform a meta-analysis on the entire body of work assessing the relationship between 5-HTTLPR, stress, and depression. Unfortunately, the different classes of studies generally used different study designs to explore this question, rendering it very difficult to combine the studies into a single traditional meta-analysis. An approach useful in situations where equivalent raw data are not available across all studies is to combine the studies at the level of significance tests.52 The Liptak-Stouffer z score method is a well-validated method for combining P values across studies and has been used widely across genomics and biostatistics.53-59 Herein, we use the Liptak-Stouffer z score method to combine the results from studies investigating whether the 5-HTTLPR variant moderates the relationship between stress and depression.

Methods

Studies

Potential studies were identified from previous meta-analyses and review articles and through PubMed at the National Library of Medicine, using the search terms depression or depressed and “serotonin transporter” or 5-HTTLPR and stress or maltreatment.3,4,10 We subsequently checked the reference sections of the identified publications and contacted authors through e-mail to identify additional studies in press or review. We considered all English-language studies published by November 2009 assessing whether 5-HTTLPR moderates the relationship between stress and depression. Two studies were excluded because their data were part of another larger study included in the analysis.12,60 In total, data from 54 publications met inclusion criteria and were included in the analysis.

To identify study design characteristics that might influence the ability to detect the interaction effect between 5-HTTLPR and life stress, we used 2 different grouping methods set out in a recent review article10 and assessed the presence of the association within each group. First, we stratified studies by the type of stressor studied (childhood maltreatment, specific medical conditions, and SLEs). When publications reported results for multiple types of stressors that matched different groups, we included the study in each relevant group.1,49,61-64 Second, we stratified studies by the method of stress assessment (objective, interview, and self-report questionnaire).

Consistent with current guidelines, we did not weigh studies by quality scores or exclude studies with low-quality studies. Instead, we report the quality data extracted, so that they are available for readers to evaluate (eTable).67,68 Further, to assess whether our results were influenced by studies rated as lower quality through this measure, we repeated our overall meta-analysis with only studies with a quality score higher than the median.69

P VALUE EXTRACTION

Two investigators (K.K. and S.S.) independently extracted the relevant P value from each study. There were no cases of disagreement between the 2 investigators. When several P values were provided (because of the use of several depression scales or separate P values for different subsets of samples), we used a weighted mean P value for our analyses. For studies with nonsignificant results that did not provide exact probabilities, a P value of 1 (no association in either direction) was assumed. When an article reported analyses that matched different groups of our study, we incorporated the mean of the P value of each group into the overall analysis.

Statistical analysis

The Liptak-Stouffer z score method was used to combine studies at the level of significance tests, weighted by study sample size. First, all extracted P values were converted to 1-tailed P values, with P values less than .50 corresponding to greater s allele stress sensitivity and P values more than .50 corresponding to greater l allele stress sensitivity.

Next, these P values were converted to z scores using a standard normal curve such that P values less than .50 were assigned positive z scores and P values more than .50 were assigned negative z scores. Subsequently, these z scores were combined by calculating

where the weighting factor wi corresponds to the individual study sample sizes, k corresponds to the number of total studies, and zi corresponds to the individual study z scores. The outcome of this test, zw, follows a standard normal distribution and the corresponding probability can be obtained from a standard normal distribution table. We used this procedure on the overall sample as well as on each of the individual study subgroups. To assess whether our results were substantially influenced by the presence of any individual study, we conducted a sensitivity analysis by systematically removing each study and recalculating the significance of the result. Further, to compare our method of combining studies at the significance test level with the method of combining studies at the raw data level used in the previous meta-analyses, we performed an analysis with only the studies included in the previous meta-analyses.3,4

To assess the possibility that results of the meta-analysis were affected by publication bias, we calculated the fail-safe N for our overall analysis, the number of unpublished studies that would have to exist to change the outcome of the Liptak-Stouffer test from significant to nonsignificant. Because the commonly used analytic approximation70 method of calculating the fail-safe N has been criticized for inadequate reliability and accuracy, we used a more rigorous, direct computational approach.71 Specifically, we calculated the number of studies with a P value equal to .50 and a sample size of 755 (the average sample size in the studies we analyzed) that we would need to incorporate into the weighted Liptak-Stouffer analysis to obtain a nonsignificant outcome. The ratio between the fail-safe N and the number of studies actually published estimates the potential for publication bias to influence our results. We also considered the effect of false-positive findings due to small sample size by calculating the number of smallest studies that could be deleted before the analysis would reveal a nonsignificant result.

Results

Overall meta-analysis

Our initial search identified 148 publications. Of these studies, we identified 54 studies that included 40 749 subjects meeting criteria for inclusion (Table 1). We found strong evidence that 5-HTTLPR moderates the relationship between stress and depression, with the s allele associated with an increased risk of developing depression under stress (P = .00002) (Figure). The significance of the result was robust to sensitivity analysis, with the overall P values remaining significant when each study was individually removed from the analysis (1.0 × 10−6 < P < .00016). When we restricted our analysis to those studies with a study “quality” score higher than the median, the P value remained significant (3.2 × 10−10). Further, there was evidence for genetic moderation among both the group of studies that used categorical measures of depression (P = .03; n = 17) and the group of studies that used continuous measures of depression (P = .001; n = 23).

Subgroup stratification

When stratifying our analysis by the type of stressor studied, we found strong evidence for an association between the s allele and increased stress sensitivity in the childhood maltreatment group (P = .00007) and the specific medical condition group (P = .0004) and marginal evidence for an association in the SLEs group (P = .03) (Tables 2, 3, and 4, respectively). The removal of individual studies did not lead to changes in the significance of the outcome in studies of childhood maltreatment (7.4 × 10−6 < P < .00014) or specific medical conditions (.00017 < P < .0068). However, the result among studies of SLEs became nonsignificant after the exclusion of any 1 of several studies1,35,38,40,76 (.013 < P < .062).

When stratifying our analysis by the stress assessment method, we found strong evidence for an association between the s allele and increased stress sensitivity among the objective measure group (P = .000003) and the interview assessment group (P = .0002) and marginal evidence in the self-report questionnaire group (P = .042). The removal of individual studies did not lead to changes in the significance of the outcome in studies assessing stress with objective measures (8.7 × 10−7 < P < .000029) or with interview assessments (4.0 × 10−6 < P < .0014). However, the result among studies assessing stress with self-report questionnaires became nonsignificant after the exclusion of several studies23,25,35,38,40,73,76 (.018 < P < .093).

Studies from previous meta-analyses

When we restricted our analysis to the studies included in the 2 previous meta-analyses, we found no evidence of an association between 5-HTTLPR and stress sensitivity (Munafò et al studies, P = .16; Risch et al studies, P = .11).

Publication bias

To make the result of our overall analysis nonsignificant (P = .05), more than 729 unpublished or undiscovered studies with an average sample size (N = 755) and a nonsignificant result (P = .50) would need to exist. This corresponds to a fail-safe ratio of 14 studies not included in this meta-analysis for every included study. Additionally, we found that 45 of the 54 studies with the smallest sample sizes could be deleted before the outcome of the analysis would change to nonsignificant.

Comment

We found strong evidence that a serotonin transporter promoter polymorphism (5-HTTLPR) moderates the relationship between stress and depression, with the less functional s allele associated with increased stress sensitivity. This quantitative meta-analytic result is consistent with recent qualitative reviews on the same set of studies.10,11 In addition, our results are consistent with a wide range of experimental neuroscience studies that have found increased stress reactivity among 5-HTTLPR s allele carriers.87-89 Evidence from animal studies also supports that functional variation in the serotonin transporter (SERT) gene affects behavioral response to stress. Serotonin transporter knockout mice show increased hypothalamic-pituitary-adrenal axis activation in response to both physical and psychological stressors.90,91 Developmentally, SERT knockout mice show impaired cortex-layer 4-barrel pattern formation and altered levels of a broad range of serotonin receptor subtypes, providing potential mechanisms through which SERT function may be affecting behavior.92-94 Further, naturally occurring, low-functioning SERT gene variants in mice and nonhuman primates are associated with changes in central nervous system biochemistry as well as with behaviors linked to stress sensitivity.95,96

While our findings are consistent with this broad set of experimental neuroscience and animal studies, our findings are inconsistent with 2 other meta-analyses that have explored this association. The 2 most likely causes of the conflicting results between our meta-analysis and the previous meta-analyses are (1) the difference in meta-analytic technique used and (2) the different sets of included studies. To distinguish between these 2 possible causes, we applied our meta-analytic technique to the sets of studies used in the previous meta-analyses.4 With these limited sets of studies, our meta-analytic technique produced the same nonsignificant result as the previous meta-analyses, suggesting that the difference in results between
meta-analyses was due to the different set of included studies rather than the different meta-analytic technique.

The results of our secondary meta-analysis, where we stratified studies by stressor type, also support the hypothesis that the difference in results between our meta-analysis and the previous meta-analyses is due to the difference in the primary studies included. Both previous meta-analyses focused exclusively on SLEs and reported no evidence that 5-HTTLPR moderates the relationship between SLEs and depression. Herein, we were able to include 11 additional SLE studies not included in previous meta-analyses but still found only marginal evidence that 5-HTTLPR moderates the relationship between SLEs and depression.6 In contrast, we found robust evidence that 5-HTTLPR moderates the relationship between both childhood maltreatment and specific stressors and depression.

One important variable that may help to account for the different results in the different stressor groups is the variability between studies within each group.86 Within
the childhood maltreatment and specific stressors groups, the designs of the primary studies were generally similar. In contrast, there is marked variation in study design between SLE studies. Some studies asked subjects about SLEs and depressive episodes that occurred decades earlier while others assessed SLEs and depressive episodes soon after they occurred.33,75 Further, SLE studies vary substantially in what they consider a life event. Another potential reason for the difference in stressor subgroup results is the nature of the stressors studied. Most of the specific stressor studies focused on chronic stressors while the SLE studies focused on acute SLEs. Interestingly, 3 studies have explicitly looked at both acute and chronic stressors in their cohorts and all 3 have found that the evidence for moderating effects was stronger for chronic stressors.13,19,48

In addition to the type of stressor studied, the stress assessment method used by investigators emerged as an important variable in our analysis. In particular, we found that the evidence of genetic moderation was stronger among studies that used objective measures or in-person interviews to assess stress than among studies that used self-report questionnaires.

There are limitations to our study, most of which follow from the meta-analytic technique that we used. Because we combined studies at the level of P values, errors or bias present in the statistical tests performed in the primary studies could potentially affect the results of this meta-analysis. We guarded against finding false-positive results due to this potential bias by using an average of reported P values when authors performed separate tests on different sample subgroups or multiple depression measures. Further, when authors performed multiple tests but only reported the significance results for a subset of these tests, we assumed that P = 1 for the unreported tests. The fact that we found a nonsignificant result when we applied our meta-analytic technique to the set of studies included in a previous nonsignificant meta-analysis suggests that statistical bias from primary studies did not unduly affect our results. Another drawback of using this meta-analytic method is that we were unable to estimate the magnitude of the genetic effect and, in particular, how the interaction effect size compares with any genetic main effect.17

Against this background, the present study suggests that there is cumulative and replicable evidence that 5-HTTLPR moderates the relationship between stress and depression. Our findings, particularly the identification of important study characteristics that influence study outcome (stressor type and stress assessment method), can provide guidance for the design of future gene × environment interaction studies. While there is certainly variation between study results, there is hope that a new generation of studies purpose-built for testing this specific hypothesis will improve replicability and shed light on sources of inconsistency. In the meantime, the results of our inclusive analysis of studies in this controversial area underscore the importance of including all relevant studies in meta-analyses and highlight the utility of incorporating environmental exposures in genetic association studies.

Submitted for Publication: April 7, 2010; final revision received October 22, 2010; accepted November 1, 2010.

Published Online: January 3, 2011. doi:10.1001/archgenpsychiatry.2010.189

Author Contributions: Dr Sen and Ms Karg had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Financial Disclosure: None reported.

Funding/Support: This work was supported by National Institutes of Health KL2 grant number UL1RR024986 and the University of Michigan Depression Center.

Role of the Sponsor: The funding agencies played no role in the design and conduct of the study; collection management, analysis, or interpretation of the data; and preparation, review, or approval of the manuscript.