Copyright in the material you requested is held by American College Of Physicians
(unless otherwise noted). This email ability is provided as a courtesy, and by using
it you agree that that you are requesting the material solely for personal, non-commercial
use, and that it is subject to ACP's Conditions of Use.
The information provided in order to email this topic will not be used to send unsolicited
email, nor will it be furnished to third parties. Please refer to
American College Of Physicians'sPrivacy
Policy for further information.

From Danube University, Krems, Austria; University of North Carolina at Chapel Hill, Chapel Hill, North Carolina; Indiana University School of Medicine, Roudebush Veterans Affairs Medical Center, and Regenstrief Institute, Indianapolis, Indiana; University of Pittsburgh, Pittsburgh, Pennsylvania; and RTI International, Research Triangle Park, North Carolina.

Disclaimer: The authors of this report are responsible for its content. Statements in the report should not be construed as endorsement by the Agency for Healthcare Research and Quality or the U.S. Department of Health and Human Services of a particular drug, device, test, treatment, or other clinical service.

Acknowledgment: The authors thank Timothy S. Carey, MD, MPH, and Stacey Williams, MA, from the University of North Carolina at Chapel Hill, and also Linda Lux, MPA, and Loraine Monroe of RTI International.

Grant Support: By a contract from the Agency for Healthcare Research and Quality to the RTI International–University of North Carolina Evidence-based Practice Center (contract no. 290-02-0016).

Abstract

Background:Second-generation antidepressants dominate the management of major depressive disorder, dysthymia, and subsyndromal depression. Evidence on the comparative benefits and harms is still accruing.

Data Sources:MEDLINE, EMBASE, PsychLit, Cochrane Central Register of Controlled Trials, and International Pharmaceutical Abstracts from 1980 to April 2007, limited to English-language articles. Reference lists of pertinent review articles were manually searched and the Center for Drug Evaluation and Research database was explored to identify unpublished research.

Study Selection:Abstracts and full-text articles were independently reviewed by 2 persons. Six previous good- or fair-quality systematic reviews or meta-analyses were included, as were 155 good- or fair-quality double-blind, placebo-controlled, or head-to-head randomized, controlled trials of at least 6 weeks' duration. For harms, 35 observational studies with at least 100 participants and follow-up of at least 12 weeks were also included.

Data Synthesis:If data were sufficient, meta-analyses of head-to-head trials were conducted to determine the relative benefit of response to treatment and the weighted mean differences on specific depression rating scales. If sufficient evidence was not available, adjusted indirect comparisons were conducted by using meta-regressions and network meta-analyses. Second-generation antidepressants did not substantially differ in efficacy or effectiveness for the treatment of major depressive disorder on the basis of 203 studies; however, the incidence of specific adverse events and the onset of action differed. The evidence is insufficient to draw conclusions about the comparative efficacy, effectiveness, or harms of these agents for the treatment of dysthymia and subsyndromal depression.

Conclusion:Current evidence does not warrant the choice of one second-generation antidepressant over another on the basis of differences in efficacy and effectiveness. Other differences with respect to onset of action and adverse events may be relevant for the choice of a medication.

Major depressive disorder (MDD) is the most prevalent axis I disorder, affecting more than 16% of U.S. adults during their lifetime (1). In 2000, the economic burden of depressive disorders was an estimated $83.1 billion (2), more than 30% of which was attributable to direct medical expenses.

Pharmacotherapy dominates the medical management of MDD. Since the mid-1980s, second-generation antidepressants have gradually replaced tricyclic antidepressants and monoamine oxidase inhibitors as first-line medications, primarily because of their lower toxicity in overdose and similar general efficacy (3). These newer treatments include selective serotonin reuptake inhibitors, serotonin and norepinephrine reuptake inhibitors, selective serotonin and norepinephrine reuptake inhibitors, and other second-generation drugs (Table 1).

To date, only 2 systematic reviews have assessed the comparative efficacy and harms of second-generation antidepressants (3–4). These studies reported no substantial differences in efficacy or harms among agents. However, because of a lack of direct head-to-head comparisons, assessments in both studies were primarily qualitative. Consequently, uncertainties persist about the differences among the drugs for which sufficient head-to-head evidence is lacking.

We systematically assessed evidence on the comparative benefits and harms of second-generation antidepressants for the acute, continuation, and maintenance phases of treatment of MDD; subsyndromal depression; and dysthymia and the comparative efficacy and effectiveness for such accompanying symptoms as anxiety, insomnia, or neurovegetative symptoms. We also sought to determine whether efficacy, effectiveness, and harms differed among subgroups of patients on the basis of age, sex, race or ethnicity, or comorbid conditions.

To our knowledge, this is the first meta-analysis of second-generation antidepressants to assess quantitatively all possible comparisons among drugs in this class. We update findings of an earlier report on these pharmaceuticals (5) for the Agency for Healthcare Research and Quality.

Methods

An open process (described at http://www.effectivehealthcare.ahrq.gov) involving the public, the Agency for Healthcare Research and Quality's Scientific Resource Center for Effective Health Care program, and various stakeholder groups produced key questions. We followed a standardized protocol for all review steps (5).

Data Sources

We searched MEDLINE, EMBASE, PsychLit, Cochrane Central Register of Controlled Trials, and International Pharmaceutical Abstracts from 1980 to April 2007. We used Medical Subject Heading terms when available and keywords when appropriate. We combined terms for depressive disorders with a list of 12 specific second-generation antidepressants—bupropion, citalopram, duloxetine, escitalopram, fluoxetine, fluvoxamine, mirtazapine, nefazodone, paroxetine, sertraline, trazodone, and venlafaxine—and their specific trade names. We limited electronic searches to “adult 19 + years,” “human,” and “English language.”

We manually searched reference lists of pertinent review articles and letters to the editor and used the Center for Drug Evaluation and Research database (up to April 2007) to identify unpublished research submitted to the U.S. Food and Drug Administration. The Scientific Resource Center invited pharmaceutical manufacturers to submit dossiers on completed research for each drug. We received dossiers from 3 pharmaceutical companies (Eli Lilly and Company, Indianapolis, Indiana; GlaxoSmithKline, Philadelphia, Pennsylvania; and Wyeth, Madison, New Jersey).

Study Selection

Two persons independently reviewed abstracts and relevant full-text articles. To assess efficacy or effectiveness regarding response, speed of onset, remission, maintenance of remission, and quality of life, we included head-to-head controlled trials of at least 6 weeks' duration that compared 1 drug with another. Because head-to-head evidence was lacking for many comparisons, we included placebo-controlled trials for indirect comparison models. To assess harms (specific adverse events, rates of adverse events, and discontinuations attributable to adverse events), we also examined data from observational studies with at least 100 participants and follow-up of at least 12 weeks. To assess differences of benefits and harms in subgroups and patients with accompanying symptoms, we reviewed both head-to-head and placebo-controlled trials. We included meta-analyses if we found them to be relevant for a key question and of good or fair methodological quality (6).

If both reviewers agreed that a study did not meet eligibility criteria, we excluded it. We also excluded studies that met eligibility criteria but were reported only as an abstract. Investigators resolved disagreements about inclusion or exclusion by consensus or by involving a third reviewer.

Data Extraction and Quality Assessment

We used a structured, Web-based data abstraction form (SRS 4.0, TrialStat, Ottawa, Ontario, Canada) onto which trained reviewers abstracted data from each study and assigned an initial quality rating. A senior reviewer read each abstracted article, evaluated completeness of data abstraction, and confirmed the quality rating. Investigators resolved disagreements by discussion and consensus or by consulting an independent party.

We assessed the internal validity (quality) of trials on the basis of predefined criteria and applied ratings of good, fair, or poor (5, 7–8). Primary elements of quality assessment included randomization and allocation concealment, similarity of compared groups at baseline, blinding, use of intention-to-treat analysis, and overall and differential loss to follow-up. To assess observational studies, we used criteria involving selection of case patients or cohorts and control participants, adjustment for confounders, methods of outcomes assessment, length of follow-up, and statistical analysis (9). We rated studies with a fatal flaw in 1 or more categories as poor quality (Appendix Table 1) and did not include them in our analyses for this review unless no other head-to-head evidence was available. To identify effectiveness studies, we used a tool that distinguishes efficacy trials from effectiveness studies on the basis of certain elements of study design (10). Such studies have greater generalizability of results than efficacy trials because they enroll less selected study populations, use treatment modalities that mimic clinical practice, and assess health outcomes along with adverse events.

Lacking clear definitions about the equivalence of dosages among second-generation antidepressants in the published literature, we developed a roster of low, medium, and high dosages for each drug based on the interquartile dosing range (5). We used this roster, which does not indicate dosing equivalence, to detect gross inequalities in dosing that could affect comparative efficacy and effectiveness.

Data Synthesis

If data were sufficient, we conducted meta-analyses of head-to-head comparisons. Efficacy outcomes included the relative benefit of achieving response (more than 50% improvement from baseline), which reflects the ratio of benefits in one treatment group to benefits in another, and the weighted mean difference of changes on the Hamilton Depression Rating Scale or the Montgomery-Asberg Depression Rating Scale.

For each meta-analysis, we conducted a test of heterogeneity (I2 index) and applied both random- and fixed-effects models. We report the random-effects results because the results from both models were very similar in all meta-analyses. We assessed publication bias by using funnel plots and the Begg adjusted rank correlation test (11) based on the Kendall τ coefficient.

Because no head-to-head evidence was available for the majority of drug comparisons, we conducted adjusted indirect comparisons (5). We employed meta-regressions of placebo-controlled trials by using individual drugs as covariates. When the number of trials was insufficient for meta-regressions, we used modified network meta-analysis (12). Evidence suggests that indirect comparisons agree with head-to-head trials if component studies are similar and treatment effects are expected to be consistent in patients included in different trials (13), although these assumptions are usually not verifiable.

All statistical analyses used StatsDirect Statistical Software program, version 2.3.8 (StatsDirect, Sale, United Kingdom); Stata, version 9.1 (StataCorp, College Station, Texas); and SAS, version 9.1 (SAS Institute, Cary, North Carolina).

Rating the Strength of Evidence

We rated the strength of the available evidence for specific key questions and outcomes in a 3-part hierarchy (high, moderate, and low) (5) by using a modified GRADE (Grading of Recommendations, Assessment, Development, and Evaluation) approach (14–15) that incorporates 4 key elements: study design, study quality, consistency of results, and directness (availability of data on outcomes or populations of interest).

Role of Funding Source

The Agency for Healthcare Research and Quality participated in formulating the key questions and reviewed and commented on planned methods and data analysis. The Agency had no role in study selection, quality ratings, or interpretation and synthesis of the evidence, although staff reviewed interim and final evidence reports and distributed them for external peer review by outside experts.

Results

We identified 2318 citations from searches and reviews of reference lists (Figure 1). Of the 203 included studies (Appendix Tables 2 to 11), 140 (69.0%) were financially supported by pharmaceutical companies and 19 (9.3%) by governmental agencies or independent funds. For 44 (21.7%) studies, we could not determine the funding source.

Major Depressive Disorder

Overall, we found no substantial differences in comparative efficacy and effectiveness of second-generation antidepressants for treatment of MDD (Tables 2, 3, and 4 and Figures 2, 3, and 4). This finding pertains to the acute, continuation, and maintenance phases of treatment; to patients with accompanying symptom clusters; and to subgroups defined by age, race or ethnicity, sex, or comorbid conditions (we found only sparse evidence for subgroups). Nevertheless, second-generation antidepressants are not identical drugs. They differ somewhat with respect to onset of action and frequency of some adverse events. Generally, effectiveness studies with less stringent eligibility criteria provided results similar to those of efficacy trials, indicating good generalizability of our findings to primary care populations.

Comparative Efficacy for Acute-Phase Treatment of MDD

Eighty good- or fair-quality head-to-head, randomized, controlled trials (RCTs), comprising more than 17 000 patients, compared efficacy or effectiveness for acute-phase MDD treatment. These studies provided direct evidence for 36 of 66 possible comparisons among these drugs. Only 5 trials directly compared any second-generation nonselective serotonin reuptake inhibitor with another; of these, only 1 comparison was evaluated in more than 1 trial.

For the 62 comparisons of 1 drug with another for which data were available, we conducted indirect evaluations of response rates, incorporating an additional 34 placebo-controlled trials of good or fair quality comprising 26 349 patients (Appendix Table 11).

For almost all comparisons, no statistically significant differences in response rates were apparent (Figures 2, 3, and 4). For some indirect comparisons, however, the precision of estimates was low and confidence intervals encompassed differences that would be clinically significant.

Findings from some meta-analyses yielded statistically significant differences among treatments, but the modest effect sizes of the differences are probably not clinically significant (5). For example, the meta-analytic comparison of response rates to citalopram versus escitalopram (16–20) yielded a statistically significant additional treatment effect for escitalopram (relative benefit favoring escitalopram, 1.14 [95% CI, 1.04 to 1.26]) (5). Pooled differences of points on the Montgomery-Asberg Depression Rating Scale presented a mean additional treatment effect (weighted mean difference) of a 1.13-point reduction (CI, 0.18 to 2.09) for escitalopram (5). A 1.13-point change on the Montgomery-Asberg Depression Rating Scale represents about one fifth to one quarter of a standard deviation, so the clinical significance of this finding may be questionable. Methods research suggests that half a standard deviation constitutes a minimally important difference for health-related quality-of-life outcomes (21).

Meta-analyses yielded significantly lower response rates for fluoxetine than for sertraline (22–25) or venlafaxine (26–33). The small effect sizes of the differences are probably not clinically relevant.

Eighteen trials (18, 23, 33–48), mostly of fair quality, included health-related quality of life or functional capacity as secondary outcome measures. We found no differences among second-generation antidepressants for these outcomes.

Comparative Effectiveness for Acute-Phase Treatment of MDD

Three studies (23, 49–50) can be considered effectiveness rather than efficacy trials. Their findings were consistent with those of the efficacy trials. Two fair-quality effectiveness trials indicated that improvement of health-related quality of life (work, social and physical functioning, concentration and memory, and sexual functioning) was similar for fluoxetine, paroxetine, and sertraline (23, 50).

Speed of Response

Seven fair-quality studies (39–40, 45, 51–55) reported that mirtazapine had a significantly faster onset of action than citalopram, fluoxetine, paroxetine, or sertraline after 1 or 2 weeks of treatment. All studies were supported by the manufacturer of mirtazapine. After 4 weeks of treatment, most response rates were similar. The extent to which the faster onset of mirtazapine can be extrapolated to other second-generation antidepressants is unclear. Mirtazapine and venlafaxine did not differ in speed of action (42).

Response to a Second Agent after Initial Treatment Failure

Overall, 38% of patients did not achieve a treatment response during 6 to 12 weeks of treatment with second-generation antidepressants; 54% did not achieve remission. The STAR*D (Sequenced Treatment Alternatives to Relieve Depression) trial (56) provides the best evidence for assessing alternative medications among those for whom initial therapy failed. About 1 in 4 of the 727 people who participated in the switch of medications became symptom-free; this did not differ significantly among those who received sustained-release bupropion, sertraline, or extended-release venlafaxine. One open-label study (57) and a smaller efficacy study (58) reported significantly greater response rates for venlafaxine than for other second-generation drugs. Given the STAR*D findings, the clinical significance of this difference is questionable.

Maintaining Response or Remission after Treatment Success

Findings from 4 fair-quality head-to-head RCTs assessing relapse or recurrence prevention (59–63) were similar for the comparisons of fluoxetine and sertraline, fluvoxamine and sertraline, duloxetine and paroxetine, and trazodone and venlafaxine. In 1 trial (59), among 105 patients who demonstrated a response at 8 weeks, 5 (10%) of 49 sertraline-treated patients and 7 (13%) of 56 of fluoxetine-treated patients had relapse over 24 weeks of continuation-phase treatment.

Efficacy or Effectiveness for Depression or Accompanying Symptoms

Clinicians may use symptom clusters that accompany depression (such as anxiety or insomnia) to guide antidepressant selection. This might improve outcomes for the depressive episode, the symptom cluster, or both. We reviewed available evidence for clinically relevant symptom clusters to address each possibility.

Treatment of Depression in Patients with Accompanying Symptom Clusters

Anxiety

Six fair-quality head-to-head trials (31, 35, 64–68) suggest that antidepressants have similar antidepressive efficacy for patients with MDD and anxiety symptoms. These studies compared either fluoxetine or paroxetine with sertraline (259 patients with accompanying anxiety) (64–65); sertraline with bupropion (972 patients; number with anxiety not provided) (66–68); and sertraline with venlafaxine (20 patients with anxiety) (35). One fair-quality, 12-week trial (31) of 146 patients reported significantly greater response (75.0% vs. 49.3%) and remission rates (59.4% vs. 40.3%) with venlafaxine than with fluoxetine.

Insomnia

Two fair-quality head-to-head trials (441 patients with insomnia) (24, 69) provide limited evidence for similar efficacy of fluoxetine, nefazodone, paroxetine, or sertraline for treating depression in patients with accompanying insomnia. A pooled analysis of 3 RCTs (447 patients) (70) reported that the reduction on the Montgomery-Asberg Depression Rating Scale total score was significantly greater for patients receiving escitalopram than for those receiving citalopram (16.5 vs. 14.0); however, the clinical significance of this difference remains uncertain.

Melancholia

Two fair-quality head-to-head trials (286 patients) (28, 65) and 1 poor-quality head-to-head trial (68 patients) (71) assessed the effects of medications for treating depression in patients with melancholia. Although 2 studies reported greater response rates for sertraline than for fluoxetine (59% vs. 44%) (65) and for venlafaxine than for fluoxetine (70% vs. 50%) (71), the small sample sizes (87 and 68 patients) and high attrition rate (71) limit confidence in these findings.

Pain

We found no head-to-head evidence. Two placebo-controlled trials reported similar response rates for patients with MDD and pain who received duloxetine (72) or paroxetine (73) compared with those who received placebo.

Psychomotor Changes

The evidence is limited to subgroup analyses from 1 fair-quality head-to-head trial (65). Fluoxetine and sertraline had similar antidepressive efficacy among 47 patients with psychomotor retardation, but sertraline had higher efficacy among 78 patients with psychomotor agitation (65). Results should be interpreted cautiously because small sample sizes and multiple testing can lead to erroneous results in such subgroup analyses.

Treatment of Symptom Clusters in Patients with Accompanying Depression

Insomnia

Five fair-quality head-to-head trials (24, 37, 45, 62, 69) and a pooled analysis of 3 RCTs (70) involving 1540 patients provide limited evidence about the comparative effects of antidepressants on insomnia in patients with depression. Individual trials favored escitalopram over citalopram (70), nefazodone over fluoxetine (69), and trazodone over fluoxetine (37) and venlafaxine (62) in improving sleep scores. The comparisons were limited to single studies, and it is difficult to assess the clinical significance of these findings.

Pain

Three fair-quality head-to-head trials (63, 78–79) and 1 poor-quality trial (80) compared duloxetine with paroxetine. These trials (1466 patients) found no substantial difference in pain relief between duloxetine and paroxetine.

Sexual Dysfunction

A fair-quality prospective observational study (1022 patients) from Spain reported that 59% of patients treated with second-generation antidepressants experienced sexual dysfunction (81). On the basis of 5 RCTs (1489 patients), bupropion led to a significantly lower rate of sexual adverse events than fluoxetine and sertraline (82–86). Paroxetine frequently led to higher rates of sexual dysfunction than did fluoxetine, fluvoxamine, nefazodone, or sertraline (16% vs. 6%) (24, 76, 87–88). Underreporting of absolute rates of sexual dysfunction is likely in these studies.

Suicidality

Eleven studies (89–99) assessed the risk for suicidality (suicidal thinking or behavior) in patients treated with second-generation antidepressants; comparative data are sparse. No particular drug has an excess risk compared with any other drug in this class (94, 98). These findings are based primarily on retrospective cohort studies (91, 93–94, 98). Confounding by indication (patients at higher risk for suicide being prescribed certain medications rather than others) may have led to erroneous conclusions.

The United Kingdom's Committee on Safety of Medicines conducted the largest attempt to determine whether second-generation antidepressants increase the risk for suicidality in 2004 (89). A good meta-analysis of placebo-controlled trials of selective serotonin reuptake inhibitors, comprising more than 40 000 adults, yielded no evidence that these agents increase the risk for suicide (odds ratio, 0.85 [CI, 0.20 to 3.40]) but did reveal an increased risk for nonfatal suicide attempts (odds ratio, 1.57 [CI, 0.99 to 2.55]) (92).

Another good meta-analysis of published trial data (90), comprising more than 87 000 patients, reported a significantly higher risk for suicide attempts among patients receiving selective serotonin reuptake inhibitors than among those receiving placebo (odds ratio, 2.25 [CI, 1.14 to 4.55]). This study estimated the overall rate of suicide attempts as 3.9 (CI, 3.3 to 4.6) per 1000 patients treated with these drugs, with an incidence of 18.2 suicide attempts per 1000 patient-years.

Other Severe Adverse Events

Evidence on the comparative risk for rare but severe adverse events, such as seizures, cardiovascular events (events relating to systolic and diastolic blood pressure and pulse or heart rate), hyponatremia, hepatotoxicity, and the serotonin syndrome, is insufficient to draw firm conclusions. Clinicians should keep in mind the risk for such harms when treating patients with a second-generation antidepressant.

Treatment of MDD in Subgroups

No study directly compared efficacy, effectiveness, and harms of second-generation antidepressants between subgroups and the general population for treatment of depression syndromes. Numerous studies, however, conducted subgroup analyses or used subgroups as the study population.

Age

Multiple head-to-head trials (22, 44, 48, 50, 54, 100–107) and 2 fair-quality meta-analyses (108–109) indicated that the efficacy of second-generation antidepressants does not differ in elderly patients (65 to 80 years of age) or very elderly patients (>80 years of age) compared with younger patients. These findings are consistent with placebo-controlled trials (110–116) conducted in elderly or very elderly patients, which reported effect sizes similar to those from trials in younger patients.