Comparative Effectiveness of Second Generation Antidepressants in the Pharmacologic Treatment of Adult Depression – An Update to a 2007 Report

Background and Objectives for the Systematic Review

Depressive disorders such as major depressive disorder (MDD), dysthymia, and subsyndromal depression (including minor depression) may be serious disabling illnesses. MDD is the most prevalent, affecting more than 16 percent (lifetime) of U.S. adults. It was recently reported that in the US workforce, MDD has a monthly depression-related worker productivity losses that resulted in human capital costs of almost $2 billion (Birnbaum 2009). In 2000, the U.S. economic burden of depressive disorders was estimated to be $83.1 billion. More than 30 percent of these costs are attributable to direct medical expenses.

Pharmacotherapy dominates the medical management of depressive disorders and may include first-generation antidepressants (tricyclic antidepressants and monoamine oxidase inhibitors) and more recently developed second-generation antidepressants. These second-generation treatments include selective serotonin reuptake inhibitors (SSRIs) and serotonin and norepinephrine reuptake inhibitors (SNRIs). The mechanism of action of most of these agents is poorly understood. These drugs work, at least in part, through their effects on neurotransmitters such as serotonin, norepinephrine, or dopamine in the central nervous system.

In general, the efficacy of first- and second-generation antidepressant medications is similar (Mulrow et al 1999). However, first-generation antidepressants often produce multiple side effects that many patients find intolerable, and the risk for harm when taken in overdose or in combination with certain medications is high. Because of their relatively favorable side effect profile, the second-generation antidepressants play a prominent role in the management of patients with major depressive disorder and are the focus of this review.

This comparative effectiveness review is an update to the original one conducted in 2006-2007 for the Agency for Healthcare Research and Quality with the same key questions (Gartlehner et al. 2007). The first review did not include desvenlafaxine (listed below) as this is a new approved treatment since the time of the literature searches were conducted for the first review. The report will summarize the available evidence on the comparative efficacy, effectiveness, and harms of 13 second-generation antidepressants: bupropion, citalopram, duloxetine, escitalopram, fluoxetine, fluvoxamine, mirtazapine, nefazodone, paroxetine, sertraline, trazodone, venlafaxine, and desvenlafaxine in treating patients with MDD, dysthymia, and subsyndromal depression. It also will evaluate the comparative efficacy and effectiveness for maintaining remission and for treating accompanying symptoms such as anxiety, insomnia, or neurovegetative symptoms.

The Key Questions

Our key questions remain the same as with the previous review, with the exception of a few additional sub-questions suggested by our technical expert panel. The rationale for these additional sub-questions are described below.

KQ1:

For adults with major depressive disorder (MDD), dysthymia, or subsyndromal depressive disorders, do second-generation antidepressants differ in efficacy or effectiveness in treating depressive symptoms?

If a patient has responded to one agent in the past, is that agent better than current pharmacologic alternatives at treating depressive symptoms?

Are there any differences in efficacy or effectiveness between immediate release and extended release formulations of second-generation antidepressants?

KQ2:

For adults with a depressive syndrome that has responded to antidepressant treatment, do second generation antidepressants differ in their efficacy or effectiveness for preventing relapse (i.e., continuation phase) or recurrence (i.e., maintenance phase) when a patient

continues the drug they initially responded to, or

switches to a different antidepressant

Comparators include (1) continued treatment of different antidepressants in all study arms and (2) continued treatment versus switched treatment.

For adults with a depressive syndrome that has not responded to acute antidepressant treatment or has relapsed during continuation phase treatment, do alternative second-generation antidepressants differ in their efficacy or effectiveness?

KQ3: In depressed patients with accompanying symptoms such as anxiety, insomnia, and neurovegetative symptoms, do mediations or combinations of medications (including a tricyclic in combination) differ in their efficacy or effectiveness for treating the depressive episode or for treating the accompanying symptom?

KQ4:

For adults with a depressive syndrome, do second-generation antidepressants differ in safety, adverse events, or adherence? Adverse effects of interest include but are not limited to nausea, diarrhea, headache, tremor, daytime sedation, decreased libido, failure to achieve orgasm, nervousness, insomnia, and more severe events including suicide.

Are there any differences in safety, adverse events, or adherence between immediate release and extended release formulations of second-generation antidepressants?

KQ5: How do the efficacy, effectiveness, or harms of treatment with antidepressants for major depressive disorder (MDD), dysthymia, or subsyndromal depressive disorders, differ for the following subpopulations:

Elderly or very elderly patients; Other demographic groups (defined by age, ethnic or racial groups, and sex);

Subquestion KQ1c was suggested by our technical expert panel as an addition to key question 1. Many of the second generation antidepressants included in this review are available in both immediate and extended release formulations. These medications with new formulations (such as an extended release formulation) are approved by the Food and Drug Administration (FDA) through an application called a supplemental new drug application (sNDA). While the original NDA has specific requirements about the types of trials that need to be conducted (Phase I-III trials with 2 pivotal placebo-controlled RCTs, etc), the considerations for a sNDA depends more upon circumstances. Sometimes it is sufficient for the manufacturer to demonstrate bioequivalence. The generalizability of results of such studies to patient-relevant outcomes, however, is often unclear. The new sub-question would explore the comparative efficacy of immediate and extended release medications of second-generation antidepressants.

KQ2a:

The technical expert panel suggested this question based on the fact that patients are sometimes successfully started on a second-generation antidepressant and later switched to another medication. The outcome of interest is whether response or remission can be maintained if patients who have responded to one drug have to be switched to another second-generation antidepressant.

KQ4b:

Key question 4a extends the scope of KQ1c to explore differences between immediate and extended release formulations with respect to adverse events. For example, in a sensitivity analysis of a meta-analysis on the risk of nausea and vomiting during treatment with second-generation antidepressants, we noticed that patients on extended release venlafaxine experienced nausea and vomiting less commonly than patients treated with immediate release venlafaxine.

PICOTS for all Key Questions

Population(s):

Adults with major depressive disorder (MDD), dysthymia, or subsyndromal depressive disorders in different phases/stages of treatment (i.e., acute, continuation, or maintenance)

Interventions:

Second generation antidepressants

Generic Name

US Trade Name

Usual Daily Dosing Range

Frequency

Bupropion

Wellbutrin®

100–450 mg

Three times daily

Wellbutrin SR®

150–400 mg

Twice daily

Wellbutrin XL®

150–450 mg

Once daily

Citalopram

Celexa®

20–60 mg

Once daily

Duloxetine

Cymbalta®

40–60 mg

Once or twice daily

Desvenlafaxine

Pristiq®

50mg

Once daily

Escitalopram

Lexapro®

10–20 mg

Once daily

Fluoxetine

Prozac®

10–80 mg

Once or twice daily

Prozac Weekly®

90 mg (weekly)

Once weekly

Sarafem®

20 mg

Once daily†

Fluvoxamine

Luvox®§

50–300 mg

Once or twice daily

Mirtazapine

Remeron®

15–45 mg

Once daily

Nefazodone‡

Serzone®§

200–600 mg

Twice daily

Paroxetine

Paxil®

10–60 mg

Once daily

Paxil CR®

12.5–75 mg

Once daily

Sertraline

Zoloft®

25–200 mg

Once daily

Trazodone

Desyrel®

150–400 mg

Three times daily

Venlafaxine

Effexor®

75–375 mg

Two to three times daily

Effexor XR®

75–225 mg

Once daily

Comparators:

Other second generation antidepressants, placebo (for KQ4)

Outcomes for each question:

Response (KQ1, KQ5)

Remission KQ1, KQ5)

Maintenance of response (KQ2, KQ5)

Maintenance of remission (KQ2, KQ5)

Associated symptoms (KQ3, KQ5)

Adverse events

Harms (KQ4, KQ5)

Adverse effects (KQ4, KQ5)

Adherence (KQ4, KQ5)

Timing:

Acute, continuation and maintenance phase treatments

Settings:

Analytic Framework

Based on the key questions, we developed an analytic framework to guide the comparative effectiveness review (Figure 3-1). This figure depicts the key questions within the context of the PICOTS described in the previous section. In general, the figure illustrates how second generation antidepressants versus other second generation antidepressants may result in intermediate outcomes such as response or remission and/or long-term outcomes such as quality of life or functional capacity. Also, adverse events may occur at any point after the treatment is received.

Specifically, both KQ 1 and KQ2 pertains to the efficacy and effectiveness of obtaining response and preventing remission during an acute phase of depression using these second generation antidepressants – where KQ1 addresses the acute phase and KQ2 the continuation or maintenance phase. KQ3 addresses response for associated symptoms (anxiety, insomnia, melancholia, pain, psychomotor changes, and somatization), KQ4 focuses on safety issues (adverse effects, adherence) with each of the different study drugs or combinations of study drugs. KQ5 focuses on the specific subgroups for each of the other key questions.

Many factors have been shown in the literature to influence the use of second generation antidepressants. While the patient is ultimately the one to make the decision about whether to receive the pharmacologic treatment, this decision is directly impacted by a discussion with the health care provider about benefits and harms of each treatment option for one’s particular case. Our analytic framework addresses the second generation antidepressant pharmacologic options available to clinicians for treating their patients with MDD, dysthymia or subsyndromal depressive disorders.

Figure 3-1. Analytic Framework for the Comparative Effectiveness of Second Generation Antidepressants in the Pharmacologic Treatment of Adult Depression - Update

Figure 1. This figure depicts the key questions within the context of the PICOTS described in the previous section. In general, the figure illustrates how second generation antidepressants versus other second generation antidepressants may result in intermediate outcomes such as response or remission and/or long-term outcomes such as quality of life or functional capacity. Also, adverse events may occur at any point after the treatment is received.

Methods

A. Criteria for Inclusion/Exclusion of Studies in the Review

We will limit our searches to human studies, published in English after January 1, 1980 to September 4, 2009. All races, ethnicities, cultural groups will be included. Geographically, we will include all developed countries: United States, Canada, United Kingdom, Europe, Australia, Japan, and China. As our review focuses on adults, we will exclude those that include children or adults. Settings are both inpatient and outpatient. Eligible study designs include original research studies that provide sufficient detail regarding methods and results to enable use and adjustment of the data and results and relevant outcomes must be able to be abstracted from data presented in the papers (including Systematic Reviews and Meta-analyses). Eligible study designs include randomized controlled trials, non-randomized controlled trials, observational studies, such as prospective and retrospective cohort studies, case-control studies and cross sectional studies. Editorials, letters, and non-systematic literature reviews are excluded. Studies deemed to be “poor quality” will be appraised formally and included in the evidence tables but not in the analysis, unless there is no other information available.

Biochemical outcomes and costs are not an outcome of interest for this review.

Exhibit 4-1 below presents the inclusion/exclusion criteria for each key question that we used in the first review and will use during abstract and full text review in the update.

Adult inpatients and outpatients with major depressive disorder, dysthymia, or subsyndromal depression

Sample size

For RCTs:

For quantitative analysis: no limit

For qualitative analysis: n ≥ 40

For observational studies: n ≥ 100

For uncontrolled trials: n ≥1,000

B. Searching for the Evidence: Literature Search Strategies for Identification of Relevant Studies to Answer the Key Questions.

We will systematically search, review, and analyze the scientific evidence for each key question and any subsidiary questions. The steps that we will take to accomplish the literature review are described below.

To identify articles relevant to each key question, we began with a focused PubMed search on second-generation antidepressive agents and depressive disorder using MeSH terms. Supplemental searches were conducted using PsycINFO, The Cochrane Library, IPA, and EMBASE databases using analogous terms. We combined terms for selected indications (major depressive disorder, dysthymia, minor depression, subsyndromal depressive disorder), drug interactions, and adverse events with a list of 13 specific second-generation antidepressants (bupropion, citalopram, desvenlafaxine, duloxetine, escitalopram, fluoxetine, fluvoxamine, mirtazapine, nefazodone, paroxetine, sertraline, trazodone, and venlafaxine). We limited the searches to “adults”, “humans”,“English language” and included a time limit between January 1, 2005 and September 4, 2009 (no time limit was used when searching for desvenlafaxine).

Our initial searches yielded 887 citations across databases. We ran a supplemental query in PubMed on March 3, 2010 specific to immediate release and extended release antidepressive agents, which yielded 87 additional citations. We will review our search strategy with the TEP and supplement it as needed according to their recommendations. In addition, to attempt to avoid retrieval bias, we will search for landmark studies and background articles in the SCOPUS database to look for any relevant citations that might have been missed by electronic searches. We will also conduct an updated literature search (in PubMed, the Cochrane Library, PsycINFO, IPA, and EMBASE) before completing the final draft of the report. Any literature suggested by peer reviewers or from the public will be investigated and if appropriate incorporated into the final review.

A search of the gray literature will be conducted at the Scientific Resource Center and reviewed by the EPC project team.

C. Data Abstraction and Data Management

We will review all titles and abstracts identified through searches against our inclusion/exclusion criteria, using the SRS 4.0 Mobius Analytics software. Each abstract will be independently reviewed by two members of the team. When differences between the reviewers arise, we will include studies for full-text review. For studies without adequate information to make the determination, we will again review the full text. All results will be tracked in an EndNote database.

We will retrieve the full text of all titles included during abstract review. Each full-text article will be independently reviewed by two members of the team for inclusion or exclusion based on the eligibility criteria described above. If both reviewers agree that a study does not meet the eligibility criteria, the study will be excluded. If the reviewers disagree, conflicts will be resolved by discussion and consensus or by consulting a third, independent party. As above, all results will be tracked in an EndNote database including, where applicable, the reason a study did not satisfy eligibility criteria so that we can later compile a listing of excluded articles and reasons for such exclusions.

We will design data collection forms that include questions on identifying information for the article, study design, methods, and results. Trained abstractors will extract the relevant data from each included article into preformatted tables. Data abstractions will be reviewed for accuracy by a second member of the team.

D. Assessment of Methodological Quality of Individual Studies

To assess the quality (internal validity) of studies, we will use predefined criteria based on those developed by the US Preventive Services Task Force (ratings: good, fair, poor) and the National Health Service Centre for Reviews and Dissemination. In general terms, a “good” study has the least bias and results are considered to be valid. A “fair” study is susceptible to some bias but probably not sufficient to invalidate its results. A “poor” rating indicates significant bias (e.g., stemming from serious errors in design or analysis) that may invalidate the study’s results. To assess the quality of observational studies, we will use criteria outlined by Deeks, et al. (2003).

Two independent reviewers will assign quality ratings to each study. Disagreements will be resolved by discussion and consensus or by consulting a third, independent party.

E. Data Synthesis

For this update, we will begin with the evidence presented in the original CER and add in all of the new eligible studies identified since the prior literature search was conducted. Data found from the combined literature review periods will be synthesized qualitatively. However, if we find a sufficient number (three or more) of similar studies of factors influencing the use of the study pharmacological treatments, we will consider quantitative analysis (meta-analysis) of data from those studies. If fewer than three head-to-head trials are available for any drug comparison, we will compute indirect comparisons employing mixed treatment comparisons. Evidence suggests that indirect comparisons agree with head-to-head trials if component studies are similar and treatment effects are expected to be consistent in patients in different trials. Our first report included such indirect comparisons and we anticipated that we would be able to incorporate new studies into the analysis for the update report. Indirect estimates of comparative treatment effects will be calculated using Bayesian inference computed with WinBUGS.

Because we do not have access to individual patient data, we will not conduct any subgroup analyses. We will be looking at these subgroups as stated in the key question:

Elderly or very elderly patients;

Other demographic groups (defined by age, ethnic or racial groups, and sex);

F. Grading the Evidence for Each Key Question

We will prioritize outcomes of interest for each key question. Our primary focus is on health outcomes such as quality of life, functional capacity, remission, or adverse events. If we cannot find sufficient evidence for these outcomes we will include proxies to health outcomes such as response to treatment on a depression rating scale. We will rate the strength of evidence for this update based on the principles outlined for use by the AHRQ Evidence-based Practice Centers in AHRQ’s Methods Guide for Effectiveness and Comparative Effectiveness Reviews (Owens, 2010). We will present the strength of evidence for those outcomes that were rated as the most important ones for each key question. Overall, we will be rating the strength of evidence on the comparative efficacy or comparative effectiveness for key questions 1, 2, 3 and 5. In addition, key question 1 will also rate the strength of evidence for quality of life and onset of action; key question 4 will rate general efficacy/effectiveness as well as comparative risk for specific harms; and key question 5 will also rate comparative harms.

Owens DK, Lohr KN, Atkins D, et al. Grading the strength of a body of evidence when comparing medical interventions--Agency for Healthcare Research and Quality and the Effective Health Care Program. Journal of Clinical Epidemiology. 2010;63(5):513-23.

Definition of Terms

Acute Phase: first phase of depression management, usually 6 to 12 weeks in length

Continuation Phase: second phase of depression management during which the treatment goal is continued absence of depressive symptoms for an additional 4 to 9 months, such that the patient’s episode can be considered completely resolved

Maintenance Phase: third phase of depression management, frequently a multi-year period during which the treatment goal is preventing the recurrence of a new, distinct episode.

Relapse: return of depressive symptoms during the acute or continuation phases, it is considered part of the same depressive episode

Summary of Protocol Amendments

In the event of protocol amendments, the date of each amendment will be accompanied by a description of the change and the rationale.

NOTE: The following protocol elements are standard procedures for all protocols.

Review of Key Questions

For Comparative Effectiveness reviews (CERs) the key questions were posted for public comment and finalized after review of the comments. For other systematic reviews, key questions submitted by partners are reviewed and refined as needed by the EPC and the Technical Expert Panel (TEP) to assure that the questions are specific and explicit about what information is being reviewed.

Technical Expert Panel (TEP)

A TEP panel is selected to provide broad expertise and perspectives specific to the topic under development. Divergent and conflicted opinions are common and perceived as health scientific discourse that results in a thoughtful, relevant systematic review. Therefore study questions, design and/or methodological approaches do not necessarily represent the views of individual technical and content experts. The TEP provides information to the EPC to identify literature search strategies, review the draft report and recommend approaches to specific issues as requested by the EPC. The TEP does not do analysis of any kind nor contribute to the writing of the report.

Peer Review

Approximately five experts in the field will be asked to peer review the draft report and provide comments. The peer reviewer may represent stakeholder groups such as professional or advocacy organizations with knowledge of the topic. On some specific reports such as reports requested by the Office of Medical Applications of Research, National Institutes of Health there may be other rules that apply regarding participation in the peer review process. Peer review comments on the preliminary draft of the report are considered by the EPC in preparation of the final draft of the report. The synthesis of the scientific literature presented in the final report does not necessarily represent the views of individual reviewers. The dispositions of the peer review comments are documented and will, for CERs and Technical briefs, be published three months after the publication of the Evidence report.

It is our policy not to release the names of the Peer reviewers or TEP panel members until the report is published so that they can maintain their objectivity during the review process.