Effects of mental health interventions for students in higher education are sustainable over time: a systematic review and meta-analysis of randomized controlled trials https://t.co/W5ZUD3tXEI https://t.co/qGUT7o2dGZ

Effects of mental health interventions for students in higher education are sustainable over time: a systematic review and meta-analysis of randomized controlled trials https://t.co/35ozkmemtG #Psychiatry

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Abstract

Background

Symptoms of depression, anxiety, and distress are more common in undergraduates compared to age-matched peers. Mental ill health among students is associated with impaired academic achievement, worse occupational preparedness, and lower future occupational performance. Research on mental health promoting and mental ill health preventing interventions has shown promising short-term effects, though the sustainability of intervention benefits deserve closer attention. We aimed to identify, appraise and summarize existing data from randomized control trials (RCTs) reporting on whether the effects of mental health promoting and mental ill health preventing interventions were sustained at least three months post-intervention, and to analyze how the effects vary for different outcomes in relation to follow-up length. Further, we aimed to assess whether the effect sustainability varied by intervention type, study-level determinants and of participant characteristics.

Material and Methods

A systematic search in MEDLINE, PsycInfo, ERIC, and Scopus was performed for RCTs published in 1995–2015 reporting an assessment of mental ill health and positive mental health outcomes for, at least, three months of post-intervention follow-up. Random-effect modeling was utilized for quantitative synthesis of the existing evidence with standardized mean difference (Hedges’ g) used to estimate an aggregated effect size. Sustainability of the effects of interventions was analyzed separately for 3–6 months, 7–12 months, and 13–18 months of post-intervention follow-up.

Discussion

The evidence suggests long-term effect sustainability for mental ill health preventive interventions, especially for interventions to reduce the symptoms of depression and symptoms of anxiety. Interventions to promote positive mental health offer promising, but shorter-lasting effects. Future research should focus on mental health organizational interventions to examine their potential for students in tertiary education.

Previous research on mental health promotion and mental ill health prevention has shown promising short-term effects of stress reduction techniques and meditation, self-hypnosis, cognitive behavioral, and mindfulness interventions (Conley, Durlak & Kirsch, 2015; Conley et al., 2017; Regehr, Glancy & Pitts, 2013; Shiralkar et al., 2013) as well as of technology based interventions (Conley et al., 2016; Davies, Morriss & Glazebrook, 2014; Farrer et al., 2013). As mental health problems persist during the study period (Bewick et al., 2010; Christensson et al., 2010) and negatively affect academic performance and future working capacity (Rudman & Gustavsson, 2012; Vaez & Laflamme, 2008), the sustainability of intervention benefits as well as its determinants and moderators deserve closer attention. Several reviews have approached the issue of intervention effect sustainability by averaging the effects reported for the longest follow-up periods, although a substantial variability in the ranges and means of the follow-up lengths made the comparisons difficult (Conley, Durlak & Kirsch, 2015; Conley et al., 2016, 2017). The authors highlighted the need for in-depth investigation of the intervention benefit sustainability over variable post-intervention follow-up periods since the effects may change their direction and strength over time (Conley, Durlak & Kirsch, 2015). Therefore, to further address the nature of sustainability of intervention effects, in this review we aimed to systematically identify, appraise and summarize the existing data from randomized control trials (RCTs) reporting on whether the effects of mental health promoting and mental ill health preventing interventions are sustained for at least three months of post-interventional follow-up. Further, we aimed to analyze how the direction and magnitude of the effects vary for different outcomes in relation to the lengths of follow-up and to assess whether effect sustainability varied by the types and major features of interventions, study-level determinants, and characteristics of participants.

The PICO components (population, intervention, comparator and outcome) were developed after discussing eligibility criteria with stakeholders from the student health services: P = students in university settings; I = any types of mental health-promoting and mental ill health- preventing interventions; C = any types of active or inactive controls; O = (i) positive mental health, including well-being, coping, locus of control, resilience, self-esteem/self-compassion, stress management, academic achievement or academic performance, and (ii) mental ill health, including symptoms of anxiety, symptoms of depression, psychological distress, worry, fatigue, sleeping problems, and perceived stress. The study design was restricted to RCTs with at least three months of post-intervention follow-up. No language restrictions were initially applied. Studies focused on students with diagnosed psychiatric disorders and studies conducted in primary care settings were excluded.

Search strategy

In collaboration with librarians (CG, AW: see Acknowledgements), a sensitive search strategy was developed and adapted to the following databases: MEDLINE (Ovid), PsycInfo (Ovid), ERIC (Ovid), and Scopus. Gray literature was searched in Dissertations & Theses (ProQuest), Dart Europe, OpenGrey, and Base Bielefeld. The searches were limited to studies published from January 1, 1995 to December 31, 2015. Key words and MESH terms are reported in the Supplemental Information (Data S2 and Data S3, respectively). Reference lists of the relevant reviews and studies selected for inclusion were manually scrutinized and the following journals were hand-searched from January 2012 to March 2016: College Student Journal, Journal of American College Health and Journal of College Counseling.

Study selection, data extraction, and quality assessment

Screening was conducted independently by two authors (RW, KG) and two colleagues (AF, AM: see Acknowledgements). The eligibility of each article was initially evaluated by the title and abstract and, if found appropriate, followed by full-text examination. At this stage only English-language publications were assessed. This resulted in a loss of 20 publications in Chinese (k = 15), Japanese (k = 3), Korean (k = 1), and Spanish (k = 1). Gray literature was taken into consideration when accessible free of charge. Any disagreements were resolved through panel discussions. Studies selected for inclusion were examined for potential overlap in study populations, which was not found.

Data extracted from the articles included first author, country of origin, setting, funding, inclusion and exclusion criteria, characteristics of the intervention and comparison groups (age, gender, ethnicity), characteristics of the intervention (type, format, delivery level, length of session, duration), type of comparison, outcome definition and measurement scale, sample size, post-intervention length of follow-up, percent of withdrawals at each measurement point, and study quality (described below). If a study reported multiple outcomes and/or if outcomes were assessed at multiple follow-up time points, quantitative data (means and standard deviations (SDs)) were extracted separately for each outcome at each follow-up period. The same approach was utilized for multi-armed RTCs, from which separate extraction was performed for each intervention–comparison pair. When data were missing in the original reports we contacted authors for further clarification.

As suggested by Conley, Durlak & Kirsch (2015), original interventions were grouped into: (i) cognitive behavior therapy (CBT)-related if focusing on identifying and changing unhelpful cognitions, behaviors and emotional regulation; (ii) mind–body-related, i.e., interventions that facilitate the mind’s capacity to affect bodily function and symptoms; and (iii) psycho-educational-related if focusing on information, discussion and didactic communication on, e.g., stress-reduction and coping. Categorization was based on the original definitions, if provided, and otherwise by us. Level of delivery was considered as universal if intervention targeted students without reported mental ill health symptoms and as selective if provided for those with adverse mental health symptoms. Interventions were further divided into group or individual format. Comparators were sub-divided into active controls (i.e., another type of intervention) and inactive controls (waitlist controls, placebo-controls, “living as usual,” and no intervention). Study outcomes were classified in two major categories: mental ill health outcomes consisting of anxiety symptoms, depressive symptoms, psychological distress, stress, self-reported worry, passive coping, and deteriorated quality of sleep; and positive mental health and academic performance outcomes including self-esteem, self-compassion, self-efficacy, mental or subjective well-being, resilience, active coping, happiness, stress management and academic performance.

The quality of selected trials was assessed independently by three authors (RW, KG, LL) and colleagues (AF, AM, SB: see Acknowledgements) using the Effective Public Health Practice Project Quality Assessment Tool (EPHPP), as recommended by the Cochrane Collaboration for public health reviews (Higgins, Green & Cochrane Collaboration, 2008). The EPHPP assesses selection bias, study-design, confounders, blinding, data collection method and withdrawals, and dropouts to yield the study quality as either strong, moderate or weak. Discrepancies in quality assessment were resolved by discussions with one of the three reviewers not involved in the review process.

Statistical analysis

Because of the variety of instruments used for measuring outcomes, a standardized mean difference using Hedges’ g was chosen as a common effect size (ES) for conducting quantitative synthesis. ES was calculated separately at each post-intervention follow-up time point as a difference in means between intervention and control group, divided by the pooled within-group SD and incorporating a correction factor for small sample sizes (Borenstein et al., 2009). One trial reported Hedges’ g as the study ES (Braithwaite & Fincham, 2009), while for other studies it was calculated from the available raw data. Throughout the recalculations we kept the original direction of scales indicating the improvement of outcome measures. Thus, for mental ill health outcomes ESs below zero pointed to superiority of the intervention group over the controls, while for positive mental health and academic performance outcomes, ESs above zero indicated that the results favored the intervention. For one study where follow-up means, but not SDs, were provided and the intervention effect was indicated as “non-significant” (Chiauzzi et al., 2008), we set ES to zero. To ease interpretation of the magnitude of Hedges’ g, we applied Cohen’s convention (Cohen, 1992) and defined the ES as small (0.2), medium (0.5), and large (0.8).

Meta-analysis was conducted for all specific outcomes originally reported and for outcomes combined within mental ill health and positive mental health and academic performance categories. To analyze the combined outcomes, we applied a hierarchical approach for selecting outcomes from the studies reporting more than one from the same category. The hierarchy was based on descending order of outcome reporting, i.e., from the most often reported to the least often reported. For mental ill health outcomes the hierarchical selection was ordered as: depressive symptoms, anxiety symptoms, stress, psychological distress, self-reported worry, quality of sleep, and passive coping. For positive mental health and academic performance outcomes the order was: self-esteem, academic performance, self-efficacy, self-compassion, mental or subjective well-being, resilience, stress management, active coping, and happiness.

Because of the initial assumptions of between-study heterogeneity, a random-effects model incorporating both within- and between-study variability was used for quantitative synthesis. To assess the sustainability of intervention effect over time and address the variety of the follow-up lengths reported in the original studies, we categorized the post-intervention follow-ups as 3–6 months, 7–12 months, and 13–18 months. Each of the included studies reported outcome measures for at least one of these categories and quantitative synthesis was conducted separately for each category. If a given study provided several outcome measures falling in the same length category (e.g., for both three and six month follow-ups), the ES for the follow-up close to the upper boundary (i.e., six months) was chosen (Kanji, White & Ernst, 2006; Seligman, Schulman & Tryon, 2007; Vazquez et al., 2012). Only one study (Seligman et al., 1999) assessed outcomes at follow-up periods longer than 18 months and the measurements of those periods (i.e., 24 months, 30 months, and 36 months) were not included in the meta-analysis. Finally, to obtain comparability between trials, if a study provided results based on both imputed data (i.e., intention-to-treat analysis) and non-imputed data (i.e., follow-up completers) (Reavley et al., 2014) or if both crude and adjusted ESs were available (Chase et al., 2013) we favored the imputed and crude measures for the main analysis leaving the latter (non-imputed and adjusted measures) for sensitivity analysis.

We evaluated statistical heterogeneity among the studies using Q and I2 statistics. For Q, p-value < 0.1 was considered as representative of statistically significant heterogeneity, and I2 values of 25%, 50%, and 75% were indicating low, moderate and high heterogeneity, respectively (Higgins et al., 2003).

The subgroup analyses were performed by stratifying the main analysis by a priori identified moderators related to interventions (category of intervention, delivery level, type of format, type of controls), study-level moderators (initial study size, study quality) and moderators related to participant characteristics (gender, country). The analyses were performed if at least two studies were included in each subgroup. Mixed-methodology was applied with random-effect modeling used for within-group pooling, while between-group differences were assessed with fixed-effect model. Leave-one-out influence analysis was conducted to assess the potential impact of individual studies on the overall pooled ES by omitting one study at a time (Tobias, 1999). Following the approach suggested by Hart et al. (2012), sensitivity analysis was conducted to assess whether the overall pooled ES differed if the lowest or the highest original ES was selected from the studies with multiple outcome assessments or multiple interventions or comparisons. In meta-analyses with three or more studies included, we assessed publication bias by funnel plots, Egger’s regression asymmetry test, and the Begg–Mazumdar adjusted rank correlation test (Begg & Mazumdar, 1994; Egger et al., 1997).

All statistical analyses were performed using STATA version 13.1 (StataCorp, College Station, TX, USA), p-values < 0.05 were considered statistically significant, and all statistical tests were two-sided.

Results

After removing the duplicates, 6,571 records were available for title and abstract screening. Among these, 6,519 records were excluded as not meeting the PICO-criteria, leaving 52 articles for full-text examination. Further evaluation excluded another 26 studies: post-intervention follow-up less than three months (k = 11), not enough data to calculate ES (k = 6), not a RCT (k = 5), population not relevant (k = 3), and outcome not relevant (k = 1). A selection process yielded a final number of 26 RCTs to be included in the meta-analysis (Fig. 1).

Effects and sustainability over time

Interventions preventing mental ill health

As presented in Table 2, for the combined mental ill health outcomes an aggregated ES for all preventive interventions yielded a superiority of interventions over the comparisons at 3–6 months and 7–12 months of post-intervention follow-up, although the effects were small. Pooled ES did not reach statistical significance for follow-up periods of 13–18 months. High-to-moderate heterogeneity was detected at all three follow-up periods with I2 of 79.5%, 74.2%, and 58.2%, respectively. Publication bias was evident for studies with 3–6 months follow-up (Egger’s test p-value = 0.013), though not for studies with longer follow-up periods (7–12 months: p-value = 0.151; 13–18 months: p-value = 0.141), (Fig. S1). Influence analysis revealed no indication that individual RCTs, if omitted, would significantly influence the observed overall ESs. As previously noted, sensitivity analyses were performed for studies with multiple outcome measures reported for the same follow-up and for studies with multiple interventions or comparisons. Pooling together the highest ESs originally reported for these studies did not alter the results of the main analysis 3–6 months: ES = −0.28 (95% CI [−0.44, −0.12]); 7–12 months: ES = −0.35 (95% CI [−0.60, −0.10]); 13–18 months: ES = −0.23 (95% CI [−0.54, 0.08]). Neither were the results influenced when the lowest originally reported ESs were used (3–6 months: ES = −0.24 (95% CI [−0.39, −0.09]); 7–12 months: ES = −0.23 (95% CI [−0.43, −0.03]); 13–18 months: ES = −0.06 (95% CI [−0.18, 0.06]). Only one study reported the results using both imputed and non-imputed data (Reavley et al., 2014), with the former included in the main meta-analysis. Alternative inclusion of the latter did not change the overall ES.

In sub-group analyses for the combined mental ill health outcomes, studies employing CBT-related interventions revealed significant pooled ESs for 3–6 month and 13–18 month follow-ups (Table 2; Fig. 2). Less consistent results were observed for mind-body-related interventions. No superiority of intervention group appeared among studies with psycho-educational interventions. Pooled ESs for universal preventive interventions yielded significant results for follow-up up to 7–12 months. Less consistency appeared in the aggregated results for selective interventions and interventions conducted face-to-face in groups. Trials with small sample size and trials comprising more than 60% females yielded significant effects for up to 7–12 months of follow-up. The small numbers of studies might explain the lack of consistency in the results of other sub-group comparisons. The high heterogeneity seen for studies with 3–6 months of follow-up might reflect differences in delivery level (p-value for Q between sub-groups = 0.006) and study size (p < 0.001), while for studies with follow-up periods of 7–12 months, heterogeneity could be explained by differences in type of comparison (p < 0.001), study size (p = 0.01), and country where the RCT was conducted (p = 0.003). We were unable to detect between-group differences for trials with follow-up of 13–18 months because of the small number of studies in the sub-groups.

Assessment of the specific mental ill health outcomes, revealed a sustainable effect of all interventions combined lasting up to 13–18 months for symptoms of depression (ES = −0.30 (95% CI [−0.51, −0.08])), (Table S3). For symptoms of anxiety sustainability was observed up to 7–12 months (ES = −0.27 (95% CI [−0.54, −0.01])). Only one study assessed the effect of interventions targeting anxiety during 13–18 months of post-intervention follow-up and, hence, we were unable to perform meta-analysis. For symptoms of stress, reductions lasted up to 3–6 months post-intervention (ES = −0.30 (95% CI [−0.58, −0.03])). Other comparisons were either inconclusive or quantitative synthesis was not performed because of the small number of studies.

The effect sizes of all interventions and combined subtotals for positive mental health and academic performance outcomes by the length of follow-up. Lines represent standardized difference in means (Hedges’ g) and 95% confidence intervals (CI); the size of the box represents the weight of each study.

Because of lack of data on the specific positive mental health and academic performance outcomes, only studies on active coping, self-esteem, and self-efficacy with 3–6 months follow-up were quantitatively assessed (Table S4). Sustainability of the intervention effect was observed for active coping (ES = 0.75 (95% CI [0.19, 1.30])) with no significant effects shown for other outcomes.

Discussion

Our systematic review and meta-analysis showed sustainability of the benefits of mental health interventions targeting students in higher education, though in most of the analyses, the pooled ESs yielded significant, but small overall effects. For the combined mental ill health outcomes, the observed effects across all preventive interventions were sustained for up to 7–12 months post-intervention. Sustainability of effects was most pronounced for interventions designed to reduce the symptoms of depression, for which the superiority of intervention groups over the comparisons remained significant for up to 13–18 months post-intervention. For the combined positive mental health and academic performance outcomes, aggregated results across all promotion interventions revealed slightly shorter, but still evident sustained effects, which remained significant at post-intervention follow-up of 3–6 months.

To our knowledge, this is the first systematic review and meta-analysis focusing primarily on the sustainability of the effects of mental health promoting and mental ill health preventing interventions among students in higher education and analyzing different categories of follow-up duration. A direct comparison to the existing literature was therefore difficult as other reviews mostly assessed the effects measured at the completion of interventions. The closest comparisons are three reviews by Conley et al. on universal and indicated mental health prevention programs (Conley, Durlak & Kirsch, 2015; Conley et al., 2017) and technology-delivered preventive interventions (Conley et al., 2016). The reviews by Conley et al., assessed the effects of interventions across all types of adjustment outcomes in university students at the longest follow-up period reported, which varied from two to 52 weeks (Conley, Durlak & Kirsch, 2015), 13 to 52 weeks (Conley et al., 2016), and four to 157 weeks (Conley et al., 2017). The first review (Conley, Durlak & Kirsch, 2015) showed the duration of follow-up to be negatively correlated with aggregated ES across mental ill health and positive mental health outcomes combined as well as no effect for psycho-educational interventions. A similar tendency for the effects of intervention to become non-significant as the duration of follow-up increases was observed in our study, though the sustainability of effects differed between ill-health and positive mental health outcomes. As in Conley’s review (Conley, Durlak & Kirsch, 2015), no effects of psycho-educational interventions on any outcomes were evident in our data, regardless of the duration of follow-up. The second review (Conley et al., 2016) reported a significant effect of universal interventions at any follow-up periods ranging between 13 and 52 weeks as well as a positive effect of selective interventions during the follow-up periods of 2–26 weeks. Similarly, in our study mental ill health outcomes were reduced by universal interventions for up to 7–12 months of follow-up and by selective interventions at follow-ups of up to 3–6 months, although our results on positive mental health and academic performance outcomes were less conclusive. Similar to the third review (Conley et al., 2017), our results indicated that the most sustainable effects were observed for interventions designed to reduce the symptoms of depression and symptoms of anxiety.

Although our literature search for intervention studies was not limited to psychological interventions, only this type was retrieved. The scarcity of organizational mental health promoting interventions was verified by a scoping review (Enns et al., 2016). However, an exception may be a recent systematic review on learning environment interventions for medical student well-being, suggesting changes to curriculum (Wasson et al., 2016). Their results support previous findings suggesting that to maximize the effectiveness of mental health promotion, all levels of delivery must contribute, i.e., not just individual and group levels, but also structural, and societal levels (Hamilton & Bhatti, 1996). To further improve the sustainability of student mental health promotion, psychological interventions may be combined with a whole-setting approach, as endorsed by the WHO initiative health promoting universities (HPU) (World Health Organization, 1995).

Limitations

Systematic reviews on student mental health have indicated lack of follow-up data on outcome assessment as a major obstacle for determining the long-term effect of interventions (Conley, Durlak & Kirsch, 2015; Conley et al., 2016; Davies, Morriss & Glazebrook, 2014; Farrer et al., 2013). Likewise, the scarcity of studies assessing the effects of interventions at post-interventional follow-ups of longer than three months along with a substantial variability in the lengths of follow-ups reported in the original studies should be considered as major limitations of our review. In particularly, the lack of original evidence affected our analysis of positive mental health outcomes as it restricted us to mainly aggregating the effects of interventions with 3–6 months of follow-up. Other limitations must also be considered. First, in most cases low numbers of studies in sub-groups prevented us from exploring the moderating effect of types of interventions, study-level determinants and participant characteristics during follow-up periods longer than six months making the results of sub-group analyses tentative. This also precluded us from conducting in-depth investigation of sources for heterogeneity, which was found to be mostly high. Second, our intention to analyse two dimensions of mental health, that resulted in combining the original outcomes into the “mental ill health” and “positive mental health and academic performance” outcome categories, with a hierarchical approach applied, could have boosted heterogeneity. In a subsequent analyses, we attempted to reduce heterogeneity by pooling together the studies with the same specific outcomes reported, though for several outcomes it was not possible due to data scarcity. Third, the evidence was insufficient to obtain any aggregated ESs for the specific outcomes, in particular, for self-reported worries, passive coping, academic performance, self-compassion, mental and subjective well-being, resilience and happiness rating. Fourth, a substantial variability exists in measurement instruments and, in several cases, the same outcome was measured by different scales. We tried to address this limitation by choosing Hedges’ g as an ES and by investigating how sensitive the aggregated results were to our initial approach of combining the original ESs in cases of multiple outcome measures or in multi-armed RCTs. The sensitivity analyses proved the robustness of our findings for mental ill health, though for positive mental health outcomes the use of the lowest ESs from the original studies altered the results for studies with 7–12 and 13–18 months of follow-up. Fifth, more than 30% of the original studies were assessed as being of weak quality. To address this issue, we conducted sub-group analyses stratifying the trials by study quality. For both categories of outcomes, these analyses revealed inconclusive results when trials with insufficient quality were pooled that should be accounted when interpreting our results. Furthermore, selection bias was the most commonly identified weakness. This bias, whether induced by the investigators or caused by self-selection may have resulted in either underestimation or overestimation of the original ES and therefore could affect the aggregated results. Finally, the results should be seen in the context of the presence of publication bias among the studies with 3–6 months of follow-up and of our inability to assess publication bias for positive mental health outcomes at follow-ups longer than six months, which may have resulted from our restriction to English-language publications at the final stage of selection.

Conclusion

Despite the limitations, the evidence suggests long-term effect sustainability for mental ill health preventive interventions, in particular, for interventions to reduce the symptoms of depression and symptoms of anxiety. Interventions designed to promote positive mental health offer promising, but shorter-lasting effects. As the research field of health promoting interventions for students expands, future studies may improve our attempts to establish the effectiveness and sustainability of those interventions, e.g., ascertaining the effects for specific positive mental health outcomes. In addition, future research should also focus on mental health organizational interventions to investigate their potential for students in tertiary education.

Supplemental Information

Characteristics of included studies–comprehensive table.

(A) Participants’ characteristics are reported as in the original trials. Several studies are missing the educational level or gender frequencies. (B) Comparator group defined in the original study as waitlist with delayed intervention was not included if intervention was provided before the end of follow-up Chase JA; (34). Comparison was made between the remaining trial arms. Table reported comparators used in meta-analysis. (C) In the trials with multiple interventions of similar nature, i.e. if interventions shared the same features and components and, therefore, belonged to the same category, effect sizes (Hedges’ g) calculated for each intervention were combined (35, 41). Similar approach was applied to studies with multiple control groups (35, 42, 43). An exception was a study by Kanji N et al. (38), where two control groups–an attention control and a time control–were included separately as considered to be different in approach and content and, thus, representing active and inactive comparisons, respectively. (D) As suggested by Conley, et al. (15), original interventions were grouped into: (i) cognitive behavior therapy-related if focusing on identifying and changing unhelpful cognitions, behaviors and emotional regulation; (ii) mind-body-related, i.e. interventions that facilitate the mind’s capacity to affect bodily function and symptoms; and (iii) psycho-educational-related focusing on information, discussion and didactic communication on, e.g. stress-reduction and coping. Categorization was based on the original definitions, if provided, and otherwise by us. (E) Adapted Schedule for Affective Disorders and Schizophrenia for School-Age Children (K-SADS); Anxiety Sensitivity Index (ASI); Beck Anxiety Inventory (BAI); Beck Depression Inventory (BDI); Beck & Srivastava Stress Inventory (BSSI); Center for Epidemiologic Studies Depression Scale (CES-D); Chinese Perceived Stress Scale (CPSS); Depression Anxiety Stress Scales (DASS-21); Fordyce Emotions Questionnaire (Fordyce Happy); Generalized Anxiety Disorder Questionnaire (GADQ-IV); General Health Questionnaire); General Self-efficacy Scale (GSES); Grade Point Average (GPA); Hamilton Anxiety Rating Scale (HARS); Hamilton Depression Rating Scale (HDRS); Health Action Process Approach (HAPA); Health Promoting Lifestyle Profile (HPLP); Kessler Psychological Distress Scale, 6 items (K6); Mindfulness based stress reduction (MBSR); Overall Subjective Wellbeing score (SWB score); Penn Scale Worry questionnaire (PSWQ); Perceived Stress Scale; Pittsburgh Sleep Quality Index (PSQI); Preparation, Action, Adaptive Learning (PAAL); Resilience Scale (RS); Rosenberg Self Esteem Scale (RSES); Self-Compassion Scale (SCS); Self Esteem Scale (SES) = Chinese version; Self-Symptom Intensity Scale (SCL-90); State and Trait Anxiety Instrument (STAI); State-Trait Anxiety Inventory-Trait version (STAI-T).The Ways of Coping Questionnaire (WCQ); World Health Organization Well-Being Index (WHO-5). (F) Outcomes were classified into negative (i.e. mental ill-health): anxiety symptoms, depressive symptoms, psychological distress, (perceived) stress, self-reported worry, avoidance coping (i.e. passive copping, including suppression), and deteriorated quality of sleep; and positive (mental health): self-esteem, self-compassion, self-efficacy, mental or subjective well-being, resilience, approach coping (i.e. active coping, including general and direct coping), happiness, stress management and academic performance. (G) If the trial reported several measures for the same outcome at a given follow-up time point (e.g. for depressive symptoms assessed by both the Hamilton Depression Rating Scale and Beck Depression Inventory) the effect sizes (Hedges’ g) were combined to obtain the single outcome measure per intervention at each measurement point (36–40). That also refers to the state and trait anxiety (or anxiety symptoms) to be measured separately within the same study Higgins DM (36)), Jones MC (37), Kanji N (38). Likewise, measures of general and direct coping were combined within a study to obtain a single measure of approach coping Jones MC (38). (H) If original study reported post-baseline follow-up it was re-calculated into post-interventional follow-up by subtracting the length of interventions from the reported months. (I) For meta-analysis post-intervention follow-ups were categorized as 3–6 months, 7–12 months, and 13–18 months. If a given study provided several outcome measures falling in the same length category (e.g., for both 3 and 6 month follow-ups), follow-up close to the upper boundary (i.e., 6 months) was chosen (38, 44, 45). (J) Depression and anxiety, both measured with GHQ-20, were not retrieved from Cheng M et al. (53) to be used in the meta-analysis due to unclearness with interpretation of original measurements. (K) Measures on anxiety from Chiauzzi E et al. (35) were not retrieved due to missing quantitative data. (L) Outcome measures during the follow-up periods longer than 18 months were not retrieved from Seligman et al. (1999) (47). (M) Overall Subjective Wellbeing score composed of scores measured by Positive and Negative Affect Schedule and The Satisfaction with Life Scale.

Publication Bias. Funnel plot.

Funnel plots of standard error by Hedges’ g effect sizes for mental ill health and positive mental health and academic performance outcomes (hierarchically selected) from the trials with different lengths of post-intervention follow-ups: (A) Mental ill health outcomes with 3–6 months of follow-up; (B) Mental ill health outcomes with 7–12 months of follow-up; (C) Mental ill health outcomes with 13–18 months of follow-up; (D) Positive mental health and academic performance outcomes with 3–6 months of follow-up. No funnel plots were built for positive mental health and academic performance outcomes with 7–12 months and 13–18 months of follow-up due to lack of original data (only two studies were included in each corresponding category).

Coding for meta-analysis and recalculation of original data for STATA 13.1 (StataCorp, College Station, TX).

Raw data. Variable legends and datasets used in the meta-analysis.

Legends of all variables (applicable for all included datasets). Excel sheet 1.

Hierarchical negative outcome (used in main analysis). Excel sheet 2.

Hierarchical positive outcome (used in main analysis). Excel sheet 3.

Individual outcomes (used in main analysis). Excel sheet 4.

Sensitivity analysis-1: Only low SMDs for negative outcomes (i.e. original lowest effect sizes from the studies with several effect sizes reported for the same outcome at the same folow-up point). Excel sheet 5.

Sensitivity analysis-2: Only high SMDs for negative outcomes (i.e. original highest effect sizes from the studies with several effect sizes reported for the same outcome at the same folow-up point). Excel sheet 6.

Sensitivity analysis-3: Only low SMDs for positive outcomes (i.e. original lowest effect sizes from the studies with several effect sizes reported for the same outcome at the same folow-up point). Excel sheet 7.

Sensitivity analysis-4: Only high SMDs for positive outcomes (i.e. original highest effect sizes from the studies with several effect sizes reported for the same outcome at the same folow-up point). Excel sheet 8.

PRISMA flow diagram.

Acknowledgements

We gratefully acknowledge Anders Wändahl and Carl Gornitzki, University Library, Karolinska Institutet for advice and support and for providing us with electronic searches; our colleagues, Annika Frykholm, Anna Månsdotter, and Sven Bremberg who contributed to the screening and extraction of data-base searches; and stake-holders from Student Health Services for valuable input for practice.

Additional Information and Declarations

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Regina Winzer conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, authored or reviewed drafts of the paper, approved the final draft.

Lene Lindberg conceived and designed the experiments, performed the experiments, analyzed the data, authored or reviewed drafts of the paper, approved the final draft.

Karin Guldbrandsson performed the experiments, analyzed the data, authored or reviewed drafts of the paper, approved the final draft.

Anna Sidorchuk performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.

Report a problem

Our promise
PeerJ promises to address all issues as quickly and professionally as possible. We
thank you in advance for your patience and understanding.

Type of problem

Details

500 characters remaining

Follow this publication for updates

"Following" is like subscribing to any updates related to a publication.
These updates will appear in your home dashboard each time you visit PeerJ.

You can also choose to receive updates via daily or weekly email digests.
If you are following multiple publications then we will send you
no more than one email per day or week based on your preferences.

Note: You are now also subscribed to the subject areas of this publication
and will receive updates in the daily or weekly email digests if turned on.
You can add specific subject areas through your profile settings.