Abstract

IntroductionHigh-quality epidemiologic research is essential in reducing chronic diseases. We analyzed the quality of systematic reviews of observational nontherapeutic studies.

Methods
We searched several databases for systematic reviews of observational
nontherapeutic studies that examined the prevalence of or risk factors for chronic diseases and were published in core clinical journals from 1966 through June 2008. We analyzed the quality of such reviews
by using prespecified criteria and internal
quality evaluation of the included studies.

Results
Of the 145 systematic reviews we found, fewer than half met each quality criterion; 49% reported study flow, 27% assessed gray literature, 2% abstracted sponsorship of individual studies, and none abstracted the disclosure of conflict of interest by the authors of individual studies. Planned, formal internal quality evaluation of included studies was reported in 37% of systematic reviews. The journal of publication, topic of review, sponsorship, and conflict of interest were not associated with
better quality. Odds of formal internal quality evaluation (odds ratio [OR], 1.10 per year; 95% confidence interval [CI], 1.02-1.19) and either
planned, formal internal quality evaluation or abstraction of quality criteria of included studies (OR, 1.17 per year; 95% CI, 1.08-1.26) increased over time, without positive trends in other quality criteria from 1990 through
June 2008. Systematic reviews with internal
quality evaluation did not meet other quality criteria more often than those that ignored the
quality of included studies.

Conclusion
Collaborative efforts from investigators and journal editors are needed to improve the quality of systematic reviews.

Introduction

Valid epidemiologic research is essential in preventing chronic diseases (1-3). Assessing the quality of observational studies is an important part of evidence synthesis (4). Systematic reviews have become key tools in evidence synthesis from a growing number of epidemiologic studies (5). Producing high-quality systematic reviews is essential to developing generalizable and actionable conclusions (6,7). Quality criteria for systematic reviews have been proposed by working groups that
developed the Meta-analysis of Observational Studies in Epidemiology (MOOSE), Strengthening the Reporting of Observational Studies in Epidemiology (STROBE), and a measurement tool for assessment of multiple systematic reviews (AMSTAR) (8-12). The working groups and the Cochrane handbook (13) addressed those criteria for systematic reviews that more likely result in biased results, including bias in selection of the studies or the information within studies by the reviewers (14-18) or
bias in the publication of positive significant results (6,15,19,20).

Previous research and guidelines (13,21-23) focus on systematic reviews of interventional therapeutic studies. Validity of observational nontherapeutic studies of prevalence of chronic diseases or risk factors for diseases is essential for effective preventive public health actions (24,25).
Our aim was to evaluate the quality of systematic reviews of observational nontherapeutic studies that examined the incidence and prevalence of chronic conditions and risk factors for
diseases. The criteria we used to determine the reporting and methodologic quality in systematic reviews were from published standards (8-12). We hypothesized that the quality of systematic reviews differs by the time when the study was published, the country in which the study was conducted, the journal of publication, the sponsorship of the study, and whether a conflict of interest was disclosed. We hypothesized also that systematic reviews with internal
quality evaluation of the included
studies would have better quality, demonstrating commitment to quality of evidence.

Methods

Data sources

We searched MEDLINE via PubMed and via Ovid MEDLINE, the Cochrane Library (26) and working groups, WorldCat (27), and Scirus (28) to find systematic reviews of observational nontherapeutic studies published in English from 1966 through June 2008 in core clinical journals (exact search string is
listed inAppendix Table 1). We used the definitions of core clinical journals from the
Abridged Index Medicus (119 indexed titles). We defined observational nontherapeutic studies as observations of patient
outcomes that did not examine procedures concerned with the remedial treatment or prevention of diseases (29).

Study selection

Three investigators independently decided on the eligibility of the studies according to recommendations from the
Cochrane Handbook for Systematic Reviews of Interventions (13). We reviewed abstracts to exclude comments, expert opinions, letters, case reports, systematic reviews of interventional studies, and systematic reviews of studies of diagnostic accuracy of tests.

Data extraction

Evaluations of the studies and data extraction were performed independently by 2 researchers. Predefined categorical responses to the checklist items were abstracted into our spreadsheet. Errors in data extraction were assessed by a comparison of the data charts with the original articles (13,30). Any discrepancies were discussed and resolved. The quality criteria that we abstracted were based on guidelines for determining the reporting and methodologic quality of systematic reviews
(8-12).

To evaluate selection bias, we abstracted whether the authors of systematic reviews described the search strategy (yes, no, or partially); yes indicated that the authors reported time periods of searches, searched databases, and exact search string. We abstracted whether the authors of systematic reviews described study flow (yes, no, or partially); yes indicated that the authors reported the list of retrieved citations, the list of excluded studies, and justification for
exclusion.

We abstracted as dichotomous variables whether the authors of systematic reviews did any of the following:

Stated the aim of the review and the primary and secondary hypotheses of the review.

Included or justified exclusion of articles published in languages other than English.

Analyzed sponsorship of and conflict of interest in the included studies.

We abstracted how the authors of systematic reviews described obtained statistical methods with justification and models for pooling with fixed or random effects models in sufficient detail to be replicated (no pooling, random, or fixed). We abstracted whether the authors of pooling analyses reported statistical tests for heterogeneity and whether heterogeneity was statistically significant (not reported, not significant, or significant).

We used 3 categories to classify whether the authors of systematic reviews had evaluated the quality of included studies
by using developed or previously published checklists or scales (31): 1) the authors stated planned,
formal internal quality evaluations; 2) the authors abstracted selected criteria of external or internal validity without using a
planned, formal, and comprehensive internal quality evaluation; and 3) the authors did not conduct internal quality evaluations. We further
categorized the studies that evaluated quality criteria to compare studies with no mention of internal quality evaluation of the included studies. We also compared studies with and without planned formal internal quality evaluation. We abstracted with dichotomous responses blinding and reliability
testing (reported or not reported) of internal quality evaluations.

We abstracted several explanatory variables that could be related to the quality of systematic reviews:

The year of publication, defined as a continuous variable. We created categories of 4- or 5-year periods: 1990 to 1994, 1995 to 1999, 2000 to 2004, and 2005 through
June 2008.

The journals of publication.

The country where the systematic reviews were performed.

The sponsorship of the reviews. Those that had either governmental or foundational
support or were fellowships were defined as having nonprofit support.

The disclosure of conflict of interest by authors of reviews (either not disclosed, disclosed as no conflict of interest, or disclosed conflict
of interest).

The number of disclosed relationships with industry, defined as a continuous variable.

The sponsor’s participation in data collection, analysis, and interpretation of the results of the review.

The review outcomes as risk factors for prevalence or incidence of chronic conditions or diseases.

Data synthesis

We summarized the results in evidence tables. We used prespecified categories of dependent and independent variables and did not force the data into binary categories for definitive tests of significance. We used univariate logistic regression to examine the association between internal
quality evaluation and the year of the publication by using the Wald test. Odds ratios
(ORs) were calculated with binary logit models and Fisher’s scoring method technique. We computed the fractions of
systematic reviews meeting various quality criteria in each of the 4 time periods considered. The proportions of systematic reviews that met different levels of each quality criterion were evaluated
by using χ2 tests and Fisher’s exact tests in cases of small numbers. All calculations were performed at 95% confidence
intervals (CIs) by using 2-sided P values with SAS version 9.1.3 (SAS Institute Inc, Cary, North Carolina).

Results

We found 145 eligible systematic reviews of observational nontherapeutic studies (study flow in the Appendix Figure) (32-176). The number of published systematic reviews increased from 17
during 1990-1994 to 56 during 2005-2008. Most of the studies were conducted in the United States (55 publications) or in the United Kingdom (28 publications) (Appendix Table 2). Half of the systematic reviews (73 publications) were funded by nonprofit organizations; 56 (39%) reviews did not publish their funding
sources, 4 reviews received industry support, and 10 were sponsored jointly by industry and nonprofit organizations. Almost three-fourths (106) of the authors of systematic reviews did not disclose conflict of interest; 35 publications stated that the authors do not have any conflict of interest; and 4 studies were conducted by authors who reported conflict of interest. The studies were published in 49 journals. Most systematic reviews (122 studies) assessed risk factors for chronic diseases,
19 summarized estimates of prevalence or incidence, 2 studies reported prevalence and associations with risk factors, and 2 studies examined levels of risk factors. Most studies reported incidence and risk factors for cardiovascular diseases (46 studies) or cancer (26 studies).

Quality of systematic reviews

Less than half of the studies reported study flow (49%), assessed gray literature (27%), or addressed language bias (29%)
(Table 1). Only 2% of reviews abstracted sponsorship of individual studies and none abstracted the disclosure of conflict of interest by the authors of individual studies that were eligible for the reviews. Pooling was performed in 137 studies; of these, 62% used a random effects model; 57%
reported detecting significant heterogeneity across the studies; and 19% did not provide any information about statistical heterogeneity in pooled estimates. The proportion of systematic reviews that met quality criteria including study flow, assessment of gray literature, or the abstraction of funding sources of included studies did not show significant trends from 1990 through 2008. The proportion of systematic reviews that assessed language bias increased from 8%
during 1995-1999 to
41% during 2005-2008.
In later years, more studies reported using random effects models (79% during 2005-2008 vs 39%
during 1995-1999) and tests for statistical heterogeneity (89% during 2005-2008 vs 65%
during 1995-1999).

Internal quality evaluation

Planned and detailed quality assessment of included studies was reported in 37% of systematic reviews, and 18% abstracted more than 1 criterion of external or internal
quality; significant positive trends were reported during the evaluated time (Table 1). Quality assessment was masked in 3 studies. Development of the appraisals, including references to previously published tools, was reported in 32 studies, but only 6 tested interobserver agreement for quality assessment.

Quality of systematic review by explanatory factors

The quality of systematic reviews did not differ much by study location or by the journal of publication. Systematic reviews of prevalence or incidence
or risk factors of the diseases did not differ in their quality measures. Sponsorship was not associated with quality of the reviews. The role of conflict of interest was impossible to establish because the authors of 56 reviews did not disclose funding and authors of 106 reviews did not disclose conflict of interest.

Explanatory factors of internal quality evaluation of included studies

The journal of publication, topic of the review, and continent where the review was conducted were not associated with the likelihood of internal quality evaluation. Systematic reviews of risk factors tended to conduct internal quality evaluation of the included studies more often than reviews of incidence or prevalence or of levels of risk factors. Systematic reviews sponsored by nonprofit organizations conducted internal quality evaluations of individual studies more often than reviews
that received corporate funding. Systematic reviews that disclosed conflict of interest conducted internal quality evaluation of individual studies less frequently (10
of 39 studies; 26%) than reviews with no disclosure (44 of 106 studies; 42%). Odds of formal internal
quality evaluation (OR, 1.10 per year; 95% CI, 1.02-1.19) and either planned, formal internal
quality evaluation or abstraction of quality criteria (OR, 1.17 per year; 95% CI, 1.08-1.26) increased over time.
Disclosure of conflict of interest by the authors of systematic reviews was not associated with greater odds of internal quality evaluation.

Quality of systematic reviews by internal quality evaluation

Complete documentation of the literature search including time period, databases searched, and exact literature search strings was less common among reviews with
planned, formal internal quality evaluation (48
studies, 35%) than among reviews without it (90 studies, 65%)
(Table 2). However, reviews that either abstracted selected quality criteria or planned, formal internal quality evaluation reported partial (6
studies) or complete (74 studies) information about the literature search more often than studies that did not evaluate quality of
included studies (64 studies). Reviews that did not justify exclusion of non-English studies ignored quality of individual studies more often (72
studies) than reviews with planned, formal internal quality evaluation (31 studies). The same pattern was present for publication bias: the reviews that did not mention gray literature also ignored the quality of individual studies. The reviews reporting attempts to contact the authors of included studies either
performed planned, formal internal quality evaluation or abstracted selected quality
criteria more often than reviews without such attempts (OR, 2.3; 95% CI, 1.1-4.7). Reviews with complete reporting of study flow performed
planned, formal internal quality evaluation or abstracted quality criteria more often (51
studies) than reviews without study flows (20 studies). More than half of systematic reviews without
planned, formal internal quality evaluation
(44 studies) also did not report study flow.

The association between quality of systematic reviews and sponsor participation in the data collection, analyses, and interpretation was difficult to analyze because this information was either omitted or reported in various ways. Less than 10% of systematic reviews contained a clear statement that the sponsors did not play any role in gathering the studies or analyzing or interpreting the results and did not influence the content of the manuscript. Other reviews omitted mention of the role
of the sponsor in approval of the manuscript or provided a general statement that sponsors did not influence the conclusions or the content of the paper. Two reviews included statements of unconditional or unrestricted sponsorship of the meta-analyses.

Discussion

Our analyses showed that less than half of the systematic reviews of nontherapeutic observational studies that were published in core clinical journals met each quality criterion. Quality of systematic reviews did not improve over time. Planned,
formal internal quality evaluations of the included studies was reported in less than half of systematic reviews, but the prevalence of internal quality evaluations has increased during the last decade. Our findings are in concordance with previously published
methodologic analyses of systematic reviews that also found inconsistent quality and incomplete internal quality evaluation of individual studies (6). Methodologic analyses of systematic reviews that focused on particular diseases or conditions demonstrated that half of the publications had major flaws in design and reporting. For instance, systematic reviews of therapies for renal diseases failed to assess the methodologic quality of included studies (177). Methodologic analyses of systematic
reviews of interventions showed that 69% of those randomly selected in MEDLINE meta-analyses did not analyze quality of trials (22). Most (68%) systematic reviews of diagnostic tests
for cancer did not provide formal assessments of study quality (178). We also found that the quality of reviews did not differ among types of studies (incidence or risk factors for diseases), types of diseases, or journal of publication.

Journal commitment to high-quality research, however, was associated with improved reporting quality of the publications. For example, adoption by journals of the Consolidated Standards of Reporting Trials (CONSORT) improved the quality of the publications of interventional studies (179,180). An endorsement of the developed standards for observational studies including MOOSE and STROBE checklists may also improve quality of the publications. We did not analyze how many core clinical
journals adopted these standards and how quality of the publications changed depending on this adaptation. Peer review of submitted manuscripts should include quality assessment using validated tools (12).

We could not identify the factors that can explain differences in quality of systematic reviews. The role of sponsorship and conflict of interest could not be estimated because of poor reporting of this information.
The quality and reliability of quality evaluation of the included studies is unclear because development of the appraisals was described in a small proportion of systematic reviews (32 of 80
studies), and only 6 of 80 studies tested interobserver agreement for quality assessment. We did not
evaluate all reviews of observational studies that were published in epidemiologic journals. However, it is unlikely that the quality of reviews published in other journals would be better than those in core clinical journals. Future research should investigate the factors that can explain differences in the quality of systematic reviews.

Peer reviewed publications of high-quality systematic reviews can provide the best available research evidence for evidence-based public health (24). Evidence-based decisions can improve public health practice in preventing incidence and progression of chronic diseases (25). In our analysis, less than half
of the systematic reviews of observational nontherapeutic studies met quality criteria established in the MOOSE, STROBE, and AMSTAR statements. Internal quality evaluation of included studies
should be an essential part of evidence synthesis, but only half of the reviews reported such evaluation. Collaborative efforts from investigators and journal editors are needed to improve quality of systematic reviews.

Acknowledgments

This article is based on research conducted by the Minnesota Evidence-based Practice Center under contract to the Agency for Healthcare Research and Quality (AHRQ), Rockville, Maryland (contract no. 290-02-0009).

We thank our reviewers David Atkins, MD, John Hoey, MD, and Christine Laine, MD, for reviewing and commenting on the draft; our collaborating experts, Mohammed Ansari, MBBS, Ethan Balk, MD, Nancy Berkman,
PhD, Chantelle Garritty, Mark Grant, MD, Gail Janes, PhD, Margaret Maglione, MPP, David Moher, PhD,
Mona Nasser, DDS, Gowri Raman, MD, Karen Robinson, MD, Jodi Segal, MD, and Thomas Trikalinos, PhD, for their scientific input throughout this project; and Carmen Kelly, PharmD, our task order
officer, and Stephanie Chang, MD, medical officer, at AHRQ for their guidance throughout the project. We also thank librarian Judith Stanke for her contributions to the literature search; research assistants Emily Zabor, candidate for
the master of science degree (MS) in biostatistics, and Akweley Ablorh, candidate for MS in biostatistics, for the data abstraction, quality control, and synthesis of evidence; Zhihua Bian, candidate for MS in biostatistics, for her statistical help; Zhiyuan Xu, candidate for MS in
applied economics, for his work creating the ACCESS database; Dean McWilliams for his assistance in database development; Qi Wang, research fellow, for her statistical expertise in reliability testing; Susan Duval, PhD, for her help estimating sample size; Marilyn Eells for editing and formatting the report; and Nancy Russell, MLS, and Rebecca Schultz for their assistance gathering data from the experts and formatting the tables,
and Christa Prodzinski for quality control of the data.

US Preventive Services Task Force, Agency for Healthcare Research and
Quality. The guide to clinical preventive services, 2008: recommendations of
the US Preventive Services Task Force. Rockville (MD): Agency for Healthcare
Research and Quality; 2008.

Principles of epidemiology in public health practice: an introduction to applied epidemiology and biostatistics. 3rd
edition. Atlanta (GA): US Department of Health and Human Services, Centers for Disease Control and Prevention, Office of Workforce and Career
Development; 2006.

Appendix

Table 1. Search Strategy and Exact Search Strings Used to Identify Systematic Reviews of Observational Studies, Scales and Checklists for Internal Quality Evaluation, and Studies About Bias in Observational Research,
1966 Through June 2008

“Epidemiologic Studies”[MeSH] AND “Research Design/standards”[MeSH] AND (“Evaluation Studies as Topic/classification”[MeSH] OR “Evaluation Studies as Topic/methods”[MeSH] OR “Evaluation Studies as Topic/standards”[MeSH]) Limits: Humans, Journal Article, English

Figure. Study flow to identify systematic reviews of observational
studies, scales, and checklists for planned formal internal quality evaluation, and studies
about bias in observational research, 1990 through June 2008. [A text description
of this figure is also available.]

Table 2. Quality of Systematic Review and Meta-Analyses of Nontherapeutic Observational Studies Published in Core Clinical Journals,
1990 through June 2008

Publication Characteristics

Outcome

Estimate

Assessment of Quality of Included Studies

Bracken, 1990 (32)Country: United StatesJournal: Obstet GynecolSponsorship: Not reportedConflict of interest (COI): Not reportedSponsor participation in data analyses: Not reported

Hatsukami and Fischman, 1996 (53)Country: United StatesJournal: JAMASponsorship: Not reportedCOI: Not reportedSponsor participation in data analyses: Not reported

Use of crack cocaine and cocaine hydrochloride

Prevalence

No

Hill and Schoener, 1996 (54)Country: United StatesJournal: Am J PsychiatrySponsorship: Not reportedCOI: Not reportedSponsor participation in data analyses: Not reported

Attention deficit hyperactivity disorder

Prevalence

No

Hackshaw et al, 1997 (55)Country: United KingdomJournal: BMJSponsorship: GovernmentCOI: Reported as not a conflict of interestSponsor participation in data analyses: “The views expressed are those of the authors and not necessarily those of the Department of Health.”

Law and Hackshaw, 1997 (57)Country: United KingdomJournal: BMJSponsorship: NoneCOI: Reported as not a conflict of interestSponsor participation in data analyses: None

Hip fracture

Risk

No

Law et al, 1997 (58)Country: United KingdomJournal: BMJSponsorship: GovernmentCOI: Reported as not a conflict of interestSponsor participation in data analyses: “The Department of Health (England) supported this work, although the views are our own.”

WHO Collaborative Study Team on the Role of Breastfeeding on
the Prevention of Infant Mortality, 2000 (78)Country: BrazilJournal: LancetSponsorship: Nonprofit organizationCOI: Not reportedSponsor participation in data analyses: Not reported

Juul et al, 2002 (90)Country: DenmarkJournal: BloodSponsorship: Government, nonprofit organizationCOI: Not reportedSponsor participation in data analyses: “They had no role in gathering, analyzing, or interpreting the data and had no right to approve or disapprove the submitted paper.”

Thurnham et al, 2003 (107)Country: United KingdomJournal: LancetSponsorship: Government, fellowshipCOI: Reported as not a conflict of interestSponsor participation in data analyses: “The funding source had no role in study design, data collection, data analysis, data interpretation, or in the writing of this report.”

Fazel et al, 2005 (125)Country: United KingdomJournal: LancetSponsorship: Nonprofit organizationCOI: Reported as not a conflict of interestSponsor participation in data analyses: “The sponsors of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.”

Serious mental disorder

Prevalence

Quality criteria abstracted

García-Closas et al, 2005 (126)Country: United StatesJournal: LancetSponsorship: Nonprofit organizationCOI: Reported as not a conflict of interestSponsor participation in data analyses: “The study sponsors had no role in the design of the study; in the collection, analysis, or interpretation of the data; or in the writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit the paper for publication.”

Boudville et al, 2006 (132)Country: CanadaJournal: Ann Intern MedSponsorship: Government, fellowshipCOI: Reported as not a conflict of interestSponsor participation in data analyses: “The study sponsors had no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the paper for publication.”

Di Castelnuovo et al, 2006 (135)Country: ItalyJournal: Arch Intern MedSponsorship: GovernmentCOI: Not reportedSponsor participation in data analyses: “The sponsor of the study had no involvement in study design; data collection, analysis, or interpretation; writing of the report; or in the decision to submit the paper for publication.”

Eichler et al, 2007 (155)Country: SwitzerlandJournal: Am Heart JSponsorship: Nonprofit organizationCOI: Not reportedSponsor participation in data analyses: “The funding source had no influence on study design; in the collection, analysis, and interpretation of the data; in the writing of the manuscript; and in the decision to submit the manuscript for publication.”

Grulich et al, 2007 (157)Country: AustraliaJournal: LancetSponsorship: Government, fellowship, scholarshipCOI: Reported as a conflict of interestSponsor participation in data analyses: “There was no funding source for this study. All authors had access to all the data. The corresponding author had final responsibility for the decision to submit for publication.”

Hirtz et al, 2007 (159)Country: United StatesJournal: NeurologySponsorship: Not reportedCOI: Reported as not a conflict of interestSponsor participation in data analyses: Not reported

Common neurologic disorders

Prevalence

Yes

Huxley et al, 2007 (160)Country: AustraliaJournal: Am J Clin NutrSponsorship: Government, nonprofit organizationCOI: Reported as not a conflict of interestSponsor participation in data analyses: “None of the funding sources had any role in the study design, data analysis, data interpretation, writing of the paper, or the decision to submit the paper for publication.”

Langan et al, 2007 (162)Country: United KingdomJournal: Arch DermatolSponsorship: Nonprofit organizationCOI: Reported as not a conflict of interestSponsor participation in data analyses: “The sponsor had no role in the design and conduct of the study; in the collection, analysis, and interpretation of data; or in the preparation, review, or approval of the manuscript.”

Larsson and Wolk, 2007 (164)Country: SwedenJournal: GastroenterologySponsorship: Nonprofit organizationCOI: Reported as not a conflict of interestSponsor participation in data analyses: “The sponsor had no role in
the study design or in the collection, analysis, and interpretation of the data.”

Polanczyk et al, 2007 (168)Country: BrazilJournal: Am J PsychiatrySponsorship: Industry, foreign grantsCOI: Not reportedSponsor participation in data analyses: “There was no involvement of any funding source in the study design, data collection, analysis, interpretation of data, and writing of this article or in the decision to submit the article for publication.”

Conde-Agudelo et al, 2008 (175)Country: United StatesJournal: Am J Obstet GynecolSponsorship: GovernmentCOI: Not reportedSponsor participation in data analyses: “The views expressed in this document are solely the responsibility of the authors and do not necessarily represent the views of the World Health Organization.”