Category Archives: evidence-based practice

Post navigation

Status

Another Study Warns That Evidence From Observational Studies Provides Unreliable Results For Therapies

We have previously mentioned the enormous contributions made by John Ioannidis MD in the area of understanding the reliability of medical evidence. [Ioannidis, Delfini Blog, Giannakakis] We want to draw your attention to a recent publication dealing with the risks of relying on observational data for cause and effect conclusions. [Hemkens] In this recent study, Hemkens, Ioannidis and other colleagues assessed differences in mortality effect size reported in observational (routinely collected data [RCD]) studies as compared with results reported in RCTs.

Eligible RCD studies used propensity scores in an effort to address confounding bias in the observational studies. The authors compared the results of RCD and RCTs. The analysis included only RCD studies conducted before any RCT was published on the same topic. They assessed the risk of bias for RCD studies and randomized controlled trials (RCTs) using The Cochrane Collaboration risk of bias tools. The direction of treatment effects, confidence intervals and effect sizes (odds ratios) were compared between RCD studies and RCTs. The relative odds ratios were calculated across all pairs of RCD studies and trials.

These authors remind us yet again that If no randomized trials exist, clinicians and other decision-makers should not trust results from observational data from sources such as local or national databases, registries, cohort or case-control studies.

Status

“Reading” a Clinical Trial Won’t Get You There—or Let’s Review (And Apply) Some Basics About Assessing The Validity of Medical Research Studies Claiming Superiority for Efficacy of Therapies

An obvious question raised by the title is, “Get you where?” Well, the answer is, “To where you know it is reasonable to think you can trust the results of the study you have just finished reading.” In this blog, our focus is on how to critically appraise medical research studies which claim superiority for efficacy of a therapy.

Because of Lack of Understanding Medical Science Basics, People May Be Injured or Die

Understanding basic requirements for valid medical science is very important. Numbers below are estimates, but are likely to be close or understated—

Over 63,000 people with heart disease died after taking encainide or flecainide because many doctors thought taking these drugs “made biological sense,” but did not understand the simple need for reliable clinical trial information to confirm what seemed to “make sense” [Echt 91].

An estimated 60,000 people in the United States died and another 140,000 experienced a heart attack resulting from the use of a nonsteroidal anti-inflammatory drug despite important benefit and safety information reported in the abstract of the pivotal trial used for FDA approval [Graham].

In another example, roughly 42,000 women with advanced breast cancer suffered excruciating side effects without any proof of benefit, many of them dying as a result, and at a cost of $3.4 billion dollars [Mello].

At least 64 deaths out of 751 cases in nearly half the United States were linked to fungal meningitis thought to be caused by a contaminated treatment that is used for back and radicular pain—but there is no reliable scientific evidence of benefit from that treatment [CDC].

In the above instances, these were preventable deaths and harms—from common treatments—which patients might have avoided if their physicians had better understood the importance and methods of evaluating medical science.

Failures to Understand Medical Science Basics

Many health care professionals don’t know how to quickly assess a trial for reliability and clinical usefulness—and yet mastering the basics is not difficult. Over the years, we have given a pre-test of 3 simple questions to more than a thousand physicians, pharmacists and others who have attended our training programs. Approximately 70% fail—”failure” being defined as missing 2 or 3 of the questions.

One pre-test question is designed to see if people recognize the lack of a comparison group in a report of the “effectiveness” of a new treatment. Without a comparison group of people with similar prognostic characteristics who are treated exactly the same except for the intervention under study, you cannot discern cause and effect of an intervention because a difference between groups may explain or affect the results.

A second pre-test question deals with presenting results as relative risk reduction (RRR) without absolute risk reduction (ARR) or event rates in the study groups. A “relative” measure raises the question, “Relative to what?” Is the reported RRR in our test question 60 percent of 100 percent? Or 60 percent of 1 percent?

The last of our pre-test questions assesses attendees’ basic understanding of only one of the two requirements to qualify as an Intention-to-Treat (ITT) analysis. The two requirements are that people should be randomized as analyzed and that all people should be included in the analysis whether they have discontinued, are missing or have crossed over to other treatment arms. The failure rate at knowing this last requirement is very high. (We will add that this last requirement means that a value has to be assigned if one is missing—and so, one of the most important aspects of critically appraising an ITT analysis is the evaluation of the methods for “imputing” missing data.)

By the end of our training programs, success rates have always markedly improved. Others have reported similar findings.

There is a Lot of Science + Much of It May Not Be ReliableEach week more than 13,000 references are added to the world’s largest library—the National Library of Medicine (NLM). Unfortunately, many of these studies are seriously flawed. One large review of 60,352 studies reported that only 7 percent passed criteria of high quality methods and clinical relevancy [McKibbon]. We and others have estimated that up to (and maybe more than) 90% of the published medical information that health care professionals rely on is flawed [Freedman, Glasziou].

Bias Distorts ResultsWe cannot know if an intervention is likely to be effective and safe without critically appraising the evidence for validity and clinical usefulness. We need to evaluate the reliability of medical science prior to seriously considering the reported therapeutic results because biases such as lack of or inadequate randomization, lack of successful blinding or other threats to validity—which we will describe below—can distort reported result by up to 50 percent or more [see Risk of Bias References].

Patients Deserve BetterPatients cannot make informed choices regarding various interventions without being provided with quantified projections of benefits and harms from valid science.

Some Simple Steps To Critical AppraisalBelow is a short summary of our simplified approach to critically appraising a randomized superiority clinical trial. Our focus is on “internal validity” which means “closeness to truth” in the context of the study. “External validity” is about the likelihood of reaching truth outside of the study context and requires judgment about issues such as fit with individuals or populations in circumstances other than those in the trial.

Is reading this study worth my time? If the results are true, would they change my practice? Do they apply to my situation? What is the likely impact to my patients

Can anything explain the results other than cause and effect? Evaluate the potential for results being distorted by bias (anything other than chance leading away from the truth) or random chance effects.

Is there any difference between groups other than what is being studied? This is automatically a bias.

If the study appears to be valid, but attrition is high, sometimes it is worth asking, what conditions would need to be present for attrition to distort the results? Attrition does not always distort results, but may obscure a true difference due to the reduction in sample size.

Evaluating Bias

There are four stages of a clinical trial, and you should ask several key questions when evaluating bias in each of the 4 stages.

Subject Selection & Treatment Assignment—Evaluation of Selection Bias

Important considerations include how were subjects selected for study, were there enough subjects, how were they assigned to their study groups, and were the groups balanced in terms of prognostic variables?

Your critical appraisal to-do list includes—

a) Checking to see if the randomization sequence was generated in an acceptable manner. (Minimization may be an acceptable alternative.)

b) Determining if the investigators adequately concealed the allocation of subjects to each study group? Meaning, is the method for assigning treatment hidden so that an investigator cannot manipulate the assignment of a subject to a selected study group?

c) Examining the table of baseline characteristics to determine whether randomization was likely to have been successful, i.e., that the groups are balanced in terms of important prognostic variables (e.g., clinical and demographic variables).

The Intervention & Context—Evaluation of Performance Bias

What is being studied, and what is it being compared to? Was the intervention likely to have been executed successfully? Was blinding likely to have been successful? Was duration reasonable for treatment as well as for follow-up? Was adherence reasonable? What else happened to study subjects in the course of the study such as use of co-interventions? Were there any differences in how subjects in the groups were treated?

Your to-do list includes evaluating:

a) Adequacy of blinding of subjects and all working with subjects and their data—including likely success of blinding;

b) Subjects’ adherence to treatment;

c) Inter-group differences in treatment or care except for the intervention(s) being studied.

Data Collection & Loss of Data—Evaluation of Attrition Bias

What information was collected, and how was it collected? What data are missing and is it likely that missing data could meaningfully distort the study results?

b) Classification and quantification of missing data in each group (e.g., discontinuations due to ADEs, unrelated deaths, protocol violations, loss to follow-up, etc.)

c) Whether missing data are likely to distort the reported results? This is the area that the evidence on the distorting risk of bias provides the least help. And so, again, often it is worthwhile asking, “What conditions would need to be present for attrition to distort the results?”

Results & Assessing The Differences In The Outcomes Of The Study Groups—Evaluating Assessment Bias

Were outcome measures reasonable, pre-specified and analyzed appropriately? Was reporting selective? How was safety assessed? Remember that models are not truth.

Your to-do list includes evaluating—

a) Whether assessors were blinded.

b) How the effect size was calculated (e.g., absolute risk reduction, relative risk, etc.). You especially want to know benefit or risk with and without treatment.

c) Were confidence intervals included? (You can calculate these yourself online, if you wish. See our web links at our website for suggestions.)

d) For dichotomous variables, was a proper intention-to-treat (ITT) analysis conducted with a reasonable choice for imputing values for missing data?

e) For time-to-event trials, were censoring rules unbiased? Were the number of censored subjects reported?

After you have evaluated a study for bias and chance and have determined that the study is valid, the study results should be evaluated for clinical meaningfulness, (e.g., the amount of clinical benefit and the potential for harm). Clinical outcomes include morbidity; mortality; symptom relief; physical, mental and emotional functioning; and, quality of life—or any surrogate outcomes that have been demonstrated in valid studies to affect a clinical outcome.

Final Comment

It is not difficult to learn how to critically appraise a clinical trial. Health care providers owe it to their patients to gain these skills. Health care professionals cannot rely on abstracts and authors’ conclusions—they must assess studies first for validity and second for clinical usefulness. Authors are often biased, even with the best of intentions. Remember that authors’ conclusions are opinions, not evidence. Authors frequently use misleading terms or draw misleading conclusions. Physicians and others who lack critical appraisal skills are often mislead by authors’ conclusions and summary statements. Critical appraisal knowledge is required to evaluate the validity of a study which must be done prior to seriously considering reported results.

For those who wish to go more deeply, we have books available and do training seminars. See our website at www.delfini.org.

Status

Sounding the Alarm (Again) in Oncology

Five years ago Fojo and Grady sounded the alarm about value in many of the new oncology drugs [1]. They raised the following issues and challenged oncologists and others to get involved in addressing these issues:

There is a great deal of uncertainty and confusion about what constitutes a benefit in cancer therapy; and,

How much should cost factor into these deliberations?

The authors review a number of oncology drug studies reporting increased overall survival (OS) ranging from a median of a few days to a few months with total new drug costs ranging from $15,000 to $90,000 plus. In some cases, there is no increase in OS, but only progression free survival (PFS) which is a weaker outcome measure due to its being prone to tumor assessment biases and is frequently assessed in studies of short duration. Adverse events associated with the new drugs are many and include higher rates of febrile neutropenia, infusion-related reactions, diarrhea, skin toxicity, infections, hypertension and other adverse events.

Fojo and Grady point out that—

“Many Americans would likely not regard a 1.2-month survival advantage as ‘significant’ progress, the much revered P value notwithstanding. But would an individual patient agree? Although we lack the answer to this question, we would suggest that the death of a mother of four at age 37 years would be no less painful were it to occur at age 37 years and 1 month, nor would the passing of a 67-year-old who planned to travel after retiring be any less difficult for the spouse were it to have occurred 1 month later.”

In a recent article [2] (thanks to Dr. Richard Lehman for drawing our attention to this article in his wonderful BMJ blog) Fojo and colleagues again point out that—

Cancer is the number one cause of mortality worldwide, and cancer cases are projected to rise by 75% over the next 2 decades.

Of the 71 therapies for solid tumors receiving FDA approval from 2002 to 2014, only 30 of the 71 approvals (42%) met the American Society of Clinical Oncology Cancer Research Committee’s “low hurdle” criteria for clinically meaningful improvement. Further, the authors tallied results from all the studies and reported very modest collective median gains of 2.5 months for PFS and 2.1 months for OS. Numerous surveys have indicated that patients expect much more.

Expensive therapies are stifling progress by (1) encouraging enormous expenditures of time, money, and resources on marginal therapeutic indications; and, (2) promoting a me-too mentality that is stifling innovation and creativity.

The last bullet needs a little explaining. The authors provide a number of examples of “safe bets” and argue that revenue from such safe and profitable therapies rather than true need has been a driving force for new oncology drugs. The problem is compounded by regulations—e.g., rules which require Medicare to reimburse patients for any drug used in an “anti-cancer chemotherapeutic regimen”—regardless of its incremental benefit over other drugs—as long as the use is “for a medically accepted indication” (commonly interpreted as “approved by the FDA”). This provides guaranteed revenues for me-too drugs irrespective of their marginal benefits. The authors also point out that when prices for drugs of proven efficacy fall below a certain threshold, suppliers often stop producing the drug, causing severe shortages.

What can be done? The authors acknowledge several times in their commentary that the spiraling cost of cancer therapies has no single villain; academia, professional societies, scientific journals, practicing oncologists, regulators, patient advocacy groups and the biopharmaceutical industry—all bear some responsibility. [We would add to this list physicians, P&T committees and any others who are engaged in treatment decisions for patients. Patients are not on this list (yet) because they are unlikely to really know the evidence.] This is like many other situations when many are responsible—often the end result is that “no one” takes responsibility. Fojo et al. close by making several suggestions, among which are—

Academicians must avoid participating in the development of marginal therapies;

Professional societies and scientific journals must raise their standards and not spotlight marginal outcomes;

All of us must also insist on transparency and the sharing of all published data in a timely and enforceable manner;

Actual gains of benefit must be emphasized—not hazard ratios or other measures that force readers to work hard to determine actual outcomes and benefits and risks;

We need cooperative groups with adequate resources to provide leadership to ensure that trials are designed to deliver meaningful outcomes;

We must find a way to avoid paying premium prices for marginal benefits; and,

We must find a way [federal support?] to secure altruistic investment capital.

Delfini Comment
While the authors do not make a suggestion for specific responsibilities or actions on the part of the FDA, they do make a recommendation that an independent entity might create uniform measures of benefits for each FDA-approved drug—e.g., quality-adjusted life-years. We think the FDA could go a long way in improving this situation.

And so, as pointed out by Fojo et al., only small gains have been made in OS over the past 12 years, and costs of oncology drugs have skyrocketed. However, to make matters even worse than portrayed by Fojo et al., many of the oncology drug studies we see have major threats to validity (e.g., selection bias, lack of blinding and other performance biases, attrition and assessment bias, etc.) raising the question, “Does the approximate 2 month gain in median OS represent an overestimate?” Since bias tends to favor the new intervention in clinical trials, the PFS and OS reported in many of the recent oncology trials may be exaggerated or even absent or harms may outweigh benefits. On the other hand, if a study is valid, since a median is a midpoint in a range of results and a patient may achieve better results than indicated by the median, some patients may choose to accept a new therapy. The important thing is that patients are given information on benefits and harms in a way that allows them to have a reasonable understanding of all the issues and make the choices that are right for them.

We have often observed that evidence can be a neutralizing force. This editorial highlights for us that this means involving the patient in a meaningful way and finding ways to support decisions based on patients’ personal requirements. These personal “patient requirements” include health care needs and wants and a recognition of individual circumstances, values and preferences.

To achieve this, we believe that patients should receive the same information as clinicians including what alternatives are available, a quantified assessment of potential benefits and harms of each including the strength of evidence for each and potential consequences of making various choices including things like vitality and cost.

Decisions may differ between patients, and physicians may make incorrect assumption about what most matters to patients of which there are many examples in the literature such as in the citations below.

Status

Demo: Critical Appraisal of a Randomized Controlled Trial

We recently had a great opportunity to listen to a live demonstration of a critical appraisal of a randomized controlled trial conducted by Dr. Brian Alper, Founder of DynaMed; Vice President of EBM Research and Development, Quality & Standards at EBSCO Information Services.

Dr. Alper is extremely knowledgeable about critical appraisal and does an outstanding job clearly describing key issues concerning his selected study for review. We are fortunate to have permission to share the recorded webinar with you.

Important: It takes about 60 seconds before the webinar starts. (Be sure your sound is on.)

More Chances to Learn about Critical Appraisal

There is a wealth of freely available information to help you both learn and accomplish critical appraisal tasks as well as other evidence-based quality improvement activities. Our website is www.delfini.org. We also have a little book available for purchase for which we are getting rave reviews and which is now being used to train medical and pharmacy residents and is being used in medical, pharmacy and nursing schools.

We highly recommend DynaMed. Although we urge readers to be aware that there is variation in all medical information sources, as members of the DynaMed editorial board (unpaid), we have opportunity to participate in establishing review criteria as well as getting a closer look into methods, staff skills, review outcomes, etc., and we think that DynaMed is a great resource. Depending upon our clinical question and project, DynaMed is often our starting point.

About DynaMed™ from the DynaMed Website

DynaMed™ is a clinical reference tool created by physicians for physicians and other health care professionals for use at the point-of-care. With clinically-organized summaries for more than 3,200 topics, DynaMed provides the latest content and resources with validity, relevance and convenience, making DynaMed an indispensable resource for answering most clinical questions during practice.

Updated daily, DynaMed editors monitor the content of over 500 medical journals on a daily basis. Each article is evaluated for clinical relevance and scientific validity. The new evidence is then integrated with existing content, and overall conclusions are changed as appropriate, representing a synthesis of the best available evidence. Through this process of Systematic Literature Surveillance, the best available evidence determines the content of DynaMed.

Status

Why Statements About Confidence Intervals Often Result in Confusion Rather Than Confidence

A recent paper by McCormack reminds us that authors may mislead readers by making unwarranted “all-or-none” statements and that readers should be mindful of this and carefully examine confidence intervals.

When examining results of a valid study, confidence intervals (CIs) provide much more information than p-values. The results are statistically significant if a confidence interval does not touch the line of no difference (zero in the case of measures of outcomes expressed as percentages such as absolute risk reduction and relative risk reduction and 1 in the case of ratios such as relative risk and odds ratios). However, in addition to providing information about statistical significance, confidence intervals also provide a plausible range for possibly true results within a margin of chance (5 percent in the case of a 95% CI). While the actual calculated outcome (i.e., the point estimate) is “the most likely to be true” result within the confidence interval, having this range enables readers to judge, in their opinion, if statistically significant results are clinically meaningful.

However, as McCormack points out, authors frequently do not provide useful interpretation of the confidence intervals, and authors at times report different conclusions from similar data. McCormack presents several cases that illustrate this problem, and this paper is worth reading.

As an illustration, assume two hypothetical studies report very similar results. In the first study of drug A versus drug B, the relative risk for mortality was 0.9, 95% CI (0.80 to 1.05). The authors might state that there was no difference in mortality between the two drugs because the difference is not statistically significant. However, the upper confidence interval is close to the line of no difference and so the confidence interval tells us that it is possible that a difference would have been found if more people were studied, so that statement is misleading. A better statement for the first study would include the confidence intervals and a neutral interpretation of what the results for mortality might mean. Example—

“The relative risk for overall mortality with drug A compared to placebo was 0.9, 95% CI (0.80 to 1.05). The confidence intervals tell us that Drug A may reduce mortality by up to a relative 20% (i.e., the relative risk reduction), but may increase mortality, compared to Drug B, by approximately 5%.”

In a second study with similar populations and interventions, the relative risk for mortality might be 0.93, 95% CI (0.83 to 0.99). In this case, some authors might state, “Drug A reduces mortality.” A better statement for this second hypothetical study would ensure that the reader knows that the upper confidence interval is close to the line of no difference and, therefore, is close to non-significance. Example—

“Although the mortality difference is statistically significant, the confidence interval indicates that the relative risk reduction may be as great as 17% but may be as small as 1%.”

The Bottom Line

Remember that p-values refer only to statistical significance and confidence intervals are needed to evaluate clinical significance.

Watch out for statements containing the words “no difference” in the reporting of study results. A finding of no statistically significant difference may be a product of too few people studied (or insufficient time).

Watch out for statements implying meaningful differences between groups when one of the confidence intervals approaches the line of no difference.

None of this means anything unless the study is valid. Remember that bias tends to favor the intervention under study.

If authors do not provide you with confidence intervals, you may be able to compute them yourself, if they have supplied you with sufficient data, using an online confidence interval calculator. For our favorites, search “confidence intervals” at our web links page: http://www.delfini.org/delfiniWebSources.htm

Status

In a recent BMJ article, “Why we can’t trust clinical guidelines,” Jeanne Lenzer raises a number of concerns regarding clinical guidelines[1]. She begins by summarizing the conflict between 1990 guidelines recommending steroids for acute spinal injury versus 2013 cllinical recommendations against using steroids in acute spinal injury. She then asks, “Why do processes intended to prevent or reduce bias fail?

Her proposed answers to this question include the following—

Many doctors follow guidelines, even if not convinced about the recommendations, because they fear professional censure and possible harm to their careers.

Supporting this, she cites a poll of over 1000 neurosurgeons which showed that—

Only 11% believed the treatment was safe and effective.

Only 6% thought it should be a standard of care.

Yet when asked if they would continue prescribing the treatment, 60% said that they would. Many cited a fear of malpractice if they failed to follow “a standard of care.” (Note: the standard of care changed in March 2013 when the Congress of Neurological Surgeons stated there was no high quality evidence to support the recommendation.)

The Cochrane reviewer for the 1990 guideline she references had strong ties to industry.

Delfini Comment

Fear-based Decision-making by Physicians

We believe this is a reality. In our work with administrative law judges, we have been told that if you “run with the pack,” you better be right, and if you “run outside the pack,” you really better be right. And what happens in court is not necessarily true or just. The solution is better recommendations constructed from individualized, thoughtful decisions based on valid critically appraised evidence found to be clinically useful, patient preferences and other factors. The important starting place is effective critical appraisal of the evidence.

Financial Conflicts of Interest & Industry Influence

It is certainly true that money can sway decisions, be it coming from industry support or potential for income. However, we think that most doctors want to do their best for patients and try to make decisions or provide recommendations with the patient’s best interest in mind. Therefore, we think this latter issue may be more complex and strongly affected in both instances by the large number of physicians and others involved in health care decision-making who 1) do not understand that many research studies are not valid or reported sufficiently to tell; and, 2) lack the skills to be able to differentiate reliable studies from those which may not be reliable.

When it comes to industry support, one of the variables traveling with money includes greater exposure to information through data or contacts with experts supporting that manufacturer’s products. We suspect that industry influence may be less due to financial incentives than this exposure coupled with lack of critical appraisal understanding. As such, we wrote a Letter to the Editor describing our theory that the major problem of low quality guidelines might stem from physicians’ and others’ lack of competency in evaluating the quality of the evidence. Our response is reproduced here.

Delfini BMJ Rapid Response [2]:

We (Delfini) believe that we have some unique insight into how ties to industry may result in advocacy for a particular intervention due to our extensive experience training health care professionals and students in critical appraisal of the medical literature. We think it is very possible that the outcomes Lenzer describes are less due to financial influence than are due to lack of knowledge. The vast majority of physicians and other health care professionals do not have even rudimentary skills in identifying science that is at high to medium risk of bias or understand when results may have a high likelihood of being due to chance. Having ties to industry would likely result in greater exposure to science supporting a particular intervention.

Without the ability to evaluate the quality of the science, we think it is likely that individuals would be swayed and/or convinced by that science. The remedy for this and for other problems with the quality of clinical guidelines is ensuring that all guideline development members have basic critical appraisal skills and there is enough transparency in guidelines so that appraisal of a guideline and the studies utilized can easily be accomplished.

Status

On Monday, May 20, 2013, we presented a webinar on “Using Real-World Data & Published Evidence in Pharmacy Quality Improvement Activities” for the member organizations of the Alliance of Community Health Plans (ACHP).

The 80-minute discussion addressed four topic areas, all of which have unique critical appraisal challenges. Webinar goals were to discuss issues that arise when conducting quality improvement efforts using real world data, such as data from claims, surveys and observational studies and other published healthcare evidence.

Key pitfalls were cherry picked for these four mini-seminars—

Pitfalls to avoid when using real-world data, dealing with heterogeneity, confounding-by-indication and causality.

Key issues in evaluating oncology studies — outcome issues and focus on how to address large attrition rates.

Important issues when conducting comparative safety reviews — assessing patterns through use of RCTs, systematic reviews, observational studies and registries.

Key issues in evaluating studies employing Kaplan-Meier estimates — time-to-event basics with attention to the important problem of censoring.

Status

Review of Endocrinology Guidelines

Decision-makers frequently rely on the body of pertinent research in making decisions regarding clinical management decisions. The goal is to critically appraise and synthesize the evidence before making recommendations, developing protocols and making other decisions. Serious attention is paid to the validity of the primary studies to determine reliability before accepting them into the review. Brito and colleagues have described the rigor of systematic reviews (SRs) cited from 2006 until January 2012 in support of the clinical practice guidelines put forth by the Endocrine Society using the Assessment of Multiple Systematic Reviews (AMSTAR) tool [1].

The authors included 69 of 2817 studies. These 69 SRs had a mean AMSTAR score of 6.4 (standard deviation, 2.5) of a maximum score of 11, with scores improving over time. Thirty five percent of the included SRs were of low-quality (methodological AMSTAR score 1 or 2 of 5, and were cited in 24 different recommendations). These low quality SRs were the main evidentiary support for five recommendations, of which only one acknowledged the quality of SRs.

The authors conclude that few recommendations in field of endocrinology are supported by reliable SRs and that the quality of the endocrinology SRs is suboptimal and is currently not being addressed by guideline developers. SRs should reliably represent the body of relevant evidence. The authors urge authors and journal editors to pay attention to bias and adequate reporting.

Delfini note: Once again we see a review of guideline work which suggests using caution in accepting clinical recommendations without critical appraisal of the evidence and knowing the strength of the evidence supporting clinical recommendations.

Status

Review of Bias In Diabetes Randomized Controlled Trials

Healthcare professionals must evaluate the internal validity of randomized controlled trials (RCTs) as a first step in the process of considering the application of clinical findings (results) for particular patients. Bias has been repeatedly shown to increase the likelihood of distorted study results, frequently favoring the intervention.

Readers may be interested in a new systematic review of diabetes RCTs. Risk of bias (low, unclear or high) was assessed in 142 trials using the Cochrane Risk of Bias Tool. Overall, 69 trials (49%) had at least one out of seven domains with high risk of bias. Inadequate reporting frequently hampered the risk of bias assessment: the method of producing the allocation sequence was unclear in 82 trials (58%) and allocation concealment was unclear in 78 trials (55%). There were no significant reductions in the proportion of studies at high risk of bias over time nor in the adequacy of reporting of risk of bias domains. The authors conclude that these trials have serious limitations that put the findings in question and therefore inhibit evidence-based quality improvement (QI). There is a need to limit the potential for bias when conducting QI trials and improve the quality of reporting of QI trials so that stakeholders have adequate evidence for implementation. The entire freely-available study is available at—