Can Guidelines Harm Patients?

2052012

Recently I saw an intriguing “personal view” in the BMJ written by Grant Hutchison entitled: “Can Guidelines Harm Patients Too?” Hutchison is a consultant anesthetist with -as he calls it- chronic guideline fatigue syndrome. Hutchison underwent an acute exacerbation of his “condition” with the arrival of another set of guidelines in his email inbox. Hutchison:

On reviewing the level of evidence provided for the various recommendations being offered, I was struck by the fact that no relevant clinical trials had been carried out in the population of interest. Eleven out of 25 of the recommendations made were supported only by the lowest levels of published evidence (case reports and case series, or inference from studies not directly applicable to the relevant population). A further seven out of 25 were derived only from the expert opinion of members of the guidelines committee, in the absence of any guidance to be gleaned from the published literature.

Hutchison’s personal experience is supported by evidence from two articles [2,3].

One paper published in the JAMA 2009 [2] concludes that ACC/AHA (American College of Cardiology and the American Heart Association) clinical practice guidelines are largely developed from lower levels of evidence or expert opinion and that the proportion of recommendations for which there is no conclusive evidence is growing. Only 314 recommendations of 2711 (median, 11%) are classified as level of evidence A , thus recommendation based on evidence from multiple randomized trials or meta-analyses. The majority of recommendations (1246/2711; median, 48%) are level of evidence C, thus based on expert opinion, case studies, or standards of care. Strikingly only 245 of 1305 class I recommendations are based on the highest level A evidence (median, 19%).

Another paper, published in Ann Intern Med 2011 [3], reaches similar conclusions analyzing the Infectious Diseases Society of America (IDSA) Practice Guidelines. Of the 4218 individual recommendations found, only 14% were supported by the strongest (level I) quality of evidence; more than half were based on level III evidence only. Like the ACC/AHH guidelines only a small part (23%) of the strongest IDSA recommendations, were based on level I evidence (in this case ≥1 randomized controlled trial, see below). And, here too, the new recommendations were mostly based on level II and III evidence.

Although there is little to argue about Hutchison’s observations, I do not agree with his conclusions.

In his view guidelines are equivalent to a bullet pointed list or flow diagram, allowing busy practitioners to move on from practice based on mere anecdote and opinion. It therefore seems contradictory that half of the EBM-guidelines are based on little more than anecdote (case series, extrapolation from other populations) and opinion. He then argues that guidelines, like other therapeutic interventions, should be considered in terms of balance between benefit and risk and that the risk associated with the dissemination of poorly founded guidelines must also be considered. One of those risks is that doctors will just tend to adhere to the guidelines, and may even change their own (adequate) practice in the absence of any scientific evidence against it. If a patient is harmed despite punctilious adherence to the guideline-rules, “it is easy to be seduced into assuming that the bad outcome was therefore unavoidable”. But perhaps harm was done by following the guideline….

First of all, overall evidence shows that adherence to guidelines can improve patient outcome and provide more cost effective care (Naveed Mustfa in a comment refers to [4]).

Hutchinson’s piece is opinion-based and rather driven by (understandable) gut feelings and implicit assumptions, that also surround EBM in general.

First there is the assumption that guidelines are a fixed set of rules, like a protocol, and that there is no room for preferences (both of the doctor and the patient), interpretations and experience. In the same way as EBM is often degraded to “cookbook medicine”, EBM guidelines are turned into mere bullet pointed lists made by a bunch of experts that just want to impose their opinions as truth.

The second assumption (shared by many) is that evidence based medicine is synonymous with “randomized controlled trials”. In analogy, only those EBM guideline recommendations “count” that are based on RCT’s or meta-analyses.

Before I continue, I would strongly advice all readers (and certainly all EBM and guideline-skeptics) to read this excellent and clearly written BJM-editorial by David Sackett et al. that deals with misconceptions, myths and prejudices surrounding EBM : Evidence based medicine: what it is and what it isn’t [5].

Sackett et al define EBM as “the conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients”[5]. Sackett emphasizes that “Good doctors use both individual clinical expertise and thebest available external evidence, and neither alone is enough.Without clinical expertise, practice risks becoming tyrannisedby evidence, for even excellent external evidence may be inapplicableto or inappropriate for an individual patient. Without currentbest evidence, practice risks becoming rapidly out of date,to the detriment of patients.”

Guidelines are meant to give recommendations based on the best available evidence. Guidelines should not be a set of rules, set in stone. Ideally, guidelines have gathered evidence in a transparent way and make it easier for the clinicians to grasp the evidence for a certain procedure in a certain situation … and to see the gaps.

Contrary to what many people think, EBM is not restricted to randomized trials and meta-analyses. It involves tracking down the best external evidence there is. As I explained in #NotSoFunny #16 – Ridiculing RCTs & EBM, evidence is not an all-or-nothing thing: RCT’s (if well performed) are the most robust, but if not available we have to rely on “lower” evidence (from cohort to case-control to case series or expert opinion even).
On the other hand RCT’s are often not even suitable to answer questions in other domains than therapy (etiology/harm, prognosis, diagnosis): per definition the level of evidence for these kind of questions inevitably will be low*. Also, for some interventions RCT’s are not appropriate, feasible or too costly to perform (cesarean vs vaginal birth; experimental therapies, rare diseases, see also [3]).

It is also good to realize that guidance, based on numerous randomized controlled trials is probably not or limited applicable to groups of patients who are seldom included in a RCT: the cognitively impaired, the patient with multiple comorbidities [6], the old patient [6], children and (often) women.

Finally not all RCTs are created equal (various forms of bias; surrogate outcomes; small sample sizes, short follow-up), and thus should not all represent the same high level of evidence.*

Thus in my opinion, low levels of evidence are not per definition problematic. Even if they are the basis for strong recommendations. As long as it is clear how the recommendations were reached and as long as these are well underpinned (by whatever evidence or motivation). One could see the exposed gaps in evidence as a positive thing as it may highlight the need for clinical research in certain fields.

There is one BIG BUT: my assumption is that guidelines are “just” recommendations based on exhaustive and objective reviews of existing evidence. No more, no less. This means that the clinician must have the freedom to deviate from the recommendations, based on his own expertise and/or the situation and/or the patient’s preferences. The more, when the evidence on which these strong recommendations are based is ‘scant’. Sackett already warned for the possible hijacking of EBM by purchasers and managers (and may I add health insurances and governmental agencies) to cut the costs of health care and to impose “rules”.

I therefore think it is odd that the ACC/AHA guidelines prescribe that Class I recommendations SHOULD be performed/administered even if they are based on level C recommendations (see Figure).

I also find it odd that different guidelines have a different nomenclature. The ACC/AHA have Class I, IIa, IIb and III recommendations and level A, B, C evidence where level A evidence represents sufficient evidence from multiple randomized trials and meta-analyses, whereas the strength of recommendations in the IDSA guidelines includes levels A through C (OR D/E recommendations against use) and quality of evidence ranges from level I through III , where I indicates evidence from (just) 1 properly randomized controlled trial. As explained in [3] this system was introduced to evaluate the effectiveness of preventive health care interventions in Canada (for which RCTs are apt).

Finally, guidelines and guideline makers should probably be more open for input/feedback from people who apply these guidelines.

————————————————

*the new GRADE (Grading of RecommendationsAssessment, Development, and Evaluation) scoring system taking into account good quality observational studies as well may offer a potential solution.

13 responses

3052012

Hype(00:52:06) :

One problem is that few guidelines include guidance to explain to the patient that “we don’t really know what we’re doing here, and are pretty much just guessing. If you want to decide what to do for yourself, that might do better.”

A quick reply VIA my mobile phone.
…In that case i didn’t make myself clear (enough). Guidelinemakers (of ebm guidelines) don’t just make guesses. It is a lot of work to distill the questions, look for all the evidence in the literature, make a synthesis & then make recommendations on basis of the evidence, which may or may not be convincing. Guidelines and systematic reviews have often revealed current practice to be wrong (even causing many deaths). But people shouldn’t expect 100% certainty. And RCTs are no panacea, as i hope i have shown. Thus expecting that only evidence from 1 rct is good, is not good (sometimes it is, but not for bach and every question or population). That guidelines are not always based on the strongest evidence doesn’t mean everyone can just do whatever one likes. One should base decisions on the available evidence and use some common sense etc as well. For patients it is admittedly more difficult to grasp the guidelines and what is reasonably sound and what isn’t. The guidelines, though different, clearly state how they reach the evidence & recommendations, though.

“possible hijacking of EBM by purchasers and managers … “. In my experience in routine practice the proper caveats expressed in the preamble of good guidelines frequently are lost, unread, and forgotten. The result is they are frequently and regularly used as a substitute for critical and original thought and analysis by a host of different agencies, some of which are mentioned above. I am not sure how much changing the name, from guidelines (it is remarkable how frequently the word “should” pops up) to something else, would help, because the real problem is teaching people how to think critically. cf. As a slightly similar lesson, look how the DSM “cookbook” has been highjacked and the profoundly negative consequences of that.
Then there is the real danger of treatment becoming a rote procession of self-fulfilling mediocrity with the stifling of innovation and use of non-standard treatment: that, IMO, has already happened in the British NHS.

@Ken
You are right about the danger of hijacking by any agency. In reality this happens. I also agree that many guidelines (shouldn’t a better name be Evidence Based Recommendations?) often sound (or are meant to be) too imperative. I’m not sure whether the real problem is that people cannot or do not think properly. It seems like you consider “adherence to guidelines” and “critical thinking” as mutually exclusive. In my view it is still invaluable to have all the available evidence summarized, critically appraised and evaluated at one place. But that doesn’t mean that all recommendations should be taken for granted and that it relieves people from thinking. Furthermore, the other way about is dangerous too. In the past many interventions and screening methods have been applied that seemed to “make sense”, but later appeared to be ineffective or even harmful (see my previous posts about the CRASH-trial and antenatal corticosteroids in women expecting premature babies, Vitamin E and Selenium for prevention of prostate cancer, breast cancer and prostate cancer screening ).
IMO the comparison with the DSM “cookbook” doesn’t hold, for as far as I can tell this is a “handbook” (consensus-derived), not an evidence based guideline.

As the author of the article you critique [1], I’d like to remark that you seem to have slightly missed my point. After writing “I do not agree with his conclusions”, you write a great deal that I not only agree with, but had explicitly in mind as I wrote the article. I’m actually rather keen on Evidence Based Medicine (EBM): as a graduate of Sackett’s Clinical Epidemiology primer course at McMaster, it would be surprising if I weren’t, since I spent a happy summer there learning how to track down “the best external evidence there is”, and have been using those skills for the last twenty years or so.
Perhaps we lost each other at the point (third paragraph) where I wrote “Guidelines committees are cast in the role of distilling evidence from the relevant literature to reduce it to a bullet pointed list or flow diagram, allowing busy practitioners to move on from practice based on mere anecdote and opinion.” That sentence has undergone a little editorial revision, and I think my original text made it slightly more evident that this “role”, thrust upon or adopted by guidelines committees, is not one I believe is appropriate or even sensible.
The rest of my piece flows from that point: medical colleagues, hospital managers and the legal profession regularly treat these distillations of the medical literature *as if* they were rules to be followed, rather than complicated bits of EBM subject to critical appraisal and doubt. In clinical practice, compliance with guidelines has becomes an end in itself, and that is the cause of my guideline fatigue syndrome.
But there is a point at which “the best external evidence there is” descends into a region in which “false positive” studies outweigh “true positives”. Ioannidis has been exploring that territory for some years, and the problem is eloquently described in his paper “Why most published research findings are false” [2], which should be required reading for all members of guidelines committees. We know experts can give diametrically opposed advice [3]; we know that early published research findings are often overturned or downgraded in effect size [4]; we know that early adoption of such findings can lead to harm to patients and later revision of guidelines as more evidence comes in [5][6]. Hence my plea that guidelines committees should be mindful of how their guidelines are being put to inappropriate use in the real world, and should therefore resist the temptation to issue guidelines based on such low-level evidence.

@Grant
First of all, I really appreciate that you make the effort to reply to my blog post and continue the discussion here (I’m pleasantly surprised you found it so quickly)
Of course I can only read what is written in the paper ( 😉 ) and that too is subject to interpretation.

I agree with the following point you make:

“medical colleagues, hospital managers and the legal profession regularly treat these distillations of the medical literature *as if* they were rules to be followed, rather than complicated bits of EBM subject to critical appraisal and doubt. In clinical practice, compliance with guidelines has becomes an end in itself, and that is the cause of my guideline fatigue syndrome”

Perhaps, as an informationspecialist, I’m a bit more naive to the pushing and “hijacking” by agencies and (some) guideline makers/medical specialists (Ken Gillman also referred to). I advise my clients just to use the EBM guidelines as “a source of evidence”, to assess the evidence themselves and to see if it is applicable to their situation (American guidelines for instance are not always applicable to the Dutch situation). But clinicians might be expected/”forced” to follow (some of) the guidelines.

There is yet another point where we lost each other, that is that I (still) don’t agree with your final conclusion:

“guidelines committees should be mindful of how their guidelines are being put to inappropriate use in the real world, and should therefore resist the temptation to issue guidelines based on such low-level evidence”

I know the work of Ioanidis (as a matter of fact I often refered to his work at this blog). The references you cite here (and not in your orginal work) are very important, but I do not see how they support your view. Taken to the extreme, they indicate that you never can be absolutely sure that something actually works, but this also applies to “high” evidence, obtained by an RCT or a SR. In your piece you emphasize that you don’t object to recommendations based on “high evidence”, but recommendations based on “low evidence”. My point is that:

Evidence is not “proof” (thus not 100% *sure*)

Some questions cannot never be answered by RCTs, thus can never attain a high evidence level (yet the evidence can still be the strong), ie. prognosis, diagnosis and harm/etiology questions.

Some procedures are so clearly beneficial or some harm is so clearly evident, that it is unethical, unpractical, too costly and/or unneccesary to test them

Guidelines can be very useful as a source for evidence, because they give an overview of the present evidence and its strength. It would be a wast of time if each clinician reinvented the wheel (read: did a SR for each question each time).

it is good to see “the gaps” in evidence exposed, it means these are the topics future research should focus on

But as said, I assume that guidelines are synthesis of evidence coupled to recommendations and that clinicians must have the freedom to deviate from the recommendations. I also think there should be more differentiation among evidence (I would not call 1 RCT high level evidence per definition). On the other hand for some questions that cannot be answered by RCTs good observational studies/cross-sectional studies might provide the best evidence and might thus score higher).

If high recommendations are based on low evidence, the guideline makers should underpin their decision with references and/or a good motivation.

We agree, though, that evidence is not absolute, guidelines and evidence can change over time, that guideline makers should be prudent in giving high recommendations based on low levels of evidence (for that topic) and guidelines should not be imposed upon clinicians

Laika:
Fortunately, I don’t seek certainty. All clinicians are forced to making decisions using imperfect and incomplete information; and all clinicians have had the experience of a patient suffering because of a recommendation the clinician has made in good faith and after careful consideration of the evidence. All we ask is a *reasonable chance* that the advice we’re offering will help our patients more than it harms them. There are also deep moral instincts that make us want to stick with the status quo until we reach some threshold of evidence at which we will adopt a new intervention. At the level of statistics, that shows up in our (purely conventional) 5% cut-off for false positives; at the life-or-death level, it shows up in the much-discussed distinction in medical ethics between “letting die” (inaction) and “killing” (action).
So clinicians review the evidence, weigh it against their own situation and that of their patients, and will switch to a new intervention in a piecemeal way, varying from clinician to clinician and patient to patient, as the evidence becomes more robust.
Ioannidis and my other references then become important, because they show that early evidence of benefit is simply *not* robust, especially if the results of early studies are generalized beyond their area of application. Generalization from initial good results took place both in the case of tight glycaemic control and perioperative beta-blockade, referenced in my previous reply. And such early promising results are exactly the sort of thing that seems to drive an outbreak of new guidelines: I encountered blanket guidelines recommending tight glycaemic control and widespread perioperative beta-blockade in the mid-2000s, before the usual cycle of optimism-pessimism-realism had had a chance to play itself out as the literature expanded. Such guidelines were not generally well received, because clinicians noted that there was clear danger of harm from both interventions, and unpersuasive evidence of benefit in the enlarged target populations.
So low-level evidence simply doesn’t drive clinicians to “switch”: we wait and we watch. Guidelines based on low-level evidence therefore do not drive change. What then is their use? You’ve said that they provide a repository for evidence, and a means of highlighting gaps in our knowledge – but, as I said in my original article, there are other and (in my view) better ways of getting such information to clinicians’ attention without tagging it as a “guideline”.

On the matter of “level of evidence”, I’m quite content that many questions in medicine are not amenable to randomized control trials. Clinicians simply compare the evidence they’ve got with the evidence they would find persuasive, and they don’t budge if the evidence doesn’t match up to the task at hand. In rare instances, as you say, a case series will be enough to compel change (the efficacy of parachutes has never been tested beyond a case series, but I still advise my patients to use them when jumping out of aircraft). Such fine points of usage don’t in any way undermine the general argument.

“It seems like you consider “adherence to guidelines” and “critical thinking” as mutually exclusive.”
I am not sure you have expressed that idea optimally-They are certainly not mutually exclusive. However, many busy doctors who just want to keep out of trouble are inevitably going to follow the path of least resistance, which means following the guidelines and not sticking your head above the parapet. That sometimes, perhaps too often, means that people will be reluctant to try non-standard recommendations simply because they’re not in the guidelines.

From your 1st comment I concluded that you found critical thinking *most* important. I gave some examples that “reasoning” alone may not always lead to good decisions. I also think that guidelines can be an important source of prefiltered evidence.

However, I too see the danger you point out “busy doctors who just want to keep out of trouble are inevitably going to follow the path of least resistance, which means following the guidelines and not sticking your head above the parapet”.

First, on re-reading everything, including Sackett’s original BMJ editorial, I would like to note that we are in complete agreement about almost everything. And thank you for highlighting that editorial which is worth our attention.

May I note that I did not say anything about “adherence to guidelines” (your quotations). I began my comment with a quote about the hijacking EBM, intending my comments in that context. My analogy was with the framework of such statements. By this I mean that all proper guidelines (and in a sense the DSM is a diagnostic guideline) contain proper caveats at the beginning to do with the limitations of the evidence and them being applied by people with appropriate training and expertise etc. The lack of critical thought etc. to which I alluded is a lack on the part of those external agencies, be they purchasers, managers, lawyers or whatever, who may (and often do) just look at the guideline summary without a proper understanding of its context and limitations.

My unoriginal thought, which bears repetition, is the danger that guidelines can so easily become a crutch for lazy thinkers. My analogy with DSM is on that level, the question that you raise about whether DSM can be considered as evidence-based is different, but interesting.