Abstract

Evidence based medicine has been a topic of considerable controversy in medical and health care circles over its short lifetime, because of the claims made by its exponents about the criteria used to assess the evidence for or against the effectiveness of medical interventions. The central epistemological debates underpinning the debates about evidence based medicine are reviewed by this paper, and some areas are suggested where further work remains to be done. In particular, further work is needed on the theory of evidence and inference; causation and correlation; clinical judgment and collective knowledge; the structure of medical theory; and the nature of clinical effectiveness.

Statistics from Altmetric.com

Evidence based medicine (EBM) is an important movement within medicine and health services, which has had considerable success over the past 30 years in promoting critical scientific and practical awareness of the status of different claims to therapeutic knowledge. Its exponents can generally be characterised as having a strong ethical sense of the importance of avoiding unnecessary harms to patients, and improving health care in the interests of the general good. At the same time, critics of this movement have drawn attention to some alleged weaknesses of the principles and practice of EBM, many of which concern its epistemological credentials.

Epistemology, or the theory of knowledge, is the branch of philosophy concerning the definition of “knowledge” and the establishment of criteria for evaluating claims that something is known, either by individuals or by the community in general. This paper takes epistemological issues as its primary focus, rather than ethical or policy issues raised by EBM, in the belief that many of the latter issues turn, or have been made to turn, on questions of methodology in the evaluation and testing of treatments, in outcome measurement, and in evidence synthesis. When thinking about epistemological issues, it is important to note that raising foundational questions is not identical to raising a sceptical challenge. Philosophical scepticism is a method in epistemology, but it tends to undermine knowledge claims as such, rather than asking, as I do here, what particular methods of inquiry or appraisal do and do not achieve, and how they do so. As I shall show, there are many open questions in the foundations of EBM. I think the challenge here is to solve them, rather than to treat them as fatal objections to the very idea of EBM.

The evidence based medicine movement is normally traced back to a series of lectures given in 1972 by the epidemiologist Archie Cochrane entitled Effectiveness and efficiency: random reflections on health services.1 Cochrane argued that too much medical care was using interventions of dubious or unknown safety and efficacy, causing harm at both individual and population levels, through iatrogenic injury, waste of resources, and failure to take up more effective treatments. He argued that treatments should be evaluated systematically, using unbiased methods of evaluation (such as the randomised controlled trial), and that individual practitioners and the medical profession as a whole should continuously review and appraise their own state of knowledge. This approach had a strong ethical imperative behind it, rooted in concern to do no harm, to do one’s best for one’s patients, and to do so justly by eliminating waste.

Since the programmatic outline of evidence based medicine in Cochrane’s lectures, various elements have been added, including cost effectiveness analysis, a deepened focus on the types of outcome measures used in evaluation, an expanded range of research synthesis tools (notably meta-analysis of existing data sets), and a heightened attention to patient relevant measures and to patient involvement in evaluation. Yet the essence of EBM is arguably the same as it was in 1972, namely, the use of randomised controlled trials (RCTs) to produce (ideally) unbiased evaluations of treatments (and diagnostic tests, health service delivery systems, and so on). Since Cochrane’s lectures, there has been a great deal of discussion of the so called hierarchy of evidence, a qualitative ranking of different types of evidential support for judgments of the clinical superiority of particular interventions over their comparators, which rests on the notion that it is possible to rank methods of inquiry by their susceptibility to bias. Alongside this discussion has been a discussion of how to combine different sorts of evidence, and how to compare different sorts of evidence.2–6

In the remainder of this paper I shall review the main live issues in discussions of the foundations of EBM. I shall not give a detailed discussion of the criticisms of EBM, nor of the consequent ethical issues surrounding EBM, as these are discussed in detail elsewhere in this issue.

WHAT IS KNOWLEDGE?

The standard account of knowledge in analytic philosophy is this: knowledge is justified true belief.

That is, for an individual X to know something (a proposition p), they must believe that p, the p must in fact be true, and they must have a valid justification for believing that p. For a doctor to know a diagnosis—for example—he must believe the diagnosis is correct, it must actually be correct, and he must have a good reason for believing it is correct. What counts as a good reason here is hotly debated. Much of the argument for and against evidence based medicine turns on whether clinical experience or diagnostic skill are sufficient as reasons for belief that a patient has the particular condition or that a treatment will prove beneficial, or whether some further reason (such as good quality experimental evidence) is required. In other words: is “clinical intuition” ever self-justifying as a ground for a claim to know something?

This approach to defining knowledge was first proposed in Plato’s Theaetetus.7 In this dialogue Socrates sets up and then undermines this definition, by pointing out that it involves a vicious regress (how do we know our justification of our belief?). And since this time, a number of different sceptical challenges to particular knowledge claims, or the idea that we know anything at all, have been proposed, as have a number of different attempts to vary the classical definition in order to evade these challenges.

WHAT IS CLINICAL KNOWLEDGE?

The classical definition of knowledge gives a definition of what it is to know a proposition. As a number of philosophers have pointed out, most famously Gilbert Ryle, there are other kinds of knowledge, such as “know how”, which cannot be reduced to propositional knowledge.8 If one considers the kinds of knowledge which a clinical pathologist might have—for example, he or she might indeed know how to take a bacterial culture; which stains to use when looking for certain cellular structures under a microscope; what Staphylococcusaureus looks like; when an infection is likely to be S aureus and so when to perform the relevant diagnostic tests; and so on. Hence clinical knowledge includes a range of know how, scientific knowledge, knowledge of rules of practice, and capacities for recognition and judgment. Some of these forms of knowledge are particularly resistant to formal analysis, although there has been quite a lot of work both in European and analytical philosophical traditions to try to clarify matters.9–11

Many of the standing criticisms of EBM have turned on the role of these other non-propositional forms of knowledge, and their possible inscrutability to objective evaluation. It should be obvious, however, that EBM is not designed to be a comprehensive account of medical knowledge, but only an account of that part of medical knowledge which is propositional. Secondly, whereas some knowledge that clinicians have could be characterised as capacities to make certain sorts of judgment reliably (such as the capacity to make and use a differential diagnosis), that a particular clinician, or a set of clinicians trained in a particular way, possesses these capacities is a proposition which can be evaluated for its truth.12,13 The assertion—for example, that a particular doctor knows by clinical skill, experience, and judgment what is best to do for his patients—looks epistemologically problematic. What is this faculty of knowledge, to which he lays claim? Singular knowledge claims, such as, “this patient has this illness, and this treatment will be most beneficial under these circumstances”, are very difficult to evaluate, precisely because they involve a faculty of judgment (the application of general rules to particular situations).14 But the assertion that a doctor (or doctors in general) possess the capacity to make such claims reliably can be evaluated both analytically and empirically. Analytically, whereas singular knowledge claims are defeasible because they can be false, the claim that one possesses a capacity to make such claims is no more than the claim that one can make such judgments at or above a certain threshold of reliability. And then one can analyse whether this claim of capacity is true, and what makes it true (what its justification is). The claim that one possesses such a capacity is a propositional assertion. Empirically, there are various ways of evaluating whether someone in fact does possess this capacity, by comparing outcomes with other practitioners (using techniques of audit, epidemiology, and outcomes research), and there are ways of evaluating interventions designed to improve such capacities. Whereas it is perhaps more difficult to evaluate claims of skill in medicine and surgery than it is to evaluate the outcomes of particular drug treatments, clinical judgment is none the less within the scope of evidence based medicine’s analytical techniques.

THE CONTENT OF CLINICAL THERAPEUTIC KNOWLEDGE

Concentrating from now on upon the part of clinical knowledge that is propositional, what sorts of propositions form the domain of clinical knowledge? Since EBM is concerned mainly with therapeutics, I will concentrate here on clinical therapeutic knowledge. This question is too broad to answer, but we can usefully distinguish between propositions about facts obtaining in particular situations and propositions about general truths. Evidence based medicine is concerned almost exclusively with the latter. EBM aims at the production and evaluation of law like generalisations about diagnostic tests, treatments, and other health care interventions. A typical statement in EBM might be that for condition C, the best evidence we currently have supports the use of treatment T as the most effective treatment for C.

Involved in this assertion are a number of epistemologically interesting claims.

E1. T is effective for the treatment of C—an unqualified statement about T’s effectiveness.

E2. T is more effective for the treatment of C than other treatments we know of—–of all the treatments for C we have, T is in fact the best, independently of whether we really know that this is the case.

E3. T is more effective for the treatment of C than other treatments we know of on the evidence we have at present—–of all the treatments for C we have, the evidence we have indicates that T is the best.

E1 raises an interesting metaphysical question: what is effectiveness? Elsewhere I have proposed an analysis of “effectivenesses” as properties of treatments, defined relative to specific therapeutic ends.15 These properties are best understood as causal powers or dispositions.16 This analysis then prompts two questions, not widely discussed in philosophy of medicine:

E1A. What makes T effective in the treatment of C?

E1B. What therapeutic ends can properly define effectivenesses?

E1A is a question about how the clinical property, T’s effectiveness in treating C, relates to the physical structure of T. E1B is a question about what sorts of ends can properly be understood as being caused by T’s clinical properties. Can—for example—treatments properly be understood to cause alterations in a patient’s quality of life? What sort of mechanism is involved in causing an alteration in someone’s quality of life? Although this question does take us into very deep metaphysical waters, the clinical point is a simple one—treatments are alleged to bring about all sorts of effects (patient satisfaction, raised CD4 counts, improved five year survival rates), and indeed many different sorts of endpoints are used in clinical trials. Are all of them really measurable and comparable, however, in the way physical endpoints are? We will return to this point when we consider the type of knowledge that clinical trial designs can deliver. In this context it is useful to recall Austin Bradford Hill’s famous criteria for identifying causal relationships in clinical epidemiology and clinical trials, and his requirement that there be a “biologically plausible” mechanism connecting putative cause and putative effect.17 Although often referred to, this point is sometimes more honoured in the breach than in the observance. The point here is that there are serious questions in the metaphysics of medicine and the foundations of clinical sciences that we have hardly begun to pose and which deserve further thought. As an example, consider the way in which cost effectiveness is attributed to treatments as if it were a property of the treatment, when, at best, it is a property of treatments in the context of a particular clinical and economic system.

This takes us to a consideration of the meaning of statement E2. It is relatively unusual to gain categorical knowledge in medical science. We can rarely, if ever, say that T is the treatment for C. Even when we can, T is generally compared with a reference class of other possible treatments for C, and may indeed have been formally evaluated through comparative trials against other members of this reference class (including the use of placebo and doing nothing). Two points are important here.

E2A. T’s effectiveness is judged superior to the other treatments in the reference class against a specific endpoint.

E2B. T’s reference class is defined both by the endpoint and by the set of options available at the time of assertion.

In E2B, availability is just as problematic in EBM as it was in the debate about the choice of control group in trials in the 2000 revision of the Declaration of Helsinki—what is meant by available? In maximal terms, however, availability here must mean something like: theoretically possible, given the total state of medical knowledge now.

This approach to interpreting E2 can be taken in two ways. The first, and simpler, way is this: E2 is a statement to the effect that treatment T is the most effective treatment for condition C of which we have good reason to be aware. The technical difficulty here is that statements about the effectiveness of T turn out to be statements about our knowledge of T, rather than statements about T directly. The second, more complex, way of interpreting E2 is this: the effectiveness of T is essentially relative to our background knowledge of T and its reference class of alternative treatments. This is to handle our knowledge of T and its properties in a way akin to nineteenth century Idealism, according to which all knowledge is relational, propositions are fictional statements, and we can have true knowledge only of the total system of beliefs and their relationships.19,20 Although neither of these alternatives is all that attractive from the point of view of common sense, the core of both is the following idea. All our statements about the clinical effectiveness of a treatment are provisional, and asserted in the light of existing evidence. There is a theoretical limit, according to which all our statements of effectiveness would become categorical statements when all the evidence is in and all reference classes for comparison become absolute reference classes (all conceivable alternative treatments). Under these conditions, statements about clinical effectiveness would become true or false assertions about the treatments themselves directly (rather than reports on our state of knowledge). Much the same processes would also be gone through to refine our disease concepts and our aetiological knowledge as well, of course. This approach to grasping the nature of effectiveness is a fairly standard strategy in pragmatist and realist philosophies of science, which presuppose that our current scientific knowledge is fallible (and may indeed be mostly false) but that at the end of inquiry we will have a true representation of the world in all its fine structure.21,22 Whereas this theory has many difficulties, the challenge it presents of determining what theoretical structure the body of clinical knowledge has and how to determine the truth of clinical propositions remains pertinent.

CLINICAL THERAPEUTIC KNOWLEDGE, THE ETHICS OF BELIEF, AND PROBABILITY KINEMATICS23,24

Statement E3 focuses our attention not on the theoretical structure of clinical knowledge, but on the structure of individuals’ or communities’ beliefs about clinical effectiveness at a given time. In other words, it addresses how individuals or communities should maintain their stock of beliefs about what works in medical treatment. There are two different dimensions to this: how should beliefs be updated in the light of new evidence, and what sort of evidence should be sought.

The updating problem is very interesting and has ramifications for philosophy of science more generally. When should new evidence be sought? This has two elements: when, as a practitioner, should I seek to update my own knowledge base? The rational individual will not update his or her knowledge continuously, since rational (human) individuals are finite beings, and information has costs in time and other resources. If this is the case, what updating heuristic should the individual adopt? Some light has been thrown on this by Kenneth Goodman (in philosophy) and others (in the theory of meta-analysis and elsewhere), but it is a question which is as important as I suspect it may be intractable to theoretical analysis.6 Secondly, how often should the scientific community update its knowledge base and synthesise what it knows? Again, this may be an intractable question, but practical steps can be taken in terms of evidence synthesis through such groups as the Cochrane Collaboration. There remains a wide range of technical problems in the theory of evidential support for theories, but many of these appear to have no practical consequences.25,26 This appearance may be deceptive. Whereas clinical epistemology appears to concentrate on proof or refutation of singular propositions—for example, this is a questionable assumption—statements of therapeutic efficacy should probably be understood as law-like statements, rather than singular statements. As such, they are theoretical statements, albeit at a low level of abstraction, and so are squarely in the domain of problems such as how a theory is to be tested; what counts as a fair test; when evidence can be said to corroborate a theory; and so on. Much of the intellectual difficulty facing clinical therapeutics, I argue, is that the theoretical structure of medicine as an autonomous science (as opposed to a collection of knowledge drawn from diverse more basic sciences) is generally opaque to investigators, and there is the apparent possibility of testing propositional claims one at a time, rather than recognising the role they play in a web of theoretical and empirical commitments.

EVIDENCE, THEORY, AND EVIDENCE BASED MEDICINE

The theoretical opacity of medicine leads us to the question of what sort of evidence should be sought for testing propositions of clinical therapeutic effectiveness. Consider the following problem: when should we regard a clinical therapeutic proposition as proven? Given the provisional nature of any such statement when framed as a statement of type E3, the answer may well be never. Statements of type E3, however, are always proposed relative to a reference class of treatments. Does this solve the problem? No, not really, because it is always possible to require a further test or up the evidential ante by requesting a more robust or reliable experimental design or a more extensive data set for meta-analysis. Thus, there is a question of epistemological “good behaviour” to solve, which is: when is proof sufficient? At any given time, we could stop—and, as we shall see, there is a broad consensus that a certain type of clinical evidence is generally taken to be sufficient. Frequently, however, parties may dispute whether this is the proper way to resolve the cognitive and therapeutic dispute.

The so called gold standard of clinical evidence is the properly controlled and appropriately powered randomised controlled clinical trial, with appropriate blinding.(I dislike the phrase “gold standard” as it has a confusing and highly misleading economic meaning irrelevant to this context). The role of the RCT in EBM has been controversial, for a number of reasons. First, many critics hold that RCT evidence may sometimes be unattainable for methodological or ethical reasons, and secondly they hold that the so called hierarchy of evidence downgrades other sorts of clinical evidence and provides no way of integrating them into an overall assessment of the evidence for the effectiveness of treatments. Thirdly, the RCT is methodologically wedded to a particular theory of statistical inference, which many statisticians and doctors dispute. Fourthly, the RCT is almost purely a methodological solution to clinical epistemology, in that it is blind to mechanisms of explanation and causation.

The first two objections have been discussed in many places, and I will not go over these again. It is true that some adherents of the EBM approach have been overenthusiastic about what can be tested with RCTs, the supposed meaninglessness or poor quality of other sorts of evidence, and the ethical superiority of the RCT over other sorts of design and approach to treatment under clinical uncertainty or equipoise.27,28 Indeed, there are significant questions about the possibility of ascertaining whether equipoise obtains and what follows from it epistemologically and ethically.29,30 None the less, few critics will deny that in general the RCT does give reliable and robust evidence, and that it has its place in clinical research. Criticisms of the RCT as a methodology can be found, however, which turn on a linked series of problems concerning the theories of inference used to frame and interpret RCTs. First, RCTs produce statements about the truth of E3 type statements within a given confidence interval (usually 95%). The methodology of RCTs is essentially comparative, so that whereas RCTs sometimes permit us to make estimates of the magnitude of effectiveness, that is not their main purpose. Much of the criticism of RCT methodology depends on this apparent dependency upon classical theories of statistical inference (which provide the foundation for talk of confidence intervals in the first place).31,32 Critics of this methodology argue that the RCT requires us to collect unnecessarily large sets of data, binds us to excessively large control groups, and requires us to continue with trials too long both when there is evidence of danger to patients and when there is evidence of superior effectiveness. They base these arguments on Bayesian theories of inference, which permit regular updating of degrees of belief in the truth of our E3 statement. Indeed, more than this, they hold that statistical information is always and only about rational subjective degrees of belief, rather than measures of objective probability; and indeed that the notion of objective probability as tied to the RCT is meaningless. This controversy will arguably never be settled, and indeed Donald Gillies argues that any reasonable theory of probability must allow both for objective chances (as in physics) and in subjective degrees of belief (as in psychology), and must live with the grammatical problems involved in trying to speak of both using the same basic language.33 Allowing for this diversity of interpretations causes us to ask, however, what the nature of a statement of clinical evidence actually is: a measurement of an objective probability, a statement of rational personal (subjective) probability, or a statement of rational collective (intersubjective) probability. In addition, there is a classical problem of statistical measurement which is unresolved in EBM, to whit, whether statistical experiments generate true causal knowledge or merely measurements of correlations.

The question of causation is important in a number of ways. First, part of the supposed superiority of placebo controlled trials in purely scientific terms, is that it permits judgments about the causal efficacy of interventions in bringing about their effects and to some extent allows estimation of the size of those effects.34,35 This justification may not apply if RCTs do not test causal hypotheses but merely establish correlations or contribute to the probability kinematics of degrees of belief.24–26

Second, the beauty of the RCT as a methodology is that it seems to operate at a level of scientific theory autonomous from the basic sciences. Apparently we need to know little or nothing of pathogenesis or drug action in order for a randomised controlled trial to be designed and implemented and (perhaps) interpreted successfully. Indeed, our theories at this more basic level could simply be wrong. So long as the results of the RCT give an answer to our E3 question (which treatment does better in this population?), measured by using a suitably well defined and credible endpoint, then questions of mechanism and cause seem to drop out of the analysis. In this regard, RCTs are an admirably pragmatist methodology, in the metaphysical and epistemological senses of the term. Much of the appeal of RCTs to methodologists is the way they can be used to test hypotheses about the effects of interventions in a very wide range of contexts, from clinical pharmacology to social welfare, even when our theories of how interventions bring about their effects may be murky or merely speculative (as in social policy, perhaps).36,37 Most methodologists would challenge designs which had no prima facie theory to support them, but there is no strict methodological requirement for this. Bradford Hill’s famous biological plausibility requirement, as a necessary condition for an intervention to be testable ethically by RCT, is best understood as a way of screening off obviously implausible treatments from test. Here, however, as so often, what counts as plausible is contestable: classical examples include the plausibility or otherwise of psychoanalysis for neurosis, or the di Bella treatment for cancer.38,39 The relationship between the theoretical structure of clinical science and the theoretical structures of the more basic sciences is as complex, and EBM may prove to be a major contribution to the establishment of clinical medicine as an autonomous scientific discipline. As noted above, however, its own theoretical structure may be quite opaque. In any event, as Nancy Cartwright and others have suggested is the case in physics, we may be better off expecting medicine to produce a patchwork of phenomenal laws of relatively low generality, rather than a complete and consistent system of universal, metaphysically founded, laws.40

CONCLUDING REMARKS

In this paper I have tried to present a range of epistemological issues concerning evidence based medicine and randomised controlled trials. Many of these issues are highly technical; my purpose in drawing them to the reader’s attention is to stimulate philosophical debate and research into these problems. What, however, of the practice of medicine? My personal view is that most of these problems are quite generic problems in the philosophy of science: foundational questions, so to speak. As a patient I would still prefer to be treated in the light of the best clinical evidence, and I would still prefer to be randomised in a well designed experiment where genuine uncertainty prevailed about the status of possible treatments for my illness.41 Part of my rationale for this is simply to ensure that I benefit from the best treatment, given our state of knowledge at the time. But, as a philosopher, I would also mark a certain scepticism about the idea that many of these foundational questions admit of metaphysical solutions.42 Evidence based medicine is the best available bet, and by small methodological and analytical improvements we will make progress in the scientific basis of health services. The philosophical challenges to the foundations of EBM are, however, important: methodological modesty is the order of the day.

Acknowledgments

This work was produced with partial funding from the EVIBASE project, funded by the European Commission. The author thanks R ter Meulen and R Lie for their comments on a draft of this paper.

Worrall J. What evidence in evidence based medicine. London: London School of Economics, 2002 (Centre for Philosophy of the Natural and Social Sciences: Causality: metaphysics and methods. Technical report 01/02).

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.