Diet and health. What can you believe: or does bacon kill you?

This article has been reposted on The Winnower, and now has a digital object identifier DOI: 10.15200/winn.142934.47856

This post is not about quackery, nor university politics. It is about inference, How do we know what we should eat? The question interests everyone, but what do we actually know? Not as much as you might think from the number of column-inches devoted to the topic. The discussion below is a synopsis of parts of an article called “In praise of randomisation”, written as a contribution to a forthcoming book, Evidence, Inference and Enquiry.

Eating one sausage or three rashers of bacon a day can increase the risk of bowel cancer by a fifth, a medical expert has said.

The warning involved only 1.8oz (50g) of processed meat daily.

It recommended that people eat less than 17.6 oz of cooked red meat a week and avoid all processed meat.

Researchers found that almost half of cancers could be prevented with lifestyle changes such as a healthier diet, using sunscreen, not smoking and limiting alcohol intake.

What, I wondered, was the evidence behind these dire warnings. They did not come from a lifestyle guru, a diet faddist or a supplement salesman. This is nothing to do with quackery. The numbers come from the 2007 report of the World Cancer Research Fund and American Institute for Cancer Research, with the title ‘Food, Nutrition, Physical Activity, and the Prevention of Cancer: a Global Perspective‘. This is a 537 page report with over 4,400 references. Its panel was chaired by Professor Sir Michael Marmot, UCL’s professor of Epidemiology and Public Health. He is a distinguished epidemiologist, renowned for his work on the relation between poverty and health.

Nevertheless there has never been a randomised trial to test the carcinogenicity of bacon, so it seems reasonable to ask how strong is the evidence that you shouldn’t eat it? It turns out to be surprisingly flimsy.

In praise of randomisation

Everyone knows about the problem of causality in principle. Post hoc ergo propter hoc; confusion of sequence and consequence; confusion of correlation and cause. This is not a trivial problem. It is probably the main reason why ineffective treatments often appear to work. It is traded on by the vast and unscrupulous alternative medicine industry. It is, very probably, the reason why we are bombarded every day with conflicting advice on what to eat. This is a bad thing, for two reasons. First, we end up confused about what we should eat. But worse still, the conflicting nature of the advice gives science as a whole a bad reputation. Every time a white-coated scientist appears in the media to tell us that a glass of wine per day is good/bad for us (delete according to the phase of the moon) the general public just laugh.

In the case of sausages and bacon, suppose that there is a correlation between eating them and developing colorectal cancer. How do we know that it was eating the bacon that caused the cancer – that the relationship is causal? The answer is that there is no way to be sure if we have simply observed the association. It could always be that the sort of people who eat bacon are also the sort of people who get colorectal cancer. But the question of causality is absolutely crucial, because if it is not causal, then stopping eating bacon won’t reduce your risk of cancer. The recommendation to avoid all processed meat in the WCRF report (2007) is sensible only if the relationship is causal. Barker Bausell said:

[Page39] “But why should nonscientists care one iota about something as esoteric as causal inference? I believe that the answer to this question is because the making of causal inferences is part of our job description as Homo Sapiens.”

That should be the mantra of every health journalist, and every newspaper reader.

The essential basis for causal inference was established over 70 years ago by that giant of statistics Ronald Fisher, and that basis is randomisation. Its first popular exposition was in Fisher’s famous book, The Design of Experiments (1935). The Lady Tasting Tea has become the classical example of how to design an experiment. .

Briefly, a lady claims to be able to tell whether the milk was put in the cup before or after the tea was poured. Fisher points out that to test this you need to present the lady with an equal number of cups that are ‘milk first’ or ‘tea first’ (but otherwise indistinguishable) in random order, and count how many she gets right. There is a beautiful analysis of it in Stephen Senn’s book, Dicing with Death: Chance, Risk and Health. As it happens, Google books has the whole of the relevant section Fisher’s tea test (geddit?), but buy the book anyway. Such is the fame of this example that it was used as the title of a book, The Lady Tasting Tea was published by David Salsburg (my review of it is here)

Most studies of diet and health fall into one of three types, case-control studies, cohort (or prospective) studies, or randomised controlled trials (RCTs). Case-control studies are the least satisfactory: they look at people who already have the disease and look back to see how they differ from similar people who don’t have the disease. They are retrospective. Cohort studies are better because they are prospective: a large group of people is followed for a long period and their health and diet is recorded and later their disease and death is recorded. But in both sorts of studies,each person decides for him/herself what to eat or what drugs to take. Such studies can never demonstrate causality, though if the effect is really big (like cigarette-smoking and lung cancer) they can give a very good indication. The difference in an RCT is that each person does not choose what to eat, but their diet is allocated randomly to them by someone else. This means that, on average, all other factors that might influence the response are balanced equally between the two groups. Only RCTs can demonstrate causality.

Randomisation is a rather beautiful idea. It allows one to remove, in a statistical sense, bias that might result from all the sources that you hadn’t realised were there. If you are aware of a source of bias, then measure it. The danger arises from the things you don’t know about, or can’t measure (Senn, 2004; Senn, 2003). Although it guarantees freedom from bias only in a long run statistical sense, that is the best that can be done. Everything else is worse.

Ben Goldacre has referred memorably to the newspapers’ ongoing “Sisyphean task of dividing all the inanimate objects in the world into the ones that either cause or cure cancer” (Goldacre, 2008). This has even given rise to a blog. “The Daily Mail Oncological Ontology Project“. The problem arises in assessing causality.

It wouldn’t be so bad if the problem were restricted to the media. It is much more worrying that the problem of establishing causality often seems to be underestimated by the authors of papers themselves. It is a matter of speculation why this happens. Part of the reason is, no doubt, a genuine wish to discover something that will benefit mankind. But it is hard not to think that hubris and self-promotion may also play a role. Anything whatsoever that purports to relate diet to health is guaranteed to get uncritical newspaper headlines.

At the heart of the problem lies the great difficulty in doing randomised studies of the effect of diet and health. There can be no better illustration of the vital importance of randomisation than in this field. And, notwithstanding the generally uncritical reporting of stories about diet and health, one of the best accounts of the need for randomisation was written by a journalist, Gary Taubes, and it appeared in the New York Times (Taubes, 2007).

The case of hormone replacement therapy

In the 1990s hormone replacement therapy (HRT) was recommended not only to relieve the unpleasant symptoms of the menopause, but also because cohort studies suggested that HRT would reduce heart disease and osteoporosis in older women. For these reasons, by 2001, 15 million US women (perhaps 5 million older women) were taking HRT (Taubes, 2007). These recommendations were based largely on the Harvard Nurses’ Study. This was a prospective cohort study in which 122,000 nurses were followed over time, starting in 1976 (these are the ones who responded out of the 170,000 requests sent out). In 1994, it was said (Manson, 1994) that nearly all of the more than 30 observational studies suggested a reduced risk of coronary heart disease (CHD) among women receiving oestrogen therapy. A meta-analysis gave an estimated 44% reduction of CHD. Although warnings were given about the lack of randomised studies, the results were nevertheless acted upon as though they were true. But they were wrong. When proper randomised studies were done, not only did it turn out that CHD was not reduced: it was actually increased.

The Women’s Health Initiative Study (Rossouw et al., 2002) was a randomized double blind trial on 16,608 postmenopausal women aged 50-79 years and its results contradicted the conclusions from all the earlier cohort studies. HRT increased risks of heart disease, stroke, blood clots, breast cancer (though possibly helped with osteoporosis and perhaps colorectal cancer). After an average 5.2 years of follow-up, the trial was stopped because of the apparent increase in breast cancer in the HRT group. The relative risk (HRT relative to placebo) of CHD was 1.29 (95% confidence interval 1.02 to 1.63) (286 cases altogether) and for breast cancer 1.26 (1.00 -1.59) (290 cases). Rather than there being a 44% reduction of risk, it seems that there was actually a 30% increase in risk. Notice that these are actually quite small risks, and on the margin of statistical significance. For the purposes of communicating the nature of the risk to an individual person it is usually better to specify the absolute risk rather than relative risk. The absolute number of CHD cases per 10,000 person-years is about 29 on placebo and 36 on HRT, so the increased risk of any individual is quite small. Multiplied over the whole population though, the number is no longer small.

Several plausible reasons for these contradictory results are discussed by Taubes,(2007): it seems that women who choose to take HRT are healthier than those who don’t. In fact the story has become a bit more complicated since then: the effect of HRT depends on when it is started and on how long it is taken (Vandenbroucke, 2009).

This is perhaps one of the most dramatic illustrations of the value of randomised controlled trials (RCTs). Reliance on observations of correlations suggested a 44% reduction in CHD, the randomised trial gave a 30% increase in CHD. Insistence on randomisation is not just pedantry. It is essential if you want to get the right answer.

Having dealt with the cautionary tale of HRT, we can now get back to the ‘Sisyphean task of dividing all the inanimate objects in the world into the ones that either cause or cure cancer’.

People who eat red meat to consume less than 500 g (18 oz) a week, very little if any to be processed.

If alcoholic drinks are consumed, limit consumption to no more than two drinks a day for men and one drink a day for women.

Avoid salt-preserved, salted, or salty foods; preserve foods without using salt. Limit consumption of processed foods with added salt to ensure an intake of less than 6 g (2.4 g sodium) a day.

Dietary supplements are not recommended for cancer prevention.

These all sound pretty sensible but they are very prescriptive. And of course the recommendations make sense only insofar as the various dietary factors cause cancer. If the association is not causal, changing your diet won’t help. Note that dietary supplements are NOT recommended. I’ll concentrate on the evidence that lies behind “People who . . . very little if any to be processed.”

The problem of establishing causality is dicussed in the report in detail. In section 3.4 the report says

” . . . causal relationships between food and nutrition, and physical activity can be confidently inferred when epidemiological evidence, and experimental and other biological findings, are consistent, unbiased, strong, graded, coherent, repeated, and plausible.”

The case of processed meat is dealt with in chapter 4.3 (p. 148) of the report.

“Processed meats” include sausages and frankfurters, and ‘hot dogs’, to which nitrates/nitrites or other preservatives are added, are also processed meats. Minced meats sometimes, but not always, fall inside this definition if they are preserved chemically. The same point applies to ‘hamburgers’.

The evidence for harmfulness of processed meat was described as “convincing”, and this is the highest level of confidence in the report, though this conclusion has been challenged (Truswell, 2009) .

How well does the evidence obey the criteria for the relationship being causal?

Twelve prospective cohort studies showed increased risk for the highest intake group when compared to the lowest, though this was statistically significant in only three of them. One study reported non-significant decreased risk and one study reported that there was no effect on risk. These results are summarised in this forest plot (see also Lewis & Clark, 2001)

Each line represents a separate study. The size of the square represents the precision (weight) for each. The horizontal bars show the 95% confidence intervals. If it were possible to repeat the observations many times on the same population, the 95% CL would be different on each repeat experiment, but 19 out of 20 (95%) of the intervals would contain the true value (and 1 in 20 would not contain the true value). If the bar does not overlap the vertical line at relative risk = 1 (i.e. no effect) this is equivalent to saying that there is a statistically significant difference from 1 with P < 0.05. That means, very roughly, that there is a 1 in 20 chance of making a fool of yourself if you claim that the association is real, rather than being a result of chance (more detail here),

There is certainly a tendency for the relative risks to be above one, though not by much, Pooling the results sounds like a good idea. The method for doing this is called meta-analysis .

Meta-analysis was possible on five studies, shown below. The outcome is shown by the red diamond at the bottom, labelled “summary effect”, and the width of the diamond indicates the 95% confidence interval. In this case the final result for association between processed meat intake and colorectal cancer was a relative risk of 1.21 (95% CI 1.04–1.42) per 50 g/day. This is presumably where the headline value of a 20% increase in risk came from.

Support came from a meta-analysis of 14 cohort studies, which reported a relative risk for processed meat of 1.09 (95% CI 1.05 – 1.13) per 30 g/day (Larsson & Wolk, 2006). Since then another study has come up with similar numbers (Sinha etal. , 2009). This consistency suggests a real association, but it cannot be taken as evidence for causality. Observational studies on HRT were just as consistent, but they were wrong.

The accompanying editorial (Popkin, 2009) points out that there are rather more important reasons to limit meat consumption, like the environmental footprint of most meat production, water supply, deforestation and so on.

So the outcome from vast numbers of observations is an association that only just reaches the P = 0.05 level of statistical significance. But even if the association is real, not a result of chance sampling error, that doesn’t help in the least in establishing causality.

There are two more criteria that might help, a good relationship between dose and response, and a plausible mechanism.

Dose – response relationship

It is quite possible to observe a very convincing relationship between dose and response in epidemiological studies, The relationship between number of cigarettes smoked per day and the incidence of lung cancer is one example. Indeed it is almost the only example.

There have been six studies that relate consumption of processed meat to incidence of colorectal cancer. All six dose-response relationships are shown in the WCRG report. Here they are.

This Figure was later revised to

This is the point where my credulity begins to get strained. Dose – response curves are part of the stock in trade of pharmacologists. The technical description of these six curves is, roughly, ‘bloody horizontal’. The report says “A dose-response relationship was also apparent from cohort studies that measured consumption in times/day”. I simply cannot agree that any relationship whatsoever is “apparent”.

They are certainly the least convincing dose-response relationships I have ever seen. Nevertheless a meta-analysis came up with a slope for response curve that just reached the 5% level of statistical significance.

The conclusion of the report for processed meat and colorectal cancer was as follows.

“There is a substantial amount of evidence, with a dose-response relationship apparent from cohort studies. There is strong evidence for plausible mechanisms operating in humans. Processed meat is a convincing cause of colorectal cancer.”

But the dose-response curves look appalling, and it is reasonable to ask whether public policy should be based on a 1 in 20 chance of being quite wrong (1 in 20 at best –see Senn, 2008). I certainly wouldn’t want to risk my reputation on odds like that, never mind use it as a basis for public policy.

So we are left with plausibility as the remaining bit of evidence for causality. Anyone who has done much experimental work knows that it is possible to dream up a plausible explanation of any result whatsoever. Most are wrong and so plausibility is a pretty weak argument. Much play is made of the fact that cured meats contain nitrates and nitrites, but there is no real evidence that the amount they contain is harmful.

The main source of nitrates in the diet is not from meat but from vegetables (especially green leafy vegetables like lettuce and spinach) which contribute 70 – 90% of total intake. The maximum legal content in processed meat is 10 – 25 mg/100g, but lettuce contains around 100 – 400 mg/100g with a legal limit of 200 – 400 mg/100g. Dietary nitrate intake was not associated with risk for colorectal cancer in two cohort studies.(Food Standards Agency, 2004; International Agency for Research on Cancer, 2006).

To add further to the confusion, another cohort study on over 60,000 people compared vegetarians and meat-eaters. Mortality from circulatory diseases and mortality from all causes were not detectably different between vegetarians and meat eaters (Key et al., 2009a). Still more confusingly, although the incidence of all cancers combined was lower among vegetarians than among meat eaters, the exception was colorectal cancer which had a higher incidence in vegetarians than in meat eaters (Key et al., 2009b).

Mente et al. (2009) compared cohort studies and RCTs for effects of diet on risk of coronary heart disease. “Strong evidence” for protective effects was found for intake of vegetables, nuts, and “Mediterranean diet”, and harmful effects of intake of trans–fatty acids and foods with a high glycaemic index. There was also a bit less strong evidence for effects of mono-unsaturated fatty acids and for intake of fish, marine ω-3 fatty acids, folate, whole grains, dietary vitamins E and C, beta carotene, alcohol, fruit, and fibre. But RCTs showed evidence only for “Mediterranean diet”, and for none of the others.

As a final nail in the coffin of case control studies, consider pizza. According to La Vecchia & Bosetti (2006), data from a series of case control studies in northern Italy lead to: “An inverse association was found between regular pizza consumption (at least one portion of pizza per week) and the risk of cancers of the digestive tract, with relative risks of 0.66 for oral and pharyngeal cancers, 0.41 for oesophageal, 0.82 for laryngeal, 0.74 for colon and 0.93 for rectal cancers.”

What on earth is one meant to make of this? Pizza should be prescribable on the National Health Service to produce a 60% reduction in oesophageal cancer? As the authors say “pizza may simply represent a general and aspecific indicator of a favourable Mediterranean diet.” It is observations like this that seem to make a mockery of making causal inferences from non-randomised studies. They are simply uninterpretable.

Is the observed association even real?

The most noticeable thing about the effects of red meat and processed meat is not only that they are small but also that they only just reach the 5 percent level of statistical significance. It has been explained clearly why, in these circumstances, real associations are likely to be exaggerated in size (Ioannidis, 2008a; Ioannidis, 2008b; Senn, 2008). Worse still, there as some good reasons to think that many (perhaps even most) of the effects that are claimed in this sort of study are not real anyway (Ioannidis, 2005). The inflation of the strength of associations is expected to be bigger in small studies, so it is noteworthy that the large meta-analysis by Larsson & Wolk, 2006 comments “In the present meta-analysis, the magnitude of the relationship of processed meat consumption with colorectal cancer risk was weaker than in the earlier meta-analyses”.

This is all consistent with the well known tendency of randomized clinical trials to show initially a good effect of treatment but subsequent trials tend to show smaller effects. The reasons, and the cures, for this worrying problem are discussed by Chalmers (Chalmers, 2006; Chalmers & Matthews, 2006; Garattini & Chalmers, 2009)

What do randomized studies tell us?

The only form of reliable evidence for causality comes from randomised controlled trials. The difficulties in allocating people to diets over long periods of time are obvious and that is no doubt one reason why there are far fewer RCTs than there are observational studies. But when they have been done the results often contradict those from cohort studies. The RCTs of hormone replacement therapy mentioned above contradicted the cohort studies and reversed the advice given to women about HRT.

Three more illustrations of how plausible suggestions about diet can be refuted by RCTs concern nutritional supplements and weight-loss diets

Many RCTs have shown that various forms of nutritional supplement do no good and may even do harm (see Cochrane reviews). At least we now know that anti-oxidants per se do you no good. The idea that anti-oxidants might be good for you was never more than a plausible hypothesis, and like so many plausible hypotheses it has turned out to be a myth. The word anti-oxidant is now no more than a marketing term, though it remains very profitable for unscrupulous salesmen.

Contrary to much dogma about weight loss (Sacks et al., 2009) found no differences in weight loss over two years between four very different diets. They assigned randomly 811 overweight adults to one of four diets. The percentages of energy derived from fat, protein, and carbohydrates in the four diets were 20, 15, and 65%; 20, 25, and 55%; 40, 15, and 45%; and 40, 25, and 35%. No difference could be detected between the different diets: all that mattered for weight loss was the total number of calories. It should be added, though, that there were some reasons to think that the participants may not have stuck to their diets very well (Katan, 2009).

The impression one gets from RCTs is that the details of diet are not anything like as important as has been inferred from non-randomised observational studies.

So does processed meat give you cancer?

After all this, we can return to the original question. Do sausages or bacon give you colorectal cancer? The answer, sadly, is that nobody really knows. I do know that, on the basis of the evidence, it seems to me to be an exaggeration to assert that “The evidence is convincing that processed meat is a cause of bowel cancer”.

In the UK there were around 5 cases of colorectal cancer per 10,000 population in 2005, so a 20% increase, even if it were real, and genuinely causative. would result in 6 rather than 5 cases per 10,000 people, annually. That makes the risk sound trivial for any individual. On the other hand there were 36,766 cases of colorectal cancer in the UK in 2005. A 20% increase would mean, if the association were causal, about 7000 extra cases as a result of eating processed meat, but no extra cases if the association were not causal.

For the purposes of public health policy about diet, the question of causality is crucial. One has sympathy for the difficult decisions that they have to make, because they are forced to decide on the basis of inadequate evidence.

If it were not already obvious, the examples discussed above make it very clear that the only sound guide to causality is a properly randomised trial. The only exceptions to that are when effects are really big. The relative risk of lung cancer for a heavy cigarette smoker is 20 times that of a non-smoker and there is a very clear relationship between dose (cigarettes per day) and response (lung cancer incidence), as shown above. That is a 2000% increase in risk, very different from the 20% found for processed meat (and many other dietary effects). Nobody could doubt seriously the causality in that case.

The decision about whether to eat bacon and sausages has to be a personal one. It depends on your attitude to the precautionary principle. The observations do not, in my view, constitute strong evidence for causality, but they are certainly compatible with causality. It could be true so if you want to be on the safe side then avoid bacon. Of course life would not be much fun if your actions were based on things that just could be true.

My own inclination would be to ignore any relative risk based on observational data if it was less than about 2. The National Cancer Institute (Nelson, 2002) advises that relative risks less than 2 should be “viewed with caution”, but fails to explain what “viewing with caution” means in practice, so the advice isn’t very useful.

In fact hardly any of the relative risks reported in the WCRF report (2007) reach this level. Almost all relative risks are less than 1.3 (or greater than 0.7 for alleged protective effects). Perhaps it is best to stop worrying and get on with your life. At some point it becomes counterproductive to try to micromanage `people’s diet on the basis of dubious data. There is a price to pay for being too precautionary. It runs the risk of making people ignore information that has got a sound basis. It runs the risk of excessive medicalisation of everyday life. And it brings science itself into disrepute when people laugh at the contradictory findings of observational epidemiology.

The question of how diet and other ‘lifestyle interventions’ affect health is fascinating to everyone. There is compelling reason to think that it matters. For example one study demonstrated that breast cancer incidence increased almost threefold in first-generation Japanese women who migrated to Hawaii, and up to fivefold in the second generation (Kolonel, 1980). Since then enormous effort has been put into finding out why. The first great success was cigarette smoking but that is almost the only major success. Very few similar magic bullets have come to light after decades of searching (asbestos and mesothelioma, or UV radiation and skin cancer count as successes).

The result of addition of the new data was to reduce slightly the apparent risk from eating processed meat from 1.21 (95% CI = 1.04-1.42) in the original study to 1.18 (95% CI = 1.10-1.28) in the update. The change is too small to mean much, though it is in direction expected for false correlations. More importantly, the new data confirm that the dose-response curves are pathetic. The evidence for causality is weakened somewhat by addition of the new data.

58 Responses to Diet and health. What can you believe: or does bacon kill you?

Simply brilliant. A wonderful, clear-eyed account of the way in which observational epidemiology can fail to deliver. It also highlights the complete inability of journos to see past the press release and assess critically.

I think the issue you mention about people laughing at the pointy-headed boffins is crucial. Uncritical reporting with a constant cycle of contradictory findings erodes trust in medical science. This all spills over into consultations and it is deeply challenging to get past this massive bombardment of dubious information to get across any coherent message.

The World Cancer Research Fund’s remit is narrow: it looks only at diet and cancer. It’s a very limited brief and, since they are awash with funds, they fund some possibly flaky studies. And maybe flaky studies come up with flaky conclusions.

Great post, David. I’m only surprised you managed to resist the temptation to entitle it:

Telling porkies…. (?)

Here at Casa Aust, Mrs Dr Aust, who was farm-raised in a place where they generally made sure they used every last bit of the farm pig, is generally rather scornful of the advice on pork. by way of example, we have just tucked into roast pork belly for our Sunday dinner…which when put together with the Penne carbonara yesterday, and the salami we ate for lunch, means we in the Aust Family are presumably all doomed… or not.

Although given the (earlier generations) Aust family history, and Mrs Dr Aust’s non-low-fat cooking, I predict hypertension and ischaemic / atherosclerotic disease is going to get me way before anything to do with dangerous bacon.

BTW, the jokers on the Swine Flu blogs are predicting that pork is going to be especially cheap this Summer, so the smell of barbecue ribs and chops will likely be wafting over many a suburban street.

Coming back to risk factors. my own little foray into the world of context-free relative risk figures was in the consideration of whether chlorinated water increase your cancer risk (a line favoured by that, er, distinguished UCL graduate Dr John Briffa), though I didn’t do the proper statistical job you have here.

Thankyou very much for that David, especially the dense references. The effort is much appreciated here in Muscle Villas. I remember when this came out and iirc Michael Marmot was all over the media throwing his academic credentials around very heavily. I thought at the time that if the data were so compelling why the heavy appeal to authority?

What I have always got from the food/cancer link studies particularly when comparing populations is the most you can do is choose your cancer risk. Japanese women may not have high breast cancer rates on a traditional diet but don’t they get gullet cancers from all the salt and the high fish input?

Also here at Muscle Villas while we have determined what is bad for us (individually: fat; gluten; sugar) but beyond those affected avoiding the relevant foodstuffs exercise is the big thing. People focus on diet because it is easier to eat than to get on your bike or hit the road.

So Parma Ham lardons or chorizo in the casserole has never been threatened here. Also the 5 a day that is everywhere has no scientific basis, but it is consistent with a healthy diet. Apparently some MinHealth functionary was asked by a journo once how much fruit’n’veg we should be eating and the panicked functionary said ‘five’.

Oh and @Dr Aust, dammit forgot to check the pork yesterday in the supermarket. We ate well of beef in the last big and silly BSE scare.

Excellent article – thoroughly enjoyed reading that. I had a similar discussion not so long ago with a friend, but was unable to put the case as eloquently and accurately as above. I’ll direct them to it!

Fantastic post. Very hepful as I’ve been debating a non-randomised ‘trial’ recently. Poking around the internet I have come across an interesting article by Peter Armitage comparing and contrasting the approach Hill and Fisher to randomisation.

magnificent, brilliant demonstration of how blogs by people who know what they are talking about can be 18 million times better than anything anywhere in mainstream media. detailed on every last necessity with not a jot of dumbing down, selfindulgently digressive without ever getting boring, an appropriately high opinion of the reader but thoroughly accessible, with references and diagrams, we are in the presence of greatness.

Ulrich Berger.
Thanks for the Orac link. It’s good as always, but surprisingly it barely mentions causality. To my mind, that is the main point. It doesn’t matter a damn if the association is real if it isn’t causal, because if it isn’t causal it can’t lead to useful action. I’m not going to go into this one myself because of an obvious vested interest, because the lack of data about pipe smoke, and because it is a field plagued by near-religious zealotry. All that I’ll risk saying is that if the highest reported risks for cigarette smoke are real and causal, then it has a very weird dose-response curve.

Except that in other animals, like chimps for eg, low status and the loss of status do lead to measurably high stress hormones and bad health. Note that low status of itself is not harmful, only the negative association with those of high stress does. So a village of poor people can be a good place to be as can being a vassal who is happy with his lot and has no realistic concept of changing it. But living in a Rio favella just across the road from conspicuous consumption and bombarded by media extolling unattainable things is because only that is actively stressful. So to completely discount it you have to either disbelieve that chronic stress is harmful or that that sort of condition is somehow not stressful.

Great piece, it makes several salient points that are lost to most people reading media reports on x or y in one’s diet. However, as an epidemiologist, I feel compelled to provide a counter-point. Yes, RCT’s are the holy grail, but we can’t always do those, especially for individual nutritional factors and rare diseases like cancer. For example, it’s infeasible to tell group A to eat 4 slices of bacon a day and group B to eat none in order to tell whether people are at an increased risk for colorectal cancer. You would have to tell thousands of people to eat a certain amount of a food and tell thousands of others not to for years, and then follow them to see if they get cancer.

Where epidemiologic studies come in is how you can use these observational studies (cohort and case-control) to fill in gaps where RCT’s aren’t feasible.

These increased risks (20-30%) may not seem like much for an individual, but a moderate increase in risk for an individual might translate to a substantial risk at a population level. While it seems meddling to say that someone should eat less than 18 oz of red meat a week (and I agree that there are entirely too many of these recommencations out there), the recommendations can have an influence on a population level.

jodelanc
Yes, of course you are right that it’s very hard to do RCTs. My point was that if they haven’t been done, or can’t, be done, it may be necessary to say we don’t know. A 20% increase in risk will “translate to a substantial risk at a population level” only if it is causal. If the association is not causal, it translates to zero risk.

If nothing else, the research on the dangers of bacon is proving to be rather useful as a teaching aid. David Spiegelhalter used it to illustrate different ways of spinning the risk and now we have this on randomisation, dose-response relationships, and the problem of establishing causality. Brilliant.

Yes, I re-posted Ben Goldacre’s Twitter post yesterday, & have done so again today. But like yours, compared to his vast army, my happy few will have little or no effect, too.

Clear, insightful & compelling science writing is rarer than the proverbial rocking horse cr*p, so hopefully your blog will become a popular destination for scientists, journalists & anyone with a healthy curiosity for good science writing.

Nice research, but I was puzzled that professional scientists should have made quite such a mess of interpreting the data. Looking at Goldbohm 1994 and presenting some of their data in what may be a clearer form (http://www.ucl.ac.uk/~ucgbarg/colon_cancer.jpg ), the processed meat – cancer relation doesn’t look so vacuous. Note that the dose-response relation is replicated almost identically in males and females. For sure, RCTs would give clearer conclusions – but how? Would many now volunteer (even with 50% chance of reprieve) to adopt whatever went into Dutch meat sandwiches in 1994? This seems more of a case for follow-up animal studies than RCTs. Sure, the verdict about causality remains “don’t know” on the basis of this evidence alone, but I wouldn’t do more than downgrade the WCRF’s “convincing” to “pretty convincing”.
Since this verges on proper pharmacology, it might help to clarify the difference between nitrites and nitrates, and how vitamin C may (or may not) fit into the picture. Perhaps Bacon Lettuce and Tomato deserves its plaudits as one of America’s great inventions.

Tonygm
The points on the Goldbohm graph are so close together (Fig 4.3.7 above) that the slope is ill-defined. In fact when men and women are combined it only just reaches the 5% level of significance (and that is about the best of the curves). It’s interesting that Goldbohm also contradicts entirely the WCRF conclusions about red meat.

No trends in relative rates of colon cancer were detected for intake of energy or for the energy-adjusted intake of fats, protein, fat from meat, and protein from meat. Consumption of total fresh meat, beef, pork, minced meat, chicken, and fish was not associated with risk of colon cancer either. Processed meats, however, were associated with an increased risk in men and women (relative rate, 1.17 per increment of 15 g/day; 95% confidence interval, 1.03 – 1.33). The increased risk appeared to be attributable to one of the five questionnaire Items on processed meat, which comprised mainly sausages. This study does not support a role of fresh meat and dietary fat in the etiology of colon cancer In this population. As an exception, some processed meats may increase the risk, . . .

It so probably just me but, you said
“There have been six studies that relate consumption of processed meat to incidence of colorectal cancer.”
Yet in figures 4.3.7 and 4.3.8 I count seven curves/studies!

Bender. I presume that Goldbohm (man and women) is counted as one study. There is a similar thing in Fig 4.3.6, The report refers to five studies suitable for meta-analysis whereas Fig 4.3.6 shows 6, presumably because Chao (2005) man and women count as one study.

The points on the WCRF Fig. 4.3.7 are indeed close together on the dose axis. This means that if there is a genuine causal relation to pathology, then the toxicity levels must have been high. It seems that in this Dutch study most people liked processed meat (only 15% were in the zero consumption category). However, 20g/day was considered quite a large dose: the highest category in the questionnaire survey was “>20g”. Perhaps Dutch salami slices are/were small but strong!

The dose-response relation is surely well enough defined (as my graphs illustrate) that one can scarcely dismiss it out of hand. Indeed, a P-value of <0.05 should ring some alarm bells for an issue as important as this, and I certainly hope that some sort of follow-up was done. Whether this happened is not I think very clear from the WCRF review. It may be that the result for processed meat was a freak that wasn’t really all that unlikely given the number of dietary categories under study – though the similarity of results for males and females argues somewhat against that. Obviously there have been other studies that have not produced clear results; but it would be quite inappropriate to dismiss tentative concerns from one study on the basis of studies that relate to quite different types of sausages, maybe from Cumberland, Milano, Kentucky or Berlin. Would it be physically possible to eat 300g/day of pepperoni (a dose that also appears on Fig. 4.3.7, presumably for some other kind of processed meat)?

I have no concern to raise alarm about diet. Nor does WCRF, which scarcely concludes more than the old mantra “exercise moderation”. But I do have concern when I read a nicely persuasive blog that properly raises important points and has elicited many deserved compliments, but that appears to skate merrily over genuine scientific issues that might weaken the impact. Such, I suppose, is a blog.

It is not “blogs” that you are accusing of skating over genuine scientific issues, it is me whom you are accusing of doing that.

WCFR does not counsel “moderation”, it says that the evidence is “convincing” and counsels that you eat “very little if any” processed meat.

What you (and I) said about Goldbohm is precisely right, but for some reason you are focussing attention entirely on the trial with the biggest effect. The other trials in Figs 4.3.6, 4.3.7 and 4.3.8 got quite different results. The whole idea of meta-analysis is to try to look at all the data, not just the bits that suit your own agenda. The result of the meta-analysis did (as i say clearly) just reach the 5% level of significance. It could well be real. That, of course, is no help at all with the problem of causality which was my main point.

I’m not dismissing “tentative concerns” at all. I’m merely saying that the totality of the evidence seems to me to be surprisingly weak as a basis for lecturing the public about what to eat.

Great post. The one point that you seem to miss, however, is that in observational studies, such as the ones used to indict processed meat, the statistical significance is not nearly as meaningful as the confounders — those other factors that might explained the statistically-significant effect observed. With cigarettes and cancer, it’s virtually impossible to imagine anything that could explain the 20-fold increase in lung cancer among heavy smokers (not that the tobacco industry didn’t try). With these smaller relative risks, even those considerably larger than 2, it’s all too easy to imagine confounding variables that the researchers either didn’t measure or didn’t properly assess. That was the message of my 2007 New York Times Magazine article that you were so kind to compliment.

Imagine for instance all the possible ways that the highest quintile of processed-meat eaters in the 1990s or 2000s might differ from the lowest quintile, particularly considering the fact that processed meats have been generally perceived as carcinogenic for thirty years or more. What you’re comparing are people who don’t seem to give a damn whether something is healthy or not (or people on the Atkins diet who are predisposed to gain weight easily) to health-conscious, quasi-vegetarians. The latter are probably better educated — a typical finding in all these studies — of a higher socioeconomic class; they go to better doctors, get better medical care, eat generally healthier diets (whatever that means), etc. etc. The reason to do randomized trials is to render irrelevant all these possible confounders — disseminate them equally among all the arms of the study. Without randomization, that an effect is statistically significant says virtually nothing at all about whether or not the cause of that effect is what you set out to study. The fact that RCTS are effectively impossible to do in these kinds of situations, as Tony GM points out, doesn’t negate the fact that they’re necessary to learn anything meaningful.

Meta-analysis can be meaningless in this context because it’s quite easy for every study done, in every population, to have the same confounders. The only way to learn anything meaningful — short of getting an effect as large as the lung-cancer/cigarette association — is to do a randomized trial.

One of the lessons I learned from my early life reporting on high energy physics is known in that field as Panofsky’s law (after Pief Panofsky, founder of the Stanford Linear Accelerator Laboratory): If you throw money at an effect, and it doesn’t get bigger, it means it’s not really there.

In nutritional epidemiology, if you throw money at an effect and it doesn’t get bigger, you do a meta-analysis. It always struck me that the very fact of having to do a meta-analysis is pretty compelling evidence that the effect you’re trying to nail down isn’t real. I may be wrong, but I’ve yet to meet the epidemiologist who could explain why.

Gary Taubes
Good to hear from the author of the wonderful NYT article that I cited.

I agree entirely with what you say, and perhaps I should have spelled out in more detail the problem of confounders which is what makes randomisation so important.

In lab experiments, as well as diet studies, systematic errors are often much more important than random sampling errors. I recall that for years each estimate of the speed of light that was made was outside the error limits of the preceding study. Confounders of the sort you mention come into the same category -errors that are reproducible from trial to trial, so giving a spurious appearance of reproducibility.

It is very striking, once again, how very difficult it can be to get firm answers to what sound like simple questions.

Probability Theory doesn’t really “solve” the problem of establishing causality. It does, however, provide a rigorous and internally consistent mathematical framework for reasoning with incomplete information. You could, for instance, quantitatively compare observational vs. randomized controlled studies, and numerically show that the former provides little support for hypotheses of causality. Also very useful for designing experiments, as you can quantify which design is likely to provide greater insight.

Probability Theory would also raise discussions of the relative merits of different study types above that of philosophy. It’s just math.

davedixon
What you say is quite true, but I suspect it doesn’t help much with the real problem.
My point was simply that no amount of mathematics will extract information that isn’t in the data.

It might be fun, though, to do an addendum on the beauty of randomisation tests. They are surely on of the best ways of demonstrating the fundamental role of randomisation as the basis of significance testing. I’ve been enthusiastic about them for quite a long time now (DC, Lectures on Biostatistics, 1971, Clarendon Press) and used them for teaching.

The bacon problem illustrates that significance testing is only a small part of the problem, Causality is much harder to show. Having huge numbers of subjects helps a lot for getting “significant” differences but with the risk that you may end up simply detecting smaller and smaller associations that are ever more susceptible to misinterpration because of confounders.

Probability Theory is no magic bullet for sure. It just elevates the discussion of evidential support from “he said/she said” to mathematics. That pushes the argument to either one of mathematical errors or incorrect inputs. Those inputs also can be extended beyond just the data, and include other information.

Consider the case of establishing causality in the face of a large number of potential confounders. The inputs are not just the data, but other potentially relevant information. For an observational study, such information is really your explicit specification of what you know you don’t know, e.g. “I’m measuring a bunch of variables, but I do not know the relationships amongst them”, in particular the arrow of causality. So when you test the hypothesis “eating bacon causes heart disease” vs. “heart disease makes you eat bacon”, you would quantitatively find no preference, as you included no information in the problem that indicates the arrow of causality.

So it’s an extension of your point: you can’t get more information from a problem than what you put in. Information extends beyond data, but also includes other pieces of knowledge. Probability Theory allows you to include this other knowledge (usually excluded from statistical analyses) and make rigorous and consistent assessments of how ALL of the included information supports hypotheses.

Great article. Someone who I showed it to, got a bit despairing: “we don’t know anything about what to eat or not to eat!” In order to reassure people, I wonder if you know of any forest plots or dose-response plots that summarise (for example) the protective effect of eating fresh fruit+veg, which I believe is well-attested?

I did some quick searching and found a few reviews. Figure 1 in each of (Ness AR, Powles JW. Fruit and vegetables, and cardiovascular disease: a review. Int J Epidemiol 1997;26:1–13) and (Steinmetz KA, Potter JD. Vegetables, fruit, and cancer prevention: a review. J Am Diet Assoc 1996;96:1027–39) seem to indicate protective effects of fruit+veg against some cancers and against cardiovascular disease. Nice to know that there are some things we think we know…?

Where we agree with David Colquhoun is that it is very likely that the more that people eat processed meat, the higher their risk of bowel cancer. Of course we can never be absolutely certain – no studies can ever do that – but the question is whether we are sure enough to make a recommendation. This is where we differ from David.

The difference is in interpretation of the evidence. We think the evidence is convincing and should form the basis of public health advice. David thinks the small element of doubt means the evidence is not strong enough to warrant advising people about processed meat.

David also suggests we should not be giving advice on processed meat because there has not been a randomised control trial. While there is still an element of doubt in randomised trials, everyone agrees they are the most robust type of study.

But while this type of trial is relatively straightforward when measuring the effect of a pill for treating a condition, it is neither practical or realistic to do this for something like the impact of habitual lifelong consumption of processed meat on a condition such as colorectal cancer which has a latency of decades.

The trouble is – as David accepts – some scientific questions simply are not testable using double blind randomised trials, and the impact of food, nutrition and physical activity over a lifetime on cancer risk is one of those. David feels that therefore no recommendation should be made; but this is a vitally important question and we feel that people should be advised according to the best available evidence.

This is a question of judgement and people can make their own minds up about who is right and wrong. But the decision to describe the evidence as convincing comes from a panel of 21 world-renowned scientists.

But if I was about to eat something that someone was confident – even if not certain – was going to be harmful to me, I would want that person to warn me about it.

So in the absence of a randomised control trial, should you never give advice about diet and lifestyle? We would argue this would be an irresponsible approach and this is at the heart of where we disagree with David.

I’m very grateful to Professor Wiseman for taking the trouble to put the other side of the case. I do wish, though, that he’d addressed more directly the strength of the evidence for causality. It really is the crucial point. Even if the relative risk was much greater than it appears to be, the advice to stop eating ham would make no sense unless the ham caused the cancer. What caused me to write this piece was seeing the astonishing flimsiness of the alleged dose-response relationship. As Wisemen’s report itself states, the existence of a dose-response relationship is about all you can do in the absence of randomised trials to establish causality, but only one study shows anything approaching a convincing relationship.

I think perhaps that Wiseman is also exaggerating a bit the extent to which there is unanimity about the question of causality. If I may resort, for a moment, to an anecdote, my recent encounter with the Royal Marsden Hospital provides an example. I noticed that the refreshment counter in their waiting room was selling ham sandwiches. When I mentioned this to my rather distinguished oncologist he just laughed. It is true that the authors of the report include some very distinguished people. But however distinguished you may be, you can’t divine causality withour data to support your conclusion, and the data are really very thin.

There are two opposing problems to dietary advice. On one hand, every small association can be interpreted as causal and very detailed advice offered. This extreme precautionary approach is what the WCRF has adopted and it can certainly be argued that this is the responsible thing to do. The other side of the argument is that excessively detailed advice can be counterproductive. It makes people laugh at science and runs the risk that really soundly-based advice will be dismisssed, It is very common to hear people saying that "these people can’t make up their mind". It happens every time a report on the beneficial/harmless/lethal effect of red wine appears in the press. Still worse, the example of hormone replacement therapy shows that advice that is not based on sound evidence for causality may do harm rather than good.

I suppose that my thesis, at heart, is that scientists should say rather more often than they do "I don’t know". Exaggeration about the strength of evidence, however well intentioned, is, in the end, counterproductive.

I haven’t read much about this issue, so I cannot make any specific comments.

IMHO I feel like science can only give us a perspective. If we are really looking for highest proof or causality in everything, all we have is proof based on a narrow set of observations.

I think that hardest part is teach the public everything is not black and white as they think or they want to think.There are so many shades of gray. And these shades will hopefully move towards white or black as science develops.

This is a great sledge hammer to what seems like an egg-shell that has resisted cracking. I disagree with David’s implicit suggestion that these are scientists. Many are MDs who have little experience with science and, in fact, David has pin-pointed the danger in the public thinking that this is science.

I also disagree with the idea that random control trials (RCTs) are any kind of standard or, in fact, that there is one kind of experiment that fits all questions. RCTs, in my opinion, have never recovered from Smith and Pell ( http://bit.ly/mXfyFw ). You don’t need an RCT to test the efficacy of penicillin. It is a foreign substance so you might need an RCT to test safety. But in these nutritional studies, it It is really the arbitrary choice of independent variable. Why sausage? Because somebody thinks they look greasy and might cause cancer. The OR or any other outcome variable should be tested against OR of any number of things that might pop into somebody’s head: pancakes, pasta primavera, taro root. I offer my own take on this at my post on “Red Meat and the New Puritans” http://wp.me/p16vK0-4l

@rumford
I don’t agree at all about RCTs. Smith and Pell was a good joke (or perhaps a rather bad joke).

It’s blindingly obvious that occasionally effects are so dramatic that you don’t need at RCT but it’s also obvious that, unfortunately, big dramatic effects are rare in medicine.

And one case where the effects are certainly not dramatic is the effect of what you eat on your health, It is, more than most, a field that despairingly needs RCTs to be sure about causality. It’s very sad that they are so hard to do. That is why there is so little hard information despite the vast amount that’s been written on the topic.

@David Colquhoun
First, let me say something about the case at hand, the crux of the nutrition mess: diet for diabetes and metabolic syndrome. In people with diabetes, blood glucose will be reduced as the dietary carbohydrate is reduced. The lower the carbohydrate, the better the reduction. Very low dietary carbohydrate is also accompanied by reduction in triglycerides, HDL. And these effects are seen even in the absence of weight loss (Gannon & Nuttal’s work). Now, before we think of an RCT, we recognize that this single meal, one week or five week trials sensibly follows from the underlying biochemistry of the glucose-insulin axis and, in fact, although the down-stream effects play out in lipid metabolism, it has been know forever, that diabetes is a disease of carbohydrate intolerance. Perhaps most important, drugs are reduced as carbohydrate is lowered. Westman reports “I have been able to taper a patient off 150 U of insulin [very high dose] in a few weeks.” (This does not happen spontaneously and is not true of other diets. In any case, it is extreme but not isolated, not n=1).

There is no conflicting data. Any deleterious effect of dietary fat is seen only in the presence of high carbohydrate consistent with the theory of glucose-insulin control of lipid metabolism (Volek’s work). In addition, the effects are stable and persist as long as the regimen is carried out. At constant weight for 10 weeks, at hypocaloric conditions for different periods depending on the paradigm. It seems that the burden of proof is on anybody who says that this isn’t the “default” treatment for type 2 diabetes. No reason not to do an RCT but, clinically, the results are as clear-cut as a parachute jump. The hard information is there. It is simply being ignored for non-scientific reasons.

@David Colquhoun
David’s argument is absolutely dead on and the case for randomization is agreed upon, but that was not my objection. My objection to the RCT and Smith and Pell’s is that even before you show the mistakes in a particular study of associations, you have to ask why it was done. Opposite to what everybody says, if you think about it, hypotheses generate associations, not the other way around. Why test bacon? Why not, as the other guy did, pizza. Because you have a hypothesis about the causal nature of bacon on cancer. David describes this in the case of antioxidants: “that anti-oxidants might be good for you was never more than a plausible hypothesis, and like so many plausible hypotheses it has turned out to be a myth.” Whether antioxidants are or are not good for you, it is the plausibility of the hypothesis that will determine whether an association is useful. So, if you had an intuition about bacon and you did find an OR of 22, you hypothesis would have been supported. This looks like an exception because we are bombarded with OR=1.22 in dumb studies whereas it is a continuum. The cut-off for taking it seriously will be all the things on our mind when we start and, most of all, the hypothesis that generated the study, like maybe cigarette smoke contains otherwise known carcinogens. Does this make sense?

Mr Colquhon
You comment above on “The Woman’s Health institute Study” on HRT treatment (Rossouw et al, 2002) which you seem to regard as a good example of randomised testing.
I presume you failed to give the study sufficient consideration. The study was a bench mark in how not to do statistical studies. Here are some of the issues which are not addressed in the paper.
1. The study was ended early after a perceived increase in certain health risks. No where does the paper consider this might have been a spike in results which if the test had continued would have regressed to the mean. Rule 1 of a statistical study is you cannot stop just because you get an interesting result.
2. If we then look at the apparent randomness of the study:
a) The participants where selected and contacted by mail shot and some then volunteered to take part. No mention is made of take up rate but I would expect it would be very low. There is no discussion about the fact those involved in the trial where all volunteers. Using volunteers is a self selection process. Who has time to take part in such a long term study where you will be taking medicines daily. This is a very select group. The paper ignores this and gives little information on the background on the volunteers. Reading the study my first question (knowing this was a US study and most women who have to pay for HRT drugs) where how many of the volunteers where women who where already taking HRT and joined the study because they would be given the drugs free.
b) Much is made of the fact that volunteers where selected randomly to receive either a placebo or the relevant drugs. A proper paper would recognise that HRT is a powerful drug with side effects. While the initial selection was random most involved in the study would quickly be aware of whether they where taking the placebo or HRT. If as I suggest above many volunteers where those who where already taking HRT then they would quickly realise if they where getting the placebo and drop out from the study. Can I suggest you look at the drop out rates and this is a possible explanation for the high level of initial drop out from those getting the placebo.
The study was far from randomised and ignored any consideration of the back ground of volunteers or that most volunteers would quickly realise if they where receiving the placebo or HRT. It then stopped when it got an interesting result.
These gaps mean the conclusions of the study, which show only minor increases in some health risks, has no validity.
Subsequent studies have highlighted the problems with a similar UK study with all the same problems again being stopped but highlighting completely different alleged health risks.
It is almost impossible to do any long term randomised medical studies on people add in testing a powerful drug and it is impossible.