Lancet roundup and literature review

by Daniel on November 11, 2004

Well, the Lancet study has been out for a while now, and it seems as good a time as any to take stock of the state of the debate and wrap up a few comments which have hitherto been buried in comments threads. Lots of heavy lifting here has been done by Tim Lambert and Chris Lightfoot; I thoroughly recommend both posts, and while I’m recommending things, I also recommend a short statistics course as a useful way to spend one’s evenings (sorry); it really is satisfying to be able to take part in these debates as a participant and I would imagine, pretty embarrassing and frustrating not to be able to. As Tim Lambert commented, this study has been “like flypaper for innumerates”; people have been lining up to take a pop at it despite being manifestly not in possession of the baseline level of knowledge needed to understand what they’re talking about. (Being slightly more cynical, I suggested to Tim that it was more like “litmus paper for hacks”; it’s up to each individual to decide for themselves whether they think a particular argument is an innocent mistake or not). Below the fold, I summarise the various lines of criticism and whether they’re valid or (mostly) not.

Starting with what I will describe as “Hack critiques”, without prejudice that they might in isolated individual cases be innocent mistakes. These are arguments which are purely and simply wrong and should not be made because they are, quite simply, slanders on the integrity of the scientists who wrote the paper. I’ll start with the most widespread one.

The Kaplan “dartboard” confidence interval critique

I think I pretty much slaughtered this one in my original Lancet post, but it still spread; apparently not everybody reads CT (bastards). To recap; Fred Kaplan of Slate suggested that because the confidence interval was very wide, the Lancet paper was worthless and we should believe something else like the IBC total.

This argument is wrong for three reasons.

1)The confidence interval describes a range of values which are “consistent” with the model[1]. But it doesn’t mean that all values within the confidence interval are equally likely, so you can just pick one. In particular, the most likely values are the ones in the centre of a symmetrical confidence interval. The single most likely value is, in fact, the central estimate of 98,000 excess deaths. Furthermore, as I pointed out in my original CT post, the truly shocking thing is that, wide as the confidence interval is, it does not include zero. You would expect to get a sample like this fewer than 2.5 times out of a hundred if the true number of excess deaths was less than zero (that is, if the war had made things better rather than worse).

2)As the authors themselves pointed out in correspondence with the management of Lenin’s Tomb,

“Research is more than summarizing data, it is also interpretation. If we had just visited the 32 neighborhoods without Falluja and did not look at the data or think about them, we would have reported 98,000 deaths, and said the measure was so imprecise that there was a 2.5% chance that there had been less than 8,000 deaths, a 10% chance that there had been less than about 45,000 deaths,….all of those assumptions that go with normal distributions. But we had two other pieces of information. First, violence accounted for only 2% of deaths before the war and was the main cause of death after the invasion. That is something new, consistent with the dramatic rise in mortality and reduces the likelihood that the true number was at the lower end of the confidence range. Secondly, there is the Falluja data, which imply that there are pockets of Anbar, or other communities like Falluja, experiencing intense conflict, that have far more deaths than the rest of the country. We set that aside these data in statistical analysis because the result in this cluster was such an outlier, but it tells us that the true death toll is far more likely to be on the high-side of our point estimate than on the low side.”

That is, the sample contains important information which is not summarised in the confidence interval, but which tells you that the central estimate is not likely to be a massive overestimate. The idea that the central 98,000 number might be an underestimate seemed to have blown the mind of a lot of commentators; they all just seemed to act like it Did Not Compute.

3. This gave rise to what might be called the use of “asymmetric rhetoric about a symmetric confidence interval”, but which I will give the more catchy name of “Kaplan’s Fallacy”. If your critique of an estimate is that the range is too wide, then that is one critique you can make. However, if this is all you are saying (“this isn’t an estimate, it’s a dartboard”), then intellectual honesty demands that you refer to the whole range when using this critique, not just the half of it that you want to think about. In other words, it is dishonest to title your essay “100,000 dead – or 8,000?” when all you actually have arguments to support is “100,000 dead – or 8,000 – or 194,000?”. This is actually quite a common way to mislead with statistics; say in paragraph 1 “it could be more, it could be less” and then talk for the rest of the piece as if you’ve established “it’s probably less”.

The Kaplan piece was really very bad; as well as the confidence interval fallacy, there are the germs of several of the other fallacious arguments discussed below. It really looks to me as if Kaplan had decided he didn’t want to believe the Lancet number and so started looking around for ways to rubbish it, in the erroneous belief that this would make him look hard-headed and scientific and would add credibility to his endorsement of the IBC number. I would hazard a guess that anyone looking for more Real Problems For The Left would do well to lift their head up from the Bible for a few seconds and ponder what strange misplaced and hypertrophied sense of intellectual charity it was that made Kaplan, an antiwar Democrat, decide to engage in hackish critiques of a piece of good science that supported his point of view.

The cluster sampling critique

There are shreds of this in the Kaplan article, but it reached its fullest and most widely-cited form in a version by Shannon Love on the Chicago Boyz website. The idea here is that the cluster sampling methodology used by the Lancet team (for reasons of economy, and of reducing the very significant personal risks for the field team) reduces the power of the statistical tests and makes the results harder to interpret. It was backed up (wayyyyy down in comments threads) by people who had gained access to a textbook on survey design; most good textbooks on the subject do indeed suggest that it is not a good idea to use cluster sampling when one is trying to measure rare effects (like violent death) in a population which has been exposed to heterogeneous risks of those rare events (ie; some places were bombed a lot, some a little and some not at all).

There are two big problems with the cluster sampling critique, and I think that they are both so serious that this argument is now a true litmus test for hacks; anyone repeating it either does not understand what they are saying (in which case they shouldn’t be making the critique) or does understand cluster sampling and thus knows that the argument is fallacious. The problems are:

1)Although sampling textbooks warn against the cluster methodology in cases like this, they are very clear about the fact that the reason why it is risky is that it carries a very significant danger of underestimating the rare effects, not overestimating them. This can be seen with a simple intuitive illustration; imagine that you have been given the job of checking out a suspected minefield by throwing rocks into it.

This is roughly equivalent to cluster sampling a heterogeneous population; the dangerous bits are a fairly small proportion of the total field, and they’re clumped together (the mines). Furthermore, the stones that you’re throwing (your “clusters”) only sample a small bit of the field at a time. The larger each individual stone, the better, obviously, but equally obviously it’s the number of stones that you have that is really going to drive the precision of your estimate, not their size. So, let’s say that you chuck 33 stones into the field. There are three things that could happen:

a)By bad luck, all of your stones could land in the spaces between mines. This would cause you to conclude that the field was safer than it actually was.
b)By good luck, you could get a situation where most of your stones fell in the spaces between mines, but some of them hit mines. This would give you an estimate that was about right regarding the danger of the field.
c)By extraordinary chance, every single one of your stones (or a large proportion of them) might chance to hit mines, causing you to conclude that the field was much more dangerous than it actually was.

How likely is the third of these possibilities (analogous to an overestimate of the excess deaths) relative to the other two? Not very likely at all. Cluster sampling tends to underestimate rare effects, not overestimate them[2].

And 2), this problem, and other issues with cluster sampling (basically, it reduces your effective sample size to something closer to the number of clusters than the number of individuals sampled) are dealt with at length in the sampling literature. Cluster sampling ain’t ideal, but needs must and it is frequently used in bog-standard epidemiological surveys outside war zones. The effects of clustering on standard results of sampling theory are known, and there are standard pieces of software that can be used to adjust (widen) one’s confidence interval to take account of these design effects. The Lancet team used one of these procedures, which is why their confidence intervals are so wide (although, to repeat, not wide enough to include zero). I have not seen anybody making the clustering critique who as any argument at all from theory or data which might give a reason to believe that the normal procedures are wrong for use in this case. As Richard Garfield, one of the authors, said in a press interview, epidemics are often pretty heterogeneously distributed too.

There is a variant of this critique which is darkly hinted at by both Kaplan and Love, but neither of them appears to have the nerve to say it in so many words[3]. This would be the critique that there is something much nastier about the sample; that it is not a random sample, but is cherry-picked in some way. In order to believe this, if you have read the paper, you have to be prepared to accuse the authors of telling a disgusting barefaced lie, and presumably to accept the legal consequences of doing so. They picked the clusters by the use of random numbers selected from a GPS grid. In the few cases in which this was logistically difficult (read: insanely dangerous), they picked locations off a map and walked to the nearest household). There is no realistic way in which a critique of this sort can get off the ground; in any case, it affected only a small minority of clusters.

The argument from the UNICEF infant mortality figures

I think that the source for this is Heiko Gerhauser, in various weblog comments threads, but again it can be traced back to a slightly different argument about death rates in the Kaplan piece. The idea here is that the Lancet study finds a prewar infant mortality rate of 29 per 1000 live births and a postwar infant mortality rate of 54 per 1000 live births. Since the prewar infant mortality rate was estimated by UNICEF to be over 100, this (it is argued) suggests that the study is giving junk numbers and all of its conclusions should be rejected.

This argument was difficult to track down to its lair, but I think we have managed it. One weakness is similar to the point I’ve made above; if you believe that the study has structurally underestimated infant mortality, then isn’t it also likely to have underestimated adult mortality? The authors discuss a few reasons why the movement in infant mortality might be exaggerated (mainly, issues of poor recall by the interview subjects), though, and it is good form to look very closely at any anomalies in data.

Basically, the UNICEF estimate is quoted as a 2002 number, but it is actually based on detailed, comprehensive, on-the-ground work carried out between 1995 and 1999 and extrapolated forward. The method of extrapolation is not one which would take into account the fact that 1999 was the year in which the oil-for-food program began to have significant effects on child malnutrition in Iraq. No detailed on-the-ground survey has been carried out since 1999, and there is certainly no systematic data-gathering apparatus in Iraq which could give any more solid number. The authors of the study believe that the infant mortality rates in neighbouring countries are a better comparator than pre-oil for food Iraq, and since one of them is Richard Garfield, who was acknowledged as the pre-eminent expert on sanctions-related child deaths in the 1990s, there is no reason to gainsay them.

I’d add to Chris’ work a theory of my own, based on the cluster sampling issue discussed above. Infant mortality is rare, and it is quite possibly heterogeneously clustered in Iraq (not least, post-war, a part of the infant mortality was attributed to babies being born at home because it was too dangerous to go to hospital). So it’s not necessarily the case that one needs to have an explanation of why they might have been undersampled in this case. Since this undersampling would tend to underestimate infant mortality both before and after the war, it wouldn’t necessarily bias the estimate of the relative risk ratio and therefore the excess deaths. I’d note that my theory and Chris’s aren’t mutually exclusive; I suspect that his is the main explanation.

We now move into the area of what might be called “not intrinsically hack” critiques. These are issues which one could raise with respect to the study which are not based on either definite or likely falsehoods, but which do not impugn the integrity of the study, and which are not themselves based on evidence strong enough to make anyone believe that the study’s estimates were wrong unless they thought so anyway.

There are two of these that I’ve seen around and about.

The first might be called the “Lying Iraqis” theory. This would be the theory that the interview subjects systematically lied to the survey team. In fact, the team did attempt to check against death certificates in a subsample of the interviews and found that in 81% of cases, subjects could produce them. This would lead me to believe that there is no real reason to suppose that the subjects were lying. Furthermore, I would suspect that if the Iraqis hate us enough to invent deaths of household members to make us look bad in the Lancet, that’s probably a fairly serious problem too. However, the possibility of lying subjects can’t be ruled out in any survey, so it can’t be ruled out in this one, so this critique is not intrinsically hackish. Any attempt to bolster it either with an attack on the integrity of the researchers, or with a suggestion that the researchers mainly interviewed “the resistance” (they didn’t), however, is hack city.

The second, which I haven’t really seen anyone adopt yet, although some people looked like they might, could be called the “Outlier theory”. This is basically the theory that this survey is one gigantic outlier, and that a 2.5% probability event has happened. This would be a fair enough thing to believe, as long as one admitted that one was believing in something quite unlikely, and as long as it wasn’t combined with an attack on the integrity of the Lancet team.

Finally, we come onto two critiques of the study which I would say are valid. The first is the one that I made myself in the original CT post; that the extrapolated number of 98,000 is a poor way to summarise the results of the analysis. I think that the simple fact that we can say with 97.5% confidence that the war has made things worse rather than better is just as powerful and doesn’t commit one to the really quite strong assumptions one would need to make for the extrapolation to be valid.

The second one is one that is attributable to the editors of the Lancet rather than the authors of the study. The Lancet’s editorial comment on the study contained the phrase “100,000 civilian deaths”. The study itself counts excess deaths and does not attempt to classify them as combatants or civilians. The Lancet editors should not have done this, and their denial that they did it to sensationalise the claim ahead of the US elections is unconvincing. This does not, however, affect the science; to claim that it does is the purest imaginable example of argumentum ad hominem

Finally, beyond the ultra-violet spectrum of critiques are those which I would classify as “beyond hackish”. These are things which anyone who gave them a moment’s thought would realise are irrelevant to the issue.

In this category, but surprisingly and disappointingly common in online critiques, is the attempt to use the IBC numbers as a stick to beat the Lancet study. The two studies are simply not comparable. One final time; the Iraq Body Count is a passive reporting system[4], which aims to count civilian deaths as a result of violence. Of course it is going to be lower than the Lancet number. Let that please be an end of this.

And there are a number of odds and ends around the web of the sort “each death in this study is being taken to stand for XXYY deaths and that is ridiculous”. In other words, arguments which, if true, would imply that there could be no valid form of epidemiology, econometrics, opinion polling, or indeed pulling up a few spuds to see if your allotment has blight. This truly is flypaper for innumerates.

I would also include in this category attempts like that of the Obsidian Order weblog to chaw down the 98,000 number by making more or less arbitrary assumptions about what proportion of the excess deaths one might be able to call “combatants” and thus people who deserved to die. This is exactly what people accuse the Lancet of doing; it’s skewing a number by means of your own subjective assessment. Not only is there no objective basis for the actual subjective adjustments that people make, but the entire distinction between combatants and civilians is one which does not exist in nature. As a reason for not caring that 98,000 people might have died, because you think most of them were Islamofascists, it just about passes muster. As a criticism of the 98,000 figure, it’s wretched.

Finally, there is the strange world of Michael Fumento, a man who is such a grandiose and unselfconscious hack that he brings a kind of grandeur to the role. I can no more summarise what a class A fool he’s made of himself in these short paragraphs than I could summarise King Lear. Read the posts on Tim’s site and marvel. And if your name is Jamie Doward of the Guardian, have a word with yourself; not only are you citing blogs rather than reading the paper, you’re treating Flack Central Station as a reliable source!

The bottom line is that the Lancet study was a good piece of science, and anyone who says otherwise is lying. Its results (and in particular, its central 98,000 estimate) are not the last word on the subject, but then nothing is in statistics. There is a very real issue here, and any pro-war person who thinks that we went to war to save the Iraqis ought to be thinking very hard about whether we made things worse rather than better (see this from Marc Mulholland, and a very honourable mention for the Economist). It is notable how very few people who have rubbished the Lancet study have shown the slightest interest in getting any more accurate estimates; often you learn a lot about people from observing the way that they protect themselves from news they suspect will disconcert them.

Footnotes:
[1]This is not the place for a discussion of Bayesian versus frequentist statistics. Stats teachers will tell you that it is a fallacy and wrong to interpret a confidence interval as meaning that “there is a 95% chance that the true value lies in this range”. However, I would say with 95% confidence that a randomly selected stats teacher would not be able to give you a single example of a case in which someone made a serious practical mistake as a result of this “fallacy”, so I say think about it this way.
[2]Pedants would perhaps object that the more common mines are in the field, the less the tendency to underestimate. Yes, but a) by the time you got to a stage where an overestimate became seriously likely, you would be talking not about a minefield, but a storage yard for mines with a few patches of grass in it and b) we happen to know that violent death in Iraq is still the exception rather than the norm, so this quibble is irrelevant.
[3]And quite rightly so; if said in so many words, this accusation would clearly be defamatory.
[4]That is, they don’t go out looking for deaths like the Lancet did; they wait for someone to report them. Whatever you think about whether there is saturation media coverage of Iraq (personally, I think there is saturation coverage of the green zone of Baghdad and precious little else), this is obviously going to be a lower bound rather than a central estimate, and in the absence of any hard evidence about casualties there is no reason at all to suppose that we have any basis other than convenient subjective air-pulling to adjust the IBC count for how much of an undersample we might want to believe they are making.

To be honest, the editor of the Lancet leaves a slightly funny taste in my mouth. By being so blatantly political, he gives a lot of succour to the people who are claiming that there were defects in the refereeing process, a point which I haven’t covered here because it would only matter if there were flaws in the actual paper, which there aren’t. (Some people, mainly people with no real idea about science, seem to regard “peer review” as magic oofle dust that has to be sprinkled on a piece of research to turn the words into Science). I agree with the Lancet guy that violent death is a genuine public health problem in Iraq and therefore the factors which cause it are a proper object of study for the Lancet, but, in any case, it was too late in the campaign to have ever made it as a real campaign issue, so why bother? God I feel like Fred Kaplan after all that.

I expect to see a comment, other than this one, which ignores everything else in the main post and criticizes you for being a “moral relativist” due to this statement, “…the entire distinction between combatants and civilians is one which does not exist in nature.”

You can count me among the innumerate (no pun intended), so my comment can immediately be discounted. However, I would be interested to know in which class of hackery you would put it.

According to the Guardian article, the study was carried out in “33 randomly-chosen neighbourhoods of Iraq representative of the entire population.” How much confidence do they have about this? In a war zone, and in a country closed to the outside world by sanctions for a dozen years, how accurately can one estimate demographic variables such as population density, total population, employment, illness, etc. Employment and illness might seem to be irrelevant, since the study is measuring violent death, but maybe if unemployment is high in one town, there are more people at home when the bombs fall. Maybe in one region, people are more likely to go to the hospital when they are sick. Seems like small variations in these variables could result in significant differences in the confidence interval.

But what do I know, I’m not a statistician. Yet let me say this: many people found this study bogus for the simple reason that extrapolating large numbers from small numbers seems to almost always be pure guesswork. The analogy that first occurred to me was crowd size estimates: the people who put on the rally estimate that 100,000 people were there, while the cops put it at closer to 20,000. Presumably they both have some rational basis for their estimate, but the results are so divergent that there seems to be no science involved at all.

Anyway, those are my thoughts on why people seem to have such a hard time with this, rather than (in most cases) an unwillingness to admit that the war could have killed people.

George; the procedure for ensuring a random and representative sample is detailed in the paper and it’s not hard to understand – it’s nuts and bolts stuff rather than anything specifically mathematical. It looks good to me and I haven’t seen anyone seriously criticising it.

I don’t agree that this is like estimating crowd sizes. As I say, I don’t like the extrapolated number, but the relative risk ratio is solid, and that’s the big thing for me. It would be very difficult to get this result from a random sample if the death rate had not risen.

If people were honest supporters of improving Iraq they would look at these numbers and try to come up with some solutions to improve the situation instead of insisting that everything is a-ok. At the very least some introspection is called for. As Daniel says, the exact number isn’t as important as the fact that things are much worse now than they were before.

Hi,
I was curious about something. Is the “cluster sampling method” anything like a Monte Carlo technique?
I am glad you have taken the time to debunk the hack critiques that many of us have seen. Personally, if a journal like Lancet publishes a study of this nature, I would be very careful in critiquing and rubbishing it.

This was really useful, thanks. Now if only I could get my students to actually read news articles referencing the Lancet study so they can begin the inevitable process of trying in vain to rubbish it. I’m (almost) envious of those of you who live amongst the rubbishers, at least they care enough to engage in self-serving and fallacious reasoning.

First let me take a minute or two for some wild enthusiastic cheering. Good job DD. It more than makes up for my disappointment when you attacked Steve Landsburg not on his libertarian sins, but for that paper on quantum game theory.

Okay, two comments. First, why can’t major news organizations hire people like you to explain studies like this? (Consider that to be part of the cheering). Second, why can’t they manage to spend more than thirty seconds mentioning them? This story was a one or day wonder in the American press–one story, usually containing some dismissive remarks, and then that’s all the time we had for investigating the utterly irrelevant story about whether we’ve made things worse in Iraq.

Several of the external links in this article have spaces around the href attribute, inside the quotes. Most web browsers just strip them out, but they confuse our news aggregator and probably many others.

More generally, I’d agree with Daniel that most of the objections to the study either misunderstand the methods or just assert that because it doesn’t conform to an ideal design it can therefore be dismissed out of hand as garbage. This kind of objection is common in the first year of graduate social science programs, where people who have yet to do any empirical research find it very, very easy to trash the work of others who have. Later they tend to realise that the available methods really are very powerful, all things considered, and that even a modest empirical paper takes a lot of work to pull off. The fact that these guys did work like this under the circumstances they faced is really remarkable. Of course it’s not the last word on the subject, but it’s a real contribution with a striking finding. To write it off with malformed (or stock) objections is sheer ideology.

Browsing around, I found there was a poll in Iraq commissioned by the American Enterprise Institute which came up with interesting results, reported or misreported by Karl Zinsmeister in the September 10 2003 Wall Street Journal opinion page online at the following location–

Anyway, according to Karl, half of the Iraqis knew someone killed by Saddam’s security forces (though someone else at the CASI website said that Karl was misrepresenting the poll question—50 percent said they knew someone who either died in one of Saddam’s wars or were murdered by Saddam’s security forces).

Karl goes on to say less than 30 percent (which presumably means nearly 30 percent) knew someone who died in the spring fighting.

Can you tell anything from relative death rates from this kind of thing? I fell back on my Poisson distribution for dummies approach, assuming that each person in Iraq knows the same number of people and setting either 0.7 (for the invasion) or 0.5 (for the Saddam years) equal to the probability of nobody dying in a given group of known people, and if I did it right you find that the number of people who died violently under Saddam is only twice the invasion number, which seems awfully high for the invasion figure, unless the Saddam total has been inflated. I’m not clever enough to come up with alternative models that might be more realistic and don’t know if the Poisson distribution approach makes sense.

My perhaps incorrect approach aside, I’m surprised that only 50 percent of Iraqis knew someone either killed in the wars or murdered by Saddam. If you took the 300,000 figure for the murders and added an equal number for deaths for war casualties, that’s 600,000 out of 24 million (these days–the population has grown, presumably). The NYC metro area has around 15-20 million, I think and out of the 3000 who died on 9/11 I didn’t know anyone, though I knew people who knew people who died. But I think that if, say, 400,000 died I’d have known quite a few and probably most people around here would have known a victim.
The 50 percent number might make sense if most of the victims came from a subset of the population that didn’t mingle with the rest, I’m guessing, or if most people in Iraq don’t know very many people. (Which seems silly).

I throw these amateurish efforts out there in hopes that someone (ahem) who actually understands statistics would be willing to say what, if anything, can be learned from this earlier poll.

dquared and kieran: thanks. I will have to take your word for it, though. If a statistics degree is necessary for one to swallow the claim that 61 deaths can be extrapolated to hundreds of thousands, you guys in the reality-based community have got a problem with the rest of us.

Add another critique to the study – the model the researchers fit allowed for an increase (and only an increase, based on their write-up – they never explicitly write down their model) in the death rate after the invasion. Thus the lowest estimate for excess death they could come up with would be 0 – no chance for a decrease in deaths. I don’t have enough experience with survival analysis to know about how this will effect the size of confidence bands, but it is suspect. It’s also another example of the authors’ bias.

Mike, that’s simply not true. They did explain what their model was and it did allow for a decrease. They calculated a relative risk ratio associated with the intervention (that’s their model) and if that risk ratio had been less than 1 (it wasn’t) then this would have corresponded to a falling death rate. They in fact did observe a falling death rate in the Kurdish provinces and reported it.

Kieran brings up an interesting point, and not one limited to the social sciences. In my chosen field (microbiology) I found similar patterns, and certainly participated in them. In the first year of grad school, every paper you read is crap, and any moron could do it better. By the fifth year, you start to see how hard it is to do a really great piece of research, and you learn not to throw out too many babies with the bathwater. Maybe if all the whiners and criers (yeah, you Fumento!) would actually spend some time doing research, and learning about what they are critiquing, they might have a bit more perspective.
Admittedly, however, the ones who get paid to publish this drivel probably have a leg up on us poor oppressed academics!

I think George is in for a shock when he finds out how economic indicators here in America (e.g. unemployment) are calculated, since the extrapolations from data are just as dramatic.

But it should be obvious this kind of thing is basically sound. We regard it as a news story when opinion polls (which extrapolate from 1000 voters to 100 million) are off by 3 or 4%. If extrapolation was as bad as people are claiming here, getting within 10% would be miraculous, which it pretty clearly isn’t. If this study is off by 3 or 4% in its mean figures, or even 10 or 15%, that means the Iraq war has led (so far at least) to waaaay more deaths than it has prevented.

You might due your readers a service by providing the actual numbers on infant mortality in order to show just how badly off the Lancet study is.

The Lancet study measured an infant mortality rate in the 18 months prior to the war (Oct 2001-Mar 2003) of 29/1000. UNICEF reported the 2002 rate at 102/1000.
Richard Garfields study to which you link reports (p7-8) that:

“In the context of an increase in the number of total visits to hospitals and a rapid decline in malnutrition since 2000, it is reasonable to assume that there has indeed been a decline from the very high levels of child mortality in the late 1990s. The decline from a high rate of 131/1,000 among under five-year-olds during 1995-1999 may well be to a rate of 90–100/1,000.”

Since under five mortality and infant mortality track it reasonable to assume that the infant mortality dropped from in similar ranges which would put Garfield in agreement with UNICEF.

There is absolutely no way that the 2002 Iraqi infant mortality rate was 29/1000. That would be the best rate ever recorded in Iraq. We have to conclude either the Lancet study or the all the other pre-war studies miss-measured the pre-war infant mortality rate. Since the Lancet study is the odd man out, Occam razor says that it’s the flawed study.

The really sick thing about all this is that the same people who now trumpet the Lancet study and the very low measure of pre-war infant mortality are the very same people who trumpeted the other studies high rates. The only commonality here is those individuals opposition to US policy in Iraq. They portray pre-war Iraq either as a charnel house or as eden depending on what gives the best advantage of the moment.

Dead children don’t matter, only screwing over their internal political enemies does.

“…people have been lining up to take a pop at it despite being manifestly not in possession of the baseline level of knowledge needed to understand what they’re talking about.”

Gosh, that makes it like most blog comments, alas. It frequently drives me crazy, to be sure.

To take a random pop, the stuff early in this post about “mines” makes me wonder the relevance, given that an anti-tank mine is set off by magnetic field, or weight, and “tossing a rock” is irrelevant. So this, for one, seems to be evoking some sort of unreal idea of a “mine,” that then, when an example purporting to represent the real world, uses such a non-existent sense of a “mine,” makes me think whatever proceeds makes no sense, given the premise on a non-existent device in reality. Anti-personnel mines are even trickier in their variability, but given their imaginary use in this theoretical example, I’m uninterested in further pursuing the fantasy of an accurate conclusion. Any argument based upon things that don’t exist, though people popularly imagine them, is suspect, in my book.

The thing is, support for the Iraq invasion has become an issue of religion with many of its supporters, especially with the socalled leftists who support it, that everytime negative news about it is published it has to be denied. Which is why you have the increasingly desperate attempts to rubbish it.

Then there are the professionals, who know their objections are false but who don’t care as long as they can mislead others into thinking the Lancet piece was rubbish.

Shannon, this objection is dealt with in my piece and (with much greater thoroughness) at Chris Lightfoot’s site. UNICEF did not “report” the 2002 rate as anything. They extrapolated a 2002 rate based on the 1999 research. And from then on in, you are citing Richard Garfield as an authority in an argument against Richard Garfield. You’re also exaggerating the extent to which a mortality rate of 29 is an outlier; given the very wide confidence intervals on the estimation process as a whole (and the substantial bias toward underestimation, which, as a “service” to your own readers you have consistently failed to acknowledge), it strikes me that a 1999 rate of about 100, a guesstimated fall of 40% and a confidence interval of +/- 20 would include the sampled figure of 29 at its lower end.

I see that you are no longer defending the clustering critique, for which thanks, but you seem to now be advancing a much less intellectually respectable critique that “Occam’s Razor” tells us that there are strange murky flaws in the methodology which you can’t identify but nevertheless insist are there. For which, not thanks. I see you are also continuing to attribute the worst of motives to people who produce results that you disagree with; you are presumably doing this because you are judging the behaviour of others by your own standards.

Gary: I define a “mine” in the simplest way possible; as “a stochastic finite state automaton defined on the integer field with mutually exclusive states sampled conditionally on a matrix M”. Doesn’t everybody?

But it should be obvious this kind of thing is basically sound. We regard it as a news story when opinion polls (which extrapolate from 1000 voters to 100 million) are off by 3 or 4%

Exactly, and you never see people attempting to rubbish polls especially when the numbers give pretexts for other ideological conclusions. Last year’s EU survey about European opinions on the war in Iraq and about Israel was a good example of that.
Or the “moral values” election poll. Everyone has draw all sorts of conclusions but I haven’t seen anyone disputing actual numbers.

People do think there is a lot of coverage about Iraq because there’s always something in the news about it, but how much of it has been about civilian deaths? So, because they don’t hear or see it on tv, it must not exist. Civilian deaths in Iraq, oh, it must be some foggy concept that can be easily put aside. (Possibly with “anyway, Saddam would have killed more” – now that’s a scientific method the Lancet authors should learn something from!).

The combination of attacks on Lancet is also so self-contradictory. I remember the IBC website was considered as reliable as Indymedia and now it’s become more authoritative than a peer-reviewed medical journal.

I don’t even understand the argument about UNICEF infant mortality rates presumably proving the Lancet study is all wrong.

How can the criticism simultaneously be that Lancet underestimates those rates _therefore_ it must overestimate the war casualties?

Another way of looking at the opinion poll example is that a 5 per cent swing (50 people in a sample of 1000) in a well-conducted survey is (correctly) taken as strong evidence that somewhere between 2 million and 8 million people in an electorate of 100 million will change their votes.

For me the big negative about the Lancet survey is they had to do it in the first place – the Geneva Convention insists that occupying forces keeps track of these figures, of course there is little chance of the US doing this – rather they just assume that everyone blow up by a bomb was probably an insurgent anyway.

Anyway, according to Karl, half of the Iraqis knew someone killed by Saddam’s security forces (though someone else at the CASI website said that Karl was misrepresenting the poll question—-50 percent said they knew someone who either died in one of Saddam’s wars or were murdered by Saddam’s security forces).

Karl goes on to say less than 30 percent (which presumably means nearly 30 percent) knew someone who died in the spring fighting.

Can you tell anything from relative death rates from this kind of thing?

I think the answer would be “maybe, but not in any straightforward way”. Saddam was in power for twenty years, and his murders were concentrated within that period to 1985-90 (Kurds) and 1991 (Shia). The coalition, on the other hand, has only been in Iraq for eighteen months. So I think it would be difficult to straighten out the time periods and get an apples-to-apples comparison. I’m not saying it couldn’t be done, but I suspect you would have to make some truly heroic assumptions.

“In the past year and a half, has your household been directly affected by violence in terms of death, handicap, or significant monetary loss? (Close family member, up to 4th degree)”

Now, suppose that there are ten people in an average “close” family group; then these figures imply that about 2% of individuals have been “directly affected by violence” (because each person who is killed or injured or whatever is known to about ten others). That gives a raw number of about 600,000 victims; we can’t, of course, tell how many of those were killed, how many injured and so forth (or who did the killing and injuring).

But the important point about this is that, whatever the difficulties of conducting an opinion poll in Iraq and the imprecision of the number which comes out, it gives an independent estimate of the number of casualties — and that estimate is of about the same magnitude as Roberts et al.’s estimate.

Starting with the IBC estimate, c.15,000 and filling in the gaps consisting of indirectly caused deaths, I doubt less than 5,000, and non-civilian deaths, surely at least as many as the IBC figure, gives a number well into the tens of thousands.

Since the IBC figures are by design systematically at the low end, the indirect figure is probably low and hte military deaths figure very likely out by a factor of two or more, there seems nothing at all implausible about the Lancet numbers.

I am surprised not to have heard more about the numbers Saddam is not killing any more. I don’t think they would wash anyway but they would be hard to count and therefore to dismiss. Has this point now been conceded?

I think the fundamental problem with this study is not in the area of statistics, but rather moral interpretation. Is it not true that most of the excess deaths have occured, not as a result of the original invasion, and overthrow of the tyrant, but as a consequence of the “insurgency’s” efforts to retake the country, and prevent it from becoming a free democracy?

Daniel, I’m not statistically educated, and the Lancet site doesn’t seem to allow a non-subscriber to read the actual study, so excuse me if these questions are addressed in the study or are simply ignorant.

1) Does the study distinguish between combatants and civilians (I can’t tell for sure from the posts and comments here)? During the first Gulf War, the US estimated Iraqi military deaths at 100,000. Is it reasonable to believe that Iraqi combat deaths from the current war would have been comparable or even higher? I.e., is it possible that the figure of 100,000 deaths reported by the Lancet is indeed under-reported but consists largely of combatants?

2) The CIA Factbook reports Iraqi population at 25.3 million and a death rate of 5.66%, which I think suggests that about 150,000 people might die in Iraq in an ordinary year. Does this mean that 100,000 excess deaths represents a 2/3 jump in the death rate (I may be misusing figures, so please correct me)? It strikes me that even a well-functioning city would have trouble finding morgue and grave space to deal with such a number of additional bodies, and so it seems that, under the chaotic wartime conditions in Iraq’s cities, this amount of excess deaths would have resulted in heaps of bodies or mass graves. If so, I would think that America’s opponents (whether the Baathists, Zarqawi, or whomever) would have every incentive to film these corpses and smuggle out the video much as they do with the beheadings. Obviously, this is not a critique of the study but simply uninformed intuition. Am I thoroughly off base, though; is it reasonable to conclude that these excess deaths were more or less absorbed by the Iraqi infrastructure?

I’m just asking because these questions occurred to me; I don’t mean to be combative toward the study or your defenses thereof, or to suggest that any number of deaths should be minimized or dismissed.

Way to go, Brett. Fall back on the other guy made me do it argument. Americans have no choice, but to use helicopter gunships and 500-2000 lb bombs in Iraqi cities.

It is possible, you know, to blame both sides in a war for their brutality.

Thanks for the response, Daniel. I agree that it’d take a lot of assumptions to get any number out of the earlier poll, but it still makes me wonder what sort of assumptions would be required to give an answer of many hundreds of thousands for the Saddam years and low tens of thousands (military and civilian) for the Spring 2003 invasion, if 50 percent knew people who died under Saddam and nearly 30 percent knew people who died in the invasion.

Well, obvious question is: If you get blown up while waiting in line to join the Iraqi police, or gunned down execution style on your way home from boot camp, why is it OUR fault?

Yes, the case can be made that we ought to be willing to accept more casualties on our side, in an effort to reduce civilian casualties. (Good luck selling that idea to the American people, by the way!) But the fact is, we already liberated Iraq, and now we’re fighting a force that’s trying to enslave the country again, and not only are THEY responsible directly for a lot of the deaths Lancet is complaining of, they’re indirectly responsible for all of them, because there wouldn’t be a war if not for their agression.

Tom: 1) No, the survey does not distinguish. I’m guessing that not a small reason for this would be the safety implications for the researchers if they went round Fallujah and Sadr City with a clipboard asking “Does your household contain members of the resistance?”. But in any case the property of being a combatant is a political one, not a natural one.

2) Yes; the estimate suggests about this order of magnitude increase in the death rate. But no, I don’t think that this would cause the death infrastructure to collapse. I would guess that the order of magnitude of the effect is equivalent to a medium-sized flu epidemic.

You know Brett, what a great argument!!!! If only the US capilulated to Osama 3000 people would still be alive. Obviously its the fault of the US! If the Israelis woudl just leave, they wouldn’t suffer terrorist attacks. Every death is their own fault! If only native americans surrendered even faster, they wouldn’t have been as fully wiped out!

If you hit a hornet’s nest with a stick, are you not at least partially responsible for the increase in stings?

We knew going in that there would be an insurgency, there was inadequate planning for it, and obviously the administration considered the losses that would come from an insurgency to be at an acceptable level. So even if a significant number of those deaths were due to the actions of insurgents, it wouldn’t show that the administration bore no responsibility for them.

Perhaps more important is that regardless of whether the cause of death was the bullet or bomb of an American or an insurgent, the numbers matter for determining whether this war satisfies the macroproportionality condition on standard versions of the just war theory. It is starting to look like they don’t. Even those of us on the left gracious enough to play along and imagine that there was somewhere in the shifting sands a just cause for this war and to set aside speculation about the purity of the President’s motives can cite the Lancet number as solid evidence that this war quite simply doesn’t pass the tests imposed by traditional just war theory.

I don’t think your explication of the problems cluster sampling with rare events makes clear the main points:
1. Most of the time there will be a small underestimate.
2. Rarely, there will be a large overestimate.
3. The estimate is unbiased on average.

To take your example (nice example, BTW): suppose there is a 1% chance of hitting a mine with a stone toss. 10 stones are tossed. Then: 90% of the time no stones hit mines, leading to an estimate of zero instead of 1%. 10% of the time, 1 stone hits a mine*, leading to an estimate of 10%. On average, the estimate is about 1% (although we’ll never actually get this estimate with 10 stones).

Admittedly, this is just a pedantic point, not a devastating critique of the Lancet study: the standard errors (confidence interval) reported by the authors presumably account for the uncertainty.

FN * Actually, the chance of 1 *or more* stones hitting a mine is 1-.99^10. I’m ignoring the possibility of hitting 2 or more mines, since it is tiny, and I’m rounding, but calculating things exactly would make little difference.

You are ignoring the dynamic issues, and discussing the study as if it were a cross-sectional (point in time) analysis.

Other studies which use this technique don’t push it nearly as hard, and apoligize for it more.

For example, Grein et al, conduct a very similar study, but it is superior because:
1. The recall period is shorter (less than half).
2. They are careful to inquire about those entering and leaving the household.

Also, their reference period is clearer, and I’d guess that in an urbanized society like Iraq, people enter and leave the household more often (schooling, military service, hospitals, etc.)than in the Angolan refugee camps studied by Grein et al.

Nonetheless, Grein et al acknowledge (referring to the same method used by the Lancet study) that “the WHO/EPI sampling technique (originally conceived for immunisation and nutrition surveys) has not been fully validated as a tool to estimate mortality and alternative methods have been suggested.”

This is the same point I’m making: the sampling technique is suited for cross-sectional studies, but not so well suited for longitudinal studies. This is especially the case for studies that are sloppy about including people before they entered or after they’ve left the household, as in the Lancet study.

To speak in jargon, you should take the issue of informative censoring much more seriously.

Why are you claiming that the Lancet study was ” sloppy about including people before they entered or after they’ve left the household”? They weren’t; there’s a discussion of this issue on pages 2 and 3 of the paper.

Note also that your point about recall bias is significantly attenuated by the Lancet team’s use of death certificates, which were not available in the Angolan study which you cite. This is a version of the “Lying Iraqis” critique rather than anything which really relates to the methodology of the survey. It is therefore, as I note above, potentially a valid objection, but one which depends on making assumptions about the interviewees for which there is no evidence.

Your comment about the methodology not having been “validated” has a flavour of Steven Milloy about it to me. This is a blanket disclaimer pointing out for anyone unaware that cluster sampling is a difficult business (hence the wide confidence intervals). If Grein et al had really thought that their survey methodology was incapable of delivering useful resutls, they wouldn’t have bothered.

The Lancet study does discuss the issue of people entering and leaving the household. As I argued in comments on your previous post, their discussion makes clear that they are doing it wrong.

The Lancet authors are basically attempting to exclude household members who leave the household, rather than include them, which is what other studies that use this technique do. This is very important, because there are many reasons to think that leaving the household could be associated with death.

The disclaimer in the BMJ article I cited isn’t about cluster sampling, it’s really about the definition of the sample universe. I’m prepared to believe that this was reasonably clear in the BMJ study, but not the Lancet study.

It isn’t well-defined who is supposed to be included in the Lancet study’s universe. This is an invitation to overcount the kinds of deaths the Lancet authors were looking for. This isn’t lying. The surveyors in the field may well have thought that by counting violent deaths to people whose membership in the study universe was unclear, they were collecting complete information and doing a good job.

And the death certificates don’t validate anything, since they were only collected from a small, non-random, subset of the respondents.

This is a great summary post, and I take the Lancet’s findings seriously. If I may offer a thought, not a critique: keep counterfactuals in mind when making any humanitarian assessment.

Had the war not happened, Saddam’s regime would have dissolved at some point anyway, possibly ending in civil war or the reign of his (now dead) sons. Presumably, numerous deaths would have followed. The Lancet’s excess deaths figure (obviously) can’t take these counter-factual deaths into account.

In other words, many of the 100,000 may have perished anyway when the Ba’ath regime (“naturally”) crumbled, however long down the road. I acknowledge this is counter-factual speculation, but I think it deserves consideration in the humanitarian argument (it is not a critique, in any way, of the very honorable Lancet study)

In light of this I’m still not sure where I stand on the humanitarian case for war. If Iraq is successful in 10 years, any ex-post justification of the war will still require pondering the question: how many lives is freedom from political repression worth?

This was getting to be a long comment, so I split it up. Both parts are important to what I am trying to say.

Part 1:

Brian W and John Q use the example of opinion polls to demonstrate how accurate statistical sampling can be. But that comparison only reinforces my skepticism, since in fact opinion polls are almost never right. Look at this table of various polls taken the week before the American election: http://www.realclearpolitics.com/polls.html. All are (I presume) reputable and statistically sound, and all purport to measure the same thing, yet there’s a spread of 4% between the highest and the lowest. In state polls, the spreads were higher. In other words, it’s not news if an opinion poll is 3% to 4% off – it’s news if it’s accurate. And this is in a population that pollsters know abundantly well; how much more difficult must accuracy be in a population we do not know well, and that has in fact recently been through enormous change?

Granted, this is not necessarily a critique of the study itself, since I gather that that uncertainty is reflected in the wide confidence interval. But then I fall back on something like the Fred Kaplan argument, which, despite Daniel’s thorough and probably accurate criticism, nevertheless seems to hint at something true: when the actual results might be anywhere in such a wide range, does the study really say anything useful for judging whether the war was a good idea or not?

This was getting to be a long comment, so I split it up. Both parts are important to what I am trying to say.

Part 1:

Brian W and John Q use the example of opinion polls to demonstrate how accurate statistical sampling can be. But that comparison only reinforces my skepticism, since in fact opinion polls are almost never right. Look at this table of various polls taken the week before the American election: http://www.realclearpolitics.com/polls.html. All are (I presume) reputable and statistically sound, and all purport to measure the same thing, yet there’s a spread of 4% between the highest and the lowest. In state polls, the spreads were higher. In other words, it’s not news if an opinion poll is 3% to 4% off – it’s news if it’s accurate. And this is in a population that pollsters know abundantly well; how much more difficult must accuracy be in a population we do not know well, and that has in fact recently been through enormous change?

Granted, this is not necessarily a critique of the study itself, since I gather that that uncertainty is reflected in the wide confidence interval. But then I fall back on something like the Fred Kaplan argument, which, despite Daniel’s thorough and probably accurate criticism, nevertheless seems to hint at something true: when the actual results might be anywhere in such a wide range, does the study really say anything useful for judging whether the war was a good idea or not?

Let me be clear about something: I supported this war, and I did so in the full knowledge that it would kill people, including innocent people. Every fair-minded advocate of the invasion of Iraq should be willing to acknowledge that. But we would have to kill an awful lot of people even to come close to Saddam’s body count. Here’s how I figure the numbers:

Saddam was in power for, what, 23 years? He was a murderous tyrant from day one (literally), but as you say, he put up the really big numbers after 1985, in his purges against the Kurds and the Shia. The estimates I’ve seen put the death toll from internal terror alone at somewhere between 150,000 and 300,000. (Also a wide confidence interval.) Over 9 years, that’s an average of 17,000-33,000 (in round numbers) per annum.

But that’s not all: Saddam also prosecuted two wars of aggression, against Iran and Kuwait, in which it is estimated that 1.5 to 2.0 million people died. These wars took place during the same period (i.e., after 1985), so that’s an average of several hundred thousand per annum.

(Many people respond that that’s all irrelevant, since Saddam was in a box, hamstrung by the sanctions and the no-fly zones. Someone on CT, I believe, calculated that only about 2,000 people were being killed by Saddam each year at the time we invaded, so this figure represents the alternative case to the war. I think that’s naïve. Prior to 9/11, the sanctions regime was already falling apart, as (it has been shown) Saddam was actively working with members of the very organization that had put him in the box to dismantle it. Sooner or later, Saddam was going to get fully out of the box and resume business. The Kurds, the Iranians, the Kuwaitis, the Marsh Arabs, the Shia – Saddam was a serial aggressor and mass murderer; it was absurdly optimistic to think he – or, perhaps worse, his sons — would just give it all up.)

Against the status quo of several hundred thousand violent deaths (on average) per year, put the casualties that have come about from the invasion. And here’s where quantity becomes important. The Lancet study says (if I understand it) it is equally likely that either more or less than 98,000 people have been killed in a year and a half. If the actual number is less than 98,000, I’d say those numbers clearly work: if (as I assert) Saddam was inevitably going to return to his bloody average of several hundred thousand a year, then the people of the region (including for this analysis both Iraqis and Iranians) are better off for the war. If the actual casualty figure is *above* 98,000…well, that gives some pause, as that figure approaches Saddam’s tally.

But remember, in either case, it is going to get better. Whatever the actual casualty figure, it will almost certainly be lower next year, and if we are successful (I am still optimistic, and this week’s news make me more so) it will decline further in the years to come.

Look, I don’t mean to dismiss a single one of those innocent deaths. Even if “only” 8,000 innocent Iraqis died as a result of the American invasion, every one is a tragedy. But people would have died as a result of our not invading, too, and a lot more of them. Action is not alone in having moral consequences; inaction does too.

And if they didn’t die then, they’d have died eventually, so it’s all good, eh?

Actually, no. We’re talking about excess deaths. Natural death would occur irrespective of the counter-factual.

Depends if it’s my life or not, don’t it? (Well, it does to me).

True, I could reframe the question to capture that effect. Suppose the US was overtaken by a nasty tyrant, and X amount of lives from the population, chosen randomly, were required to remove said tyrant via revolution. Suppose you are among the population from which the sample is chosen. How small would X have to be for you to support revolution?

Pacifist Response: 0
Pro-war nut response: 300 million

Anyway I don’t want to veer this thread off its discussion of the Lancet. My point was that the humanitarian case has taken a serious blow from the Lancet, but it’s not out the window just yet.

On a moderately off topic point, there’s a semi-serious suggestion in the above post that readers should spend their evenings taking statistics. Does anyone know of a respectable university-level statistics course offered on-line? I’ve wanted a better statistics grounding for awhile (mostly because I get so irritated by having to take arguments like this one on faith), but haven’t got the time to do it in person.

D’oh! Despite proofreading, I made a big goof in my arithmetic. The period of time between 1985 and the invasion was 18 years, not 9, so cut all my “status quo” numbers in half. Not that it derails my basic argument, but talk about innumerate…

Sorry Dave, I see where I may have been misinterpreted. When I said “many of the 100,000 may have perished anyway” I was trying to make the counter-facual intuitive, but my statement was misleading. I should have said “excess deaths comparable to those found in the Lancet may have resulted from counter-factual succession/civil-war”

I think you need to ask the Iraqis. Preferably by going round from house to house of a representative sample and ask them directly, then report back what you find….

(But be careful, they may be all lying/insurgents/spoilt rotten with exceedingly high expectations from living under dictatorship for three decades. In which case you can certainly take on the heavy burden of answering that question for them and declaring the humanitarian dilemma solved.)

I discuss the confidence interval issue in my ZNet piece: 100,000 Iraqis Dead: Should We Believe It?http://www.zmag.org/content/showarticle.cfm?SectionID=15&ItemID=6565
What I conclude is that a major reason the CIs are so large is that the authors engaged in a conservtive (in a statistical sense) analysis of their data and included the Kurdish region. In this region, there was a reversal of the pattern in the rst of the country: more deaths prior to invasion than after. If this region had been excluded, the CIs would have been narrower, and the point estimate (98,000) higher.

The fact that the authors did not exclude the Kurdish region is one piece of evidence that they were NOT seeking to maximize the estimate of casualties. This type of thing is one of the things we researchers look for when judging the quality of another’s study. This study was one of the best I’ve seen. Still, given its discrepancy with the estimates of others (Iraq Body Count, Project for Defense Alternatives), I tend to believe the “true” estimate is somewhat below 98,000, but still in the many tens of thousands.

The Lancet authors are basically attempting to exclude household members who leave the household, rather than include them, which is what other studies that use this technique do. This is very important, because there are many reasons to think that leaving the household could be associated with death.

But this would lead to an underestimate rather than an overestimate, most likely. I would guess that more people left their households after the invasion than before. Indeed, I suspect that the researchers chose this criterion precisely in order to avoid picking up people who left home to join the Al-Mahdi Army and were “missing, presumed dead”.

The disclaimer in the BMJ article I cited isn’t about cluster sampling, it’s really about the definition of the sample universe. I’m prepared to believe that this was reasonably clear in the BMJ study, but not the Lancet study.

You’ll have to explain this more clearly. The sample universe in the Lancet study is perfectly clear; it’s the population of Iraq.

It isn’t well-defined who is supposed to be included in the Lancet study’s universe. This is an invitation to overcount the kinds of deaths the Lancet authors were looking for.

I disagree. It’s perfectly clear. Household members who died is what are counted, and the membership criterion is specific – people who were part of the household when they died and had been for the preceding two months. What specific sentences in the article make you think that the criterion isn’t clear?

And the death certificates don’t validate anything, since they were only collected from a small, non-random, subset of the respondents.

I don’t see how this is true. They were checking for two death certificates (for noninfants) per cluster. They had 66 death certificates out of 182 deaths. That’s not a small sample. I’m troubled that there is no description of how they decided whether or not to ask for death certificates, though.

In general, the potential for recall bias and lying (what you’re calling “informative censoring”) is discussed at fairly decent length on the last two pages of the article. The team concludes that it’s not enough of a problem to credibly threaten their conclusion and I agree with them.

Ragout,
Do you find the results of this survey to any great extent surprising? The IBC numbers plus the number of combatant deaths would give you a low ball at best rather less than an order of magnitude smaller than the Lancet numbers.

If they were really only after the sensation value they could have given estimates including the Falluja figures.

At best you are going to find holes in what is already known to be a net. Whatever its shortcomings it is the best figure we have to go on and it has been peer reviewed by a reputable journal. It doesn’t say anything very surprising beyond crystalising the effects of a gradually developing situation. I’m not sure why you think its burden of proof should be so high or that it fails to meet it.

Why do you think there are no official figures? Surely this number, whatever its true value is an important performance indicator? In any case you will convince more people if you can provide a persuasive case for a significantly different number rather than just trying to find ways this number might not be right.

I think ragout’s actually being quite intellectually honest here and I’d appreciate it if people didn’t pile on (Jack; this is a general and pre-emptive warning, not one aimed at you, I think your questions are pretty fair).

There is a potential issue of reporting bias, and I was probably wrong to trivialise it with the nickname “lying Iraqis”. I maintain my view that the dataset and the design of the experiment don’t give any real reason to believe that there is a problem here, but ragout is not blowing smoke; that’s why I’m asking him for clarification.

One way you can tell that ragout isn’t acting the hack, by the way, is that he’s not throwing around hysterical accusations at the paper’s authors. They’ve designed a good piece of science and carried it out well; I count it as significant progress that the debate we’re now having is one about how even a legitimate piece of science might have given us wrong numbers.

While I’m on the subject, I’d add that I would have expected that, if there was recall bias (and if it differentially affected responses so as to overestimate the post-invasion death rate relative to the pre-invasion death rate, then I’m surprised that it was so consistent across non-Kurdish Iraq (including the Shia provinces, who are meant to support the coalition) and so consistently the other way in the Kurdish North.

I’m not positive, but I don’t think this article addressed the strongest arguement agains the Lancet study: the numbers are utterly ridiculous.
For 100,000 deaths to have occurred since the end of the war, that would be an average of around 175 a day (I don’t know the exact number, but its been discussed since the study came out). The 100,000 may have included additional infant mortality, and may have included additional ‘natural’ deaths that weren’t occurring under Saddam, but as I recall, it specifically said that ‘most’ deaths were occurring in violence, and ‘most’ of those were due to American (presumably unintentional) weapons. So at most 49,999 were additional infant mortality figures, (leaves ‘most’ due to violence), and at most 24,999 were non-American weapons (leaving ‘most’ due to Americans) So of 180 deahts per day, you’ve got 90 or so deaths per day due to violence, and 45 or so deaths per day due to American violence. 90 (or 45) per day, every day, for the last year and a half. Thus, in the last year and a half, when you have read in the papers about a bombing that killed 70 people, you were actually reading about a slow day! The news media has been writing headlines about the days when violence was actually lower than the norm! 100 people would have had to have been killed for a normal run of the mill day, I guess.
So maybe you should address the “these numbers are absurd” statistical test.

Quick question: has anybody in the media or blogosphere both a)criticised the Lancet study and b) criticised the Allawi government’s decision to stop publicising the Iraqi Ministry of Health’s collation of death figures from Iraqi hospitals? I think maybe Fred Kaplan; otherwise, everyone who did b) but not a) wins this year’s Intellectual Dishonesty Prize.

In light of this I’m still not sure where I stand on the humanitarian case for war. If Iraq is successful in 10 years, any ex-post justification of the war will still require pondering the question: how many lives is freedom from political repression worth?

A fair question, but it would be a better one if the case for war had actually rested on humanitarian grounds. It only really did ex post facto when the WMD and “terrorist links” arguments went into the crapper.

I think you should edit the parenthetical “(that is, if the war had made things better rather than worse).” For Iraqis, Saddam’s last couple years in power were pretty mild (according to HRW), and, as a matter of course, the war was going to make things worse in the short run. This is a longer short-run than Iraqis would have hoped for, but, if BushCo pulls its head from its arse and starts treating Iraqis like human beings[1], things could improve quickly.

The discussion of the number of people who knew people who died under Saddam, versus now, also suffers because the timespan is shorter for information to have travelled, even if the intensity of information travel might be higher.

And, Rajeev Advani’s mention of succession, the process whereby control is relinquished from one leadership group and given to another, ignores the rigged election process so prevalent in modern despotic regimes. Uday or Qusay would have “run” for strong-man, and won between 90 and 99.99 percent of cast ballots.

[1] The murder rate by african-american males halved in the first thirty years after Civil Rights, declining quite steadily in all age ranges. Sadly, largely laid at the feet of the drug wars, late in that period, in the younger men’s age range, it started rising again. Something tells me that if one treats people like people, there is more incentive for them to act like people, while if one is brutal and unust…

I’m not positive, but I don’t think this article addressed the strongest arguement agains the Lancet study: the numbers are utterly ridiculous.

That’s not the stronget argument against the study. If anything, it’s the weakest.

As Marc Mulholland has pointed out, 175 deaths/day is about ten times worse than the worst year in Northern Ireland. This doesn’t seem all that unreasonable, given that the British Army was not in the habit of calling in airstrikes on the Falls Road.

Forget the media reports. There is saturation coverage of the green zone of Baghdad. There is no good information about the dangerous bits of Iraq. As Tim points out above, journalists aren’t allowed to visit the morgue.

I’m not saying the 100K number is gospel; as I hope I’ve been consistent in saying, I don’t like linear extrapolation. But the entire problem is that we have literally no information of sufficient quality to allow us to gainsay it.

‘But the entire problem is that we have literally no information of sufficient quality to allow us to gainsay it.’
…Although we might well have had such information had Allawi not ordered his Ministry of Health to stop publishing the death figures they were collecting from the hospitals.

Also- if anyone objects that collated death rates from Iraqi hospitals would be unreliable since not all corpses would be taken to hospital: agreed. But injured people would be, and the military do a lot of studies of killed:wounded ratios which would enable a good estimate of fatalities. The Iraqi Health Ministry apparently still collates these figures but is under orders from Allawi to keep them silent. This was covered in some detail back in September by Knight-Ridder and AP.

Similarly, if we’re arguing about infant mortality and other forms of death related to the war but not directly caused by violence, the obvious way to go about things would be for the Iraqi Health Ministry to collate the figures (which it may well be doing) and publish them. This would also rather tend to help such matters as the supply of medicines, funding of health services etc. Those who object to the Lancet figures need to admit that the Iraqi Health Ministry, in September of this year, publicised the figures of deaths-from-violence that it had collated from April 2004 on, and estimated that persons were twice as likely to be killed by coalition forces as by ‘insurgents’. Now I suspect that a lot of those killed were in fact insurgents, but 348 of them were women and children. Two third of 348 is approx. 238 women and children in five months, which is frankly a pretty abusive use of firepower.

And it is also a fact that as far as I know, the only person to have a pop both at the Lancet and at the suppression of the Iraqi Health Ministry figures was Kaplan. The US government did not raise a peep, nor did the courageous Mr Straw, who has been so loud in his condemnations of the Lancet paper.

There are several clear problems with the Lancet report that have not been adaquately addressed by this blogger. I would like to address two. They are the infant mortality rate, and the clear, overwhelming overestimation of the violent deaths in the Fallujah area.

First, let me consider the infant mortality rate. Quoting the report:

First, the preconflict infant mortality rate (29 deaths per 1000 livebirths) we recorded is
similar to estimates from neighbouring countries.

The use of neighboring countries appears reasonable…given the tacit assumption that no better comparisons are available from Iraq itself.

However, as the initial article here mentioned, there was a study, sponsored by UNICEF. It is available at:

24,000 households in the South/Center of Iraq (85% of the population, Arabs)
16,000 households in the North of Iraq (15% of the population, Kurds)

were randomly sampled.

This is far greater than the Lancet survey, which surveyed, roughly, 1000 households.

Second, the infant mortality rate was measured at:

108 per 1000 live births in the South/Center
not given in the North but said to have fallen like the under 5 mortality rate

The under 5 mortality rate was

138 per 1000 live births in the South/Center
72 per 1000 live births in North.

So, we have an under-5 mortality rate of 12.8%, and an estimated infant mortality rate of 10% for the entire country in 1999. This contrasts with the remembered estimate of 29 per 1000 live births pre-invasion in the Lancet study.

It is considered good technique to address previous literature in the field. This study stands out as important relevant literature. Yet, it was ignored. If these numbers were correct, then one would expect a death rate of 130/5 per year for children under 5. With infant mortality being the majority of this, we can estimate that 1006 under 5 year olds seen correspond to at least 1100 live births in the last 5 years. (This gives a birth rate that is slightly lower than the pre-invasion birth rate 226/year given in the Lancet study.

Using these UN figures, we would expect that there would be 32 under 5 deaths during the pre-invasion interval. The Lancet report gives only 12 deaths under 15…and doesn’t break it down by under 5 and over 5. This difference of 20 deaths over 14.6 months is absolutely critical. With that age cohort representing 39% of the total population, it also represents a difference in the under 15 contribution to the mortality rate of 6.4 deaths/1000. With an Iraq population of, approximately, 25 million, this translates to over 125,000 deaths per year…more than the 98,000 given in the Lancet.

Given this, the authors of the Lancet study have a duty to mention the UN study and then explain why they think the death toll showed an unprecedented decline between ’99 (Feb to May) and ’02. (Jan 02, to mid-Mar 03) AFAIK, this decline far surpasses any other three-year decline.

Since the author of the blog states that the start of the oil for food program is the cause of the improvement, let me address that. There are a couple of problems with this: first the oil for food program started in ’96. With infant mortality, in particular, it is hard to see why it would not have had an effect on infants born two years later. Second, during that time, the oil- for-food program was raided by Hussein for other purposes. One would need to provide evidence that this stopped in ’98 and ’99 in order to argue that the infant mortality rate fell tremendously.

The second suspect number is the Fallujah death toll in 8-04 and 9-04. Reading the graph, I obtain 33 deaths in two months in Fallujah. Given that the total sample rate is, roughly, 1 in every 3200 Iraqis, this can be extrapolated to more than 100,000 deaths in this area during those two months.

The city only has a population of 250,000. Since civilians were allowed to escape, why would they stay when people were dying at that rate? Further, how would it go unnoticed? As mentioned in the Lancet study, deaths must be taken care of quickly in Arab countries. Wouldn’t 100,000 requests for death certificates, 100,000 funerals, be noticed. Further, since there are usually >1 wounded per death in military actions of this type, wouldn’t nearly everyone else be wounded? Wouldn’t the hospitals and morgues be more than overwhelmed? Wouldn’t someone notice the smell of the rotting bodies, since there wouldn’t be enough able-bodied people to take care of the dead?

Yet, the authors simply glide over this point…stating that there was a low possibility that this outlier was improperly sanding Even decent technique would require this type of cross checking between sampling and more direct measurement.

Given these two examples of bad technique; given contrary data that appears to be much more solid, I don’t think the Lancet article can be taken seriously. My question is what sort of peer review goes on at the Lancet. It might be the same folks who reviewed Sokal’s paper. :-)

I searched for “infant mortality” and “UNICEF”, and obtained one reference to that study: it is below:

and the last estimate of under-five mortality was from a UNICEFsponsored demographic survey from 1999.11,12

So, it is true, that it was mentioned. But, it appears that you and I differ on what properly adressing a previous study, with far better statistics is. For example, they also quoted numbers on infant mortality. That wasn’t mentioned. So, it was only briefly mentioned in the introduction, and not considered as relevant data when the infant mortality rate was discussed.

Second, the consideration of the Fallujah data was poor. Their data indicates that 100k died in two months where there was minimal US activity in Fallujah. Even if you only take half of that, its still 50k.

Cluster sampling is a technique that is fraught with difficulties. Done perfectly well, it works. But, if there are _any_ indications that there are serious problems with the techniques, a serious researcher is oblidged to deal with them directly. What they did, instead, is just say it was an outlier, and gave results with and without it.

A scientist is supposed to search for every possible way his data could be wrong before publishing. Cross checking with other data sources should be done, whenever possible. A reasonable scientist, would have cross checked the Fallujah data, and known that there was a serious problem with the sampling. The authors wrote as though it was a minor, or modest problem at most.

So, to be accurate, I shouldn’t say they didn’t address these two issues; rather I should say that they didn’t adress them as one would expect in a serious piece of work.

If I implied the former, I apologize.

One other thing, it is possible that I miscalculated a number, since I don’t perform the same due diligance in posts as I do in professional papers that I write. If you can catch an important numerical error that I made, I’d appreciate you letting me know what my mistake was.

Sorry, Dan. Mrs. dsquared is currently out with her friends, and as a result I am quite drunk, but that’s no real excuse.

To answer your questions directly:

1) The UNICEF study is from 1999, and there is decent reason to believe that infant mortality fell substantially in Iraq between 1999 and 2002.

1a) Furthermore, the cluster sampling methodology would always tend to undersample infant deaths; but this would not affect the longitudinal study (ie, the undersample would be more or less the same for the pre- and post-invasion death rate.

2) The Fallujah cluster was not used in the analysis for precisely that reason. The issue you raise here was dealt with explicitly in the study.

Brett Bellmore has quite some balls to claim that the insurgents are the only agressors here, and by implication the US/UK are not.

The Iraq occupation is an agressive, military occupation which came about through a war based on lies. We have quite literally bombed hospitals and other civilian buildings in Fallujah. How is that not agresssion?

You can debate how legitimate or not it was to go to war and how legitimate or not it was to attack Fallujah, sure.

But I wish people like Brett Bellmore would recognise that perhaps there might be valid reasons for the Iraqis (and indeed the foreign fighters – let’s not be arbitrary!) to oppose the occupation, and hence some of their “aggression” might be justified. It’s like “We are the most magnanimous, humanitarian creatures ever to walk the earth; they are irrational savages”. The same old exaggerated propaganda – treat our enemies as uniformly subhuman.

I wasn’t being critical of the lack of distinction between combat and civilian deaths. I can understand why it wasn’t done, and I was asking just to educate myself.

When I mentioned combatants, though, I wasn’t being clear; I principally had in mind the Iraqi army that opposed the US forces prior to the fall of Baghdad (whose status as combatants is clearer), rather than the insurgents. By referring to the first Gulf War, I meant to be asking whether, since US forces killed 100,000 of those troops in taking Kuwait, is it reasonable to believe that US forces would have killed some similar (or higher) number of Iraqi army soldiers in taking Iraq?

Note that this is question is not meant as an attack on the study. Rather, I’m suggesting that our experience with the first Gulf War supports the 100,000 figure.

Also, I’m not trying to say that 100,000 deaths is OK if and only if they were primarily Iraqi army soldiers. I do think it raises slightly different moral issues, but obviously, just as with the first Gulf War, one still has to ask how many deaths are too many.

Tom: I see what you mean. Iraqi soldiers would not have been counted unless they had returned to a household at least two months before they died (or unless their family decided to ignore this stipulation and mention them to the researchers anyway). We’re talking about “civilian” deaths here, in an attenuated sense of the word in which terrorists are civilians unless they draw an army paycheck.

(I’ve reposted [and slightly amended, hopefully making myself clearer] these musings from another blog thread which has sadly reduced to arguments pro/contra the war, a subject in which I long ago lost interest.)

I’m no statistician, but it seems to me that:

(a) 24.4 million (the population of Iraq) divided by 100,000 (the Lancet authors’ non-Fallujah estimate of excess deaths in the 18 months since the invasion) = 244; therefore 1 in 244 Iraqis has died, by this estimate, thanks to the war – about 4 in every thousand, 40 in every ten thousand;

(b) 7868 (the number of living Iraqi individuals in the Lancet’s sample) divided by 244 = 32, which implies that the figure of 100,000 deaths is extrapolated from 32 excess deaths discovered by the Lancet.

Of the increase in deaths (omitting Fallujah) reported by the study, roughly 60% is due directly to violence, while the rest is due to a slight increase in accidents, disease and infant mortality.

21 is the number given for non-Fallujah violent deaths in Table 2 of the paper, and 21 is of course “roughly” 60 percent of 32.

If so, then “100,000” is the number of deaths one would discover in Iraq if the Lancet’s sample of 7868 happened to be perfectly representative of excess deaths in the country as a whole. In other words, it assumes that the 7868 were a perfectly random sample, and that the phenomenon of excess deaths due to the war was uniformly distributed among the population (the entire Iraqi nation). How confident (in the normal usage of the term) can anyone be that this “perfect-world” number of 100,000 is, in fact, the “most likely” number of excess deaths country-wide?

From what I understand of the issues involved, there’s nothing controversial about the survey’s falling short of perfection: this is to be expected from the fact that it used cluster sampling – as would any other survey of its kind – and that clusters differed from each other to some significant degree (we don’t know by exactly how much, because the bar charts in Figure 1 relate to Governorates, some of which had several clusters each), and the survey’s size and design explains the width of the confidence interval (CI) ranging from 8,000 to 194,000 excess deaths. But the CI deals with the purely random errors arising from the size and mathematical design of the survey and (unless I am mistaken about the authors’ methods) the inter-cluster results obtained: it doesn’t address measurement or sampling errors due to poor survey design (if any). In that respect the CI is “unintelligent” – it knows nothing about Kurdish areas or Sunni or the way in which the fighting was distributed, nor does it compensate for “recall bias,” “respondents lying,” or what have you.

So I’m at a loss to explain the confidence (again, in the normal usage of the term) of the Lancet authors in the “perfect” 100,000 (or 98,000) number which lies midway between the endpoints of the CI. Surely it should never have been promulgated as enthusiastically as it has been. I respect the position taken by dsquared (that the interval is more interesting, in that it shows an indisputably elevated relative risk of death post-war, especially from violence), but for good or ill it’s with “100,000 excess deaths” that the Lancet study is now associated, and this is down to no-one but the authors, who have promoted this number from the beginning.

(I’ve reposted [and slightly amended, hopefully making myself clearer] these musings from another blog thread which has sadly reduced to arguments pro/contra the war, a subject in which I long ago lost interest.)

I’m no statistician, but it seems to me that:

(a) 24.4 million (the population of Iraq) divided by 100,000 (the Lancet authors’ non-Fallujah estimate of excess deaths in the 18 months since the invasion) = 244; therefore 1 in 244 Iraqis has died, by this estimate, thanks to the war – about 4 in every thousand, 40 in every ten thousand;

(b) 7868 (the number of living Iraqi individuals in the Lancet’s sample) divided by 244 = 32, which implies that the figure of 100,000 deaths is extrapolated from 32 excess deaths discovered by the Lancet.

Of the increase in deaths (omitting Fallujah) reported by the study, roughly 60% is due directly to violence, while the rest is due to a slight increase in accidents, disease and infant mortality.

21 is the number given for non-Fallujah violent deaths in Table 2 of the paper, and 21 is of course “roughly” 60 percent of 32.

If so, then “100,000” is the number of deaths one would discover in Iraq if the Lancet’s sample of 7868 happened to be perfectly representative of excess deaths in the country as a whole. In other words, it assumes that the 7868 were a perfectly random sample, and that the phenomenon of excess deaths due to the war was uniformly distributed among the population (the entire Iraqi nation). How confident (in the normal usage of the term) can anyone be that this “perfect-world” number of 100,000 is, in fact, the “most likely” number of excess deaths country-wide?

From what I understand of the issues involved, there’s nothing controversial about the survey’s falling short of perfection: this is to be expected from the fact that it used cluster sampling – as would any other survey of its kind – and that clusters differed from each other to some significant degree (we don’t know by exactly how much, because the bar charts in Figure 1 relate to Governorates, some of which had several clusters each), and the survey’s size and design explains the width of the confidence interval (CI) ranging from 8,000 to 194,000 excess deaths. But the CI deals with the purely random errors arising from the size and mathematical design of the survey and (unless I am mistaken about the authors’ methods) the inter-cluster results obtained: it doesn’t address measurement or sampling errors due to poor survey design (if any). In that respect the CI is “unintelligent” – it knows nothing about Kurdish areas or Sunni or the way in which the fighting was distributed, nor does it compensate for “recall bias,” “respondents lying,” or what have you.

So I’m at a loss to explain the confidence (again, in the normal usage of the term) of the Lancet authors in the “perfect” 100,000 (or 98,000) number which lies midway between the endpoints of the CI. Surely it should never have been promulgated as enthusiastically as it has been. I respect the position taken by dsquared (that the interval is more interesting, in that it shows an indisputably elevated relative risk of death post-war, especially from violence), but for good or ill it’s with “100,000 excess deaths” that the Lancet study is now associated, and this is down to no-one but the authors, who have promoted this number from the beginning.

3. Several ten thousand violent deaths is a fair estimate, but the Lancet study does not provide the kind of evidence I’d like to see to negate the notion that most were combatants or at least, not innocent civilians killed by coalition bombing.

The coalition and Iraqi government do provide estimates of people killed. They also say that they do their utmost to avoid civilian casualties and that most of those killed by them are combatants. They cannot do detailed body counts of civilians/combatants.

As a war supporter I say that so far, in my judgement, the life of Iraqis has not improved overall, and, so far, we have in all likelihood not saved innocents when compared to what Saddam would have done.

I do, however, also believe that on balance, so far, life for Iraqis overall hasn’t deteriorated much either (it’s gotten significantly better in some parts, reconstruction is real and effective in parts of Iraq), and that Iraq will in the future be much better off, both in overall quality of life, and where raw death statistics (such as infant mortality) are concerned.

As the Icelandic prime minister has said, before, there was no hope for Iraqis, now there is.

I believe the transformation of the Middle East in general and Iraq in particular will be swift historically speaking and for the better, but that still implies many years, not 18 months.

Heiko, coalition claims that they try to avoid civilian deaths don’t mean much when they then go on to use artillery, helicopters, and 500 lb bombs in cities. There is some evidence (apart from this Lancet study) that the coalition forces are killing more civilians than the insurgents–Nancy Youssef reported this in a Knight-Ridder story last Sept 25. (I mentioned that in an earlier thread to which I never returned, so I don’t know if anyone commented on it.)

There was also a story in the October 12 issue of the NYT (I think I have the date right) where a Pentagon official told the Times that they saw a good side to civilian casualties in Fallujah, because it might make the inhabitants have to chose between supporting the insurgents and supporting the coalition forces. Presumably they were supposed to chose the Americans dropping the bombs on their town. I suspect that this story gives us a glimpse of the real thinking that lies behind the use of heavy weapons in urban areas, but official government spokesmen aren’t going to be so naively open about it. Americans are extremely good at doublethink–we are perfectly capable of dropping bombs on cities and seeing the useful side of collateral damage while sincerely proclaiming our desire not to see any civilians killed.

You wrote “The sample universe in the Lancet study is perfectly clear; it’s the population of Iraq.”

I’m trying to make a distinction between the idealized universe that the authors hope to study, and the actual universe they’re sampling from. This may be a more subtle point than I thought, and I’m not sure that there’s good jargon for it. Here’s an example. The Lancet sampling method appears to be more likely to sample housing units on large lots (it’s partially based on area). So this is one way in which the study sample systematically differs from the idealized population universe. (I don’t know enough about Iraq to know if this is an important problem. It probably isn’t).

I don’t think the idealized universe is the “population of Iraq,” it seems to be the non-institutional population of Iraq. It excludes people living in army barracks, hospitals, prisons, etc. At least that seems to be implied by concept of surveying households, if not specifically stated.

Here’s the gist of how the study universe is defined:
“We defined households as a group of people living together and sleeping under the same roof(s)…Respondents were also asked to describe the composition of their household on
Jan 1, 2002, and asked about any births, deaths, or visitors who stayed in the household for more than 2 months…The deceased had to be living in the household at the time of death and for more than 2 months before to be considered a household death.”

This is probably an adequate definition of a household at a point in time. But “the composition of their household on Jan 1, 2002″ is horribly ambiguous. Consider a newlywed couple that’s been living together for a year when interviewed. Earlier, they were living with their respective parents (separately). Are their parents included in the “composition of their household on Jan 1, 2002″? There are many examples, since households are not static over time.

The problem with ambiguity is that it is an invitation to the interviewers to overcount violent deaths. The interviewers presumably know the aim of the study is to estimate the effects of the war. So what happens in an ambiguous situation, as with the newlywed’s parents discussed above? My guess is that conscientious interviewers, never mind lying ones, are much more likely to count the parents as “in sample” when they’ve suffered a violent death.

Finally, “The deceased had to be living in the household at the time of death” just makes no sense at all. They can’t possibly be saying that deaths in a hospital don’t count, can they? The deaths of infants who died in the hospital soon after childbirth don’t count? It’s possible I’m misreading this, or that they didn’t really follow this protocol in practice. But overall, this statement really makes me think that their sampling technique is a mess.

The numbers I find fairly plausible are the figures for the post-war violent death rate. I think the authors’ definition of a household is less problematic over shorter time periods.

They don’t actually seem to report the violent death rate specifically, but some calculations show it to be 1.8 per 1000 (6 per 1000 with Falluja). Which translates to 44,000 per year (152,000 with Falluja). So, they’re pretty big numbers.

In the US, very high murder rates are something like 0.7 per 1000 (in places like Detroit in the early 1990s). So living in Iraq today seems to be something like living in the worst neighborhoods of Detroit, DC, or New Orleans during a high crime year.

So, even though I don’t believe the study’s pre-war figures, I think these are pretty depressing and believable numbers.

Dsquared says: recall bias and lying are the what I’m calling “informative censoring.” That isn’t what I mean.

When people leave the household in this study, they’re out of sample, their deaths don’t count. In the jargon, they’re “censored.” This isn’t recall bias or some kind of error: it’s how the Lancet authors define their sample.

Censoring is a problem when it’s non-random (informative). I would guess that people who leave home are more likely to die than those who don’t. They may be leaving to fight Americans, for example. So the study will systematically undercount deaths.

I’ve looked at a couple other similar studies (the BMJ study I cited is one). These studies ask about the deaths of people who’ve left the household, which I think is a much better procedure.

You’re right that this argument doesn’t tell you if there’s more, less, or the same bias after the war compared to before before. I think there’s every reason to think that the bias wasn’t the same in both periods, but I don’t know which way it goes.

What a shame that this study has become a symbol of pro-war vs anti-war (which I guess is largely because of the sensationalist headlines and the timing of its publishing). Imagine if it had been conducted at the request of the coalition forces. It would appear as a limited but worthwhile attempt to measure how well they were achieving one of their objectives (improve the lot of the Iraqi people) under difficult conditions and with admirable speed. Instead it has become a battle-ground in its own right.

What a shame that this study has become a symbol of pro-war vs anti-war (which I guess is largely because of the sensationalist headlines and the timing of its publishing). Imagine if it had been conducted at the request of the coalition forces. It would appear as a limited but worthwhile attempt to measure how well they were achieving one of their objectives (improve the lot of the Iraqi people) under difficult conditions and with admirable speed. Instead it has become a battle-ground in its own right.

Let me clarify a few things. Even if we assume that something is seriously wrong with the sampling (eg because prisons were excluded), the numbers for violent deaths are pretty overwhelming, 1 before to 21 after (ex Fallujah, and the sampling period and population size aren’t quite the same, so make that 1.4 to 21 or so).

Saddam wasn’t involved in mass killings at the time of the invasion, as he had been at previous times in Iraq’s history, and there was little fighting going on. Another study has been mentioned that instead looked at torture, and Shiites in the South did report that this did go on at similar levels as earlier in the period just before the invasion.

And, it appears, that the Lancet study does not count deaths in prisons, which would be where most of Saddam’s victims would have died in the time period in question (estimates of between 2000 and 10000 deaths per year in the pre-war period from direct killing in prisons by Saddam seem realistic to me).

Still, clearly the violent death rate has gone up substantially.

Secondly, the reason this study is getting so much flak is that its results were presented to the press in such a terribly politicised fashion. 100,000 innocent civilians, mostly women and children, killed by coalition forces was the message widely perceived as having been released to just be timed before the election.

And I must say that the authors, the Lancet and the press have gotten pretty little flak on this from the study’s defenders.

It’s the presentation of their findings that’s getting people like Worstall to immediately dismiss them.

However, the findings themselves are hardly surprising, or, where they are strong, add much to what we already know anyway, namely that violent death has increased substantially.

The findings shed relatively little fresh light on the question of attributing the deaths, are most killed either terrorists / those killed by terrorists, or are most of those killed innocent civilian victims of coalition bombing?

The Lancet study also appears very weak on the question of infant mortality (it finds little change in non-violent adult mortality).

Unicef presents data for both 2002 (102) and 2001 (107), WHO even gives confidence intervals for their numbers of under five mortality in Iraq in 2002.

Daniel wonders why one wouldn’t suspect that other numbers are undercounted, if infant mortality is.

On the other hand, if under five mortality in 2002 in Iraq was 131, undercounting this is nearly a prerequisite for being able to claim substantial increases. The Lancet study found a near doubling for infant mortality. If that applied to a true pre-war figure of over a hundred, between 1 in 5 and 1 in 4 of Iraq’s infants would have died after the war.

An infant mortality of 29 before the invasion, if true, would suggest that neither sanctions nor Saddam’s regime were any longer responsible for excess deaths compared to other countries. And it would be an unbelievably sharp fall, and a record low that had never been approached before by Iraq, and only achieved by neighbouring countries over a decade or more of improvement.

It matters enormously, what true infant mortality was pre-war. Only if it really was above a 100, there is enormous scope for improvement, and either people who believed sanctions to be the problem, or who layed the blame on Saddam, had a much greater humanitarian case.

If infant mortality was no different than in neighbouring countries, either argument (for lifting sanction or for removing Saddam Hussein) becomes much weaker.

I have now written to Unicef. It is vital that they publish good figures. And that means when they say 2002, they mean 2002, not 1994-1999, or say 102 and the true number is 29.

That’s just not acceptable, if they mean 1994-1999 they should say exactly that, and if they are so unsure about the number that it could be anything between 25 and 250, they should indicate that clearly as well, rather than list 102. Some error is ok, but not over a factor 3. That renders the number meaningless.

And we aren’t only talking Iraq here. Unicef and WHO publish numbers for all countries. I’d find it rather worrying, if those might (or might not be without them bothering to mention that fact) wild guesses based on outdated data.

I had a look at that article and did not find it credible and I didn’t find any reasonably official confirmation.

It’s very hard to definitively classify people as combatants or civilians in such a conflicht, particularly when one side in the conflict (insurgents/terrorists) have both a propaganda objective and very little, if any, checks on their ability to get away with lies.

That’s why I don’t believe the Iraqi Body Count figures. Most of these deaths may be made up of terrorist sympathisers, or may be of combatants that are claimed to be civilians.

Iraqi Body Count makes no attempt to verify any of this on the ground, and most western journalists (whose reports they rely on) sit either in a heavily fortified and guarded Baghdad hotel relying on Iraqi informants, or are travelling with coalition forces.

On the second point you make, I see this in a much less sinister light. All I expect him to have expressed is a belief that Fallujans would recognise that the insurgents had no positive plan for Iraq, and meant nothing but violence.

————————–

Robin,

for violence to be legitimate in my eyes it’s got to have both have a moral goal and have some chance of achieving it.

What are the insurgents doing that is benefiting ordinary Iraqis in any way?

They can have elections, and coalition forces will leave, when asked to do so by a fairly elected government,

Now that you just have a hangover :-), you might wish to discuss my objections to the Lancet article in greater detail. Or, you might consider such a discussion outside of the scope of this site; which will be acceptable to me. If it is the former, then I will try to make statements/ask questions that will allow us to focus on the core of our disagreement, while defining the areas in which we do agree. If it is the latter, just say so, and I’ll stop posting on the subject.

Let me start with an area that I think we agree on. The Fallujah numbers are clearly wrong. They are not merely an outlier, they are not merely suspect, they have been falsified by other observations.

As a final note, I am assuming that any experiment needs to be tested by both external and internal consistency. Although there are examples of singular experiments that stand alone, most verification and falsifcation of models require a number of different experiments. Thus, it is reasonable to work hard to cross check any numbers from one experiment/study against those obtained in another experiment/study.

Let me start with an area that I think we agree on. The Fallujah numbers are clearly wrong. They are not merely an outlier, they are not merely suspect, they have been falsified by other observations.

No, we don’t agree on that. In particular, I think that your reason for believing this about the Fallujah numbers (excerpted below) is, to be frank, nonsense.

The second suspect number is the Fallujah death toll in 8-04 and 9-04. Reading the graph, I obtain 33 deaths in two months in Fallujah. Given that the total sample rate is, roughly, 1 in every 3200 Iraqis, this can be extrapolated to more than 100,000 deaths in this area during those two months.

What you’ve done is taken the Fallujah death rate, multiplied it by the sampling rate for the whole of Iraq, and attributed it to Fallujah. This isn’t right. What you should have done would be to multiply it by the local sampling rate. We don’t have this, but we do know that there was one cluster in Fallujah, and that a cluster for this study is 33 households. So in other words, the Fallujah study found roughly one death per household in the immediate postwar period. That’s a hell of a lot, but it’s not at all impossible as an outlier in a heavily bombed neighbourhood.

I think that the simple fact that we can say with 97.5% confidence that the war has made things worse rather than better…

You have just demonstrated that this is not a “simple fact”!

You go on and on at length showing how the central estimate could be an underestimate, which I am willing to concede here.

But having just demonstrated that the nonsampling error has shifted the central estimate lower, you still accept a 97.5% probability that war made things worse, even though that’s based on no underestimation–and only accounts for sampling variability.

If you really think 98K really is an underestimate, isn’t 97.5% too low?

If, in the face of your evidence of underestimation, you still want to stick to 97.5%, then aren’t you assuming that think there’s an equal possibility that nonsampling errors made 98K an overestimate?

You don’t seem to believe that in your writing above.

Either way, 97.5% is one very important piece of evidence, but not really a “simple fact”.

So I’m at a loss to explain the confidence (again, in the normal usage of the term) of the Lancet authors in the “perfect” 100,000 (or 98,000) number which lies midway between the endpoints of the CI

dsquared replied:

The authors have no such confidence; they have repeatedly said that they believe that the central estimate is an underestimate.

We may be talking past each other on this point, but when authors of a study describe a numerical estimate derived from their work in every press interview as “conservative”, they are indeed expressing confidence in it. In particular, they express confidence that the true number is unlikely to be lower (whatever the CI has to say about it).

The explanation given for this is that the Fallujah “outlier” would have produced a central estimate twice as large. But the Fallujah data doesn’t just point to possible underestimation, it also raises question marks about the survey’s reliability as a nation-wide estimator of excess mortality. One would have thought that this would preclude the possibility of making confident claims about “100,000” being a “conservative” extrapolation because it excludes Fallujah. But it seems not.

D^2 suggested that I used suspect technique in my analysis. In particular, he suggests we don’t know the sample size. It wasn’t given directly, that’s true. But, the authors were kind enough to give death rates with and without Fallujah, as well as the number of post invasion deaths in Fallujah. Thus, we have a direct means of checking my observations.

“During the period before the invasion, from Jan 1,
2002, to March 18, 2003, the interviewed households had
275 births and 46 death The crude mortality rate was
5·0 per 1000 people per year…The crude mortality rate during the period
of war and occupation was 12·3 per 1000 people per year…If the
Falluja cluster is excluded, the post-attack mortality is 7·9 per
1000 people per year”

and

“When included, we estimate that the rate of death increased
2·5-fold after the invasion (relative risk 2·5 [95% CI
1·6–4·2]) compared with before the war. When Falluja
was excluded, we estimated the relative risk of death for
the rest of the country was 1·5 (95% CI 1·1–2·3).”

Now, in the first quote, there is some ambiguity as to whether methods changed from including the invasion time to excluding it in the two surveys. Since the ratios for the two quotes are equivalent to within rounding, I’ll assume the same technique is used.

We also have 1.48 years post invasion. Just doing arithmetic, I obtained the following deaths per 1000 for the whole invasion period

All Iraq: 18.2
Fallujah: 6.5
Without Fallujah 11.7
All Iraq, no increase 7.4

Multiplying by the population of Iraq, (24.4 million) we obtain:

All Iraq: 445k
Fallujah: 159k
Without Fallujah 286k
All Iraq, no increase 181k

By these calculations, we would get 105k excess, instead of 98k. This is consistent with the rounding errors in the numbers I used, so that’s not too bad. The paper also states that 53 were killed in Fallujah during the post-invasion period. Since 33 were killed in Aug-Sep. ’04, the authors would extrapolate (to within rounding) 33/53 * 159k during that period. That comes to 99k.

We can even normalize down for my rounding error, and get 92k. So, as you pointed out, my 100k estimate wasn’t exactly the same as what the authors estimated. Still, it’s close.

In short, even though I’m just a humble aging, plumber, :-) I can still look at numbers in a paper and see what conclusions are obtained. Typically, I calculate “back of the envelope” numbers using the author’s technique in my head to check out the author’s work. Now, that I’ve spent a bit more time on making direct calculations from the authors’ numbers; I’d argue that my estimate was close enough to prove my point.

Dan, what you’re doing is, effectively, multiplying the Fallujah death rate by the population of Iraq to get a number of deaths, then comparing it to the population of Fallujah. This is why you’re getting a number that looks too big relative to the population of Fallujah, and it’s why the authors threw the Fallujah outlier out; to avoid doing exactly this.

…
Iraqis like to call this mess ‘the situation.’ When asked ‘how are thing?’ they reply: ‘the situation is very bad.”

What they mean by situation is this: the Iraqi government doesn’t control most Iraqi cities, there are several car bombs going off each day around the country killing and injuring scores of innocent people, the country’s roads are becoming impassable and littered by hundreds of landmines and explosive devices aimed to kill American soldiers, there are assassinations, kidnappings and beheadings. The situation, basically, means a raging barbaric guerilla war. In four days, 110 people died and over 300 got injured in Baghdad alone. The numbers are so shocking that the ministry of health — which was attempting an exercise of public transparency by releasing the numbers — has now stopped disclosing them.
[…]
I heard an educated Iraqi say today that if Saddam Hussein were allowed to run for elections he would get the majority of the vote.
…

Heiko, the reasoning of the Pentagon official in the Oct 12 NYT story was strikingly similar to the reasoning of Americans in Vietnam who would bomb VC-held villages hoping to force the villagers to “choose” the American side. Your dismissal of the Nancy Youssef story as not credible was, well, not credible to me. I think you wish to see Iraq through rose-colored glasses and I find this frustrating, so I’ll drop out of this conversation.

There is an interesting correlation between two of the critiques Daniel itemizes here: the inconsistency of the infant mortality data may tend to support the Lying Iraqis theory.

Back in 2002, DD and I had a brief exchange in which I pointed out (1) that _pre-war_ Iraqi infant mortality numbers were controversial, (2) that the UNICEF estimate was much higher than the CIA’s, (3) that the CIA estimate was much easier to reconcile with total Iraqi population numbers 1990-2001, and (4) that the mortality rate Saddam was attributing to the sanctions regime was implausibly high based on UNICEF numbers, and absurdly high based on CIA numbers. See original link at

Anyway, two years later we have the same numbers, but everyone looking at them has changed sides. The Lancet study finds current infant mortality rates that are supposed to have _doubled_ since the war, but remain far _below_ UNICEF’s prewar estimate…and almost exactly in line with the CIA’s.

It seems to me that one model which explains all the observed facts is that Iraqi infant mortality has fallen slowly from 60+ per mil to 50+ over the last 15 years, without much impact from sanctions or war. In this model, the CIA and Lancet have always been right in the present tense, and UNICEF and Lancet have both been misled by anecdote in the past tense, in each case in the direction the Lying Iraqis hypothesis would favor. That is, during the sanctions regime, UNICEF investigators on the ground were told about fake deaths (or real deaths of cousins resampled as siblings, or other bias), while today Lancet investigators are losing half the _real_ deaths from before 2003.

Since the Iraqi “story” used to favor a high mortality rate under the sanctions regime, and now favors a low past mortality rate for the same sample period, the facts that UNICEF heard 110 per mil, and that Lancet now hears 29, cannot be viewed in isolation from each other. Rather, they begin to make a fair Bayesian case for serial deception.

“Not only is there no objective basis for the actual subjective adjustments that people make, but the entire distinction between combatants and civilians is one which does not exist in nature. As a reason for not caring that 98,000 people might have died, because you think most of them were Islamofascists, it just about passes muster. As a criticism of the 98,000 figure, it’s wretched.”

Shorter d-squared: You shouldn’t criticize the ‘science’ just because they didn’t bother with the important part.

Guess what, they tried to influence the election with their hack summary. Failing to discriminate between combatants and non-combatants is failure to discriminate between politically powerful and politically crass.

mr Holsclaw, that the USA never even tried to have a fiable and public statistic of deaths and other mishaps that happend on the Iraqis is enough to negate their innocence. Let’s be excessive: the Nazi thought they were right and so documented most of their crimes. That made Nurenberg processes that much easier. Their modal (not moral, after all Nazis claimed to do what was right) offsprings of nowadays, like Milosevic, take care that no official trace can come up to them. The USA are doing the same, so while soldiers can believe their indoctrination over fighting terrorism and terrorists, the officialdom knows it is a lie.

The unemployment figures also came out during the election campaign. Perhaps the BLS ought to have split them up into “proper” unemployed, plus people who were in their opinion lazy, workshy or not trying hard enough.

It’s certainly been suggested in the past to the medical statistics profession that they ought to break down AIDS numbers into the categories of “innocent victims” who got the disease through blood transfusions, plus gays and drug users, whose illness obviously matters a lot less. To their eternal credit, they have always resisted this pressure.

Thanks for your analysis of the paper. It’s helped me clear up at least a couple of the reservations I had about the methodology. However, there is a point that concerns me which I didn’t see addressed, and I wonder if you could help me with it.

At both the province level and the locality level, decisions on where to sample appear to have been made based on the available prewar census data. But during war, there are frequently population dislocations, and these dislocations are non-random. Whenever possible, people relocate from high-risk areas to safer areas. If this has been the case in Iraq, wouldn’t using the prewar census data carry a significant risk of resulting in the oversampling of high-risk areas and the undersampling of low-risk areas? And couldn’t that, in turn, result in an elevated estimate of the nationwide mortality?

Sebastian, if the survey had asked Iraqi respondents to say if the dead were combatant or non-combatant, and computed combatant/non-combatant death rates on those basis, you’d be telling us that the Iraqis were lying to inflate the number of non-combatants killed.

I note that:
i) a complete disinterral of all corpses in Iraq, or even a disinterral of a large sample of the same, followed by their forensic testing for evidence of weapons use, seems a rather impractical way of doing things;
ii) the Allawi government, with not a word of public protest from the Bush administration, has discontinued publishing the Iraqi Ministry of Health’s figures for corpses seen in Iraqi hospitals (figures of war wounded seen in Iraqi hospitals, which in my view would provide a strong basis for an extrapolation of war dead, are also not being published);
iii) the Coalition forces are estimating Iraqi deaths but not publishing those estimates.

Given these facts, could you please tell us what you would accept as a basis for responsibly estimating the Iraqi death rate?

Mike: In principle, you might be onto something here; they certainly did use unadjusted January 2003 population estimates (I think that these were estimates rather than census data) for calculating the sampling scheme, and on the face of it, I’d say that this would tend to oversample people who’d stayed in the dangerous zones (although it would by the same token undersample households who had fled the scene after losing a family member). Note that households would have to move from one governorate to another in order to affect the chances of being sampled; my guess is that this would mean that Baghdad would be inflated relative to rural areas, as this is the only town where I would guess there were lots of people who had family to stay with in another governorate.

On the other hand, I think I would guess that, given the extraordinary consistency of the story across the non-Kurdish governorates, this effect might not be too large in magnitude. But I can’t rule out a larger effect; this is a bloody good spot on your part. This article is about to drop off the front page, but I’ll keep in touch by email if I think of anything else.