Gymno

succumbing to peer pressure

Friday, February 01, 2008

Iraqi Civilian Deaths

Les Roberts et al have published two well-known articles in The Lancet estimating Iraqi civilian deaths. The references are Lancet 2004; 364: 1857–64 and Lancet 2006; 368: 1421–28. I know that's all crude and unattractive, but my links to the pdfs of the articles themselves are through electronic journal access via my university, so they wouldn't do you much good. I believe The Lancet offers them up for free, you just have to register at their site.

Each of these articles estimated 'excess' civilian deaths at rates much, much higher than any other estimate out there ('official' or otherwise) and as such, of course, suffered much lampooning, and alleged debunking. First, a few clarifications - that word 'excess' is important, and frequently misunderstood. It means that the researchers attempted to estimate the expected, baseline mortality rate (from illness, malnutrition, violence, etc.) and then estimate the difference between that baseline rate and the current observed rate. It doesn't allow for any causal links between violent deaths and, oh, I don't know, a violent occupation, but it's the standard estimate for analyzing the potential affects of violent conflict. Next, the sampling methodology could potentially sound really wonky and highly suspect (this is where many of the 'debunkers' gained traction). While there is certainly plenty of room for criticism of the choices the researchers had to make, it is simply unreasonable to expect textbook sampling methods to translate directly to the implementation of any survey, much less a survey conducted under such dangerous conditions. Lastly, a disclaimer - I had the pleasure of hearing Les Roberts discuss his papers at the Joint Statistical Meeting last year, but missed the counterarguments from his primary detractor, David Kane. As a statistician, I have a definite bias toward Roberts, since I am already predisposed to trust him based on his decade-plus of experience collecting and analyzing data in conflict regions. That said, I will attempt to be objective and lay out the various arguments happening here (I know this post probably seems out of date, but hang in there, and you'll get an up-to-the-minute (almost) update on civilian casualty research).

Ok. First, to non-survey researchers, Roberts et al's methods seem odd - for the 2006 survey 50 clusters were initially selected with population proportional to size, then an 'administrative unit' was selected, again proportion to (estimated) size, then a main street was selected, followed by a residential street off that main street, and lastly a random household was selected, and adjacent households were included until a sample of size 40 was reached. For the 2004 survey 33 clusters were selected, then gps units were used to map rectangles onto the selected area, subdivided by 100m increments, the gps unit randomly selected a point within the rectangle and the 30 nearest households to that point were visited (the gps units were not used for the later survey because there was concern that they would be mistaken for bomb detonators at checkpoints and endanger the interviewers). This is all very methodologically sound, so long as all of these steps were properly taken into account in the analysis stage. In many environments you need to reach a point where you can hand a survey to (or verbally interview) an individual, and if you want your research to be representative of the larger population from which you are sampling, you have to figure out how you're going to make a series of decisions to eventually reach that individual about whom you collect data. At this stage the only valid criticism is in the number of clusters, which is small compared to a more recent study (which we'll get to in a minute) but is perfectly reasonable by WHO standards (which recommend 30 clusters). There's also potential for clusters to lose their representativeness if there is a systematic pattern to those that are reachable vs. those that are not - this certainly happened in both Lancet studies, with more unstable and violent regions being less likely to be covered in the survey. This results in a) underestimating civilian deaths and b) more variability in the ultimate nationally representative estimate.

In the nitty-gritty details there are plenty of other things to debate, such as how to define a household, how to define a family unit/who lives in that household, etc. etc. etc. There are lots of difficulties involved in estimating a population in flux, one comprised of millions of internally displaced persons as well as refugees. However, as long as these decisions are held consistent throughout the survey, which they were in this case, none of them invalidates the results. You may choose to define these things differently, but within a reasonable set of possible answers, there is no blatantly wrong choice. Additionally, by traditional scientific standards, even if you disagree with those choices, using those same definitions, you should still be able to replicate similar results. If one was attempting to make a valid and productive criticism, the best way would be to duplicate results with consistent decisions used in the study under attack, then transparently vary those decisions and compare results.

Now what I find interesting is that the results from these two studies are most often compared to 'official' estimates from the Iraqi Ministry of Health and 'unofficial' estimates from Iraq Body Count. We'll come back to IBC. For now, I want to focus on just what we're comparing when we say the Lancet articles estimated more than 20 times the casualties reported by President Bush in December '05 (in that speech Bush claimed 30,000 civilian deaths, the 2004 Lancet article estimated 98,000 excess deaths, in 2006 the estimate was more than 650,000). First, is that pesky word excess. So right off the bat we may not be comparing apples and apples. Second, for all it's flaws, and there are plenty, the Lancet studies were scientifically, statistically sound attempts to estimate a nationally representative mortality rate in a conflict zone. Bush's 'official' numbers rely primarily (if not solely) on data provided by Iraqi government entities. Am I the only one who sees a flaw in that logic? In a country with a clearly crumbling infrastructure, how can we possibly expect government-provided vital statistics to be reliable? Does it really seem reasonable to assume that death certificates filled out by the local health department (if such a thing exists) in Basrah and Kirkurk and Mosul and the western regions of Anbar are being transported consistently to Baghdad? Then that consistent records are being maintained by the ministry of health? Again, it is perfectly valid to criticize Roberts et al on their selection of clusters and other methodological decisions (about which Roberts et al are quite transparent) but then isn't it also valid to criticize President Bush's numbers for, frankly, lacking any scientific validity whatsoever? If we're going to argue over which numbers are 'less wrong' (and wrong in more transparent, estimable ways) I'll side with the statisticians every day of the week.

Now for comparisons to Iraq Body Count. This method is very interesting, because it's all passive. They rely on "news media, NGOs and 'primary' sources, and official cumulative figures." Although their methodology is also transparent, and appealing in terms of their handling of conflicting information sources, I have a hard time believing that it is truly representative of the mortality rate across the country. Surely voluntary reporting of civilian deaths varies from region to region. Although this site is run by biostatisticians, so they may indeed be accounting for this, I see no clear indication of how representative they believe their numbers to be.

Now, I have been talking a lot about transparency and reproducibility. This is the main criticism of David Kane, who complains that Roberts et al won't share their data with him. Having not requested the data myself, I have no evidence to back this up, but Roberts claims that Kane is one of only two people to whom he has denied the data, due to his belief that they were pushing their own agendas and were not planning to attempt to reproduce his analyses in an honest pursuit of scientific integrity.

Kane's other primary attack is on Robert's choice to omit the cluster that contained Fallujah from his 2004 study. Given that data collection for this study overlapped with a major offense in Fallujah, it seems quite plausible that this cluster was an outlier. Nevertheless, it certainly would have been nice to see results presented both including and excluding this cluster (a standard way to handle outliers and to defend one's decision to include or exclude them).

Now, on to the new stuff (how's that for burying the lead?). This month's New England Journal of Medicine features a compelling new study of Iraqi Civilian Mortality, conducted by the Iraq Family Health Survey Study Group. They collected data from 971 clusters, achieving a nationally representative survey of 9,345 households. With such a large sample size, this study is much more likely to be truly representative of the country, and to estimate mortality rates more precisely. Although I'll have to think some more about their use of Iraq Body Count data to impute mortality data from the 115 clusters they were unable to visit due to security concerns, I do otherwise find their methods and results to be quite sound. Their use of the growth balance method for demographic estimation (to account for households that simply cease to exist after a death) is an appealing, and, again, transparent way to handle the only somewhat predictable ways in which populations change during conflict. (the WHO has a nice, fairly lay-termed Q&A about the study here)

Recall of deaths in household surveys with very few exceptionssuffer from underreporting of deaths. None of the methods toassess the level of underreporting provide a clear indicationof the numbers of deaths missed in the IFHS. All methods presentedhere have shortcomings and can suggest only that as many as50% of violent deaths may have gone unreported. Household migrationaffects not only the reporting of deaths but also the accuracyof sampling and computation of national rates of death.

Taking that underreporting (and other sources of error) into account, the Iraq Family Health Survey Study Group estimates 151,000 violent deaths in Iraq from 2002 to 2006, with a 95% confidence interval from 104,000 to 223,000. Three times as many deaths as reported by IBC, but only 25% as many reported in the 2006 Lancet article (and five times as many deaths as reported by President Bush, for a slightly different time period).

Overall I am inclined to agree with the authors, that this is probably the most accurate estimate we're going to be able to achieve. "About half of the violent deaths were estimated to have occurredin Baghdad... Overall mortality from nonviolent causes was about 60% higherin the post-invasion period than in the pre-invasion period." Regardless of which set of estimates you choose to believe, the take-home message is one of "an ongoing humanitarian crisis."

In the health section of today's New York Times is this gem of a headline: "Tainted Drugs Tied to Maker of Abortion Pill." Oh no! Yet another thing that makes abortions bad and evil and scary and dangerous! But wait, what's this? Right there in the very first sentence of the article, the tainted drugs have absolutely nothing to do with RU-486, but rather, cancer drugs!

A huge state-owned Chinese pharmaceutical company that exports to dozens of countries, including the United States, is at the center of a nationwide drug scandal after nearly 200 Chinese cancer patients were paralyzed or otherwise harmed last summer by contaminated leukemia drugs.

Oh, but cancer isn't nearly as sexy and attention grabbing as abortion! Why go for accuracy over show-stopping?

I sent an e-mail criticising the headline writer and got this response: "You are not the first person to make that point." Well, at least my fellow times readers appear to be, you know, literate. More than I can say for some of the times employees, apparently.

In discussing the upcoming primaries (and my rapidly-approaching need to make a decision) with Eli (a long and ardent Obama supporter) I was saying how I worried that my decision may come down to a gut reaction. My grand plan is to spend the weekend reading all about both Clinton and Obama, but I feel that a) in general their policies are quite similar, and b) where they differ, they both possess potential deal breakers for me (her record on Iraq, his potential to be wrong about healthcare mandates).

But then I think about the last two Democratic debates, and am filled with this unexpected feeling - hope. Perhaps the last four years have just beaten me down more than I care to admit, perhaps I've been so brainwashed that I no longer accurately remember the Gore and Kerry campaigns, but I have been genuinely, pleasantly surprised by the language employed and topics covered by both Obama and Clinton (thanks in large part to influence on them both by Edwards). Have I really been treated over the past several weeks to political dialogue about the moral atrocity of poverty? About ideas like social justice and human dignity? While I was pounding my head against the conservative wall that has become this country, has the Democratic party really become (gasp!) progressive? Maybe I've been living in Georgia for too long, or reading too many comments over at PeachPundit, but the fact that I could live out my West Wing fantasy* of caring about a candidate, of being inspired and moved by a candidate...well, it's almost too much to hope for.

*"Because I'm tired of it: year after year after year after year having to choose between the lesser of who cares. Of trying to get myself excited about a candidate who can speak in complete sentences. Of setting the bar so low, I can hardly bear to look at it. They say a good man can't get elected President. I don't believe that. Do you?"