1) I have indeed been neglecting my EphBlog duties the last few months. Apologies!

2) Note that I offered to write an Eph Diary on this project. Alas, no one was interested. I started a separate blog on the topic.

3) I wish that left wing blogs would also highlight the paper. Anyone have a contact at DailyKos or Talking Points Memo? I sought to post it at Deltoid, where all the Lancet geeks hang out. See the excellent discussion at the last link which Lowell provides. Malkin picked it up from there, and then it spread.

4) Comments are welcome from EphBlog readers too!

5) Note that two Ephs helped with the paper, Arjun Narayan ’10 and Aaron Schwartz ’09. Don’t blame them for any mistakes.

I’m no statistician, but this seems less like any real disproval than it is a methodological critique, which in and of itself is important, but perhaps it is easy to overstate what Dave has and has not “proven.” I would also assert that Dave’s argument is dependent on Saddam having kept wonderful and unskewed census records or their equivalent, something he may not have been wont to do. It seems that he is right that there are issues with an inability to account for that many deaths over time, but while he has cast some doubt on the Lancet study (and the most damning aspect is that their data is not readily available and thus verifiable) it is not clear to me that he has actually debunked that study.

It seems to me that David has underscored the proposition that at the very least the Lancet is willing to engage in loose talk – for which of course there was an earthier, more direct, probably more appropriate, certainly more conclusionary term in my old neighborhood.

It’s only too bad the left didn’t come up with more “flawed” studies and propoganda … if the democrats had engaged in a little more swift-boating and unethical election tactics of their own, they would have defeated Bush/Cheney, and this disastrous war would be nothing but a bad memory instead of an ongoing fiasco of epic proportions. I don’t really see the point to the debate. There is no doubt that over 3000 American troops have died, many multiples of that figure have been seriously wounded, and many multiples of THAT figure of innocent Iraqis have died in this conflict, which has undoubtedly only strengthened Al Qaeda and fomented more anti-American sentiment in the Muslim world. How many deaths do we need, exactly, before we admit that this was a terrible idea and cut our (and the Iraqis’)losses?

that’s a foolish, foolish conclusion frank. Recent history is pretty clear that the right has had significantly more integrity flaws than the left. If the Lancet study is proof of a “integrity flaw” of the left, then the left is quite widely defined to include laymen not in politics. You really want to go that route?

Anyway, as for David’s article, its math is actually not that intense, just scary looking (no matter what Malkin says) and it is an interesting methodological critique. That all said, it seems to make its argument mathematically impressive, but common sense seems to run the other way. One of David’s points is that in including Falluja, the standard deviation becomes extremely large because of Falluja’s outlier role. The authors of the original study exclude Falluja because of that. One of David’s main points is that IF we include Falluja, then the confidence interval grows so big, that it includes a non-finding.

However, common sense dictates that Falluja–being in the midst of a massive assault at the time–had an abnormally high mortality rate. So though it mathematically does expand the standard deviation (and thus the CI), in reality, its effect should only raise mortality rates. In other words, though we may quibble about the statistics and how to account for Falluja in such a study (and such quibbling is useful so that the next time such a study is needed, it can be improved), the practical end result is the same:

A whole lot of Iraqis died because of our invasion.

I’m actually tempted to use this whole argument in TAing my social statistics class this fall. It’s an interesting quirk of statistics.

UPDATE: and to be fair to David, he does note the argument I make (the right-skewing of Falluja) in his comment on Deltoid. I think its worth mentioning in the paper as well…else political people who don’t know stats (coughMalkincough) might/already get carried away.

I’m no statistician, but this seems less like any real disproval than it is a methodological critique, which in and of itself is important, but perhaps it is easy to overstate what Dave has and has not “proven.” I would also assert that Dave’s argument is dependent on Saddam having kept wonderful and unskewed census records or their equivalent, something he may not have been wont to do. It seems that he is right that there are issues with an inability to account for that many deaths over time, but while he has cast some doubt on the Lancet study (and the most damning aspect is that their data is not readily available and thus verifiable) it is not clear to me that he has actually debunked that study.

I don’t think it was his intention to “debunk” the premise of the Lancet study. I read through a couple of David’s post on the blogs linked to above, and the pro-Lancet crowd was seriously arguing that there were no objection raised to the actual results. This is obviously now false, and I think David makes a good point on whether or not Fallujah deserved to be included in the study.

As a somewhat-green-behind-the-ears student of Statistics/Quantitative Finance, I know that it’s always tempting to “clean” the data to prove your original hypothesis. In this case I wholly agree with DK that the authors’ predisposition to prove the significant increase in deaths perhaps led them to (consciously or not) skew the actual numbers.

I think no one will disagree that the Iraq war has caused a huge number of civilian casualties whatever the actual number may be. This is obviously the main tragedy and the quibbling over the numbers in the Lancet study is somewhat of a red herring.

Yet I’m also certain that the Lancet study sought to prove a point in the period leading up to the actual 2004 vote. According to their study, the Iraq war caused a likely 98,000 innocent civilian deaths (correct me if I’m wrong). If we look at the data, however, the confidence interval on this estimate is (in my opinion) unacceptably wide.

98,000 sounds a lot more menacing than the low-end of the 95% confidence band, which is about 8000.

A couple points, having now gone through the entire deltoid back and forth.

In this case, “cleaning” Falluja is a conservative cleaning, the type that are very welcome in most statistical efforts. The idea that Falluja represents a different distribution (areas in warlike environments vs. the bulk of Iraq) is quite convincing and reasonable. Further, even if one decides not to clean Falluja, there’s a lot of reason to believe that David’s point should/does have little reason to doubt the bulk of the paper.

As for david attempting to “debunk” the paper, he claims not to be doing so out of one side of his mouth and out the other…well, he outright calls for it to be withdrawn or at least a correction in the deltoid thread. That’s potentially even worse than a “debunking”.

let’s be clear about what the pro-lancet group was and still is arguing: any errors made have no meaningful impact on the value of the report. Yes, they could have used different methods (you always can), and yes Falluja is a problem cluster, but the take home is still true. Further, the authors are very clear about when and how they deal with Falluja. A lot of the deltoid discussion is about bootstrapping vs. unimodal distributions…the argument is about the methods of stats, not really the take-home message about excessive deaths. Even David is willing to accept that about 100,000 innocent iraqis have died from the war (a quite low estimate, in my opinion, but still a disturbingly high number).

Finally, the Lancet authors using the 98,000 number is perfectly standard and not worthy of your rebuke. They certainly report the CI as well, but it is quite standard to give the statistical approximation’s actual value. And it isn’t surprising that that is the number reported in the mainstream press. That is their actual estimate, after all. 98,000 is also much less menacing than nearly 200,000, which you didn’t see in the press.

I’ll also note that professors holding their actual data close to their chests is not new, surprising, or a sign of something going wrong. It may just be an IRB requirement, their desire to get more papers out of it before others get to it, a desire to protect confidentiality, etc.

For example, the US census does not release individual level data until that census is (I believe this is the right number) 70 years old. So right now, I could not, as a researcher, get the census information for individuals from 1940. Such restrictions on data are normal, and those who note that the authors not giving their data or code to others are working from a natural science paradigm. In the social sciences–for better or worse–its quite different.

A similar brouhaha occured in economics over a study purporting to show competition has a positive effect on education using IV analysis. The original author would not give her data to someone trying to replicate it. when he did, he didn’t find the same results and published a paper much like David’s about Lancet. It has since devolved, basically, into vicious name calling. It was between Caroline Hoxby and Jesse Rothstein, I believe.

Rory, before you get too worked up about my “rebuke”, note that I said:

I think no one will disagree that the Iraq war has caused a huge number of civilian casualties whatever the actual number may be. This is obviously the main tragedy and the quibbling over the numbers in the Lancet study is somewhat of a red herring.

Given that we are discussing the actual statistical concepts behind the paper, it is perfectly reasonable to question why we would all of a sudden remove the Falluja cluster from the final results. Is it an outlier? Of course. But don’t forget that the authors then take this pre- and post-invasion survey of 250-330 households and extrapolate an estimate for all 28 million Iraqis.

How and why Falluja would present issues with such a study is, in my opinion, irrelevant.

Obviously we can (and do) disagree on this point. Further discussion on why I would object to the final conclusion is useless.

That being said, the take-home does indeed remain the same. The fact that I think the Iraq war’s effects on civilians is a tremendous tragedy does not prevent me from objecting to their “cleaning” the data to improve their final statistics.

How and why something is an outlier is critical to how one should consider it. If Falluja is an outlier because it is in a different distribution, it is perfectly reasonable to take it out of the analysis. If, however, Falluja’s status as an outlier cannot be explained in such a manner, it would be inappropriate to exclude it. Much of David’s point seems to be that the authors of the original study inadequately consider and address Falluja in their calculations. Had they done so in the way David wanted, they would have found they could not reject the null hypothesis.

How and why falluja is an outlier is critical to the argument.

Further, taking a 300 household survey and extrapolating is dangerous to do, certainly, but that’s why the CIs are so damn wide. As a sample size, 250-300 is quite reasonable considering what the authors are trying to do…size of the country be damned (assuming the country is of one population, not two with different distributions). it’d be nice to be bigger (so even with falluja and even with david’s critique, if one buys it), but that doesn’t make it questionable.

As for our agreement on the take home, fine. we can move on, somewhat. But it is important to come to, as best as possible, a consensus on statistical approximations and methods.

Let me be clear:
1. The authors do not clean the data (in a roundabout way, though, that’s what David is arguing). In the article, they report both with and without Falluja included in their analysis.
2. It is important to clear up these statistical issues because laypeople/media give these things more credit than they deserve…Lancet got a huge push, now Malkin and Lowell are both calling this study a “debunking”. It isn’t.
3. As to your argument that nobody disagrees that the war has caused a large number of casualties…I invite you to read the comments on littlegreenfootballs about this. Not only does it show that people are questioning that (how??????), but also that these types of academic debates are misread to prove that academics have an agenda/manipulate/deceive and clean data for political reasons.

Actually, one of the ironies of the deltoid debates is that David is pointing out how an outlier can have a weird and unexpected result (a high outlier = reject the null hypothesis) but his fallback in response to the bootstrapping/not-unimodal argument is that the CI is symmetrical, thus it must be unimodal (it couldn’t be weirdly randomly symmetrical in this case).

I think your qualification regarding the huge loss of life is the right point. My question is why post your “debunking” with Malkin? The wingnuts have latched onto your work as proof of a liberal media conspiracy to distort the “truth” in Iraq.

Your arguments should be part of the discussion on this topic. I am troubled that the data is not available (the explanation by a researcher as to the reasons for that seem plausible but unfortunate).

However, intentionally or unintentionally, your work is being used by commentators who are not credible (on really any aspect of Iraq policy, frankly) and will be distorted by folks who will twist it and use it to justify whatever “unreality” based position they happen to be advocating.

Ultimately, I think the likes of Malkin discredit your work and the legitimate questions it may raise. Your choice of venue, rather than shedding light as legitimate analysis is discredited by those who wield it irresponsibly to justify the unjustifiable.

If you don’t believe me, visit the links discussing your work among wingnuts.

I agree with that the discussion among some of Malkin’s fans is not, uh, that useful. But I did not seek to place my work at her site. I only sought to get it posted at Deltoid, where the discussion has been excellent as expected.

Malkin did ask me for permission. (She reads Deltoid, despite (because of) the fact that the focus there is beating up on (stupid) right-wingers. Should I have denied Malkin permission, just because of her politics? If someone from DailyKos requested the same permission, should I deny it? That makes no sense to me. I am an (occasional) scientist. Once I release a paper (even a draft) I am happy to have anyone and everyone read it. Is that a bad policy?

David —
By the way — Good for you. It’s not your fault that Malkin is a shrill harpy hack, and a writer rarely has control over who takes on and embraces their work. Some of my most prominent pieces were featured at Instapundit and at the Wall Street Journal’s roundup, and my politics almost never mesh with theirs.

Further, taking a 300 household survey and extrapolating is dangerous to do, certainly, but that’s why the CIs are so damn wide. As a sample size, 250-300 is quite reasonable considering what the authors are trying to do…size of the country be damned (assuming the country is of one population, not two with different distributions). it’d be nice to be bigger (so even with falluja and even with david’s critique, if one buys it), but that doesn’t make it questionable.

We can obviously give the authors a free pass on this point. I would imagine it is exceedingly difficult to obtain a larger sample size during a war. The fact that the CI’s are so wide is subsequently excusable, but I mostly take issue with the fact that the authors were, from what it gather, writing a polemic on the tragedies of the war using data that no one else has seen.

Viewed as such, I’m exceedingly skeptical of removing an entire portion of the raw data to improve Confidence Intervals/reduce variance. You have obviously followed the discussion far better than I have (in my half-hour sifting through DK’s paper and some blog comments), so I welcome corrections to my impressions on the issue.

Please, a bit of truth would help here. David has been ideologically opposed to the Lancet study from the start. Remember: he accused them of fraud in a post on a Harvard website, which had to be taken down for its obvious extremism. The current paper is simply a continuation of his ideological crusade. Read through the Deltoid comments and you will see what expert opinion thinks of it. Here is a link to another discussion:http://crookedtimber.org/2007/07/27/alice-in-wonderland-and-the-lancet-study/
Here is some of what they say:
“The paper is a disaster. As the comments thread at Deltoid gradually teases out, it’s full of silly mistakes (the author constantly fails to make a distinction between an estimate and its confidence interval) and is based on a fundamental misreading of the paper (in that it assumes that the relative risk rate was estimated parametrically using a normal distribution when it wasn’t). But one doesn’t need to go into the maths of the thing to understand what’s wrong with it…”
Thank goodness he is identified with Harvard. All of this brings only shame and embarrassment to Williams.

If we think Fallujah is subject to a different data generating process from the rest of Iraq, then we should not include it in the same model. Since Fallujah is then a single data point, how should that single data point inform our estimate for the rest of Iraq?

Including Fallujah increases the dependent variable substantially, so one might expect confidence intervals to be larger. However, Fallujah is an outlier on the high side. Excluding high outliers would suggest that unless we are also excluding comparably peaceful regions of Iraq from the analysis that we have underestimated the effect. That is, our lower confidence interval should be less effected than our upper confidence interval by including Fallujah.

Bootstrapping could verify this supposition pretty easily. Could you run the bootstrap and report back the results, Kane?

I want to start with the idea of not giving out data. I want to be as clear as possible, as well, because this error seems to be popping up all over the malkin and right-wing blogs about replication and data availability:

The replication model of science that is so common in lab sciences (and I will admit to jealousy of this practice) is not nearly as common in social sciences. Much fo the data in social sciences is sensitive and/or confidential. Further, the political nature of such data also makes it a much more closed environment in some cases (in this case, david has such a personal grudge against the lancet studies, why would lancet give him data so he can use it to create more bashing studies to get right wing press?). I sincerely, and many other social scientists have also agreed, wish that replication were more common in the social science model, but it is not. In other words, for the Lancet authors to not share the data is not an act of hiding, obfuscating, or weird, but rather quite normal.

Going back to the Hoxby debate as a prototype, here’s a pargraph from the WSJ on it:
“Dr. Rothstein also complained that for years Dr. Hoxby ignored his requests for the data she used. Economists frequently argue about the availability of data. In this case, Dr. Hoxby said her ability to circulate all her data was limited because some came from the National Center for Education Statistics, which restricts public access to some of its information to protect the identities of students and school districts. Dr. Rothstein contends that the data she ultimately made available don’t match up to the data used in her paper.”

Second, the authors do not leave out data points , they report both with and without the outlier results (standard practice). David claims that they used the incorrect method for calculating their CI and thus should have not rejected the null hypothesis. This most recent post on deltoid is quite helpful (from “Bill”):

“Your argument, so far as I understand it, boils down to three parts:
1) L1 uses different distributional assumptions to calculate the CMR CIs from what it uses to calculate the RR CI.
2) If one uses the assumption used to calculate the CMR CIs to calculate the RR CI (underlying distribution is normal), then the RR CI includes one.
3) The assumption underlying the calculation of the CMR CIs is appropriate, and should be applied to calculate the RR CI.
On #1, you seem to be correct. On #2, I actually think your math looks ok, but this is irrelevant, because:

3 is so wrong. 180 degrees off. Normality is obviously a bad assumption here; nonparametric methods, e.g., the bootstrap, will be better. If you want to reconcile the discrepancy you found in #1, it would be much better to do completely the opposite of what you have done, and fix the CMR CIs. (Which, I should note, is exactly what Tim said in the post that kicked this off.)”

The replication model of science that is so common in lab sciences (and I will admit to jealousy of this practice) is not nearly as common in social sciences.

Perhaps this is why many people don’t consider the “social sciences” to be actual sciences? Replication is a pretty important part of the scientific process.

in this case, david has such a personal grudge against the lancet studies, why would lancet give him data so he can use it to create more bashing studies to get right wing press?

To improve the perceived credibility of their own study. If you’ve got nothing to hide, then there’s no reason not to release your data. Sure, people can distort it, but that happens anyway. Having the data allows others to verify and defend your results, assuming your results are in fact defensible.

In other words, for the Lancet authors to not share the data is not an act of hiding, obfuscating, or weird, but rather quite normal.

False dichotomy. Just because it’s normal to hide data doesn’t mean that they’re not still hiding data, and it doesn’t make it right.

Sorry for the rant, I’m just really irritated at scientific proprietarism right now. The project I’m working on right now is basically a replication of a project that someone else did last year under slightly different circumstances, but because they won’t share any of their code or data with us we have to waste huge amounts of time reimplementing everything from scratch. Several people have spent months working on stuff that would be completely unnecessary if we could just use the already-developed, already-working code that they won’t share. And this is in the hard sciences too; the place you’d most expect people to have grown beyond petty ego-tripping.

I understand the rant–I hate that people don’t make their data replicable, but I think that to doubt the Lancet authors for that is being far too doubting.

Perhaps your reason is why people doubt social science. If so, then people have a pretty piss poor understand of science (I think it more because of qualitative social sciences and how we teach undergrads, but we can save that for another posting). Replication does not necessarily mean you must always use the exact same sample (impossible at times). Further, sometimes code can identify sensitive information and thus can’t be shared. Replication is focused in the social sciences not on the data, but the analysis: can someone come up with the same analytic end point? Replicate the question and see if you get the same answer. That’s the key…not just “run the same analysis on the same data”. So when one person finds something interesting about education in the ADDHEALTH database (a popular one), someone else will see if they can find the same thing in the NELS database (another popular one). Some researchers have access to the first, some to the second. That’s still replication. In fact, in statistical analysis of quantitative findings, it might even be better replication.

in response to your point that the Lancet people might be covering their asses: yes, but that question was more rhetorical than anything else–David’s shown his ideological point of view quite clearly on their study. Is that really who you want replicating your analysis? I’d try to find someone more, i don’t know, impartial?

Confidential/private information is crucial to much/most of quantitative social science. For someone to give me confidential information, it generally takes promising secrecy throughout the process…that nobody but myself can identify the person through any means. Considering the situation in Iraq, I wouldn’t be surprised if the Lancet authors had to promise that their data would not be shareable, i.e, that any shared data be aggregated (like the census as I note in an earlier post). That’s not unreasonable.

And if that (or a similar rationale) is being them not giving out the data, then that is not a reason to cast doubt on their study. Otherwise, social science would be without much very important data.

Do I like it that data sharing isn’t as common as it could be? No. But is it reasonable at times for data to be kept unshared considering that we have human subjects and often political charged questions? Yes. Is this one of those times? Quite possibly…I’d even say probably.

Nor am I naive enough to think anyone in any science is anything but egotistical:)