Should impact evaluation be justified by clinical equipoise or policy equipoise?

I finally read through a much-discussed paper by Stephen Ziliak and Edward Teather-Posadas, entitled “The Unprincipled Randomization Principle”. Casey Mulligan and Jessica Goldberg [Author edit: It is Jessica Goldberg, not Penelope, applogies!] exchanged thoughts on this paper a few weeks back, and Martin Ravallion also mentioned it [Author edit: Martin contacted me to note he actually references a different paper, apologies again!]. The overall topic of the paper relates to last month’s discussion on RCTs and ethics that was so ably conducted by Martin, Berk, David, and Markus. I stayed out of that particular exchange as I had written several previous posts on the topic, but having found “The Unprincipled Randomization Principle” so painful, in parts, to read I now believe a bit of repetition is in order…

Yet before I get to my main point on the ethics of a control group in policy evaluation research, first are two asides on other aspects of the “The Unprincipled Randomization Principle”:

Ziliak and Teather-Posada (henceforth ZTP) make great hay of the randomista meme. A randomista is apparently defined as someone who only accepts evidence derived from randomized trials and rejects all other evidentiary sources – balanced observational studies, historical analysis, etc. As I have yet to encounter one, please tell me on which continent or planet does the genus randomista reside. Here I echo Berk’s previous call: “could someone please point out their [randomistas] identities, provide a quote from one of these scapegoats?”

ZTP reviews the formative debate around inferential reasoning with statistics held between Fisher and Student in the1920s. They contrast Fisher’s randomized experimental design with Student’s stratified observational design and they criticize randomized studies as often subject to the confounding imbalance of key characteristics (with practitioners simultaneously unaware of such confounding). This is a bizarre characterization given contemporary practice. Many, if not most, randomized designs are stratified random designs; some take Student’s adviso to the extreme with pair-matched randomization. The overriding motivation for randomized evaluations is to control for imbalance of influential unobserved characteristics. Hence RCTs are typically very concerned with the balance of possibly correlated observable characteristics. Standard balance tables may now be occasionally excluded from the final published version of selected studies but that does not mean these checks aren’t conducted.

OK to the main point. According to ZTP, the “Unprincipled Randomization Principle” is unprincipled due to the existence of a leave-out observational control group, while the treated group receives a benefit. [Note: a leave-out control is not unique to RCTs, but… hmmm… randomistas and all.]

ZTP’s example of this unethical practice is the 2012 paper by Glewwe, Park, and Zhao, which measures the effectiveness and cost-effectiveness of providing corrective lenses to rural Chinese students. Behind ZTP’s stance is the sentiment that – here I’m paraphrasing –“students in need of corrective lenses will obviously benefit from their provision, so why on earth do we need a study to demonstrate this?” This sentiment engages the principle of clinical equipoise – the principle that justifies a leave out control only when there is uncertainty over the benefit of a treatment. ZTP’s stated belief is that the eyeglass study does not meet the standard.

But as both David and I have previously written, it is far from clear that clinical equipoise should constitute the guiding ethical principle in the design of social policy research. From the standpoint of policy, the salient question is not whether an intervention produces any benefit, but how much benefit and delivered at what cost. While clinical medicine is chiefly concerned with the improvement of health, the motivation for all economic inquiry comes down to the founding observation that resources are finite. The question is not “does it work” but rather “should public resources be utilized in this way at the expense of alternative uses”? Perhaps we need a parallel principle, call it “policy equipoise”, to guide the ethical assessment of socio-economic field trials.

Let’s leave aside the particular question of whether you or I believe the eyeglass study is ethical or not – maybe there are ways to assess the cost-effectiveness of this program without a leave-out control group. Although let’s be clear: this other, non-existent, study would have to explicitly consider or model the behavioral responses and investments that vision-challenged students and their families undertake in the absence of freely provided glasses (so this non-existent observational study would face rather severe challenges).

Rather let’s focus on the existing ethical governance of field research – after all, such a force does exist in the world, namely the Independent Review Board (IRB). I presume the eyeglass study underwent independent ethical review, as is the norm.

One main task of an IRB is to consider precisely these questions of equipoise. As the Ethical Guidelines for Medical Research as published by the Indian Council of Medical Research phrases, the IRB must determine whether the proposed research is “Essential” as well as ensure the “Non-Exploitation of Study Subjects”. So the decision to conduct the study comes down to whether the research question has been deemed essential by at least one accredited peer group. A critic may feel that the requirements of equipoise have not been met in the eyeglass study, but clearly the body tasked with this assessment did feel this threshold was met.

Framing the discussion in this fashion shifts the point of contention in a productive direction. Let’s no longer deal with the inert question of the ethical validity of one particular method. Let’s review the existing governance structure that is meant to ensure the ethical conduct of research. This would entail considering the following (non-exhaustive) questions:

Is the current IRB accreditation system sufficient to provide proper guidance for researchers?

If not, how should IRBs be accredited and governed?

How should IRBs be financed? (Note that some IRBs are paid fees directly by the projects reviewed, thus raising the possibility of conflict of interest.)

Does social science need to develop guidelines on “policy equipoise” as opposed to adopting the principle of “clinical equipoise”?

Are too many studies that a professional consensus might deem unethical being allowed to go forward?

Should “professional consensus” even constitute the proper threshold to assess ethicality? What about professional unanimity? Or majoritarian view?

Should the IRB system be centralized, i.e. do we want something like a global clearinghouse for ethical review?

Or should we maintain the decentralized system we have now?

I have opinions on some of these questions, but I can’t answer many of them with full certainty at the moment. In my mind though, these are the critical questions of practice, not those taken up by ZTP.

Comments

Thanks for a good post, with lots to think about, including traction for 'policy equipoise'. I do have an immediate question. On this point (and with Jeff Hammer's recent comments about 'our' ability to answer questions about opportunity-cost (in Lahore) in mind), 'The question is not “does it work” but rather “should public resources be utilized in this way at the expense of alternative uses'?"... This strikes me as an argument for testing 2 interventions (or variations on one) against each other rather than leave-out(?).

Hi Heather, great comment, thanks! I completely agree - in the studies I know in the field, a policy related intervention (or several) is tested against at least the "business-as-usual" approach, which constitutes a policy already in place.

Thanks for this, it echoes the frustration that so many people working in the randomized trial space feel at the straw man critiques repeatedly dredged up. One small note: The author of the CGD rebuttal to Casey Mulligan was Jessica, not Penelope, Goldberg.