The MCC's approach to impact Evaluation

Submitted by Markus Goldstein
On Wed, 07/24/2013

co-authors: David McKenzie [1]

In our continuing series on discussing institutional approaches to impact evaluation, DI virtually sat down with Jack Molyneaux, Director of Independent Evaluations at the Millennium Challenge Corporation. (Please note that these are Jack’s opinions, not that of the MCC)

DI: Impact Evaluation seems to be something that's pretty important at the MCC. Can you tell us a bit about how this focus came about?
JM: Since its inception MCC’s mandate has included demonstrating results. Rigorous impact evaluations have always been a key component of that mandate.

DI: Given the MCC approach, where you have country programs with local implementing agencies implementing five year compacts, who has the responsibility for impact evaluations?
JM: Typically MCC contracts independent evaluators who have primary responsibility for the evaluations. In rare instances, the local Millennium Challenge Account (MCA) – the partner countries’ units charged with implementing the compacts – can contract the project evaluator. In all cases, MCC and MCA work with the independent evaluator trying to integrate the evaluation within the program implementation.

DI: How do you decide what gets an impact evaluation?
JM: All significant components of our investments require some evaluation. Whether that is an impact evaluation or a performance evaluation depends on the feasibility, costs and perceived value of the evaluation alternatives. You can imagine that this concept of “perceived value” is very hard to quantify but it encompasses assessing whether the investments are productive uses of public funds. Theoretically this assessment should consider the value of demonstrating whether an investment contributes to sustained economic growth, as well as the value of learning lessons that can better inform future investment decisions.

DI: Relative to other forms of evaluation, in impact evaluation there is a premium on the evaluator/researcher being engaged earlier in the project, ideally in the design. Indeed, this could help the project team expand the set of different interventions that they are testing in the evaluation. However, within evaluation departments there is a long tradition of independence which can create tension with the more integrated role that impact evaluation can play. How are you dealing with this at MCC?
JM: The tension of independence is most evident and problematic when our evaluations focus narrowly on accountability – or as you, Markus, define even more narrowly as “judgment”. Our early evaluations tended to have a narrower accountability focus and often encountered significant challenges integrating the evaluation with the program implementation. Evaluations that have focused more on understanding and assessing all components of the program design have generally been better integrated with our sector teams and the implementer and have had fewer difficulties overcoming these tensions. Our current evaluation management approach requires more consistent attention to unpacking the components of the program theory/logic and better integrating the evaluators, sector experts and country implementation teams.

But we still struggle to make good use of the learning opportunities that rigorous evaluations could provide. We often find limited appetite among our country implementation teams to test components of the programs they are implementing. These are often people who believe it is their job to “know” what works. Some of our sector teams have started to push on this learning agenda.

DI: Do you think that impact evaluation has more of an impact on what policymakers (both in developed and developing countries) think than other kind of evaluative evidence?
JM: My sense is that the easier it is for policymakers to understand and trust the results of an evaluation, the more influential the results can be. This is the beauty of RCTs; when our policymaker audience can understand what did or did not work in a program, it is hard for them to ignore the results. By contrast, results founded on complicated modeling are much more easily dismissed by implementers and policymakers.

DI: What are you doing to make sure that evaluation results that you fund or do gets used - both internal to the MCC and outside? Can you give us a concrete example of where the results of an impact evaluation have changed or are changing the way something is done at the MCC?
JM: From the evaluation perspective, much of what we learned from our first five agriculture evaluations focused on how we could have learned more or better and less on how to implement better. However, our agriculture sector colleagues are actively using these results to more critically question the context and design of farmer training and technical assistance programs.

One of the clearest examples comes from the evaluation of the Armenia farmer training program. The agriculture sector team agreed early in the program to continue the farmer training programs (and its randomized roll-out evaluation) even though implementation delays meant that it could be several years before the trained farmers would have had access to project-improved irrigation. While the original purpose of the training was to prepare farmers for the irrigation, at the time there was a sense that the farmer training program should continue as planned. In retrospect the agriculture sector team feels we should have “pressed pause” and delayed the farmer training. It would improve the potential effectiveness of the training; and the randomized roll-out of the training would have been timed to measure these short-term effects.

An application of this lesson appeared almost immediately. A Moldova farmer training program with a similar link to irrigation improvements was also experiencing delays in the planned irrigation improvements. The agriculture team pointedly insisted that the farmer training be delayed to minimize the gap between the training and the completion of the irrigation improvements.

MCC is transparent about its findings from the evaluations and shares its lessons with development stakeholders, especially other U.S. Government agencies. For example, we have been working closely with our colleagues at USAID – both in Policy Planning and Learning, and at Feed the Future. USAID is embarking on ambitious efforts to rapidly integrate rigorous evaluations in what they do. They have been extremely attentive to and engaged with us in our efforts to address our evaluation challenges; they are helping us think through the systems we put in place to address these challenges.

DI: How have you dealt with results that show less than stellar impacts?
JM: We are working with our sector counterparts to strengthen the feed-back loop from evaluation to implementation improvements: we are identifying how the lessons from disappointing evaluations will change how we design, implement and learn better. These less-than-stellar impacts create better opportunities to learn than do evaluations of successful programs. Our agriculture team has used unexpected results to seriously question components of their programs that had previously gone unquestioned. They are now asking challenging questions about how we implement farmer training programs and the conditions under which we expect them to work. They are actively seeking opportunities to test program components that were previously accepted doctrine.

DI: The MCC places a particular emphasis on the economic rate of return. Could you tell us a bit about this and how you use it in project selection and how the impact evaluation results that you support (and that others do) fit into these calculations?
JM: Our ERR calculations serve multiple functions. When they are conducted prior to an investment, they are a key input to each investment-specific assessment of whether its expected benefits warrant its costs. The ERRs formally identify the specific benefit streams that are used to assess the value of an investment, along with the relevant costs. And they also explicitly identify the underlying assumptions of the program logic upon which investment assumptions are made. ERRs are also re-estimated at compact close-out, reflecting any major revisions to program design or evidence of benefits, as well as actual costs and measured program outputs. When the impact evaluations provide improved information on program benefits or (less likely) costs, the evaluator is also asked to produce ERRs based on the impact evaluation estimated parameters.

Published impact evaluations, or any other relevant peer-reviewed publications, form the basis for the assumptions of the ERRs. We expect that our own impact evaluation results, along with other published results will inform future ERR calculations, whenever these results are relevant.

DI: How do you approach -- both in terms of funding and implementation -- impact evaluations which may need to be finished outside of the lifecycle of a given country compact?
JM: MCC has a Due Diligence budget that supports most of the costs of the independent evaluations (most data collection costs incurred during the five-year life of a Compact are paid for out of the beneficiary countries’ M&E budgets; all else is paid for through the Due Diligence budget.) This budget is independent of the compact budgets and timelines. Some of this budget is used to help develop evaluation designs. It continues to be used throughout the relevant evaluation period. Currently this evaluation period is constrained only by constraints on our opportunities to find cost-justified opportunities to learn or account for our investments.

DI: Where do you see your priorities or balance between funding efficacy trials (e.g. a particular intervention or policy works under ideal conditions on a small scale) versus effectiveness trials (e.g. a particular intervention works under real-world conditions with typical messy implementation) versus mechanism experiments (e.g. testing particular mechanisms through which a policy is supposed to work when testing the policy itself may be difficult)?
JM: MCC’s approach is to support five-year investments that our country partners and we agree are reasonably likely to contribute to poverty reduction through sustained economic growth. To do this, we typically seek investments that are ready to implement at reasonable scale and within our limited time-frame.

As a result, we try to build on existing evidence; we have done relatively little in the way of pilots and efficacy trials. But when we do conduct trials, it is typically because our country partners have proposed investments that we cannot find adequate evidence to support. In these settings, we have tried to use pilot programs to help our partners learn about their viability.

DI: How do you evaluate the quality of the evaluations that MCC supports?
JM: Our current approach involves both prospective and retrospective assessments. The prospective assessments focus on both the selection of evaluators, and our management of their contracts. Our selection of evaluators is competitive: We engage our Economic Analysis and sector team members in selecting our evaluators for each evaluation. And we have recently adopted a procurement approach that enables us to consider a much broader set of potential evaluators.

We are now implementing much stricter management of our evaluation contracts, requiring that our full implementation team shares input to evaluation design materials as well as other early evaluation products to ensure there is a shared, fully informed understanding of the program to be evaluated and the questions that need to be addressed.

The retrospective assessments are a little delicate. We make it clear in our evaluation contracts that the evaluators alone are responsible for their assessments of program impacts. We do provide feedback on issues of factual accuracy and technical competence of the evaluations, but we make it clear that they hold the pen, and are ultimately responsible for the content of their evaluations.

However, we also often engage independent peer reviewers who we ask to comment on the evaluation’s factual accuracy, technical rigor, clarity and contributions to the literature.