An eloquent 3 year-old would have been better asking “What the dickens are you talking about? Who is defining success? Who says failure is bad, anyway?” – Joe

Earlier I blogged about aid cheerleaders and critics. Each camp argues about the mean outcome of aid rather than the distribution of impact among projects. Both camps agree that some projects have positive results and others negative. So why not try to figure out which projects work and focus our resources on them?

I got some great and insightful comments and a few nice aid distribution graphs from readers. Here are some key themes:

The mean *does* matter if the distribution is random. In other words, if we can’t predict in advance what types of projects will succeed, we should only spend more resources if the mean outcome is positive.

Many people believe that on average the biggest positive returns come from investment in health projects.

We should also look at the distribution of impact even within successful projects, because even projects that are successful on average can have negative impacts on poorer or more vulnerable people.

Given the difficulty in predicting ex-ante what will work, a lot of experimentation is necessary. But do we believe that existing evaluation systems provide the feedback loops necessary to shift aid resources toward successful initiatives?

“Joe,” the commenter above, argues that in any case traditional evaluators (aid experts) are not in the best position to decide what works and what doesn’t.

From reader Steve White: "Here is my graph based on two stylized facts about aid projects: 1) most projects have very marginal impacts (agricultural tools to villages, microcredit, school construction, textbooks, scholarships, deworming...) and 2) some health projects have HUGE impacts (vaccinations, DDT, bednets)." The two bars represent impacts between -1 and 0, and between 0 and 1

From reader Daniel Kyba: "Those which do a good job are the ones with defined and observable measures - profit/loss; live/die and so on. These measures provide a form of a feedback mechanism at the project level to which the aid provider can respond. As you move towards the world of fuzzy concepts and measures that is where the ineffectiveness occurs, due to the lack of feedback mechanisms and because there is less definition of success/failure."

Petr Jansky sent a paper he is working on with colleagues at Oxford about cocoa farmers in Ghana. The local trade association was upset that they could not get pervasive adoption of a new package of fertilizer and other inputs designed to increase yields. According to their models, the benefits to farmers should be very high. The study found that – on average – that was true, but that the package of inputs has negative returns to farmers with certain types of soil or other constraints. Farmers with zero or negative returns were simply opting out.

At first glance, these findings seem obvious and trivial. But they are profound, in at least two ways. First, retention rates are an implicit and easily observable proxy for net returns to farmers. We don’t need expensive outside evaluations to tell us whether the overall project is working or not. And second, permitting farmers to decide acknowledges differential impacts on different people even within a single project.

What other ways could we design aid projects to allow the beneficiaries themselves to evaluate the impact and opt in or out depending on the impact for them personally? And how would it change the life of aid workers if their projects were evaluated not by outside experts and formal analyses but by beneficiaries themselves speaking through the proxy of adoption?

Witness: Let A be the event of terrorism, and B be the event of Muslimism. Then P(A|B)≠P(B|A)

Congressman: What are you talking about?

Witness: You seem to be confusing the probability that a Muslim person will be a terrorist with the probability that a terrorist person will be a Muslim

Congressman: And you seem to be confusing everyone in this hearing, smartass.

Witness:

Congressman: What did you just call me?

Witness: it’s simple, the probability that a Muslim will be a terrorist will be 13,000 times lower than the probability that a terrorist will be a Muslim. That is, the ratio of the probability of being a terrorist to the probability of being a Muslim is about 1 over 13,000 (P(A)/P(B)).

Congressman: so even the math department has been taken over by politically correct academic radicals who hate America?

Witness: even if you think that the Probability of a Terrorist being a Muslim is 95.3%, the probability of a Muslim being a Terrorist is only 0.0007%. That is less than the probability of a left-handed octogenarian Olympic discus-thrower being struck by lightning.

Congressman: or maybe even less than the probability that anyone is listening to you?

Witness: maybe this picture will help.

Congressman: I’m calling your state legislature right now to fire your radical butt.

POSTCRIPT: response to commentator:

Mr. McKinney, perhaps your prejudices led you to mis-read the piece. 13,000 was how much larger one conditional probability was than another, which is helpful for understanding Bayes’ Theorem but not for policy. The policy-relevant probability is that of a Muslim being a terrorist, which based on a Rand report was calculated here as 0.007 percent.

One other probability you may want to consider is that Al-Qaeda’s recruiting will become more successful by a δ >= 0.0007 percent after you have persecuted the 99.9993 percent of Muslims who are innocent.

]]>https://aidwatchers.com/2011/03/the-congressional-muslim-terrorism-hearings-the-mathematical-witness-transcript/feed/35Solving the education puzzle? Test scores and growthhttps://aidwatchers.com/2011/03/solving-the-education-puzzle-test-scores-and-growth/
https://aidwatchers.com/2011/03/solving-the-education-puzzle-test-scores-and-growth/#commentsFri, 04 Mar 2011 05:01:21 +0000http://aidwatchers.com/?p=9080There has long been a mystery in why the rapid growth of education in poor countries did not pay off in growth of production per worker, above all in Africa (best captured by a classic paper by Lant Pritchett, Where has all the education gone?, ungated here)

Eric Hanushek at Stanford has been working for the past several years on test scores as a possible resolution of the puzzle. If education doesn’t translate into higher test scores, then there is something else wrong along the way, which likely includes well-known problems like absent teachers and missing textbooks. He showed this picture in a 2008 paper, and he has a stream of papers since, all with coauthor Ludger Woessman.

Growth is growth of income per person 1960-2000. Both growth and test scores are measured “conditionally,” that is how well they do relative to a country’s initial educational enrollment and income in 1960.

Of course, test scores are a potentially sensitive subject, as some will think they are tests of intrinsic intelligence. Is this whole area of research racist?

Not necessarily, of course. Let’s take racist stories of differing intelligence between nations off the table, and consider all the other factors that could be reflected in such widely varying test scores relative to educational enrollment and income.

I was just on a committee that selected a small number of papers from a large number of submissions for a conference. We each graded each paper and then we had to come up with a rule to go from our individual grades to a ranking of the papers to decide which ones got into the conference. So here are some possible rules:

(1) one veto kills the paper

So the overall grade for the paper equals the minimum of all of our grades, so if even just one of us flunks the paper, the paper flunks. You need to satisfy all of us. In econ lingo, you can’t SUBSTITUTE one of us with a positive opinion for another one of us with a negative opinion.

ANALOGY: the “weakest link” production function, in which whatever input the economy has least constrains the whole output. Note that zero substitution means that all inputs/committee members are perfect complements. This is the world view of those who like Big Pushes to increase all the development inputs at once.

(2) simple average

Averaging our grades goes to the other extreme of perfect SUBSTITUTION between us. One of us with a positive opinion cancels out (i.e. substitutes for) another one of us with a negative opinion. We committee members are not complements at all: the value of my grade is not influenced by your grade.

ANALOGY: the old Human Development Index.

Also in production functions relating Development to inputs, this rule implies extreme flexibility. Rich economies feature this selectively to compensate for weakest links — if the whole system is going to fail because of one input, then have a backup input that is a perfect substitute.

(3) geometric averages

This exotic animal (cube root of the 3 grades multiplied together) is in between (1) and (2). You can partially but not completely substitute for one of us with another one of us. So for example if we were just grading A,B,C (numerically 3,2,1), then a paper with the score (2,2,2) has a higher geometric average than a paper with the grades (3,1,2) although they both had the same simple average under 2. We are also partial complements — the higher is your grade, the stronger is the effect of my grade.

ANALOGY: the new Human Development Index, which an Aid Watch post criticized for TOO MUCH complementarity. The higher was committee member Per Capita Income, the stronger was the effect of another committee member Life Expectancy, which has the unappealing property that we value lives of rich people far more than those of poor people. Makes more sense for production functions than for HDI.

The ending of the actual committee story– qualitative discussions were necessary for choosing the final papers in the end after constructing the mechanical indexes. Let me see what is the analogy here…

UPDATE 2: FEB 8 4:30PM: concluding coverage by @PSIHealthyLives @viewfromthecave of the Great Twitter War prompted by this post between @aidwatch and @earthinstitute, with collateral attacks on @m_clem, ending in a non-acceptance of debate by head of @earthinstitute.

UPDATE Feb 8 11:30am: sent comment with link to this post to DFID Independent Commission for Aid Impact. Spokesperson promptly responded, rejected my comment for consideration on technical grounds, but did warmly invite me to complete the anonymous online mass Survey Monkey. It probably doesn’t mean much, unless the Independent Commission already learned the brilliant strategy of bureaucratizing the critics? I do feel a wee bit sorry for one of your Commisioners, the great John Githongo, who presumably did not risk his life so he could be reading anonymous results from Survey Monkey.

Nobody is more tired of the interminable Sachs-Easterly debate than one guy named Easterly…alas, I seem to be stuck in a kind of Critic Trap, in which the ideas criticized keep reappearing unchanged, requiring equally unchanged criticisms, keeping me in chronic peril of taking myself way too seriously.

So it was with great weariness I heard the news that the British aid agency DFID (otherwise probably the best bilateral aid agency) is close to financing a brand new Millennium Village in northern Ghana, near Bolgatanga. I had hoped for something better from the new UK government, which had seemed like an improvement over the Blair and Brown (“We know the answers, just double aid”) team .

As it happened, I passed by the proposed MVP site last summer. The proposed villages are right on the main road in one of the most NGO-intensive places anywhere (see the sign below, in which NGOs apparently own the region).

The usual critique that selection bias of the Millennium Villages makes evaluation impossible may be somewhat relevant given the political realities that (1) the current government chose the villages for the MVP, (2) the incumbents have frequently promised to do more for the North, (3) the MVP came along and may be a high visibility way to keep that promise, and (4) ergo, the government will likely do everything possible to make the project succeed, showing nothing about scalability for thousands of villages elsewhere. In short, this new MV may be about as informative as my feeding my own children is informative on whether child nutrition programs work.

And how good is the track record of the MVP taking evaluation seriously? Michael Clemens and Gabriel Demombynes posted the following on the World Bank Africa blog last Friday:

In a June 2010 report called Harvests of Development, the Project claimed that the impacts of the project included expanded cell phone ownership. For example, the MVP claimed that increases in cell phone ownership at the Ghana site were caused by the project, in this extract from page 91 of the MVP report:

This claim has little basis, because cell phone ownership has been expanding at about the same rate all around the MVP site in areas untouched by the project. ….

But on Tuesday, months after multiple discussions we’ve had with MVP leaders on our research, a post on the MVP’s blog restated the claim that the increase in mobile phone ownership at the intervention sites was caused by the Project…

Sauri looks back on five years of success

Infrastructure: … The proportion of households owning a mobile phone has increased four-fold….

In short, independent observers made an irrefutable argument that a claim was invalid, the MVP heard the argument, seemed to accept it, and then repeated the previous claim unchanged.

Or in other words, if nobody is listening to any evaluations anyway, if I am bored and I am boring everyone else, why should I want to be Official Sachs Critic any longer?

Messrs. Clemens and Demombynes, you may want to check out a new job opening…

]]>https://aidwatchers.com/2011/02/dear-uk-government-why-wont-you-let-me-retire-as-official-sachs-critic/feed/10Development in 3 Sentenceshttps://aidwatchers.com/2011/02/development-in-3-sentences/
https://aidwatchers.com/2011/02/development-in-3-sentences/#commentsSat, 05 Feb 2011 18:23:33 +0000http://aidwatchers.com/?p=8561I liked this formulation from the blog The Coming Prosperity, posted today as a link on Twitter:

If solutions are known, need $$. If solutions are knowable, need evaluations. If solutions are evolving, need entrepreneurs.

Consumer Warnings: This comes at the end of a long diatribe against You-Know-Who (associated with $$). I’m not sure the author is a reliable guide to other people’s work, since Yours Truly is incorrectly associated with “evaluations.” But I still like the 3 sentences above.

UPDATE 2: oops, author of only nomination so far says it’s not so positive– see comments

UPDATE: received first nomination of positive review

On Twitter, @bill_easterly noted yesterday’s Aid Watch post :

On Millennium Villages: this is not my own predictable response, this is independent guest post

Which immediately got the reply on Twitter:

intentional irony? your guest posts are as “independent” as any MV self-assessment

Aid Watch will let its guest posters defend their own independence, but in the meantime let’s find another guest poster that will pass our critic’s most stringent independence test. In short…

…could somebody please send us a strongly positive evaluation of the the Millennium Villages.

Our critic rightly notes that self-assessment is not what anybody is looking for, so the only restriction is that the evaluators of course must not be part of the MV program themselves, i.e. must be independent.

This is not satire. Aid Watch would be very happy to hear from those evaluators of the MVs who have the strongest possible positive portrayal of the results of the MV intervention. We will post summaries of these evaluations without comment on Aid Watch.

]]>https://aidwatchers.com/2011/02/please-help-us-praise-millennium-villages/feed/16What’s it like to live in a Millennium Village?https://aidwatchers.com/2011/01/what%e2%80%99s-it-like-to-live-in-a-millennium-village/
https://aidwatchers.com/2011/01/what%e2%80%99s-it-like-to-live-in-a-millennium-village/#commentsFri, 21 Jan 2011 05:01:20 +0000http://aidwatchers.com/?p=8130In Mayange, a cluster of villages about an hour’s drive south of Kigali, Rwanda, interventions by the Millennium Village Project across all sectors (in seeds, fertilizer, malaria nets, health clinics, vaccines, ambulances, water sources, classrooms, computers, cell towers, microloans and lots more) aim to lift villagers out of poverty within five to ten years.

What do we know about the effects of such ambitious projects on the lives of the people living in these impoverished, rural communities? A qualitative study by Elisabeth King, a post-doctoral fellow with Columbia’s Earth Institute, produces a fascinating if limited* “snapshot” of social impacts of the Millennium Village Project in Mayange. A few observations:

The villagers King talked to were reluctant to bare all to a foreigner asking questions about delicate social topics. Her questions about quality of life, trust, and social exclusion elicited some contradictory and evasive answers: “Life in Mayange…In general it is not bad, it is not good, it’s in between.” “I know people well. But then, people are private and one only knows one’s own problems.” “There are no problems. But there are always some small problems between people though.” King explained that in her previous research she found that Rwandans “downplayed negative aspects of social life and tended to embed negative reflections within positive pro-government ‘bookends.’”

MVP has good brand recognition and outreach; cooperatives sometimes increase cooperation. King found that the project was well-known among villagers, and almost all could name a change that had resulted from the project. Most were members of some kind of cooperative (farming, basket-weaving, bee-keeping) created by the project, and some described these groups as strengthening social bonds in the community or increasing women’s confidence by helping them provide income for their families.

Villagers thought that benefits from the project were unevenly distributed. In response to “Who gained the most from the project?” villagers answered most frequently “MV staff,” followed by “local leaders,” and villagers most willing to adopt new practices suggested by the project.

MVP may not be doing so well on the most basic thing – letting people say what THEY want. The most common suggestion was that the project should consult more with people in the community about what they want. One woman told King: “The MV has to meet with local community to learn more about what people really want because sometimes the MV brings things that the community doesn’t need or want. People may have good ideas.”

—

*King’s study is limited in several ways beyond lack of statistical significance (she spoke with 35 individuals and 8 focus groups in a population of 25,000 people). One, as a visiting Westerner asking questions about MVP, she can’t avoid being seen as tied with the project. Whether this makes her interviewees more timid in voicing complaints (for fear of losing some project benefit or subsidy), or more bold (in the hopes of gaining resources to address their troubles) is hard to say. Two, the Rwandan ban on talking about ethnic divisions prevents people from speaking candidly about this obvious issue in a place where resettled genocide survivors and released prisoners now live side by side. Three, King has no baseline data, so she can’t talk about changes in quality of life or social cohesion based on statements from before the project vs. during/after the project (see also: Clemens and Demombynes).

What a nice illustration of a serious problem in development empirics, known by the lusty, sensuous name of “heterogeneous effects.” If you find handing out free bed nets lowers malaria, that still only applies ON AVERAGE to the group covered by the study. Within this group, the effects are likely heterogeneous behind the average positive effect, and there could be some sub-group for which the effect is zero. This is analogous to the Texas effect on voting– on average, being Texan makes you vote Republican, but this is an average of heterogeneous groups, some of whom — like the burgeoning Hispanics — vote Democratic.

You could solve this problem by analyzing all the possible sub-groups. Unfortunately, both in politics and in development, this is unlimited, while research budgets and data are limited.

To illustrate imaginative sub-group possibles, my own pathbreaking insight is that one reliable group of Republican voters is, well, how to be polite about this(!?), are persons with somewhat larger belt sizes. Notice how many of the most brownest, reddest states are Red States, while the Blue State strongholds are in the relatively thinner Northeast.

Also some sub-group effects could be spurious correlations. During my own struggles against middle-aged spread, I have not noticed any more inclination to vote Republican when my jeans size increases.

If this is all too methodological and obscure for you, then, congratualtions, you are normal. On the off chance that you are willing to work hard on this stuff, you can get many unexpected lessons. For example, if you want a roly-poly Santa for the office party, ask a Republican.

]]>https://aidwatchers.com/2010/12/census-2010-population-shifting-to-states-that-are-more-republican-more-hispanic-fatter/feed/5Human Development Index debate…you still want more?https://aidwatchers.com/2010/12/human-development-index-debate-you-still-want-more/
https://aidwatchers.com/2010/12/human-development-index-debate-you-still-want-more/#commentsSun, 19 Dec 2010 13:34:40 +0000http://aidwatchers.com/?p=7743I suspect that we long ago exhausted the patience of our readers with our multiple rounds of debate on the Human Development Report’s new methodology for its Human Development Index. At the same time, I feel an obligation to let the other side of the debate have their say as much as they want. So here is UNDP’s new response to Martin Ravallion’s response to UNDP’s previous response to our original blog criticizing the new Human Development Index, as well, crazy.
]]>https://aidwatchers.com/2010/12/human-development-index-debate-you-still-want-more/feed/2