The Chief Minister Posed Questions We Couldn’t Answer

I was recently at a conference in Lahore, Pakistan sponsored by the International Growth Centre where the keynote address was given by Shahbaz Sharif, the Chief Minister of the province of Punjab, Pakistan (100+ million people). While fun to see old friends and colleagues, the conference was a little depressing in the way it reflected the state of the development economics profession.

The Chief Minister posed serious questions that have traditionally been the bread and butter of the economics profession. Unfortunately, we are not even trying to answer them any more. The specific question was “Should I put more money into transport? Infrastructure (power, roads, water)? Law and order? Social services? Or what? And where am I going to get the money?” What questions could be more solidly part of the core of economics than these? Unfortunately none of these were even remotely the focus of the “evidence-based” policy making discussed.

Almost all of the cases analyzed were single, simple policy “tweaks” that were, first of all, isolated from the broader market context in which they occurred and, second, had no conception of opportunity cost – what we would have to give up to pursue these things? We had an answer to “how to improve a public food distribution system” but even with a precise answer (to whether a tweak would work) we had no idea whether the substantial amount of money funding such a system is a good idea. Maybe the Chief Minister would be better off improving education or road networks or police or rural electricity. Some of these alternative policies could have more impact on food consumption than food distribution if we thought about how the world worked. Getting food to market securely (roads, better cold storage, trustworthy police and safe roads – this is Pakistan, which no one seemed to notice) may increase food availability much more than any tunnel-visioned food program Or not – maybe the food distribution system is better. We just don’t know. And none of us “experts” are trying to find out.

On spending priorities, what we need is the old fashioned notion of opportunity cost. “Evidence” now is “did something work?” meaning did it have any effect at all? or “can we get it to work a little better?” But the real question in such a resource-constrained economy is “does it work well enough to take money away from the power plant it prevented or any other thing money could have been used for?” Or even, “is it better than leaving the money in private hands by not collecting the taxes to pay for it?” Besides not knowing the marginal welfare cost of taxation (anyone remember that?), we forget that poor people use their money for food, so the first-order effect of tax revenues is to make poor children hungrier. Is the benefit from secondary education or bicycles or the fertilizer subsidy so good as to impose this cost on these children? We don’t know who ultimately pays taxes (when wages, for example, respond to indirect taxation) but it is likely that poor people, the majority of the population, pay at least some substantial share. And we don’t know how badly distorted the tax system is – in its very structure, not just in its administration. The incidence and efficiency loss of the whole structure of taxation are the first order answers the Chief Minister needs. No one studies these anymore.

When someone says “we should have more “X” because we have evidence that it works”, the response should be “compared to what?” What should we cut in order to promote your particular interest? My hobby horse these days is more sanitation in South Asia. I should have to defend it against (at least) a few alternatives.

It’s not like we have no basis for making this comparison. We usually try to determine which things the private sector (i.e. almost everyone – farmers, bicycle manufacturers and repairmen, truckers, shopkeepers, halal sausage casing makers) can be safely relied upon to produce, where it goes somewhat wrong (exactly how bad are private schools or doctors?) and where it is a flaming disaster such that the government is utterly indispensable. While we’ve all drawn the gap between public and private costs (or benefits) to help us talk about optimal Pigouvian taxes, when was the last time anyone tried to measure this one, central concept for valuing interventions in developing countries? Or in developed countries, for that matter? We look at enrollment rates (or even learning rates) but never ask “how much is this secondary education worth, and how much of that isn’t captured by the student?” Further, since there is no reason to think the number is the same in any two places, even if there were a couple of such studies, it wouldn’t make up the bulk of what we call policy-relevant research. And it’s not like it’s easy to do so we can’t just say “let the practitioner-types do the (routine) calculations”. There is nothing routine to it at all.

In the conference, several research projects measured an effect (not an externality, not a welfare loss – just an effect) that could well be part of an almost completely private good with no serious market failures to speak of. Can it really be the case that date exporters genuinely didn’t know that packaging for export was available (and wouldn’t a phone call to either the exporters or the marketing wing of the packaging producers suffice)? Did football producers really need to know a better pattern for cutting pentagons out of leather when mechanized stitching (as the commentator on that paper noted) is swiftly changing the entire production process worldwide? Will the competition that is currently mechanizing allow firms to exist even with the 10% higher profits that a better pattern enables? And are policy makers (even with Ivy League economists as their advisors) really going to make better decisions than those producers or, much more importantly, the competitive forces in the economy?

My defense of my promoting sanitation is that I contrasted the value of health via providing public goods (sewer systems in cities) to spending on publicly provided health care (a rival and excludable service – I’m avoiding the “p” word, this being the sub-continent). I don’t know if I’ve cleanly identified the effect that I purport to have measured – whether open defecation without sewage in slums damages the health of its residents - but it makes sense, is tied to most peoples’ notions of the nature of public and private goods, and gives some evidence of an externality. One reason to avoid specifying which service should be sacrificed is to avoid fights. Even fairly convincing evidence that publicly provided healthcare is of questionable value can provoke uncomfortable arguments. But not even mentioning the opportunity cost of a proposed policy is irresponsible.

On collecting more taxes: this is, of course, a core government activity. Any way we can efficiently get more money into government coffers to support critical public services is to be applauded. But what we were treated to was a two-year experiment on something that looks like tax-farming (and indeed, was titled as such). Higher powered incentives to collect taxes? When you’re being watched? Tax inspectors didn’t know an experiment was underway? Even if it was double blind (which it was not), can a two-year project using currently recruited tax inspectors (i.e., those that entered public service expecting to get a salary without having to work too much) anticipate what happens in equilibrium when everyone figures out how to make money from these high-powered incentives? That is, core government service or not, there is a labor market in which the people who this experiment purports to study operate. It is the nature of the long-run equilibrium of that market that is the proper level of analysis for policy purposes, not the behavior of the particular individuals who happen to have the job at present. As the commentator on that paper noted, the proposal looked like the medieval version of tax farming. But that scheme always deteriorated in time (longer than a two-year experiment would tell us) into an ugly system that brought down rulers.

The Chief Minister is a committed and capable man. With the recent elections behind him, he has the opportunity to actually accomplish things. He deserves much better support than we’re giving him.

Comments

Very true. The tidal wave of RCTs (which I am not against) has pushed development economics more and more into answering relatively small and ultra-precise questions ("is it better to sell fertilizer to small farmers at price X, Y, or Z? The extent of hyperbolic discounting?") to the detriment of the less-precise but bigger-order questions.

Long-time friend and colleague, Jeff Hammer, seeks to get us back on track in development economics. Way to go. Some time back, Robert Frank of Cornell reported on now technical economics distracts students from key issues. In his NYTimes columns he emphasized the same point as Jeff: Look at opportunity cost associated with any investment choice. Calculate, or at least estimate, the benefits and costs associated with, and derived from a given investment. Focus principal attention there and we can help those Chief Ministers make sound decisions. The year 1984 was my last to visit Pakistan, three decades ago. Let's hope the next three decades boosts Pakistan's and the Punjab's economies by more than the last three. Thanks, Jeff.

Great post, and I fully endorse the sentiment. Governments that do think about evidence systematically often go for exactly the kinds of important questions the Chief Minister was asking, not the narrow and very precise ones that budget officials really don't care that much about. In fact, I very recently did a blog post on that very topic, arguing that RCTs are interesting, but quite marginal for real-life governments:

Jeff, thanks for the wide-ranging thoughts. There is a lot to agree with here, including the points on the decline of research on topics such as taxes and thinking about the welfare implications of public versus private provision.

One thing I am less clear on is how to measure opportunity costs well (or maybe better). If we put a dollar into sanitation intervention X at the expense of education intervention Y, shouldn't we know what the effects (private returns + externalities) are of x and y are? And i think impact evaluation is a really good way to measure them, which suggests to me we don't have enough impact evaluations so that we can get a fuller sense of the tradeoffs we are making (instead of doing the one or two things we know have a positive return).

Markus,
Thanks for the comments. On practical and technical grounds I don’t see the set of alternatives to any real policy yielding to isolated experiments. I don’t see an alternative to having theory (studied in whatever ways we can) filling in lots of gaps. For many policies (anything to do with changing prices, for example) studies of the RCT variety will not have a sufficient number of “arms” to study all relevant cross price effects (let alone the market failure in each of these related markets). Not enough power. And if we were to go for studies that would cover enough such alternatives, we’ll still need some structure (like demand systems, production possibility frontiers) to guide us over the, necessary, gaps in data. These could be built up derivative by derivative, product by product, country by country by individual experiments but all such policies based on them will be irrelevant before we’re half done.
The other big problem is that of the size of the relevant problem. Are we going to experiment with road systems (not the individual roads – unconnected roads are irrelevant: I know this because of… ummm… common derision of “roads to nowhere”? Or have people build in useless places just to avoid the endogenous placement problem), port location, power grids, sewer systems, mass transit? The big problems of development could be those of large public infrastructure. Our experiments are often with private goods since these are easily measured by household or factory surveys.
In order to make a problem tractable for experiments it is often conceived in ridiculously narrow terms. One experiment I like is the northern Ghana work of Karlan and Udry since it captured the conditions of farmers in broader terms than other studies of fertilizer use. When a fuller picture, involving the risk that farmers face in output markets, was included, the “irrationality” we imputed to them seemed to be attenuated by realizing they were solving a different problem than the researchers were. But that is an old insight – Lipton 1976? (72? 78?).
If experiments help us with containing air pollution (or global warming?) or measuring the labor market equilibrium of better trained workforce (and the externality of that training), that would be great. But I doubt they will.
Some things should be studied with experiments if that’s the right method for the right problem. But the problem should come first, the appropriate method to be matched to it.
Jeff

Interesting and thought-provoking post. As to what to make of all the little experiments showing a default change here, a reminder there, a nudge there, an incentive there etc. lead to little improvements.
I have two possible responses:
1) We have trouble finding any evidence that any big policy really works that well. So then perhaps development really is a slow process of making small tweak after small tweak to an existing system - and if you make enough changes which each gain you 2-5%, over time they add up.
2) we should think of all these little experiments like venture capital, with every now and then we discover something that delivers exceedingly high returns/big changes (e.g. defaulting to opt-out rather than opt-in on organ donations).

In terms of your point about tax collection, theory might tell us to scrap the entire tax system and start again from scratch (certainly that seems optimal in the US system), but politics makes this impossible - so then small tweaks to an existing system may be the best we can hope for, and then we want to know which ones seem to matter.

David - no argument really with point 2)... of course you should have venture funds, especially if it's not the Gov't of Pakistan but international donors who fund it.

But 1) doesn't sound right to me. Yes there's no evidence on big policy, but we do know that some countries are way better at "government" than others. Big policy happens and senior officials as well as politicians have to make do with what they've got and take big decisions accordingly.

You can accumulate rock solid evidence on small decisions at the margin, but the big prize is to help governments make big decisions marginally better.

Hmmmm… no big policy changes that worked. Let me see… Oh yeah. Korea did something in 1962. It might have been liberalizing interest rates, promoting savings and having producers face international prices. It might have been industrial policy and deliberate export promotion. I rather think the former but this is a period worth studying having raised the income of the country from dirt poor to quite rich in two generations, massively raising the incomes of (now 50) millions of people.
China in 1978 took a wild guess that agricultural supply curves sloped up (where they got that idea, I’ll never know). They were right. Rural incomes skyrocketed. This was followed by attempts to get the same results in industry. Was that an “experiment”? The first, not at all – it was theory (supply and demand) and common sense. The latter? In the sense that it was done in stages and they watched what happened, perhaps. No one recommends abandoning any empirical work. But no randomization, no careful application (by graduates students) of some cute gimmick. The result was hundreds of millions of people getting massively better off. Second order big policies included managing transport networks along the (one – so no chance of a control) coast and ensuring a reliable electricity grid. The contributions of those are eminently possible to study but certainly not by an RCT.
India in 1991 dropped tariff rates dramatically (politics or not – a long story) and allowed quite a few markets to operate much more freely. Just as policy analysts familiar with economic theory (even as long ago as that!) would have, and did, suggest. Income of hundreds of millions of people rose dramatically since. Twenty years later the fact that not all of the billion people fared so well is a real problem that is, also, eminently study-able or would be if the statistical office had kept more standard data (as in rich countries) so that things could be tracked, analyzed, theories developed and checked against data.
As with the rest of Southeast Asia. Lots of policies happened and either they worked or were irrelevant but, in any case, can be, and have been, studied with other kinds of data. These studies are never perfect but have yielded a large number of well-agreed upon conclusions along with some stubborn areas of disagreement that may be studied but will not yield to anything like RCT’s.
So, I would contend that all the policies inspired by the little experiments that ever were and ever will be conducted will never help a small fraction of the many hundreds of millions of people that important policy changes have done, of which the above were just a few examples. Of course, they will not be studied by our best economists because of what is and is not acceptable in the profession.

Martin, a very interesting post more so because it looks at a fundamental if complicated situation in pakistan. We should meet soon so you can brief the ambassador and I on your views on the conference and any tangible and practical outcomes that may have come about at the end of it.

Take a look at MCC.gov that uses evaluation to test assumptions about which approaches work. AND, cost-benefit analysis to inform selection of cost-effective investments. It doesn't have to be about this evidence or that evidence. . .

Hi Salman,
Of course we need a "compared to what". Fortunately, that one's an easy one with lots of private goods with few externalities and not-particularly-well-targeted subsidies being provided by government that sanitation, with a growing body of evidence and obvious public good characteristics (particularly in cities) is very likely to beat. Getting rid of open defecation in rural areas is harder, of course, and we should study that - maybe even with experiments if appropriate. Check out www.riceinstitute.org for some of that evidence.
Jeff

Salman - You need to be careful about statistics given in reports. I have worked on water and sanitation issues in Pakistan for over 30 years. The percentage of people with access to a toilet is much higher than 50%. - probably in excess of 90% in both rural and urban areas in northern Punjab for instance. The challenges around sanitation relate to wastewater disposal (very little) and operation and maintenance. Cities like Lahore are largely sewered (albeit informally in many peripheral areas). People without access to sewerage normally connect their toilets to the nearest drain, often but not always via a crude interceptor tank. There is virtually no wastewater treatment. The issue with both formal and informal sewage disposal systems is maintenance and that brings us to the main challenge in Pakistan, as far as municipal infrastructure is concerned. I agree strongly with Jeffrey's basic point, that economists and international aid agencies in general, are often concerned about peripheral issues and don't tackle some of the clearly important issues. I agree with other respondents that cost benefit analysis is not always easy but does not take rocket science to see that a water supply system or sewer that is not maintained or to which no-one is connected will not cover its costs. I would recommend that anyone who is interested in development in Pakistan should read the book 'So Much Aid: So Little Developmentby Samia Altaf. One point to come out of that book is the need to be realistic and honest about what our initiatives achieve - development planners need to get beyond the meetings with top government officials and explore reality on the ground, which is almost always very different - more difficult and more complicated - than we assume.

Thx for your post. This idea is not new. Please have a look on the "Growth Diagnostics" (Hausmann, Velsaco, Rodrik) - quite qualitative but the basic work for development economists when discussing lifting constraints to development.
RCTs help for this overall diagnostic tool to discuss the specific instruments or partly engagements in specific sectors. Thus, there is need of both.

Julian: As I recall, the Hausmann-Rodrik-Velasco growth diagnostics did not identify interventions in terms of the rationale for public intervention. That is, they identified binding constraints to growth, regardless of whether the government should be spending scarce public resources on relaxing it (because there's a market failure) or not. Yet, this is the point that Jeff was making in this post. I think we need to devote more research resources to identifying and measuring these market failures because that is the relevant answer to the chief minister's question of how much he should spend. Shanta

Agree completely with Jeff.
I am setting up India's first independent evaluation office and what we need are more impact evaluations in a cost benefit framework. I just chaired a wonderful talk at the Institute of Economic growth by Prof James Heckman who has spent his whole life in doing this kind of work on human development. For example in India we have spent over $ 40 billion on the Mahatma Gandhi Rural Employment Gurantee Program but we don't know how useful it's been and whether we could have spent the money differently to achieve the same objectives. Right on Jeff

Your point on the need to study the general equilibrium effects of such interventions is well taken; however it seems a bit contradictory when the meat
of the blogpost is about not being able to address the Chief Ministers
concerns.

Unfortunately the CM is also not concerned with the GE effects of such an
intervention, he wants the biggest and quickest bang for the buck or rather vote
for the buck. And it seems he has figured out that the answer to that lies in large
road and transport infrastructure projects.

Also im glad you mentioned the welfare impacts of taxation as thats what the
real policy issue on taxation should be. Again however its not something that the
CM is interested in. In the current political economy, all the talk in Pakistan is
about increasing tax collection and improving the tax to GDP ratio. To meet WB
and IMF benchmarks the government has focused on the more politically viable
deepening of the tax base instead on the more optimal widening of the tax base.
In such an environment talking of the welfare impact of taxation will fall on deaf
ears.

Also on your very reasonable expectation that the intervention should be
compared to other alternatives, we can look at two policies that the punjab
goverment tried to implement to improve tax collection. Firstly the rather
arbitrary luxury tax on large house , which thankfully has been taken to court and
currently under litigation) http://www.nation.com.pk/business/06-Sep-2013/lhc-
asks-punjab-why-luxury-tax-imposed-on-houses-in-selected-areas

Secondly an attempt to increase property tax rates has failed to pass through
parliament because of the politics surrounding such policies. http://www.dailytimes.com.pk/punjab/15-Feb-2014/old-property-tax-rates-restored-pa-adopts-punjab-finance-bill

Compared to these two policies, tax farming may be the most feasible and least
disruptive. Even though it represents low hanging fruit when it comes to ways to
increase tax revenue.

This article by Jeffery Hammer is very welcome and important in that it raises a systemic issue (regarding tax incidence) behind whether projects in a particular sector are worth doing or not relative to other projects in other sectors, and it also points to the highly incremental nature of the current spate of impact evaluation research which raises serious questions about its ultimate ability to guide broad strategy for development. The taxation issue is however even more important systemically than the paper suggests. One of the key reasons for project failure as we all know is failure of local 'ownership.' Local ownership in turn is closely linked to accountability, as we all know. One of the key elements of accountability arises from a financial stake in results. Where externally aided projects are part of a broad and intrusive foreign aid sector (i.e. where there is high aid dependency as in many African countries) there is a break in the principal/agent chain that weakens local accountability, but there is also a weakening in the financial stake in results of the Government and people since it is not necessary (to the extent of the level of aid) for Parliament to appropriate tax revenue to pay for public expenditure and not necessary to undergo the difficult negotiation with the citizens and voters about raising their taxes which are at the heart of a democracy. This aggravates the serious deficiencies in institutional development that we all know about. While impact evaluation of a project can potentially provide useful technical pointers to what to do within specific sectors within specific locations and conceivably beyond these specific situations (assuming that the experimental design is OK, the econometrics are done right, the participants cooperate and the funding is enough to pay for it all) the systemic problems that undermine aid effectiveness are however not touched. If systemic issues are not touched then impact evaluation does not shed light on a key set of reasons why projects fail. If we want development aid to have any success it might be better to put much more effort into understanding how political and institutional barriers undermine programs and projects and how they can be designed to compensate for such barriers and less into developing technical libraries of hopeful good practices. (www.developmentwithoutaid.com).

I am glad to see Jeff pointing out the straitjackets that the RCT culture has imposed on the development discourse. It almost feels like we are living in a world of “have a method, propose a question that can be answered with this method” without asking whether the question is the right one to ask.

Clean experimental design often requires that the world does not move fast enough to “mess up” the experiment and so the experiment must be evaluated within a short time frame. This often tends to focus our attention on quick fixes – what Jeff calls tweaks – rather than long term solutions or solutions that depend on several changes working synergistically.

This equivalent of searching for keys below the lamp post has overwhelmed the field of development studies to a level where it crowds out other, more fundamental, questions as Jeff points out. However, the policy world does not stop just because economists are worrying about whether giving bicycles to girls has an impact on school attendance or not, it goes right on making large policies that are often enshrined in constitutions (e.g. Right to Education Act in India) with long lasting consequences while simultaneously pushing aside social science research as being irrelevant.

Further to my rant - some people will object to the argument about taking account of systemic issues by asserting that these issues are a given - and meanwhile we have to address the 'technical' questions. Quite apart from the highly optimistic view that we can definitively solve the technical issues, especially across countries and regions, there is a simple response to objection. Why not treat project success as a given (there are good projects and bad projects and given the characteristics of the aid industry there always will be) and home in on the systemic, institutional issues ("ownership", "accountability") that doom many projects before they have even got off the ground.

Hello there, just became aware of your blog through Google, and found that
it's truly informative. I am going to watch out for brussels.
I'll appreciate if you continue this in future. Numerous people will be benefited from your writing.
Cheers!