The watchdogs never achieved scale. The largest among them, Charity Navigator, evaluates about 8,000 of the active charities in the country, or 1.1%. And none of the new effectiveness organizations are on a trajectory to get much larger, even using relatively inexpensive and simple evaluation methods. For example, GiveWell, one of the best of the bunch, measures only 413 organizations, or .059% of the active charities in the country. Why is this so? None has yet discovered a revenue model that can achieve dramatic growth.

While Dan is spot on to highlight the importance of evaluation in the social sector, his focus on scale is premature. The fact is that none of the evaluation methodologies available to us today are worth scaling up.

Evaluation is complex, and Dan’s call for one evaluation agency to rule them all smacks of the oversimplification of those who wrongly believe evaluating non-profits is a specialty in itself. Part of the reason organizations like Charity Navigator have failed to evaluate organizations effectively is that they have tried to evaluate agencies that are too dissimilar from one another. Comparing the incomparable necessarily forces evaluators to rely on pointless metrics like overhead ratios.

Perhaps more damaging, the insistence on evaluating all types of agencies distracts from the more important work of developing sound methodologies in any one of the numerous philanthropic sub-sectors, such as environmental issues or poverty. Our ratings suck not just because of a lack of money, or imagination as Dan suggests, but a lack of focus and specialization.

An evaluation specialist is not necessarily someone who understands the managerial side of non-profits. Ideally, an evaluator would be an expert in the social or environmental issue a non-profit aims to influence.

If I want to evaluate an environmental organization, forget the non-profit consultant, give me an environmental expert. Yet the lauded evaluation agencies of today by-and-large fail to take an issue specific approach to outcomes evaluations.

We have a lot of work to do before bringing evaluations to scale. The first step is abandoning the pointless pursuit of an evaluation framework that compares all organizations, no matter what they do, with one another. If we can agree on that, we can begin the serious work of evaluating social and environmental outcomes. Eventually, we might even have something worthy of scale.