At the Yale field experiments workshop today, Katherine Casey from Brown University (soon to be Stanford GSB) presented a brilliantly executed study by herself, Rachel Glennerster, and Edward Miguel evaluating the impact of a community directed development (CDD) program on local public goods and community social institutions in rural Sierra Leone. Here is a link to the working paper (PDF). I think this paper is a must-read for those interested not only in decentralization and democratization of rural social institutions in poor countries, but in field experiments, policy analysis, and causal inference more generally. In fact, I would suggest that if you have an interest in any of these things, that you stop what you are doing (including reading this post) and look carefully at their paper right now. Here is the abstract,
Although institutions are believed to be key determinants of economic performance, there is limited evidence on how they can be successfully reformed. The most popular strategy to improve local institutions in developing countries is “community driven development” (CDD). This paper estimates the impact of a CDD program in post-war Sierra Leone using a randomized experiment and novel outcome measures. We find positive short-run effects on local public goods provision, but no sustained impacts on fund-raising, decision-making processes, or the involvement of marginalized groups (like women) in local affairs, indicating that CDD was ineffective at durably reshaping local institutions.
They indicate that, for the most part, these results are consistent with what other CDD studies have produced, raising serious questions about donors’ presumptions that CDD programs can really affect local social institutions. In a recent review of CDD impact evaluations, my co-authors and I found the same thing (see here, gated link). Given the centrality of CDD programs in current development programming, this comes as a call to reflect a bit on why things might not be going as we would like.
For those who don’t really care that much about CDD, there are four methodological aspects of this paper that are simply terrific and therefore warrant that you read it:
- They very effectively address the multiple outcomes, multiple comparisons, and associated “data dredging” problems that have plagued research on CDD in particular (see again our review essay) and pretty much every recent analysis of a field experiment that I have read. Their approach involves a few steps, with the last step being the most innovative. The steps are, first, articulating a clear set of core hypotheses and registering (via a Poverty Action Lab evaluation registry) these before the onset of the program; second, grouping outcome indicators as the bases of tests for these hypotheses; third, pre-specifying and registering their econometric models; and, finally, using seemingly-unrelated regressions (SUR, link) to produce standard errors on individual outcomes while taking into account dependence across indicators, and then using omnibus mean-effects tests to obtain a single standardized effect and p-value for each core hypothesis.
For example, to test the hypothesis that the program would increase lasting social capital, they have about 40 measures. The SUR produces dependence-adjusted standard errors on each of these outcomes, and then the omnibus mean-effects test allows them to combine the results from these individual regressions to present a single standardized effect and p-value for the social capital hypothesis. That’s a huge step forward for analyses of field experiments. Effect synthesis and omnibus testing like this needs to become much more regularized in our statistical practice (see Levitra coupon
for a recent post on omnibus tests of covariate balance).
Their hypotheses are motivated by a clear theoretical model that formalizes what the authors understand as being donors’ and the Bank’s thinking about how CDD affects community-level social dynamics. The model explains what constraints and costs they hypothesize as being alleviated such that the program might improve public goods and, potentially, social capital outcomes. This really shores up one’s confidence in the results of the empirical analysis, because it is clear how the hypotheses were ex ante established prior to the analysis.
A propos to some recent discussion over at the World Bank Development Impact blog (link), they study outcomes measured both during the program cycle and some time afterward, to assess programmatic effects on provision of public goods and downstream effects on social capital.
- To measure effects on social capital, they created minimally intrusive performance measures based on “structured group activities” that closely resemble real-world situations in which collective problem solving would be required. For example, a social capital measure was based on the offer of a matching grant to communities, with the only condition to receive the grant being that the community had to coordinate to come up with matching funds and put-in for the grant. In the event, they found that only about half of communities overall were able to take up the matching grant, and the treatment effect on this take-up rate was effectively zero.
Katherine indicated that for them, the null result on social capital effects was the most important take-away point. This provoked a salient question during the Q&A: how will journal editors react to this, that the core finding of the paper is a null result on a hypothesis that was derived from a theory that was motivated only because it seemed to characterize what donors and Bank program staff thought would happen? As a political scientist, I am sympathetic to this concern. I can imagine the cranky political science journal editor saying, “Aw, well, this was a stupid theory anyway. Why should I publish a null result on an ill-conceived hypothesis? Why aren’t they testing a better theory that actually explains what’s going on? I mean, why don’t they use to data to prove the point that they want to make and teach us what is really going on?” Reactions like this, which I do hear fairly often, come in direct tension with ex ante science, and essentially beg researchers to do post-hoc analysis. Hopefully publishing norms in economics won’t force the authors to spoil what is a great paper and probably the most well-packaged, insightful null result I’ve ever read.