For those involved in evidence review or meta analysis projects, I highly recommend a few (rather old, but still relevant) articles by Robert Slavin:
- Slavin, R.E. (1984). Meta-Analysis in Education: How Has It Been Used? Educational Researcher 13(8):6-15. [gated link]
- Slavin, R.E. (1986). Best-Evidence Synthesis: An alternative to meta-analytic and traditional reviews. Educational Research 15(9): 5-11. [gated link]
- Slavin, R.E. (1995). Best evidence synthesis: An intelligent alternative to meta-analysis. Journal of Clinical Epidemiology 48(1):9–18.[gated link]
Slavin, who I take to be a strong proponent of good quantitative social research, makes some great points about how meta analysis has been misused in attempts to synthesize evidence on social programs. By my reading, what Slavin emphasizes is that fact that in very few cases will we have enough high quality and truly comparable studies for meta-analytic methods to be applied in a way that makes sense. That being the case, what we need is a scientifically defensible compromise between the typically unattainable ideal of a meta analysis on the one hand, and narrative reviews that have too little in the way of structure or replicability on the other.
Unfortunately, proponents of meta-analysis often suggest (see the literature that he cites) that only through the use of meta-analytic methods for effect aggregation and heterogeneity analysis can a literature review be considered as “scientific.” Those doing reviews are then faced with either doing a review that will be deemed “unscientific” or trying to apply meta analytic methods in situations where they shouldn’t be applied. Because of this pressure, we end up with reviews that compromise either on study quality standards or comparability standards so as to obtain a large enough set of studies to fulfill the sample size needs for conducting a meta analysis! These compromises are masked by the use of generic effect size metrics. This is the ultimate in the tail wagging the dog, and the result is a lot of crappy evidence synthesis (see the studies that he reviews in the 1984 article for examples). I’m inclined to view some attempts at applying meta analysis to development interventions (including some of my own!) in this light. See also the recent CBO review of minimum wage laws.
Slavin’s alternative is a compromise approach that replaces rote subservience to meta analysis requirements with scientific judgment in relation to a clear policy question. He recommends retaining the rigorous and replicable methods for searching the literature (including devising search strategies that attend to potential publication biases) that are the first stage of a meta analysis, but in a manner that is stringent in applying standards pertaining to study quality (internal validity) and relevance of the treatments (ecological validity) and gathers ample evidence to assess the applicability of results to the target context (external validity). The nature of the question will determine what is the “best evidence,” and the review should focus on such best evidence. From here, the method of synthesis and exploration of heterogeneity will depend on the amount of evidence available. It may be that only one or a few studies meets the stringent selection criteria, in which case the review should scrutinize the studies with reference to the policy questions at hand. In the very rare circumstance that many studies are available or in cases where micro data that capture substantial forms of heterogeneity are available, then statistical analyses may be introduced, but in a manner that is focused on addressing the policy questions (and not a cookbook application of homogeneity tests, random effects estimates, and so on). As Slavin writes “[no] rigid formula for presenting best evidence synthesis can be prescribed, as formats must be adapted to the literature being reviewed” (1986, p. 9).
For most policy questions in development, we face a paucity of high quality studies and limited comparability between such studies. Abandoning the meta analysis model in favor of Slavin’s more realistic but no less principled approach seems like the right compromise.