More on matching and identification

At the Social Science Statistics blog, Richard Nielsen posts a response (link) to the Chris Blattman’s recent post (link) on problems with the way people interpret what matching does:

Matching is generally a pretty smart way to condition on observables, but it doesn’t buy you anything if you believe that there are unobserved variables that systematically influence treatment assignment…[P]eople I [Nielsen] talk to who are skeptical of matching almost always argue that there will always be problematic unobservables lurking no matter how hard you try to measure them. In general, these types of people prefer instrumental variables approaches (and tend to be economists rather than statisticians, interestingly enough)….[W]hat always gets me is that the same people who tell me that lurking unobservables are everywhere tend to be fairly comfortable making the types of exclusion restrictions that make IV approaches work. The crazy thing is that just like matching, these assumptions rely on assumptions about unobservable causal pathways. The claim that an instrumental variable is valid is the claim that there are no unobserved (or observed) variables linking the instrument to the outcome except through the path of the instrumented variable. So it always puzzles me that the same people who think that lurking unobservables are everywhere in matching somehow think that all these lurking uobservables go away as soon as you call something an instrument and try to defend it as exogenous.

Nielsen is arguing that IV analyses are on no firmer ground than are the matching analyses that Blattman complains about. This is because, according to Nielsen, claims of “exclusion” in typical IV analyses are no less dubious than claims of “no unmeasured confounding” in typical matching analyses.

A reaction: A matching analysis is convincing only if you have a verifiable story (or a collection of verifiable stories) to explain why two units who share common covariates may nonetheless differ in their treatment assignment. One way this may happen is if you have something that acts like an instrument, determining treatment status and nothing else among units that are otherwise the same. (This is sort of what Gilligan, Mvukiyehe, and I did in our ex-combatant reintegration paper [link], although the identification was far from pristine.) Or, you may have a collection of such “exogenous sources of variation”—that is, a collection of fortuitous events that determine treatment but nothing else for the different matched sets of cases in the analysis. Another way this may happen is if you can really reconstruct the assignment mechanism, using matching to account for all the information that was available to those who made assignment decisions, and then demonstrating that there was something akin to a coin flip that determined treatment status within matched sets. From that perspective, matching is just a robust conditioning technique that we can use to implement a causal identification strategy based on either verifiable sources of exogenous variation or a certifiably reconstructed assignment process. The technique needs to be wedded to these sources of identification. In that way, the comparison should not be between a “matching” analysis and “IV” analysis but between a “non-IV exogenous variation” analysis or a “reconstructed assignment mechanism” analysis and “IV.” Many applications of matching fail to specify from where, in fact, identification is being attained. IV analyses by necessity have to address this issue head on. For this reason, I disagree with the claim that analyses based on these two approaches are similarly problematic.