{"id":18,"date":"2010-10-29T18:01:03","date_gmt":"2010-10-29T22:01:03","guid":{"rendered":"https:\/\/cyrussamii.com\/?p=18"},"modified":"2010-11-10T10:34:20","modified_gmt":"2010-11-10T15:34:20","slug":"tips-on-observational-study-research-design","status":"publish","type":"post","link":"https:\/\/cyrussamii.com\/?p=18","title":{"rendered":"Tips on observational study research design"},"content":{"rendered":"<p>Here are slides from a talk I gave last week as part of a series hosted by the Applied Statistics Center at Columbia: <a href=\"https:\/\/cyrussamii.com\/wp-content\/uploads\/2010\/10\/samii_designing_quasiexperiments1.pdf\">PDF<\/a><\/p>\n<p>The talk was intended for grad students working on dissertation research plans. \u00a0The focus was on strategies for collecting data to analyze the effects of micro-level development policies. \u00a0I tried to make a few points<!--more-->, including:<\/p>\n<ol>\n<li>If your research design uses matching to control for confounding (rather than, say, an instrumental variable), you should still have a verifiable source (or sources) of exogenous variation that can explain why two units who have the same background characteristics may nonetheless differ in whether they were exposed to the program or not. \u00a0To simply match on the available variables and then claim that differences in program exposure can now be considered &#8220;random&#8221; is not convincing. \u00a0There may be a single exogenous source of variation, or it may be a collection of fortuitous accidents. \u00a0I gave as an example my study with Michael Gilligan and Eric Mvukiyehe on the effects of an ex-combatant reintegration program in Burundi (<a href=\"http:\/\/www.columbia.edu\/~cds81\/docs\/bdi09_reintegration100701.pdf\">PDF<\/a>). \u00a0There, the reason that some ex-combatants did not get the program was because of a bureaucratic dispute that caused one of the implementing NGO to fail in delivering benefits. \u00a0The matching in that study was used to control for the &#8220;incidental&#8221; differences in the personal and community level characteristics of ex-combatants who were designated to receive benefits from that particular NGO. \u00a0These thoughts echo some of what Chris Blattman has said recently about studies that fallaciously claim that matching somehow solves causal identification problems (<a href=\"http:\/\/chrisblattman.com\/2010\/10\/28\/the-cardinal-sin-of-matching-continued\/\">link<\/a>). I agree with Chris: matching is nothing more than a way to circumvent certain modeling assumptions.<\/li>\n<li>There is a lot of work that you can do before you hit the field to improve your data collection strategy. \u00a0This includes using available data on studies comparable to the one that you are proposing to design simulations to study power or the robustness of various design alternatives. \u00a0There are tons of datasets out there that you can find on Google.<\/li>\n<\/ol>\n<p>There was a lively Q&amp;A. \u00a0One person asked how we find those &#8220;verifiable sources of exogenous variation&#8221; and I answered that there is no recipe. \u00a0You come across them when you are working through the fine details of whatever you are studying. \u00a0You just need to be trained to recognize them. \u00a0Another person asked a technical question about how to integrate sampling weights into a matching estimator. \u00a0My reply was that if you are interested in estimating the effect of the treatment on the treated, you simply set for yourself the target reweighting the &#8220;control&#8221; units so that they balance the weighted sample of &#8220;treated&#8221; units.<\/p>\n<p>All in all a nice discussion. \u00a0Macartan Humphreys also presented, discussing some of the challenges of implementing a rigorous sampling design in places where ex ante information on population sizes, locations, etc. is sparse.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here are slides from a talk I gave last week as part of a series hosted by the Applied Statistics Center at Columbia: PDF The talk was intended for grad students working on dissertation research plans. \u00a0The focus was on strategies for collecting data to analyze the effects of micro-level development policies. \u00a0I tried to &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/cyrussamii.com\/?p=18\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Tips on observational study research design&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-18","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/cyrussamii.com\/index.php?rest_route=\/wp\/v2\/posts\/18","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cyrussamii.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cyrussamii.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cyrussamii.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cyrussamii.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=18"}],"version-history":[{"count":9,"href":"https:\/\/cyrussamii.com\/index.php?rest_route=\/wp\/v2\/posts\/18\/revisions"}],"predecessor-version":[{"id":230,"href":"https:\/\/cyrussamii.com\/index.php?rest_route=\/wp\/v2\/posts\/18\/revisions\/230"}],"wp:attachment":[{"href":"https:\/\/cyrussamii.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=18"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cyrussamii.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=18"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cyrussamii.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=18"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}