R and Stata code for inverse covariance weighting

A previous post had discussed differences between dimension reduction through principal components and factor analysis on the one hand and inverse covariance weighting (ICW) on the other: [link].

Here is a link to a Stata .ado GitHub repository with the code for ICW index construction, including both an R example as well as a Stata .do file that loads a program to construct indices: [.git]. The .do file itself contains instructions on using the function “make_index_gr”, which generates an ICW index that can include weights and can be set to standardize with respect to any subset of the data (e.g., against the control group).

Please give them a try and if you find any bugs, please let me know. Also, if anyone wants to do a more professional job with the coding, and even integrate them into broader packages, please be my guest.


  • The make_index_gr Stata program was modified on 2018-05-03 so that the resulting index indeed centers on the standardization group.
  • Post was edited on 2018-05-04 to link to the GitHub repository.

EGAP platform

I am grateful for your consideration as candidate for Executive Director. I have been active within EGAP since 2009, and it has been singularly important intellectually and professionally. I think we can increase the value that EGAP offers to its members, the broader social science community, and policy makers. The organization needs to balance many priorities. If I were elected as Executive Director, I would aim to promote the following to the extent that we can, taking into considering resource constraints and the need for balance:

1. Methodological training

EGAP events offer unique opportunities for increasing our methodological sophistication as a research community. We can devote more to this, including lectures and initiatives on the use of statistical and substantive theory to inform research design and analysis plans. EGAP can be the hub for methodological excellence in field-experimental and otherwise quantitative fieldwork-driven social science.

2. Policy engagement

We can be more systematic in promoting policy engagement. For example, we could host our meetings in national capitals and then hold expert sessions with local policy makers as separate events alongside the regular meeting.

3. New venues for scholarly publication

We can use the EGAP network to establish new venues for scholarly publication to overcome the fact that conventional journals are too slow and unreliable. A modest goal would be a working papers series (along the lines of NBER or BREAD), an ambitious one would be EGAP “proceedings” journals that operate in a manner similar to proceedings outlets in other disciplines like in computer science.

4. Geographic diversity

I think that it is important for us to continue broadening the geographic reach of the network in terms of membership, sites for events, and education activities such as our “learning days.” This includes doing more to engage scholars in Latin America, the Middle East, Africa, and Asia.

5. Deconcentrating leadership

I would like to distribute leadership positions over a broader set of members. My sense is that, at present, administration of meetings, member selection, and research programs is concentrated among too few individuals, and that this has only worked to date because of the exceptional commitment and energy of these few individuals. But this is unsustainable. I will look into different possibilities for delegation of tasks such as member selection, event organization, and management of research initiatives on the basis of region or thematic groups.


A selection of amazing papers from 2017

As 2017 draws to a close, I am marking the year by posting links to a selection of papers circulated in the past year. These are papers that I read and that I think are especially impressive in moving the methodological frontier forward for people, like myself, who do field experimental and quasi-experimental policy research. I have linked to ungated versions whenever I could:

Foundational Work in Causal Inference

These are papers that challenge conventional wisdom or otherwise newly reveal some basic considerations in causal inference:

  • Pei et al. on when placebo tests are preferred to control [pdf].
  • Abadie et al. on causal standard errors [arxiv].
  • Morgan on permutation vs bootstrap for experimental data [arxiv].
  • Young on how IV inference tends to be overconfident [pdf].
  • Lenz and Sahn on p hacking through covariate control and why we do indeed want to see the zero order correlations [osf].
  • Christian and Barrett on spurious panel IV [pdf].
  • Spiess on the importance of statistical unbiasedness in situations where researchers have to persuade skeptics [pdf].
  • Savje et al. on when we do and don’t need to worry about interference as a first order problem [arxiv].
  • D’amour et al. on a curse of dimensionality for strong ignorability [arxiv].

Generalization, Policy Extrapolation, and Equilibrium Effects

Current program evaluation work is pushing past the old “reduced-form vs. structural” divide and making use of tools from both worlds to address questions of generalization, policy extrapolation, and equilibrium effects. Here are some good recent examples:

  • Muralidharan et al. on equilibrium effects of improving access to a cash for work program in India [pdf].
  • Banerjee et al. on using an experiment and model to infer an optimal strategy to deter drunk driving [pdf].
  • Davis et al on how to design experiments to improve their ability to inform scaled up interventions [nber-gated].

Beautifully Designed Field Studies

Finally, here are papers that I read this year that taught me a lot about framing and designing field experiments. Reading them offers a master class on research design and analysis:

  • Benhassine et al. on the low returns from firm formalization [pdf].
  • Blattman and Dercon on the drawbacks of factory work [ssrn].
  • Munger “bots” experiment to reduce partisan incivility on Twitter [pdf].
  • Green et al experiment on how mass media campaigns can reduce intimate partner violence [pdf].
  • Matz et al on using psychological profiling to target persuasion [pnas].
  • Weigel experiment on how state taxation induces more civic engagement [pdf].

An alternative to conventional journals and peer review: the “proceedings” model

Agonizing over peer review is a perennial theme in conversations among scholars. I have given this some thought, and in this attached document, I propose an alternative “proceedings” model for publication in political science, my home discipline: [PDF]

A point that I make in the document is that, in other disciplines like computer science, proceedings-type publications are the highest-prestige outlets, and conventional journals are considered second-tier. So, there is nothing essential about conventional journals for granting prestige.

Something that I do not make explicit, but is implicit, in this model, is that there is ample scope for scholars to be entrepreneurial in organizing new events, perhaps even one-off events or short-termed series, that generate new proceedings outlets. An overarching governing body (like an APSA section) could serve to “certify” such proceedings. This would be an alternative to “special issues” of journals that are sometimes arranged to serve a similar purpose, but again tend to be bogged down unnecessarily by hurdles associated with operating through conventional publication processes.

P.S.: For those interested in models of publication alternative to the conventional closed-review, closed-access formats, here are two to consider:

  • NIPS Proceedings (link) are the peer-reviewed proceedings of the annual Neural Information Processing Systems conference, a major forum for advances in machine learning. Note that papers are posted along with their reviews.
  • Theoretical Economics (link) is an open-source journal focusing on economic theory and published by the Econometric Society. Note that they host using software generated by Open Journal Systems of the Public Knowledge Project (link).

Theoretical models and RCTs

In my research, I typically try to inform decisions on the design of policies. Sometimes this amounts to a binary “adopt” / “do not adopt” decision, but usually it is more complicated than that. To the extent that it is, I would like to have an experiment that sets me up to inform the more complicated decision. This often requires that I pose some kind of theoretical model that relates a range of options for policy inputs to outcomes of interest.

To the furthest extent possible, I would like my experimental design to deliver estimates of key parameters in the model with minimal additional assumptions needed when it comes to analysis. That is, I want as many of the identifying assumptions that I need to be guaranteed by my design. This is, in essence, the approach developed by Chassang et al. in the context of treatments that work only if recipients put in some effort to make them work. Here is a link to the paper: [published] [ungated PDF]. In this, the model informs the design in an ex ante manner. Ex post, after the experimental data are in, we can just estimate some simple conditional means to get the parameters of interest. A la Rubin (link), design trumps analysis; the revision is that it is model-informed design.

Now, sometimes I cannot do everything that I want in my design. For example, suppose my theoretical model suggests a potentially nonlinear relationship between inputs and outcomes. Suppose as well that I can only assign treatment to a few points on the potential support of the inputs (maybe even just two points). Then, I may need to do more with the analysis to get a sense of what outcomes might look like in areas of the support of the inputs where I have no direct evidence. This would be important if we want to propose optimal policies over the full support of inputs levels. We could take as an example this approach by Banerjee et al., who try to estimate the optimal allocation of police posts to reduce drunk driving: [ungated PDF].

(These issues are central to a working group that Leonard Wantchekon and I are now running for NYC-area economists and political scientists. We had our first event last week at Princeton and it was great! This post is inspired by the thought provoking talks given by Erik Snowberg, Brendan Kline, Pierre Nguimpeu, and Ethan Bueno de Mesquita at that event.)