Cyrus Samii – Page 14

February 5, 2014February 5, 2014

Does Islamic rule boost women’s opportunities? RD evidence from Turkey

From a remarkable study by Erik Meyersson in the new Econometrica, highlights from the abstract:

In 1994, an Islamic party [in Turkey] won multiple municipal mayor seats across the country. Using a regression discontinuity (RD) design, I compare municipalities where this Islamic party barely won or lost elections…The RD results reveal that, over a period of six years, Islamic rule increased female secular high school education. Corresponding effects for men are systematically smaller and less precise. In the longer run, the effect on female education remained persistent up to 17 years after, and also reduced adolescent marriages. An analysis of long-run political effects of Islamic rule shows increased female political participation and an overall decrease in Islamic political preferences. The results are consistent with an explanation that emphasizes the Islamic party’s effectiveness in overcoming barriers to female entry for the poor and pious.

Ungated version posted on Meyersson’s website: link. On the Econometrica website: link.

February 4, 2014

Two year research scientist post at the Hertie School of Governance, Berlin

Piero (webpage) at Hertie forwarded this announcement:

The Hertie School has an opening beginning in April 2014 for a Research Scientist (f/m). We are looking for highly motivated candidates interested in conducting research on governance indicators for the School ́s Governance Report (www.governancereport.org) and related projects. The Governance Report seeks to understand the challenges of multi-actor, multi-level governance, analysing the reasons behind governance successes and failures, identifying and examining governance innovations, and developing governance indicators. The results are published in annual reports and other publications.

See more on the official announcement: link. Berlin is of course a fabulous city and this is interesting and engaged research.

February 3, 2014February 15, 2016

Should you use frequentist standard errors with causal estimates on population data? Yes.

Suppose you are studying the effects of some policy adopted at the state level in the United States, and you are using data from all 50 states to do it. Well,

When a researcher estimates a regression function with state level data, why are there standard errors that differ from zero? Clearly the researcher has information on the entire population of states. Nevertheless researchers typically report conventional robust standard errors, formally justified by viewing the sample as a random sample from a large population. In this paper we investigate the justification for positive standard errors in cases where the researcher estimates regression functions with data from the entire population. We take the perspective that the regression function is intended to capture causal effects, and that standard errors can be justified using a generalization of randomization inference. We show that these randomization-based standard errors in some cases agree with the conventional robust standard errors, and in other cases are smaller than the conventional ones.

From a new working paper on “Finite Population Causal Standard Errors ” by the econometrics all-star team of Abadie, Athey, Imbens, and Wooldridge (updated link): link.

I have been to a few presentations of papers like this where someone in the audience thinks they are making a smart comment by noting that the paper uses population data, and so the frequentist standard errors “don’t really make sense.” Abadie et al. show that such comments are often misguided, arising from a confusion over how causal inference differs from descriptive inference. Sure — there is no uncertainty as to what is the value of the regression coefficient for this population given the realized outcomes. But the value of the regression coefficient is not the same as the causal effect.

To understand the difference, it helps to define causal effects precisely. A causal effect for a given unit in the population is most coherently defined to be a comparison between the outcome observed under a given treatment (being the “state level policy” in the case of the example above) and what would obtain were that same unit to be given another treatment. It is useful to imagine this schedule of treatment-value-specific outcomes as an array of “potential outcomes.” Population average causal effects take the average of the unit level causal effects in a given population. Now, suppose that there is some random (at least with respect to what the analyst can observe) process through which units in the population are assigned treatment values. Maybe this random process occurred because a bona fide randomized experiment was run on the population, or maybe it was the result of “natural” stochastic processes (that is, not controlled by the analyst). Then, for each unit we only get to observe the potential outcome associated with the treatment received, and not the other “counterfactual” potential outcomes associated with the other possible treatments. As such, we cannot actually construct the population average causal effect directly. Doing so would require that we were able to compute each of the unit-level causal effects. So, we have to estimate the population average causal effect using the incomplete potential outcomes data available to us. If the results of the random treatment assignment processes had turned out differently, the estimate we would obtain could very well differ as well (since there would be a different set of observed and unobserved potential outcomes). Even though we have data from everyone in the population, we are lacking the full schedule of potential outcomes that would allow us to estimate causal effects without uncertainty.

As it turns out, the random treatment assignment process is directly analogous to the random sampling of potential outcomes, in which case we can use standard sample theoretic results to quantify our uncertainty and compute standard errors for our effect estimates. Furthermore, as a happy coincidence, such sample theoretic standard errors are algebraically equivalent or even conservatively approximated by the “robust” standard errors that are common in current statistical practice. (This previous sentence was revised so that its meaning is now clearer.) It’s a point that Peter Aronow and I made in our 2012 paper on using regression standard errors for randomized experiments (link), and a point that Winston Lin develops even further (link; also see Berk’s links in the comments below to Winston’s great discussion of these results). Abadie et al. take this all one step further by indicating that this framework for inference makes sense for observational studies too.

Now, some of you might know of Rosenbaum’s work (e.g. this) and think this has all already been said. That’s true, to a point. But whereas Rosenbaum’s randomization inference makes use of permutation distributions for making probabilistic statements about specific causal hypotheses, Abadie et al.’s randomization inference allows one to approximate the randomization distribution of effect estimates without fixing causal hypotheses a priori. (See more on this point in this old blog post, especially in the comments: link).

February 3, 2014

One-year post at UC-Berkeley with Berkeley Initiative for Transparency in the Social Sciences

The fine CEGA folks at UC-Berkeley are recruiting a quant-savvy social science grad to work with them on an important research transparency initiative:

Interested in improving the standards of rigor in empirical social science research? Eager to collaborate with leading economists, political scientists and psychologists to promote research transparency? Wishing to stay abreast of new advances in empirical research methods and transparency software development? The Berkeley Initiative for Transparency in the Social Sciences (BITSS) is looking for a Program Associate to support the initiative’s evaluation and outreach efforts. The candidate is expected to engage actively with social science researchers to raise awareness of new and emerging tools for research transparency. Sounds like fun? Apply now!

More information is here: link.

December 1, 2013December 5, 2013

Big data and social science: Mullainathan’s Edge talk

The embedded video links to an Edge talk with Sendhil Mullainathan on the implications of big data for social science. His thoughts come out of research he is doing with computer scientist Jon Kleinberg [website] applying methods for big data to questions in behavioral economics.

Mullainathan focuses on how inference is affected when datasets increase widthwise in the number of features measured—that is, increasing “K” (or “P” for you ML types). The length of the dataset (“N”) is, essentially, just a constraint on how effectively we can work with K. From this vantage point, the big data “revolution” is the fact that we can very cheaply construct datasets that are very deep in K. He proposes that with really big K, such that we have data on “everything,” we can switch to more “inductive” forms of hypothesis testing. That is, we can dump all those features into a machine learning algorithm to produce a rich predictive model for the outcome of interest. Then, we can test an hypothesis about the importance of some variable by examining the extent to which the model relies on that variable for generating predictions.

I see three problems with this approach. First, just like traditional null hypothesis testing it is geared toward up or down judgments about “significance” rather than parameter (or “effect size”) estimation. That leaves the inductive approach just as vulnerable to fishing, p-hacking, and related problems that occur with current null hypothesis testing.* It is also greatly limits what we really learn from an analysis (statistical significance is not substantive significance, and so on). Second, scientific testing is typically some form of causal inference, and yet the inductive-predictive approach that Mullainathan described in his talk is oddly blind to questions of causal identification. (To be fair, it is a point that Mullainathan admits in his talk.) The possibilities of post-treatment bias and bias amplification are two reasons that including more features does not always yield better results when doing causal inference (although bias amplification problems would typically diminish as one approaches having data on “everything”). Thus, without careful attention to post-treatment bias for example, the addition of features in an analysis can lead you to conclude mistakenly that a variable of interest has no causal effect when in fact it does. The third reason goes along with a point that Daniel Kahneman makes toward the end of the video: the predictive strength of a variable relative to other variables is not an appropriate criterion for testing an hypothesized cause-effect relationship. But, the inductive approach that Mullainathan describes would be based, essentially, on measuring relative predictive strength.

Nonetheless, the talk is thought provoking and well worth watching. I also found the comments by Nicholas Christakis toward the end of the talk to be very thoughtful.

*Zach raises a good question about this in the comments below. My reply basically agrees with him.