How to judge a theoretical model

Theoretical models are simplified approximations of the world, or more specifically thought experiments predicated on assumptions that try to approximate first order aspects of behavior. As an empiricist I cannot knock theorists for using approximations—heck I do it all the time too (see, asymptotics, or the delta method). But we should ask whether the approximations are reasonable against what we see in the real world. That is,

  1. If verisimilitude is not a criterion for assumptions, any result can be reverse engineered by picking the assumptions that deliver the result.

  2. If any result can be engineered then results themselves have no special ontological status.

  3. Exploring the implications of assumptions for its own sake can be technically demanding but no more credible as a way to map reality for that. This practice generates “bookshelf” models whose practical utility depends on filtering the assumptions and implications against data and our beliefs about the real world. Without filtering we are building a Tower of Babel (or maybe an art museum). (Note this goes beyond Friedman’s famous arguments about predictive validity without regard for assumptions: we need to filter assumptions too because the implications of a model are not unique to one set of assumptions, by 1 and 2.)

  4. How complicated can the problems be that we allow our agents to solve in a model? Is a dynamic program ever admissible as a reasonable assumption on the objective function of an agent? That depends on the situation. If the goal that the agent is seeking is sufficiently clear (albeit complicated to achieve) and the agent has lots of opportunity to experiment and come upon something that works well, it may be reasonable to assume that the agents’ actions will converge in a way that it appears as if it is solving such a program. The validity of the “as if” assumption should be vetted in this way though.

All of this from an essay by Paul Pfleiderer on “chameleon” models and the misuse of theory in economics: link. (Ht @noahpinion)


Why do countries work so hard to *lose* their access to World Bank loans?

scheme In a new working paper posted to SSRN, Peter Aronow, Allison Carnegie and I propose an answer to the puzzle: by doing so, countries can reap “status” gains that outweigh the material costs of losing access to loans [SSRN link].

The World Bank loans program is an interesting setting for analyzing how international pressures affect the behavior of governments. This is because the terms of loans and other sorts of support that the World Bank offers depend on how a country falls along a fixed schedule of income classifications. The figure above illustrates the classifications as they were applied in the year 2000. Our paper focuses on what happens when countries cross the threshold shown at $5225 GNI/capita in the figure. Here, countries are made eligible for “graduation,” which, when achieved, means that they can no longer receive the type of generosity that the Bank provides to middle income and lower income countries. We reviewed the case histories of countries crossing this threshold and found, quite curiously, that countries seem always to welcome this offer to graduate. The case histories provide no evidence of a tendency for countries to try to avoid or stall graduation even though this means losing access to benefits.

Moreover, the clear cut nature of the classification rules allows us to use a regression discontinuity design to study just how countries react to crossing the threshold—that is, to obtain a very credible estimate of the causal effect of becoming eligible to graduate. We find, remarkably, that countries tend to react by liberalizing. This is not what we expected: we expected to see that countries would react in a manner indicative of becoming extra sensitive to risks, perhaps even by reigning in liberties.

We investigated various possible explanations for this seemingly puzzling behavior, and the one that seems best supported by the evidence is that governments view graduation as an opportunity to increase their institutionally conferred status and join the “club” of developed nations. The liberalization that we witness is part of that exchange, given the hegemony of liberal western governments in defining terms of “success” within the World Bank.


Estimation and inference with dyadic data

For those interested in some statistical self-flagellation, here’s a link to work in progress on estimation and inference with dyadic data, joint with Peter Aronow and Valentina Assenova: link. Dyadic data are ubiquitous in various fields of social science, including network sociology, international relations, and even research on “speed dating.” The problem of dyadic dependence complicates inference for such data. From what we’ve seen, most people either make hopeful assumptions about the nature of this dependence or just sweep it under the rug entirely. What we’ve done is to derive some results under highly “agnostic” assumptions, to show that on the one hand, the heavy parameterizations used in current approaches may be unnecessary, while on the other hand, ignoring dyadic dependence can be extremely misleading. We’re working on more applications and efficient software implementation. Comments appreciated.


There is too much idolatry of whether contexts are “representative” or effects “generalizable”

From Wikipedia:

Idolatry is a pejorative term for the worship of an idol or a physical object such as a cult image as a god, or practices believed to verge on worship, such as giving honour and regard to created forms…. In current context, however, idolatry is not limited to religious concepts. It can also refer to a social phenomenon where false perceptions are created and worshipped….

In the recent past I reviewed a paper for an academic journal. The paper covered an interesting subject, it was well done, and so I recommended some revisions and that the author resubmit once those were done. Other reviewers disagreed, arguing most centrally that the context in which the study was undertaken was highly specific and therefore not “representative,” in which case the empirical results may not be “generalizable”. They recommended reject.

Even more recently on the blog, I pointed to Meyersson’s newly published paper on the effects of the rule of Islamic parties in Turkish municipalities (post). Meyersson’s most remarkable finding was that opportunities for women seemed to expand substantially under the municipal rule of Islamic parties. I received a few responses via Twitter and in person critiquing Meyersson’s findings, suggesting that the constellation of historical, economic, and institutional conditions in Turkey undermine the “generality” of these effects on women’s opportunity.

While I appreciate that academic papers sometimes underplay scope conditions for their results, I find such obsession with whether an empirical result “generalizes” or whether the empirical context is “representative” to be poorly motivated in many cases. First, there are no research designs or analytical methods that can reliably deliver “representative” or “generalizable” findings. For example, using “representative” data does not guarantee that your results will be representative even for units in your dataset. (See here for more: [link 1] [link 2] [link 3] [link 4].) To pursue a “representative” estimate is often to chase a mirage.

Second, working with “non-representative” groups may provide more theoretical traction. If existing theory suggests that effects should go one way with a particular group of units but you find the effects go the other way, well this is the kind of anomaly that allows theoretical elaboration to advance.

Third, it is often unclear as to what would be a “representative” or “general” case. Was the skepticism toward Meyersson’s paper coming from some implicit comparison to, say, Saudi Arabia? If so, why on earth should we take findings for Saudi Arabia to be “general” and dismiss those from Turkey as “idiosyncratic”? The fact that effects vary by context is interesting and worth understanding.

If the objective is to learn about such heterogeneity across contexts or, the other side of the coin, to demonstrate stability across contexts, then one should conduct studies that seek unusual contextual conditions!


Does Islamic rule boost women’s opportunities? RD evidence from Turkey

From a remarkable study by Erik Meyersson in the new Econometrica, highlights from the abstract:

In 1994, an Islamic party [in Turkey] won multiple municipal mayor seats across the country. Using a regression discontinuity (RD) design, I compare municipalities where this Islamic party barely won or lost elections…The RD results reveal that, over a period of six years, Islamic rule increased female secular high school education. Corresponding effects for men are systematically smaller and less precise. In the longer run, the effect on female education remained persistent up to 17 years after, and also reduced adolescent marriages. An analysis of long-run political effects of Islamic rule shows increased female political participation and an overall decrease in Islamic political preferences. The results are consistent with an explanation that emphasizes the Islamic party’s effectiveness in overcoming barriers to female entry for the poor and pious.

Ungated version posted on Meyersson’s website: link. On the Econometrica website: link.