Monthly Archives: June 2015

summer reading: “Reinventing the Bazaar” by McMillan

For a market to function well, [1] you must be able to trust most of the people most of the time [to live up to contractual obligations]; [2] you must be secure from having your property expropriated; [3] information about what is available where at what quality must flow smoothly; [4] any side effects of third parties must be curtailed; and [5] competition must be at work.

So concludes John McMillan in his magisterial and highly engaging 2002 book on institutions and markets, Reinventing the Bazaar: A Natural History of Markets (amazon). McMillan provides fantastic examples from across time and around the world on how formal and informal institutions have served in meeting these five conditions. Examples range from produce vendors in the Makola market in Accra to bidders for public construction contracts in Tokyo.

I was reading this while traveling through the DRCongo the past two weeks. Helped to open my eyes about the various third party roles that state, armed group, and traditional elites play in market exchange there.


“Random routes” and other methods for sampling households in the field

Himelein et al. have a draft working paper (link) covering methods for household sampling in the field when you don’t have administrative lists of households or full enumeration on-site is not possible. This includes various “random route”/”random walk” as well as methods that use satellite data. Some choice tidbits:

  • On using satellite maps to construct a frame: “Based on the experience mapping the three PSUs used in the paper, it takes about one minute per household to construct an outline. If the PSUs contain approximately 250 structures (the ones used here contain 68, 309, and 353 structures, respectively), mapping the 106 PSUs selected for the full Mogadishu High Frequency Survey would have required more than 50 work days.” Yikes! Of course they probably could have cut this time down if they sampled subclusters within the PSUs and only enumerated those. Nonetheless, the 1-minute/household estimate is a useful rule of thumb.
  • They define the “Mecca method” as choosing a random set of GPS locations in an area, and then walking in a fixed direction (e.g., the direction of Mecca, which almost everyone in Mogadishu knows) until you hit an eligible structure. The method amounts to a form of probability proportional to size (PPS) sampling, where “size” in this case amounts to the area on the ground that allows for an unobstructed path to the structure. This may not be such an easy thing to measure, although the authors propose that one could approximate the PPS weights using distance between the selected household and the next household going up the line that was traveled. Also it’s possible that some random points induce paths that never come upon an eligible structure. This would create field complications, particular in non-urban settings where domicile layouts may be sparse.

The authors take images of domicile patterns Mogadishu and some information on consumption variable distributions to construct simulations. They use the simulations to evaluate satellite-based full enumeration, field listing within PSU segments, interviewing within GPS-defined grid squares, the Mecca method, and then the Afrobarometer “random walk” approach. No surprise that satellite-based full enumeration was the least biased, segmentation next, and then Mecca method with PPS weights and approximate PPS weights third and fourth. All four of these were quite good and unbiased though. Grid, random walk, and unweighted Mecca method were quite biased. Such bias needs to be weighed against costs and ability to validate. Satellite full enumeration is costly but one can validate. The segment method is also costly and rather hard to enumerate. The grid method fares poorly on both counts. The Mecca method with true PPS weights is somewhat costly, but with approximate PPS weights is quite good on both counts. The random walk is cheap but hard to validate. Again, I would say that some of these results may be particular to the setting (relatively dense settlement in an urban area). But the insights are certainly useful.

I found this paper from David Evans fantastic summary of the recently concluded conference on Annual Bank Conference on Confronting Fragility and Conflict in Africa: link.


More on external (and construct) validity

The Papers and Hot Beverages (PHB) blog had a nice discussion (link) of some of the points I raised in my previous post about “pursuing external validity by letting treatments vary” (link). PHB starts by proposing that we can rewrite a simple treatment effects model along the lines of the following (modified from PHB’s expression to make things clearer):

$latex Y = \mu + \rho T + \epsilon$
$latex \hspace{2em} = \mu+\left(\sum_k \alpha_k x_k + \sum_j \sum_k \kappa_{jk} z_j x_k\right) T + \epsilon$.

The idea is that the treatment may bundle various components, captured by the $latex x$ terms, each of which has its own effect. Moreover, each of these components may interact with features of the context, captured by the $latex z$ terms.

The proposal to explore external validity by “letting treatments vary” amounts to trying to identify the effect of one of the $latex x$ components by generating variation in that component that is independent of the other $latex x$ components. Of course, in doing so, one does not resolve the problem of covariation with the $latex z$ components. So in this way, I understand why PHB was not “convinced” about the strategy of letting treatments vary as being sufficient for testing a parsimonious proposition that focuses on the effect of a particular component of a treatment bundle in a manner that does not incorporate contextual conditions. Of course, I was not trying to propose that such a strategy is sufficient in this way. Just that it is another way to think about accumulating knowledge across studies.

We can also go further and provide a more complete characterization of the problem of interpreting a treatment effect. Indeed, PHB’s characterization imposes some restrictions relative to the following:

$latex Y = \mu + \rho T + \epsilon$

$latex = \mu + \left( \sum_{k} \alpha_k x_k + \sum_{k}\sum_{k’\ne k} \beta_{kk’}x_kx_{k’} \right.$
$latex \left. + \sum_{j} \sum_{k} \kappa_{jk}z_jx_k + \sum_{j} \sum_{k} \sum_{k’\ne k} \delta_{jkk’}z_jx_kx_{k’} \right)T + \epsilon $

The $latex \alpha$s are effects of elements in $latex x$ that depend on neither other elements of the treatment bundle $latex x$ nor the context $latex z$. The $latex \beta$s are the ways that elements of the treatment bundle modify each others’ effects regardless of context. The $latex \kappa$s are ways that the context modifies the effects of elements of $latex x$ separately. Finally, the $latex \delta$s are ways that context modifies the ways that elements of $latex x$ modify the effects of each other.

When using causal estimates to develop theories, we typically want to interpret manipulations of $latex T$ in parsimonious terms. The upshot is that in trying to be parsimonious we may ignore elements of $latex x$ or $latex z$. Even if the effect of $latex T$ is well identified, our parsimonious interpretation may not be valid.

This is a mess of an expression. But I find it strangely mesmerizing. It gives some indication of how complicated is the work of interpreting causal effects.