“Random routes” and other methods for sampling households in the field

Himelein et al. have a draft working paper (link) covering methods for household sampling in the field when you don’t have administrative lists of households or full enumeration on-site is not possible. This includes various “random route”/”random walk” as well as methods that use satellite data. Some choice tidbits:

On using satellite maps to construct a frame: “Based on the experience mapping the three PSUs used in the paper, it takes about one minute per household to construct an outline. If the PSUs contain approximately 250 structures (the ones used here contain 68, 309, and 353 structures, respectively), mapping the 106 PSUs selected for the full Mogadishu High Frequency Survey would have required more than 50 work days.” Yikes! Of course they probably could have cut this time down if they sampled subclusters within the PSUs and only enumerated those. Nonetheless, the 1-minute/household estimate is a useful rule of thumb.
They define the “Mecca method” as choosing a random set of GPS locations in an area, and then walking in a fixed direction (e.g., the direction of Mecca, which almost everyone in Mogadishu knows) until you hit an eligible structure. The method amounts to a form of probability proportional to size (PPS) sampling, where “size” in this case amounts to the area on the ground that allows for an unobstructed path to the structure. This may not be such an easy thing to measure, although the authors propose that one could approximate the PPS weights using distance between the selected household and the next household going up the line that was traveled. Also it’s possible that some random points induce paths that never come upon an eligible structure. This would create field complications, particular in non-urban settings where domicile layouts may be sparse.

The authors take images of domicile patterns Mogadishu and some information on consumption variable distributions to construct simulations. They use the simulations to evaluate satellite-based full enumeration, field listing within PSU segments, interviewing within GPS-defined grid squares, the Mecca method, and then the Afrobarometer “random walk” approach. No surprise that satellite-based full enumeration was the least biased, segmentation next, and then Mecca method with PPS weights and approximate PPS weights third and fourth. All four of these were quite good and unbiased though. Grid, random walk, and unweighted Mecca method were quite biased. Such bias needs to be weighed against costs and ability to validate. Satellite full enumeration is costly but one can validate. The segment method is also costly and rather hard to enumerate. The grid method fares poorly on both counts. The Mecca method with true PPS weights is somewhat costly, but with approximate PPS weights is quite good on both counts. The random walk is cheap but hard to validate. Again, I would say that some of these results may be particular to the setting (relatively dense settlement in an urban area). But the insights are certainly useful.

I found this paper from David Evans fantastic summary of the recently concluded conference on Annual Bank Conference on Confronting Fragility and Conflict in Africa: link.