matching with multilevel data, discussing some strategies

Nyasha, a PhD candidate from the Netherlands, writes,

I am evaluating a food aid program for HIV/AIDS afflicted families and individuals in Zambia. This is the data I have:
  1. 4 zones were selected in an urban area. This purposive selection was based on HIV prevalence data, these 4 zones appear to have slightly higher rates than others .
  2. Proxy means testing was then used to select households in these 4 zones. The selection criteria was based on a combination of household and individual characteristics.
  3. I have data from 200 “treated “ households from the 4 selected zones and data from 200 similar “ control” households from 4 zones that were not selected.
  4. Within these households, I am interested in assessing outcomes at individual level. I have personal medical data for HIV patients (300 observations), household consumption data (400 observations) and personal labour supply data for everyone in each household (1935 observations). First chapter of my thesis looks at patient level data, second chapter looks at household consumption data and the last one looks at individual labour supply data.
A few questions:
  • How I can proceed with creating propensity scores. I would like analyse the program’s impact on both household level outcomes and individual level outcomes. Do I use a propensity score model defined at household level when I am assessing individual outcomes? Is it justified to use three different propensity score models for each level of analysis?
  • How do I incorporate the geographical selection in the matching? When I include the zone dummies in the logit model (psmatch2), I have a separation problem, with 6 of the dummies being dropped from the model. Should I continue to use this logit model? Or should I completely drop the zone dummies?
  • What do I do with HIV rate. The difference is there between selected zones and non-selected zones. However HIV rates do not appear to be correlated with any of my outcomes. I was thinking of using it as an instrument in IV regressions? Is it a justifiable instrument, especially when the variable is only available for 8 cluster zones? My preliminary diagnostic tests show its a valid instrument.
  • Can I also include the HIV rate variable as a cluster/geographic level covariate in PSM? Or do I exclude it as it appears more to be an instrument?

Some of the questions are probably best posed to someone working in your discipline. But let me respond to the general question about matching on multilevel data. From what I understand, your treated and control individuals are from different communities. Whether or not this is the case, there are different ways to do the matching with multilevel data:

One way is to first use community level data to match communities that include treated people with communities that didn’t. Then, across each of the matched pairs or matched sets, find individual level matches for each of the treated individuals by drawing controls only from the matched community. That would be a two-stage matching approach, and it makes sense if you think that community level factors are really important.

Another way is to simply load in the community level variables along with the individual level variables and match on everything at the same time. It makes sense when you think that community level variables are no more or less important than individual level variables.

In practice, the two approaches may generate nearly identical solutions. But such may not be the case for you, in which case you need to decide whether you think the community level variables are of paramount importance or not.

The matching can be done with pscores, coarsened exact matching, nearest neighbor, genetic matching or something else—whatever you like. There are benefits and downsides to each. I have used genetic matching because in theory it obtains the best outcome that either pscores or mahalanobis distance nearest neighbor matching can obtain. I have also used coarsened exact matching because of its transparency and ease of interpretation. Another alternative would be to use a generalized weighting algorithm, but I don’t think there is readily available software for it yet (although some of Jens Hainmueller’s current work seems to be promising).

On some of your other question: the separation problem with the zone dummies seems to be due to the fact that some zones had no treated or no controls. If there are some zones that have both, you might restrict your analysis to those zones. You might then do another analysis that then adds in matched data created according to the first option above.

Indeed, if you think something has the properties of an instrument, then you do not want to include it in the matching algorithm. That can result in bias.

Nyasha followed up,

Another question I have is this- four zones or communities in my data only have treated only. The other four only have controls. That is why I think I am encountering the separation problem when I add community dummies into the logit equation.

Most certainly that will create such problems.

I have few other community observed characteristics which I have included already, but how best do I then control for unobserved effects (especially endogenous program placement) at community level, if I cannot include the community dummies in the propensity score model?

Alas, you cannot. You need to make an assumption that the measured covariates capture all of the relevant differences between the communities, and then match using these measured covariates.

Should I also carry out further regression analysis on the matched sample, where I then include the community dummies (as fixed effects)?

You will not be able to do this because of perfect collinearity with the treatment indicators. There is really no way to account for unmeasured community level factors. The best you can do is use the measured information to match, and you can also include these community level covariates whatever regressions you use. Then, perhaps you can conduct a sensitivity analysis.

Or should I also look into using an IV with community fixed effects? Would this work for cross section data?

Again, you won’t be able to do it because the community fixed effects will be perfectly collinear with the treatment indicator, so a second stage regression with community fixed effects would not be identified.

Did I miss something here?

Share

letter to Senators regarding USIP defunding

UPDATE: A petition is available online: link.

Dear Senator __________,

The House has voted to cut funding for the United States Institute of Peace. I write to urge you not to make the same error.

What USIP provides is a means by which the government ensures itself access to a diverse, independent, and up-to-date pool of expert knowledge on conflicts around the world. The current events in the Arab world should make it clear how important it is to have such a resource.

USIP supports independent researchers who help our government to navigate adeptly a turbulent and complex world. Adept management of Americans’ foreign affairs, like clean air, is a classic “public good.” Thus, it requires government support. It is a basic economic principle that private market forces will fail in providing these kinds of goods. No private market actor is capable of internalizing all of the relevant costs and benefits. Therefore, no private market actor will find it in their interest to look after the national interest in an efficient and effective manner. USIP is an important means by which our government looks after Americans’ well-being.

The internal mechanisms for knowledge creation in the government are inadequate on their own and cannot substitute for the diverse network of researchers that USIP brings together.

The gap left by the potential removal of USIP, a nonpartisan agency that helps guide research in a direction serving our country’s goals overseas, will most certainly be filled by considerably less reliable partisan voices and business interests. This would handicap our government’s foreign policy.

USIP’s budget is a minuscule fraction of overall spending, but its impact is great. The cost benefit equation is clearly on the side of sustaining USIP support, especially when considered relative to other items that draw on considerably more government funds with considerably less reward.

I hope you will make the wise decision and reject any move to defund USIP.

Sincerely,

Cyrus D. Samii
Fellow, MacMillan Center, Yale University
Assistant Professor (as of July 2011), Politics Department, New York University

Share

(technical) yet more on clustering and standard errors: clustering in the regressors

A little technical note on how correlation in regressors, which can be measured, can sometimes provide guidance in choosing what kind of standard error to use: correlation_in_x110217

This is pretty much straight Moulton factor (why is there no Wikipedia entry on Moulton factor to link to?). I still need to reconcile this stuff with what I had shown a few months ago about how the Moulton factor leads to the wrong conclusion in the context of correlated potential outcomes, even if treatment assignment is at the unit level (yes, I said unit level, not cluster level). For a refresher on that, see here: link1 link2. A big difference in the document that is attached to this post is that we are looking at a vanilla “constant effects” set-up, whereas the potential outcomes stuff was agnostic on unit-by-unit differences in effect sizes.

Share

because we feel like it

Sometimes we just “feel” like doing something. By my reading of recent neuroscience, these situations may arise because somewhere in our brain there are processes that have determined that this “something” is optimal and the signals from these processes have overwhelmed signals from others that may have come to a contrary conclusion.

Our thoughts and actions are the result of numerous parallel processes. They are sometimes combined in an apparently sensible way giving us the illusion of an integrated self (link). But sometimes they do not come together in a sensible way and so we cannot immediately intuit a reason. We just feel like it.

The manner in which external stimuli and those parallel processes can mix is vast. So our urges to do things may take into account a vast number of dimensions of which we are barely familiar. So long as we let ourselves occasionally take actions because we “feel like it,” these processes reveal a preference ordering that we cannot access intentionally. In doing so we discover features of our inaccessible inner preference ordering. One implication is that we can misjudge ourselves just as much as we can misjudge others (link).

Share