Clustering, unit level randomization, and insights from multisite trials

Another update to the previous post (link) on clustering of potential outcomes even when randomization occurs at the unit level within clusters: Researching the topic a bit more, I discovered that the literature on “multisite trials” addresses precisely these issues. E.g., this paper by Raudenbush and Liu (2000; link) examines consequences of site-level heterogeneity in outcomes and treatment effects. They formalize a balanced multisite experiment with an hierarchical linear model, $latex Y_{ij} = \beta_{0j} + \beta_{1j}X_{ij} + r_{ij}$ where $latex r_{ij} \sim i.i.d.N(0,\sigma^2)$, and $latex X_{ij}$ is a centered treatment variable (-0.5 for control, 0.5 for treated). In this case, an unbiased estimator for the site-specific treatment effect, $latex \hat \beta_{1j}$, is given by the difference in means between treated and control at site $latex j$, and the variance of this estimator over repeated experiments in different sites is given by, $latex \tau_{11} + 4\sigma^2/n$, where $latex \tau_{11}$ is the variance of the $latex \beta_{1j}$’s over sites, and $latex n$ is the (constant) number of units at each site. Then, an unbiased estimator for the average treatment effect over all sites, $latex 1,\hdots,J$, is simply the average of these site-specific estimates, with variance $latex \frac{\tau_{11} + 4\sigma^2/n}{J}$. What distinguishes this model from the one that I examined in the previous post is that once the site-specific intercept is taken into account, there remains no residual clustering (hence the i.i.d. $latex r_{ij}$’s). Also, heterogeneity in treatment effects is expressed in terms of a simple random effect (implying constant within group correlation conditional on treatment status). These assumptions are what deliver the clean and simple expression of the variance of the site-specific treatment effect estimator, which may understate the variance in the situations that I examined where residual clustering was present. It would be useful to study how well this expression approximates what happens in the more complicated data generating process that I set up.