### Fun math: sum of integers

Math is often about finding the right analogy, often a spatial analogy.  Suppose you want to compute the sum of integers, $latex S = 1+2+3+ \hdots + N$. Consider decomposing the sum as,

$latex \begin{array}{cccccc}S = & 1 & + 1 & + 1 & + \hdots & + 1\\ & & + 1 & + 1 & + \hdots & + 1 \\ & & & + 1 & + \hdots & + 1\\ & & & & \vdots & \\ & & & & & + 1,\end{array}$

where you’ll note the column sums equal the integers in the sequence.  You’ll have a bunch of ones that form a triangle. Imagine taking such triangle, copying it, flipping it and then joining the copy to the first triangle. Removing the “$latex +$” signs you’d get something that looks like,

$latex \begin{array}{ccccc} 1 & 1 & 1 & \hdots & 1\\ (1) & 1 & 1 & \hdots & 1\\ (1) & (1) & 1 & \hdots & 1\\ & & & \vdots & \\ (1) & (1) & (1) & \hdots & 1 \\ (1) & (1) & (1) & \hdots & (1), \end{array}$

where I’ve put parentheses on the $latex 1$’s from the second, copied triangle. By analogy, the sum of the integers from the original sequence is equal to half the area of a rectangle that is characterized by this matrix of ones—that is, a rectangle of height $latex n+1$ and width $latex n$. As such, $latex S = n(n+1)/2$.  This comes up, e.g., in the asymptotic approximation for the Wilcoxon signed rank test (link).

### Mundane algebra: stratified mean and IPW mean

Came up in a conversation, so I just wanted to store it: the stratified mean and inverse-probability weighted mean are algebraically equivalent:

$latex \underbrace{N^{-1}\sum_{s=1}^S \sum_{i \in s}y_{is}\frac{R_{is}N_s}{n_s}}_{\text{IPW mean}} = \sum_{s=1}^S\frac{N_s}{N}\frac{1}{n_s}\sum_{i \in s}y_{is}R_{is} = \underbrace{\sum_{s=1}^S\frac{N_s}{N}\bar{y}_s}_{\text{stratified mean}}$,

where $latex N$ is population size; $latex N_s$ and $latex n_s$ are stratum $latex s$ population and sample size, respectively; and $latex R_{is}$ is the response indicator for unit $latex i$ in stratum $latex s$.

### “Positive” versus “negative” causal identification

Reflecting on some of the recent discussions of matching as a tool for causal analyses in social science (see here as well as this really nice commentary—hat tip, Chris Blattman), I wonder if it’s useful to make a distinction between “positive” versus “negative” causal identification.   Continue reading ““Positive” versus “negative” causal identification”

### Clustering, unit level randomization, and inference (updated, Nov 5)

I wanted to look into the case where you have an experiment in which your units of analysis are naturally clustered (e.g., households in villages), but you randomize within clusters.  The goal is to estimate a treatment effect in terms of difference in means, using design-based principles and frequentist methods for inference.

Randomization ensures that the difference in means is unbiased for the sample average treatment effect.  Using only randomization as the basis for inference, I know the variance of this estimator is not identified for the sample, as it requires knowledge of the covariance of potential outcomes.  But the usual sample estimators for the variance are conservative.  If, however, the experiment is run on a random sample from an infinitely large population, then the standard methods are unbiased for the mean and for the variance of the difference in means estimator applied to this population (refer to Neyman, 1990; Rubin, 1990; Imai, 2008; for finite populations, things are more complicated, and the infinite population assumption is often a reasonable approximation).  I understand that these are the principles that justify the usual frequentist estimation techniques for inference on population level treatment effects in randomized experiments.

The question I had was, how should we account for dependencies in potential outcomes within clusters?   Continue reading “Clustering, unit level randomization, and inference (updated, Nov 5)”