10 minute read

The stable unit treatment value assumption (SUTVA).

A referee raised the SUTVA in a revision that I’m working on. It’s an important topic, that gets less discussion than it deserves. Since the topic is often folded into the random assignment assumption,1 I know less about the history of the idea than I should. This post is going to be a quick collection of notes while I chase the idea through the history of econometrics.

I always start with Woolridge (2010) because it’s where I learned everything the first time.

Woolridge (2010):

The following discussion assumes that we have an independent, identically distributed sample from the population. This assumption rules out cases where the treatment of one unit affects another’s outcome (possibly through general equilibrium effects as in Heckman, Lochner, and Taber, 1998) assumption that treatment of unit $i$ only effects unit $i$ is called the stable unit treatment value assumption (SUTVA) in the treatment literature (see, for example AIR2). We are making a stronger assumption because random sampling implies SUTVA. (p. 905, Ch. 21: Estimating Average Treatment Effects)

To pull on the SUTVA thread, let’s look at the discussion in AIR, and see where that takes us.

Angrist, Imbens, and Rubin (1996):

Assumption 1: Stable Unit Treatment Value Assumption (SUTVA) (Rubin 1978,1980,1990). The SUTVA implies that potential outcomes for each person $i$ are unrelated to the treatment of other individuals. … SUTVA is an important limitation, and situations where this assumption is not plausible cannot be analyzed using the simple techniques outlined here, although generalizations of these techniques can be formulated with SUTVA replaced by other assumptions.

The AIR paper and it’s placement of Instrumental Variables within the context of the Rubin Causal Model is worth it’s own post. In fact it’s worth a whole book, and there are at least three very good ones (Mostly Harmless Econometrics, Causal Inference: A Mixtape, and Mastering ‘Metrics: The Path from Cause to Effect). But I’m interested in the SUTVA assumption, so I’m after Rubin (1978,1980,1990).

Rubin (1990):

…agricultural experimenters had used guard rows between neighboring plots treated differently to avoid “interference between units.” … I call (Rubin, 1980) the assumption [of no interference between units] the “stable-unit-treatment-value assumption”, SUTVA, or simply “the stability assumption.”

This quote is a little chopped up to avoid some notation that is explained in a more compact manner in Rubin (1980). I’m also avoiding the discussion of how to relax this assumption, as that deserves—and will soon get—it’s own post. For now I’m just chasing the SUTVA back to it’s origin.

Rubin (1980):

In the paired comparison experiment, let $Y_{it}$ be the response of the $i$th unit ($i=1,\dots,2n$) if exposed to treatment $j$ $(j=1,2)$, where $Y={Y_{it}}$ is the $2n \times 2$ matrix of values of $Y_ij$. The assumption that such a representation is adequate may be called _the stable unit-treatment value assumption: If unit $i$ is exposed to treatment $j$, the observed value of $Y$ will be $Y_ij$; that is there is no interference between units (Cox 1958, p. 19) leading to different outcomes depending on the treatments other units received…

This notation is what I elided from Rubin (1990) where it was spread over a much longer passage. I believe that this passage is where Rubin coined the SUTVA, though, as we’ll see, the idea is in Rubin (1978) and comes from Cox (1958).

An even more compact version of this assumption is in Rosenbaum & Rubin (1983):

The assumption that there is a unique value $r_{it}$ corresponding to unit $i$ and treatment $t$ has been called the stable unit-treatment value assumption (Rubin, 1980)

Rubin (1978)

…for each $k$, $Y_{kt}^{t}$ represents the $i$th experimental unit’s observable value of $Y_k$ for all values of $W$ such that $W_i = t (> 0)$. Without this assumption, called “no ‘interference’ between different units” by Cox (1958, page 19), is usually made in practice. A common exception is cross-over designs with additional carry-over effects assumed. Our model can be extended to include more general assumptions about interference effects between units, but for notational and descriptive simplicity we assume $T$ versions of each $Y_k$ are adequate. The general results of this paper do not rely on the no-interference assumption.

Before we move on to Cox (1958, page 19), note that the extension mentioned here is in Rubin (1990), where the notation is augmented to allow for carry-over.

Cox (1958, page 19)

The last aspect of the assumption (1) to need discussion is the requirement that the observation on one unit should be unaffected by the particular assignment of treatments to the other units, i.e. that there is no “interference” between units. In many experiments the different units are physically distinct and the assumption is automatically satisfied. If, however, the same object is used as a unit several times, or if different unit are in physical contact, difficulties can arise…

The discussion continues with examples of how the assumption can be violated, and how estimates of the true treatment values can be recovered.

Summary:

Key points:

  1. The SUTVA seems to start with Cox (1958), but is formalized by Rubin (1978,1980,1990).
  2. The analogy of adjoining plots contaminating each other is very useful, both in terms of stating the problem concretely, and in thinking about how to implement solutions.
  3. The SUTVA is subsumed by the assumption of random assignment (Woolridge 2010), which explains why it is often not mentioned in Econometrics textbooks. It is interesting to note that the work Rubin was doing as he formalized this was directly related to the role of randomization. In particular Rubin (1978) is exploring the utility of randomization in Bayesian analysis, thus he could not assume random assignment.
  4. Relaxation of the SUTVA assumption is included in Cox (1958), Rubin (1990), and Angrist et al. (1996).
  5. The discussion in the statistics literature focuses on the “no interference” formulation.

“No general equilibrium effects” formulation of the SUTVA

Cameron & Trivedi (2005)

In my collection of econometrics textbooks the only one, other than Wooldridge (2010) that discusses the SUTVA is Cameron and Trivedi:

Matching is a persuasive and attractive methodology if (1) we can control for a rich set of x variables, (2) there are many potential controls, and (3) the [average treatment effect on the treated (ATET)] is the parameter of interest. It also requires the “no general equilibrium effects” assumption or stable unit treatment values assumption (SUTVA), which implies that treatment does not indirectly affect untreated observations. (p. 872)

Several things are interesting about this discussion. First, while Wooldridge (2010), Angrist et al. (1996), and Rubin (1990,1980,1978) and Cox (1958) discusses the SUTVA as an assumption required, or used, to characterize treatment effects in the potential outcomes framework (i.e. the Rubin causal model), Cameron and Trivedi (2005) take a different approach. Cameron and Trivedi introduce treatment evaluation in a random assignment setting, and then introduce matching as a solution to the problems that arise when matching is not possible, then introduce the SUTVA in the discussion of matching. Though slightly different presentations, these are not contradictory approaches.

Second, this echos the “no general equilibrium effects” formulation of the SUTVA, mentioned by Wooldridge which is what I want to explore next. Since, I’m interested in how this assumption is addressed in applied econometrics, lets turn to Heckman, Lockner, and Taber (1998)

Heckman, Lockner, and Taber (1998)

In this paper, Heckman et al. lay out the ‘no general equilibrium effects’ version of the SUTVA. While both versions share the notion that interference between units must be eliminated through design or adjustment from estimates of the treatment effect, the ‘no general equilibrium effects’ version also stipulates general equilibrium effects can also lead to results that do not generalize from the experiment to the population. This is an important difference. Testing fertilizer, and experimenter might leave space between treated and untreated plots; testing the benefits of a policing program a social scientist might also leave space between treated and control neighborhoods. In both cases the space between the locations where treatment and control are measured could reasonably limit interference between the treated and control locations. Though the SUTVA assumption of Cox (1958), and Rubin (1978,1980,1990) would be satisfied, part of the contribution of Heckman et al. is to point out that such an experiment may still not correctly estimate the treatment effects that would be observed if the treatment were applied to the entire population. I’m going to quote at length as the nuance here is important.

The impact of tuition policies on earnings is [typically] evaluated using a schooling - earning relationship3 fit on pre-intervention data and does not account for the enrollment effects of the taxes raised to finance the tuition subsidy.

This following reiterates, and is interesting as it places the ‘no general equilibrium effects’ version of the SUTVA in close proximity to the Lucas critique:

The danger in this widely used practice is that what is true for policies affecting a small number of individuals need not be true for policies that affect the economy at large. A national tuition-reduction policy that stimulates substantial college enrollment will likely reduce college skill prices, as advocates of the policy claim. However, agents who account for these changes will not enroll in school at the level calculated from conventional procedures, which ignore the impact of the induced enrollment on earnings. As a result, standard policy evaluation practices are likely to be misleading about the effects of tuition policy on schooling attainment and wage inequality.

This is the notion that I’m summarizing above as implying that general equilibrium effects no only threaten the estiamtes of effects in the experiment that you conducted, they also (via the Lucas critique) impact the ability of the experiment to generalize to a larger policy recommendation.

Evaluation of the general-equilibrium effect of a national tuition policy requires more information than the tuition-enrollment parameter that is the centerpiece of partial-equilibrium policy analysis.t policy proposals extrapolate well outside the range of known experience and ignore the effects of induced changes in skill quantities on skill prices.

The statistical and econometric literature on “treatment effects” if remarkable for its inattention to the market consequences of the programs it evaluates. The widely used “Rubin” model (Rubin, 1978) assumes no interactions amount the agents being analyzed. The paradigm in the econometric literature on treatment effects is that of evaluating the effectiveness of a drug. It assumes that there are no spillovers to society at large that flow from the drug use (or “treatment”) by individuals.

Then they go on to argue that in the education setting, spillovers are known to exist. This is the salient point for those of us interested in causal inference. The following discussion sets up their causal model (which is very similar to the one from Rubin)

The standard framework for a microeconometric program evaluatiohnis partial-equilibrium in nature (see Heckman and Robb, 1985). For a given individual $i$, $Y_{0,i}$ is defined to be the outcome the individual experiences if he does not participate in the program, and $Y_{1,i} is the outcome if he does participate. The treatment effect for person $i$ is $\Delta_{i}=Y_{1,i}-Y_{0,i}$. When interventions have general equilibrium consequences, these effects depend on who else is treated and the market interaction between the treated and the untreated.

This is essentially Rubin’s point that if there is interference between the units, then this type of notation is not sufficient. I think that the distinction between this point, which is the SUTVA, and their earlier point that general-equilibrium effects make generalization from a local policy to a national (or global one) difficult, is important.

Bibtex entries for the papers above:

@book{wooldridge2010econometric,
  title={Econometric analysis of cross section and panel data},
  author={Wooldridge, Jeffrey M},
  year={2010},
  publisher={MIT press}
}

@article{angrist1996identification,
  title={Identification of causal effects using instrumental variables},
  author={Angrist, Joshua D and Imbens, Guido W and Rubin, Donald B},
  journal={Journal of the American statistical Association},
  volume={91},
  number={434},
  pages={444--455},
  year={1996},
  publisher={Taylor \& Francis}
}

@article{rubin1990comment,
  title={Comment: Neyman (1923) and causal inference in experiments and observational studies},
  author={Rubin, Donald B},
  journal={Statistical Science},
  volume={5},
  number={4},
  pages={472--480},
  year={1990},
  publisher={Citeseer}
}

@article{rubin1980randomization,
  title={Comment: Randomization analysis of experimental data: The Fisher randomization test},
  author={Rubin, Donald B},
  journal={Journal of the American statistical association},
  volume={75},
  number={371},
  pages={591--593},
  year={1980},
  publisher={JSTOR}
}

@article{rubin1978bayesian,
  title={Bayesian inference for causal effects: The role of randomization},
  author={Rubin, Donald B},
  journal={The Annals of statistics},
  pages={34--58},
  year={1978},
  publisher={JSTOR}
}

@article{cox1958planning,
  title={Planning of experiments.},
  author={Cox, David Roxbee},
  year={1958},
  publisher={Wiley}
}

@article{rosenbaum1983central,
  title={The central role of the propensity score in observational studies for causal effects},
  author={Rosenbaum, Paul R and Rubin, Donald B},
  journal={Biometrika},
  volume={70},
  number={1},
  pages={41--55},
  year={1983},
  publisher={Oxford University Press}
}

@article{heckman1998general,
  title={General-Equilibrium Treatment Effects: A Study of Tuition Policy},
  author={Heckman, James J and Lochner, Lance and Taber, Christopher and others},
  journal={American Economic Review},
  volume={88},
  number={2},
  pages={381--386},
  year={1998},
  publisher={American Economic Association}
}
  1. For example Mostly Harmless Econometrics uses the benchmark assumption of random assignment, and thus does not discuss the SUTVA. 

  2. AIR refers to Angrist, Imbens, and Rubin (1996) 

  3. Just flagging this for the “people have relationships, not data” pedants among us.