No Free Lunch: Natural Experiments and the. Construction of Instrumental Variables

Size: px

Start display at page:

Download "No Free Lunch: Natural Experiments and the. Construction of Instrumental Variables"

Marlene Logan
6 years ago
Views:

1 No Free Lunch: Natural Experiments and the Construction of Instrumental Variables Thad Dunning Department of Political Science Yale University June 28,

2 Abstract Social scientists have increasingly exploited sources of random or quasi-random variation, including natural experiments, to construct instrumental variables for use in regression analysis. In many applications, researchers seek to defend the plausibility of a key assumption: namely, while an instrument or set of instruments is empirically correlated with an endogenous regressor in a linear regression model, it is independent of the error term in that model. I argue here that while fulfilling this exogeneity criterion may be necessary for a valid application of the instrumental variables approach, it is far from sufficient. In particular, in the regression context the identification of causal effects depends not just on the exogeneity of the instrument(s) but also on the validity of the underlying model. In this paper, I focus attention on the implications of one feature of characteristic models: the assumption of common effects across exogenous and endogenous portions of the problematic regressor(s). In many applications, this assumption may be quite strong, but relaxing it can limit our ability to estimate parameters of greatest theoretical interest. After discussing two substantive examples, I discuss analytic results (simulations are reported elsewhere). I also present a specification test that may be useful for determining the relevance of these issues in a given application. 2

3 1 Introduction Social scientists increasingly exploit natural experiments as sources of instrumental variables for use in regression analysis. Unlike true experiments, in a natural experiment the manipulation of a treatment variable is not under the control of an experimental researcher; instead, analysts take advantage of interventions they observe in the social and political world. Akin to randomized controlled experiments, however, and unlike other observational studies, a researcher exploiting a natural experiment can make a credible claim that the observed assignment of non-experimental subjects to treatment and control conditions is done at random or as if at random. In any given study, it may happen that units of analysis are as if randomized to the treatment of theoretical interest. In this case, a natural experiment may be very close to a true experiment, in which the researchers have planned and introduced the randomized intervention. Perhaps more often, however, nature randomizes units of analysis to levels of some variable Z that is different from the treatment variable X. Under further assumptions to be discussed below, randomization of subjects to such a variable Z may nonetheless allow identification of the causal effect of the non-randomly assigned treatment X, so long as Z is correlated with treatment. The well-known idea is as follows. Consider the regression equation, Y = Xβ + ɛ, (1) where Y is a column vector of observations on a dependent variable, X is a matrix of observations on treatment variables and covariates, β is a vector of parameters, and ɛ is a vector of unobserved, mean-zero error terms. Unlike the classical regression model, here at least some columns of X may be dependent on the error term, that is, endogenous. The Ordinary Least Squares (OLS) estimator of β will therefore be biased by a quantity related to the expected value of the error term, given X. However, under additional assumptions, Instrumental Variables Least Squares (IVLS) 3

4 regression provides a way to obtain consistent estimates of β. To use IVLS, we must find a matrix of instrumental variables Z, with at least as many columns as X (exogenous columns of X may be included in Z), for which the following conditions hold: Z Z and Z X have full rank, and Z is independent of the unobserved error term (Greene 2003: 74-80; Freedman 2005: 175). The latter requirement is the hard one, and it is the one for which natural experiments are often exploited. For instance, Miguel, Satyanath, and Sergenti (2004) take advantage of the as if random assignment of African countries to inclement weather to instrument for GDP growth, in a study of the influence of growth on civil conflict. Acemoglu, Johnson, and Robinson (2001) use variation in historic settler mortality rates across former European colonies to instrument for current institutional quality, in a regression of measures of economic development on institutions. Angrist and Lavy (1999) exploit as if random variation in the size of Israeli classes to estimate the effect of class size on educational attainment. In these and other applications, researchers tend to devote substantial attention to defending the plausibility that a set of instrumental variables Z is independent of the error term in an equation like (1), as required for consistent estimation of β. In the context of an equation like (1), however, it is not merely the exogeneity of the instrument(s) that allows for estimation of the causal impact of X on Y. Instead, Z is tied to inferences about the impact of X on Y through a particular causal model. This underlying causal model in turn lends itself to a regression equation like (1), which is used to estimate the effect of treatment. Exogeneity is therefore necessary but not sufficient for valid application of the instrumental variables approach: the validity of the model is always at issue as well. Though this observation is in itself unremarkable, I argue here that an under-appreciated aspect of a statistical model like equation (1) plays an important role in sustaining causal inferences about the impact of X on Y. In brief, the statistical model in equation (1) assumes common effects across exogenous and endogenous portions of the treatment variable. While this assumption may be innocuous in some settings, it is far from clear that it holds in other contexts in which we would commonly use IVLS. Below, I discuss at length two substantive examples in which a compelling 4

5 natural experiment provides plausible as if randomization and thus supplies an instrumental variable that is credibly exogenous. Because of the exogeneity of the instruments, these examples lend themselves to particularly credible applications of the IVLS approach. However, as these examples will also suggest, the assumption that endogenous and exogenous portions of a problematic regressor have the same effects on the outcome of interest may be quite strong. Suppose that X is a (mean-zero) scalar random variable, and we have X = X 1 + X 2, with X 1 endogenous and X 2 exogenous; this partition of X into endogenous and exogenous portions emerges in a natural application-specific way in one of the examples discussed below. One alternative is that the true data-generating process is such that we should estimate Y = β(x 1 + X 2 ) + ɛ (2) Another alternative, however, is instead Y = β 1 X 1 + β 2 X 2 + ɛ (3) with β 1 β 2. We can think of the rows of equation (1) as i.i.d. realizations of the data-generating process implied by equation (2) or (3). The simple point I make here is that in many applications, as the substantive examples discussed below suggest, equation (3) may be more natural than equation (2). Since the point of using IVLS is often to recover estimates of the coefficient on an endogenous variable such as X = X 1 + X 2, positing a model like equation (2) is an important part of the technique. However, erroneously assuming constant coefficients can also produce IVLS estimates that are misleading. The point is not that there is a general failure in IVLS applications. Rather, the point is that in the regression context, the identification of causal effects depends not just on the exogeneity of the instrument(s) but also on the validity of the underlying model. This is, of course, 5

6 a general point that goes beyond applications of IVLS, yet it is one we tend to forget in focusing our attention only on the exogeneity of the instruments. The spirit of the discussion might also be put as follows. In a typical randomized controlled experiment, questions are sometimes raised about the extent to which the effect of one treatment can be generalized to another (perhaps similar) treatment. The standard response to such questions would be, we need to conduct another experiment to find out. Yet the natural-experiment-cuminstrumental variables approach seems to offer another alternative: although natural experiments often randomize nations or other units of analysis to treatments (weather patterns, setter mortality rates, and so on) other than the treatment of theoretical interest (such as GDP growth or institutional quality), it appears that we may use IVLS to recover the effect of these treatments. This approach can substitute for conducting a new experiment or for finding a different natural experiment in which units are in fact randomized to the treatments of primary interest only under assumptions that may be quite strong. In these contexts, the IVLS approach may not provide a free lunch, that is, a way to surmount the unfortunate fact that nature has randomized our units of analysis not to the treatments about which we care the most but rather to some other, related treatment. Whether these issues are germane in any given application is mostly a matter for a priori reflection, though at the end of this article I sketch a statistical specification test that might be of some use. The specification test requires an additional instrument, however, and therefore may be of limited practical utility. The main goal of the paper is thus to underscore the general relevance of the issues and to encourage their discussion in applications. Indeed, specification of the model, especially the assumption of constant effects, should perhaps be defended with the same energy with which we often defend exogeneity. This paper relates to but is distinct from several strands of literature in econometrics, political science, statistics, and program evaluation. There is a large literature that discusses the relative merits of IVLS (Kennedy 1985: 115; Hanushek and Jackson 1977: 238). Bartels (1991) uses simulations to study a bias-variance trade-off under the assumption that the instrument itself is (weakly) 6

7 endogenous. On a different topic, there is also a literature on instruments that may be exogenous but that are only weakly correlated with an endogenous regressor or set of regressors (e.g., Bound et al. 1995). The focus of the current paper differs from this previous work, in that it considers instruments that are strictly exogenous and that are also well-correlated with endogenous regressors. In this case, IVLS estimates are consistent, and the efficiency loss won t be too great because of the high correlation between the instrument and the endogenous regressors. Nonetheless, IVLS may fail to produce accurate estimates, due to a particular form of model misspecification. The paper thus focuses on inferential difficulties that arise even when the standard requirements for a valid instrument are met. More related to the present article is an important recent literature on understanding instrumental variables in the presence of causal heterogeneity or essential heterogeneity. Many recent papers have clarified what instrumental variables can estimate in such settings. In some settings, for example, the instrumental variables approach estimates treatment effects for individuals whose behavior is modified by instruments (Heckman and Robb 1985, 1986); instrumental variables can identify what Imbens and Angrist (1994) call local average treatment effects (see also Angrist, Imbens, and Rubin 1996; for discussion, Heckman, Urzua and Vytlacil 2006). Rosenzweig and Wolpin (2000) also show that what IVLS estimates depends on the underlying behavioral models that are posited. In these papers, however, which are often formulated in the context of the Neyman-Holland-Rubin potential outcomes model, the heterogeneity of interest comes across units (individuals, countries, etc.) that is, across i. In the present paper, we suppose that coefficients are constant across i and instead investigate the consequences of heterogeneity across variables that is, that is, heterogeneity across exogenous and endogenous portions of the regressors in X. 7

8 2 An example on lottery winnings In a recent paper, Doherty, Green and Gerber (2005) are interested in assessing the relationship between income and political attitudes. 1 They surveyed 342 people who had won a lottery in an Eastern state between 1983 and 2000 and asked a variety of questions about attitudes towards estate taxes, government redistribution, and social and economic policies more generally. Given the number and kinds of lottery tickets that individuals buy, the level of lottery winnings are randomly assigned among lottery players. 2 Abstracting from sample non-response and other issues that might threaten the validity of the inferences, 3, Doherty, Green, and Gerber can exploit the lottery to make compelling claims about the causal impact of winnings on political beliefs. It turns out that winning large amounts in a lottery has an effect on some relatively narrow political attitudes e.g., those who win more in the lottery favor the estate tax less but lottery winnings have relatively little impact on broader political attitudes, for instance, towards the proper role of government in the economy writ large. However, the question of perhaps greater social-scientific interest concerns the political effects of overall income: while relatively few people have lottery winnings, many have incomes. Does the natural experiment also allow us to generalize from the impact of lottery winnings to the effect of overall income on attitudes? It does not, without assumptions that may be quite strong in this context. As Doherty et al. (2005) carefully point out, the effect on political attitudes of windfall lottery winnings may be very different from other kinds of income for example, income earned through work, interest on wealth inherited from a rich parent, and so on. These kinds of concerns may also limit our ability to use IVLS to estimate the causal effect 1 Portions of the material in this section are based on Dunning (forthcoming). 2 Lottery winners are paid a large range of dollar amounts. In Doherty et al. s sample, the minimum total prize was $47,581, while the maximum was $15.1 million, both awarded in annual installments. 3 See Doherty et al. (2005) for further details. 8

9 of overall income on political attitudes. Consider the regression equation ATTITUDES i = β INCOME i + ɛ i (4) where ATTITUDES i measures the political attitudes of respondent i, INCOME i is the self-reported income (from all sources) of respondent i, and β is a regression coefficient common to all respondents. 4 For ease of exposition, the variables are mean-deviated, and covariates are not included. 5 The error term ɛ i is a random variable, independently and identically distributed across respondents with E(ɛ i ) = 0. The goal is to estimate the value of the parameter β, defined here as the impact of overall income on political attitudes. Equation (4) is the standard linear regression set-up, except for one catch: the error term is not independent of income, because unobserved (unmeasured) variables may be associated with both overall income and political attitudes. For instance, rich parents may teach their children how to play the stock market and also influence their attitudes towards government intervention. Peer-group networks may influence both economic success and political values. Ideology may itself shape economic returns, perhaps through the channel of beliefs about the returns to hard work. Even if some of these variables could be measured and controlled, clearly there are many unobserved variables that could conceivably confound inferences about the causal impact of overall income on political attitudes. From the perspective of standard approaches to instrumental variables regression, however, the innovative research design of Doherty et al. (2005) supplies the perfect instrument namely, a variable that is both correlated with overall income and is independent of the error term in equa- 4 Note that according to equation (4), subject i s response depends on the values of i s right-hand side variables; values for other subjects are irrelevant. The analog in Rubin s formulation of the Neyman model is the stable unit treatment value assumption (SUTVA) (Neyman 1923, Dabrowska and Speed 1990; Holland 1986). 5 In a similar regression model, Doherty et al. (2005) include various covariates, including a vector of variables to control for the kind of lottery tickets bought, and another vector of demographic variables to boost statistical efficiency by adjusting for slight imbalances due to the randomization. Doherty et al. (2005) also estimate a series of ordered probit models to estimate the impact of lottery winnings per se on attitudes. 9

10 tion (4). This variable is the level of lottery winnings of respondent i. An accounting identity is INCOME i EARNED INCOME i + WINNINGS i (5) where WINNINGS i are the lottery winnings of survey respondent i and EARNED INCOME i is shorthand for all other income sources of respondent i, net of lottery winnings. (We call this earned income, though it could of course include any additional income source beyond lottery winnings). Equation (5) implies that Cov(INCOME i, WINNINGS i ) 0 (6) since the variable WINNINGS i is a component of INCOME i. 6 Moreover, since levels of lottery winnings are randomly assigned to the lottery-playing survey respondents, winnings should be statistically independent of other characteristics of the respondents, including characteristics that might influence political attitudes. Thus: WINNINGS i ɛ i (7) where A B means A is independent of B. Viewed in the context of equation (4), equation (7) is an exclusion restriction (Greene 2003: 74-80). Together with equation (6), it says that there exists a variable that is correlated with the endogenous regressor INCOME i in equation (4) but that is independent of the error term in that equation. The Instrumental Variables Least Squares (IVLS) estimator is ˆβ IVLS = Ĉov(WINNINGS, ATTITUDES) Ĉov(WINNINGS, INCOME) (8) 6 This assumes (eminently plausibly) that Cov(EARNED INCOME i, WINNINGS i ) var(winnings i ). 10

11 that is, the sample covariance of lottery winnings and attitudes divided by the sample covariance of lottery winnings and overall income. 7 Under the assumptions of the model, equation (8) will provide a consistent estimator for β in equation (4). Note, however, that our ability to generalize from the effect of one treatment lottery winnings to the effect of another treatment total income is ensured only by the model in equation (4). Given the model, we can use the instrumental variables technique to obtain a consistent estimator of β, since lottery winnings are correlated with income but independent of the error term in equation (4). Yet the model itself does an important portion of the inferential work. To see this, note that by equation (5), the total income of individual i will be the sum of income from lottery winnings and income from all other sources (or earned income ). It is useful to rewrite equation (4) as ATTITUDES i = β(earned INCOME i + WINNINGS i ) + ɛ i (9) Writing the equation this way makes the assumptions of the model clearer. Among the most important assumptions is that β is assumed to be constant for all i, and, especially, constant for all forms of income. According to the model, it does not matter whether income comes from lottery winnings or from other sources: a marginal increment in either lottery winnings or in earned income will be associated with the same expected marginal increment in political attitudes. Put differently, the slope coefficient is assumed to be the same across endogenous and exogenous portions of the regressor INCOME i. The model therefore assumes away the important inferential issue that Doherty et al. (2005) point out. Suppose instead that we allowed lottery winnings to have its own slope parameter in equation (9), thus assuming that political attitudes are a linear combination of earned income, lottery 7 Equivalently, we can use Two-Stage Least Squares (IISLS); see Freedman (2005: ) on the equivalence of these estimators. 11

12 winnings, and an error term. Then we can rewrite equation (9) as ATTITUDES i = β 1 EARNED INCOME i + β 2 WINNINGS i + ɛ i (10) The variable WINNINGS i is independent of the error term among lottery winners, due to the randomization provided by the natural experiment. However, EARNED INCOME i remains endogenous, perhaps because factors such as education or parental attitudes influence both earned income and political attitudes. We could again resort to the instrumental variables approach, but since we need as many instruments as there are regressors in (10), we will need some new instrument in addition to WINNINGS i. Even if we could find one, we would need to assume a constant coefficient β 1 across the exogenous and endogenous portions of EARNED INCOME i. Suppose the data were generated according to equation (10), with β 1 β 2, and we (erroneously) assume equation (9). Now we estimate the model with IVLS, using WINNINGS as an instrument for INCOME. As I show analytically in Section 4, given the independence of EARNED INCOME and WINNINGS, IVLS will asymptotically estimate the coefficient β 2 on WINNINGS. Yet the coefficient on the endogenous variable EARNED INCOME may be of theoretical interest. (After all, if we only cared about β 2, we could simply regress ATTITUDES on the exogenous variable WINNINGS). In this context, IVLS provides no free lunch, despite the fact that we have an ideal instrumental variable for the endogenous regressor in equation (4). The identification provided by the natural experiment can help us recover the impact of lottery winnings, but not the impact of overall income unless the data were indeed generated according to (4). Again, the point is not that there is a general flaw in the IVLS approach. The point is the model is misspecified; to use the IVLS approach effectively, the data should have been generated according to (9), not (10). 12

13 3 An example on economic growth and civil conflict in Africa Before turning to the analytic results, however, I explore a different example drawn from the growing literature that applies the IVLS approach in comparative political economy and comparative politics. As we will see, there are issues that are parallel to those raised above, in the context of the study of lottery winnings. Just as in that study, the point here is neither that IVLS is necessarily the wrong approach nor that there are fundamental flaws in the study: indeed, the study reviewed here is one of the most creative and compelling of recent applications of IVLS in the field. The point is simply that purging estimates of their endogeneity through an application of IVLS depends on an important assumption about homogenous effects, and this is an assumption that it may be useful to discuss. Miguel, Satyanath, and Sergenti (2004) are interested in the effects of economic recession on the likelihood of civil conflict in Africa. Although the idea that adverse economic conditions may incite conflict is an old one, recent studies have posited specific mechanisms through which economic recessions may increase the likelihood of civil war. For instance, according to the influential models advanced by Paul Collier and Anke Hoeffler of the World Bank (1998, 2001, 2002), economic factors help to explain the incidence of civil war because of the important role they play in rebel recruitment. Miguel et al. (2004: 727) summarize the approach as follows: Collier and Hoeffler stress the gap between the returns from taking up arms relative to those from conventional economic activities, such as farming, as the causal mechanism linking low income to the incidence of civil war. 8 Indeed, Collier and Hoeffler suggest that the economic motives of potential rebels far outweigh other factors, such as indicators of social injustice, in explaining the incidence of rebellion. In their well-known formulation, it is greed, not grievance, that mainly explains variation in the occurrence of civil wars. 8 Miguel et al. (2004) also discuss an alternative, though possibly complementary, approach, that of Fearon and Laitin (2003), who emphasize the importance of state capacity and road coverage in explaining the outbreak and duration of civil war. 13

14 However, there is an important problem for purposes of testing such theories about the influence of economic conditions on civil conflict: civil conflict may influence economic conditions, and there may be confounding, too. As Miguel et al. (2004) put it, the existing literature does not adequately address the endogeneity of economic variables to civil war and thus does not convincingly establish a causal relationship. In addition to endogeneity, omitted variables for example, government institutional quality may drive both economic outcomes and conflict, producing misleading cross-country estimates (2004: 726). In other words, in a regression of civil conflict on economic growth, the latter may be dependent on the error term in the underlying regression model. The solution to this problem, Miguel et al. (2004) suggest, is instrumental variables regression. The instrument for economic growth they propose is weather shocks stemming from variation in rainfall. In sub-saharan Africa, as these authors demonstrate empirically, there is a strong positive correlation between percentage change in rainfall over the previous year and economic growth (and the correlation holds up for both lagged and contemporaneous annual change in rainfall). Drought hinders economic growth. Rainfall thus passes one key requirement for a potential instrument, that it be correlated with the endogenous regressor. The other key requirement, and always the harder one to fulfill, is that rainfall is exogenous that is, independent of the error term in the underlying regression model. This assertion is, of course, essentially untestable; but Miguel et al. (convincingly) probe its plausibility at length. Although establishing the plausibility of an instrument s exogeneity is always an important component of the instrumental variables approach, it is not the issue here. For purposes of this discussion, we will therefore assume rainfall is exogenous, which seems very sensible. The IVLS estimates presented by Miguel et al. suggest a strong negative relationship between economic growth and civil conflict: a five-percentage-point drop in annual economic growth increases the likelihood of a civil conflict (at least 25 deaths per year) in the following 14

15 year by over 12 percentage points which amounts to an increase of more than one-half in the likelihood of civil war (2004:727). 9 Miguel et al. also find perhaps surprisingly that the impact of income shocks on civil conflict is not significantly different in richer, more democratic, more ethnically diverse, or more mountainous African countries. This appears to be compelling evidence of a causal relationship, and Miguel et al. also have a plausible mechanism to explain the effect namely, the impact of drought on the recruitment of rebel soldiers. But have Miguel et al. really estimated the effect of economic growth on conflict? This is not so clear. Making this assertion, as we will see, depends on specific assumptions about the way growth produces conflict. In particular, it depends on positing a model in which economic growth has a constant effect on civil conflict constant, that is, across the components of growth. As with the example on lottery winnings, using the IVLS machinery to identify causal effects depends not just on the validity of the exclusion restriction it also depends on the validity of this model. The point might be elaborated as follows. Suppose for purposes of this argument that we can model economic growth in country i in year t as a function of growth in two sectors, agriculture and industry. We then want to consider two alternate ways in civil conflict could be a function of economic variables. On the one hand, it might be the case that the probability of civil conflict is given by Prob{C it = 1} = γ Y it + ɛ it (11) where C it is a binary variable for conflict in country i in year t (with C it = 1 indicating conflict), Y it is the economic growth rate of country i in year t, and ɛ it is a latent mean-zero random variable meant to capture unmeasured characteristics that affect the probability of civil war. 10 According to 9 The dependent variable is dichotomous; it measures the incidence of civil conflict in which there are more than 25 (alternatively, more than 1,000) battle deaths in a given year. The main equation thus appears to be specified as a linear probability model. 10 Equation (11) resembles the main equation found in Miguel et al. (2004: 737), though we abstract from control variables as well as lagged growth values here for ease of presentation. Miguel et al. in fact specify C it = γ( Y it ) + X it β + ɛ it, so the dichotomous C it is assumed to be a linear combination of continuous right-hand side covariates and a 15

16 the model, if we intervene to increase the economic growth rate in country i and year t by one unit, the probability of conflict in that country-year is expected to increase by γ units (or to decrease, if γ is negative). However, the model is agnostic about the source of this increase in economic growth. Indeed, if we want to influence the probability of conflict we might consider different interventions to boost growth: for example, we might target foreign aid with an eye to increasing industrial productivity, or we might hope that more rainfall will boost agricultural productivity. Suppose, on the other hand, that growth in agriculture and growth in industry which both influence overall economic growth have different effects on conflict, as in the following model: Prob{C it = 1} = α I t + β A t + ɛ it (12) where I t and A t are the annual growth rates in industry and agriculture, respectively. What might motivate such an alternative model? As the causal mechanism posited by Collier and Hoeffler suggests, decreases in agricultural productivity may increase the difference in returns to taking up arms and farming, making it more likely that the rebel force will grow and civil conflict will increase; yet in a context in which many rebels are recruited from the countryside, changes in (urban) industrial productivity may have no or at least different effects on the probability of conflict. In this context, heterogenous effects on the probability of conflict across components of growth may be the conservative assumption. Moving from either equation (11) or equation (12) to data, and thus to estimation of γ or α and β requires further statistical assumptions. In a standard linear probability model, the Y it in equation (11) would be independent of the ɛ it, and OLS would give unbiased estimates of γ. The problem is that Y it is dependent on the ɛ it, that is, endogenous. The solution that Miguel et al. propose is IVLS, with rainfall growth as an instrument for the Y it. Given the response schedule in equation (11), this seems like a very plausible solution. Rainfall growth is correlated continuous error term; yet the authors clearly have in mind a linear probability model, so in the text I write equation (11) instead. 16

17 with economic growth in Africa. If rainfall growth is also exogenous, as Miguel et al. argue, IVLS delivers the goods. If the true data-generating process is equation (12), however, another approach is needed. It seems reasonable that industrial growth and agricultural growth will both be dependent on the error term in equation (12), for the same reasons as Miguel et al. (2004) suggest that overall economic growth is endogenous. For instance, conflict may depress agricultural growth, and harm urban productivity as well. If rainfall growth is correlated with agricultural growth but not with industrial growth as seems plausible intuitively we have a good instrument for A t. But we cannot estimate α without an additional instrument for industrial productivity. The point here is not that equation (12) is the right response schedule; indeed, it is as stylized as equation (11), and there may well be contexts in which it is more appropriate to estimate (12). There are important policy implications, of course: if growth reduces conflict no matter what the source, we might counsel more foreign aid for the urban industrial sector, while if only agricultural productivity matters, the policy recommendations would be quite different. The objective here, however, is merely to point out that what IVLS estimates in this context (or any context) depends importantly on the assumed model, and not just on the plausible exogeneity of the instrument. 4 Analysis If the data-generating process involves heterogenous effects across endogenous and exogenous portions of X, and we instead assume homogenous effects, what does IVLS estimate? In this section, I analyze a case akin to the example on lottery winnings, where an endogenous regressor breaks down into the sum of exogenous and endogenous portions. 17

18 For each observation i, the true data-generating process is y i = β 1 x 1i + β 2 x 2i + ɛ i, (13) where β 1 and β 2 are parameters. When β 1 β 2, equation (13) is identical to equation (10) in the example on lottery winnings; x 1 is analogous to earned income and x 2 is analogous to lottery winnings. The subjects are i.i.d., and E(ɛ i ) = E(x 1i ) = E(x 2i ) = 0, with ɛ x 2 but Cov(ɛ, x 1 ) 0. Also, x 1 x 2, (14) as in the example on lottery winnings: subjects are randomized by the natural experiment to levels of x 2. Suppose we (erroneously) assume that data were generated according to y i = β(x 1i + x 2i ) + ɛ i, (15) Equation (15) is the usual regression model, with one exception: the regressor x Ti x 1i + x 2i (with T for total ) is endogenous, because x 1 and ɛ are dependent. However, we have available to us (by construction) a valid instrument, since x 2 is correlated with the endogenous regressor but also independent of the error term. The instrumental variables estimator is: ˆβ IVLS = Cov(x 2, y) Cov(x 2, x T ) (16) where the covariances are taken over data. 11 Now, substituting for x T and distributing covariances, 11 Equation (16) is valid because all of the x s have expectation 0. 18

19 we have ˆβ IVLS = Cov(x 2, y) Cov(x 2, x 1 ) + Var(x 2 ) (17) By assumption, x 1 and x 2 are independent, so Cov(x 2,x 1 ) should be near zero, and Cov(x 2, y) lim n Var(x 2 ) = β 2 (18) IVLS here delivers an estimate of the impact of the exogenous portion of treatment to which subjects have been randomized but not the endogenous portion of the aggregate variable of interest, x T. Yet we are ultimately interested in the effect of x 1, or at least x T ; otherwise, we could simply regress the dependent variable on x 2. This is the sense in which there is no free lunch provided by the exogeneity of the instrument, given that the data-generating process is equation (13) and not (15). In other cases, the situation may be somewhat more complicated. For instance, when Cov(x 1, x 2 ) 0, the IVLS estimate of β in equation (15) will converge to a mixture of β 1 and β 2, the weights being w = Cov(x 2, x 1 )/[Cov(x 2, x 1 ) + Var(x 2 )] and 1 w. In simulations reported on the author s website, I investigate what IVLS estimates under a range of other assumptions about the true data-generating process A specification test The discussion above suggests a natural specification test, which requires the availability of an additional instrument, z 1, with the following properties: z 1 ɛ (19) 12 In the formula for w, Var and Cov operate on random variables, and w could be negative. 19

20 and Cov(z 1, x 1 ) 0 (20) where the notation follows the section above. We will then use IVLS to estimate the model in equation (13) above, that is, y i = β 1 x 1i + β 2 x 2i + ɛ i (21) using z 1 and x 2 (which is exogenous) as the instruments. Let Σ be the estimated variance-covariance matrix for the coefficient estimates: Ĉov( ˆ β 1, ˆ β 2 x 1, z 1 ) = Σ (22) Using the diagonal and off-diagonal elements of this 2 2 matrix, we can calculate Var ( ˆ β 1 ˆ β 2 ) = Var ˆ β 1 + Var ˆ β 2 2 Ĉov ( ˆ β 1, ˆ β 2 ) (23) The coefficient estimates are asymptotically normal, and z-tests for the difference can be applied (see Greene 2003: for details). If pooling is appropriate, then the estimated coefficient on x 1 should be the same as the estimated coefficient on x 2, up to random error. Statistical tests should therefore fail to reject the null hypothesis that β 1 and β 2 are equal. This adaptation of a standard test compares a pooling estimator to a splitting estimator; it could be viewed as a Hausmann test, in which an additional instrument is needed to test the pooling restriction because x 1 is endogenous. In simulations, the specification test is able to detect model specification failures with a high degree of accuracy. Of course, like most specification tests, this one is robust only against a limited class of alternatives: we stipulate that the data are generated according to equation (13), and the alternatives are that β 1 = β 2 or β 1 β 2. Moreover, since the test requires the availability of an additional instrument, it may only be useful in certain classes of 20

21 applications. For instance, we do not attempt to key the test to data from the examples discussed in this paper because we do not see an available additional instrument Conclusion In a given natural experiment, nature may assign the units of analysis not to levels of the treatment variable X that is of greatest interest but to levels of another variable Z. Nonetheless, the causal effect of X may be recovered through instrumental variables regression analysis provided that the assumptions of the IVLS model are valid. In most applications, analysts tend to focus attention on two canonical requirements for a valid instrumental variable: the variable Z must be correlated with the endogenous regressor X, and it must itself be exogenous, that is, independent of the error term. The first assumption can be checked from the data. The second, essentially untestable, assumption is generally the more difficult one, and it is the one for which a good natural experiment can be particularly useful. However, satisfying these requirements is not enough for valid application of the instrumental variables approach. In particular, the regression model linking X to Y must also be valid. While this may seem obvious, in this article I have drawn attention to a too-infrequently remarked feature of the canonical IVLS regression model: the assumption of constant effects across exogenous and endogenous portions of the problematic regressor X. Violations of this assumption can limit the ability of the natural-experiment-cum-instrumentalvariables approach to recover causal parameters. For example, in order to use lottery income to estimate the effect of overall income on political attitudes, we must assume that the effects of lottery income and earned income are the same. To use rainfall changes to estimate the effect of economic growth on civil conflict, we must assume that growth in the agricultural sector has the same effect as growth in the industrial sector. These are strong assumptions, and they should be 13 For simulations, see the author s website. 21

22 defended with same kind of energy that is used to defend exogeneity. If the assumption of constant effects across endogenous and exogenous portions of X is wrong, then IVLS estimates can be quite misleading. When heterogeneity takes the simple form I have investigated here that is, the outcome variable is a sum of independent exogenous and endogenous portions instrumental variables regression simply estimates the coefficients of the exogenous portion. In more complicated settings, IVLS may estimate a mixture of the true coefficients of interest. Thus, if the model is incorrectly specified, exogeneity may not be much help. Ultimately, of course, the question of model specification is a theoretical and not a technical one. Whether it is proper to specify constant coefficients across exogenous and endogenous portions of a treatment variable, in examples like those discussed in this paper, is a matter for a priori reflection. This is not unique to applications of IVLS indeed, similar issues may arise even if there is no endogeneity yet special issues are raised with IVLS because we often hope to use the technique to recover the causal impact of endogenous treatments. What about the potential problem of infinite regress? In the lottery example, for instance, it might well be that different kinds of earned income have different impacts on political attitudes; in the Africa example, different sorts of agricultural income could have different effects on conflict. To test many permutations, given the endogeneity of the variables, we would need many instruments and these are not usually available. This is exactly the point. Deciding when it is appropriate to assume constant effects is a crucial theoretical issue. That issue tends to be given short shrift in current applications of the natural-experiment-cum-instrumental-variables approach, where the focus is on exogeneity. The basic point emphasized here is therefore not that data analysis or regression diagnostics are the key. Rather, in any particular application, a priori and theoretical reasoning should be brought to bear to justify the crucial assumptions of constant effects across endogenous and exogenous portions of the problematic regressor. In some settings, the constancy assumption may 22

23 be innocuous; the point here is not that the IVLS approach is necessarily flawed, but that the validity of underlying regression model should be carefully considered. The no free lunch principle suggests randomization to Z instead of X may not be enough to recover the causal impact of X. 23

24 References [1] Acemoglu, Daron, Simon Johnson, and James Robinson The Comparative Origins of Comparative Development: An Empirical Investigation. The American Economic Review Vol. 91 (5): [2] Angrist, Joshua D., Guido W. Imbens, and Donald B. Rubin Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association 91 (434): [3] Angrist, Joshua D. and Victor Lavy Using Maimonides Rule to Estimate the Effect of Class Size on Student Achievement. Quarterly Journal of Economics 114: [4] Bartels, Larry M Instrumental and Quasi-Instrumental Variables. American Journal of Political Science Vol. 35 (3): bound et al Bound, John, David Jaeger, and Regina Baker Problems with Instrumental Variables Estimation when the Correlation between the Instruments and the Endogenous Explanatory Variables is Weak. Journal of the American Statistical Association Vol. 90 (430): [5] Doherty, Daniel, Donald Green, and Alan Gerber Personal Income and Attitudes toward Redistribution: A Study of Lottery Winners. Political Psychology Vol. 27 (3): [6] Dunning, Thad. Forthcoming. Improving Causal Inference: Strengths and Limitations of Natural Experiments. Political Research Quarterly. Previous version presented at the meetings of the American Political Science Association, August 31-September 5, Washington, D.C., [7] Freedman, David Statistical Models: Theory and Practice. Cambridge: Cambridge University Press. [8] Freedman, David, Robert Pisani, and Roger Purves Statistics. 3rd 3d. New York: W.W. Norton, Inc. 24

25 [9] Greene, William H Econometric Analysis. Prentice Hall: Upper Saddle River, NJ, Fifth Edition. [10] Hanushek, Eric A. and John E. Jackson Statistical Methods for Social Scientists. San Diego, CA: Academic Press, Harcourt Brace Company. [11] Heckman, James J Randomization as an Instrumental Variable. The Review of Economics and Statistics Vol 78 (2): [12] Heckman, James J. and R. Robb Alternative Methods for Evaluating the Impact of Interventions. In James J. Heckman and Burton Singer, eds., Longitudinal Analysis of Labor Market Data, Volume 10, pp New York: Cambridge University Press. [13] Heckman, James J. and R. Robb Alternative Methods for solving the problem of selection bias in evaluating the impact of treatments on outcomes. In Howard Wainer, ed., Drawing Inferences from Self-Selected Samples, pp New York: Springer-Verlag. [14] Heckman, James J., Sergio Urzua, Edward Vytlacil Understanding Instrumental Variables in Models with Essential Heterogeneity. Paper given by Heckman as the Tjalling C. Koopmans Lecture, Cowles Foundation, Yale University, September 26-27, [15] Holland, Paul W Statistics and causal inference. Journal of the American Statistical Association 8: (with discussion). [16] Kennedy, Peter A Guide to Econometrics. 2d ed. Cambridge: MIT Press. [17] Krasno, Jonathan S. and Donald P. Green Do Televised Presidential Ads Increase Voter Turnout? Evidence from a Natural Experiment. Manuscript, Department of Political Science, Yale University. 25

26 [18] Neyman, Jersey Sur les applications de la théorie des probabilités aux experiences agricoles: Essai des principes. Roczniki Nauk Rolniczych 10: 1 51, in Polish. English translation by DM Dabrowska and TP Speed (1990), Statistical Science 5: (with discussion). [19] Rosenzweig, Mark R. and Kenneth I. Wolpin Natural Natural Experiments in Economics. Journal of Economic Literature Vol. 38 (4): [20] Rubin, Donald Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66:

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data