Regression Discontinuity Design on Model Schools Value-Added Effects: Empirical Evidence from Rural Beijing

Size: px
Start display at page:

Download "Regression Discontinuity Design on Model Schools Value-Added Effects: Empirical Evidence from Rural Beijing"

Transcription

1 Regression Discontinuity Design on Model Schools Value-Added Effects: Empirical Evidence from Rural Beijing Kai Hong CentER Graduate School, Tilburg University April 2010 Abstract In this study we examine the value-added effects of model schools on students achievements. We apply regression discontinuity design to data from Daxin District in rural Beijing. Both parametric and nonparametric approaches are adopted and estimate results are heterogeneous. For science student, significant positive effects ranging from 20 to 80 are found. While for art students, we find few evidences to support positive effects. Three robust checks, including additional covariates, school specific cutoffs and peers effects, are also performed and robustness of our results are confirmed further. Two policy related issues are also discussed: compliers and noncompliers and partially fuzzy design, by which we find smaller effects for the full population and almost the same effects for eligible participants. Keywords: regression discontinuity design, value-added effect, model school, economics of education 1

2 1 Introduction How to allocate restricted educational resources to obtain maximum achievement has been a controversial issue for a long time. On the one hand, education is crucial for further developments. For example, in economics, education plays an important role in both economic growth and eliminating inequality 1. On the other hand, educational resources are usually limited. For example, public finance of education in most countries accounts for less than 5% of the GDP. In 2005 in China this percentage was only 2.82%. How to balance them and guarantee that these limited educational resources are used efficiently is the central problem in economics of education. In China, establishing key schools or model schools is recognized as an effective way to solve this problem. Since 1994, policies regarding model schools have been implemented. Nowadays, 15 years later, these model schools perform really well in almost all fields, especially in students achievements. However, because whether a student can be enrolled in a model school largely depends on his or her previous performance, students in model schools usually have excellent achievements even before they enter current schools and are more likely to obtain the same achievements in normal schools. In that case, compared with generous educational input, whether value-added effects of model schools on students achievement are large enough is still questionable. In this paper we will exam the effects of model high schools on students achievements in rural Beijing with the regression discontinuity design (RD design for short). The effects of teachers or schools have been drawing attention from researchers for several decades. Such effects are usually known as value-added effect and interpreted from both the descriptive or causal aspect, say treatment effects of certain policies concerning teachers or schools. Many specific topics, from theories to practices, are involved into this field, such as the realization of treatments, how to obtain reliable causal estimation of these treatment effects and how to deal with certain specific econometric techniques 2. The main question that needs to be answered is usually presented as what are the effects on students of being in school A on their sequential test scores, or how much a particular school or teacher has added value to their students test scores. To answer such questions usually we need to compare the post-test scores with test scores before the treatment assignment and identify to what extent we obtain the causal effects. Before going deeper into this field, it is necessary to review methods used to identify and estimate causal effects with an emphases on a special one called regression discontinuity design. The rest of the paper is organized as follows: the second section will be a intensive review on RD design in economics of education, including an introduction of randomized experiments which is recognized as the standard method for estimating causal effects; the development of RD design, which is commonly recognized as quasiexperimental design; a few recent applications of such design to value-added analysis; a brief conclusion and several relevant prospects. In the third section empirical backgrounds, including the introduction of relevant exams in Beijing and model school policies and data descriptions, are introduced. The forth section deals with the evidence of validity, where discontinuity of variables are analyzed by graphics and the density of the treatment-determining variable is further tested. The fifth section concerns on empirical analysis. After an introduction of the RD design framework, both of parametric and nonparametric estimation are performed. Robustness checks, including additional covariates, multiple cutoffs and peers effects are also discussed. In the sixth section policy extensions dealing with compliers, noncompliers and eligible participants are intensively discussed. The seventh section concludes. 1 For the relation between education and economic growth, see Stevens and Weale (2003). For the relation between education and eliminating inequality, see Mickelson (1987). 2 See Donald B. Rubin et al. (2004) for a short review on the value-added assessment in education. 2

3 2 RD Design in Economics of Education 2.1 Basic Settings Usually the RD design begins with a population of objects N, N = (1, 2,..., I). An object in it can be denoted by i. Such object can be individuals, households, schools and so on. For each i several attributes are observed. One is the outcome Y i. We want to know why it varies across objects. Another is the treatment T i. If we assume that there are only two levels of treatment for simplicity, we have T i = 1 for objects in the treatment group and T i = 0 for those in the control group. Other characteristics can be denoted by X i. The treatment effect is measured by the difference between outcomes of the same object in the treatment group and the control group, with other characteristics X i unchanged. For a given individual, we have the following formulation: Y i = α + β i T i + ɛ i, where α is a constant term and ɛ i is the error term. We can find that if T i = 0, we have Y i = α + ɛ i. So with the assumption that E(ɛ i ) = 0, we have α = E(Yi 0 0 ), where Yi is the outcome without the treatment. And E(Yi 1) = α + E(β i), where Yi 1 is the outcome with the treatment. Then we have E(Yi 1) E(Y i 0) = E(β i), which is the average treatment effect for the treatment T. 2.2 Random Experiment The central idea about causal inference of treatment effect goes back to Rubin (1974), where thoughts of randomized experiment and potential outcome were introduced. As what has been pointed out there, the basic expression of a treatment effect, say T, on individual i can be written in the form of Yi 1 Yi 0, where Y i 1 and Yi 0 are outcomes after and before the treatment respectively. However, it is usually impossible to observe these two outcomes of a given individual simultaneously. Holland (1986) summarizes two potential ways to deal with this problem: scientific solution and statistical solution. Which one is useful depends on the validity of assumptions 3. In the scientific solution some special assumptions, like the untestable unit homogeneity is proposed. For example, we assume that both outcomes before and after treatment are the same for two objects. It means that Y1 0 = Y2 0 and Y1 1 = Y2 1. While, if object 1 is in the treatment group and object 2 is in the control group, we can observe Y1 1 and Y2 0. Then the treatment effect can be measured by observed values Y1 1 Y2 0. In the statistical solution the average treatment effect is identified under certain conditions, such as the well-known randomization or independence assumption 4. More detailed, the individuals are assigned randomly to make the treatment independent of all other variables, such as the backgrounds or outcomes themselves. Mathematically, we have the independence assumption: Y 0, Y 1 T. In this case we have E(Y 1 ) = E(Y 1 T = 1) and E(Y 0 ) = E(Y 0 T = 0). Then the average treatment effect can be expressed in the form of E(Y 1 T = 1) E(Y 0 T = 0). Usually it is difficult to realize such pure random assignment, which calls for well-designed experiments. If these experiments are not available, selection biases in the average treatment effect may come about. One of common cases occurs when the assignment to treatment is determined by a predictor 5, which can be denoted by S i. There 3 The former one is commonly used in the science laboratory and the latter one is usually preferred by social experiments. 4 Of course, there are also some additional necessary assumptions to make the causal inference simpler, such as the Stable Unit Treatment Value Assumption (SUTVA) argued by Rubin (1986). The SUTVA contains two components: one is that all objects in a certain group, such as the treatment group or control group, should receive the same treatment; another is that the potential outcomes of a certain object should not be affected by the treatment status of another object. 5 There are also many other methods to solve the problem of non-random experiment. Sometimes researchers can replicate a random experiment by matching methods on observables, or IV strategies on unobservables. For details see Heckman, Ichimura and Todd (1998) and Imbens and Angrist (1994) respectively. 3

4 is a cutoff point S 0 of this covariate 6 and the units will be treated if the value of covariates is on one side of this point and will not be treated if on the other side. So we have T i = 1 if S i S 0 and T i = 0 if S i < S 0. This idea leads to another analyzing framework named regression discontinuity design. The following section is a review of the development of such design with a concentration on applications to economics of education. 2.3 Origin The concept of RD design is first introduced by Thistlethwaite and Campbell (1960), where the effect of recognition (Certificate of Merit) on several factors relating to high school students 7 is analyzed. The decision of the Certificate of Merit is mainly made on the basis of qualifying scores. The paper shows results briefly by graphic presentations. The problem of tests of significance is also discussed and the t-test from Mood (1950) 8 is adopted. The main purpose of this paper is to compare the RD analysis with the ex post facto experiment 9 in both of methods and application results. One advantage of RD analysis is emphasized: the RD analysis does not rely upon matching to equate experimental and control groups 10. The crucial idea behind the RD design is that the assignment to treatment is determined by other observed variables, according to certain administrative decisions, completely or partially. If the relationship between the assignment to treatment and observed variables is completely deterministic, like that in Campbell (1960), the RD design is called sharp RD design. If such relationship is not deterministic 11, the RD design is called fuzzy RD design, which is introduced by Campbell (1969). With mathematic expression, we have T i = 1 { S i > S 0} and lim P r(t i = 1 S i = s) lim P r(t i = 1 S i = s) = 1 in the former case, while in the latter case we only have s S 0+ s S 0 T i = 1S i > S 0 and lim P r(t i = 1 S i = s) lim P r(t i = 1 S i = s) > 0 12, which means that the jump at the s S 0+ s S 0 cutoff point is smaller than one. Though introduced as early as in the 1960s, the RD design experienced few theoretical or practical developments until the middle 1990s, except for several applications in psychology and education by these pioneers themselves, such as Cook and Campbell (1979). Nevertheless in the middle 1990s researches started to boom after a 30-year silence Early Applications There are two possible reasons for the popularity of the RD design since the middle 1990s. One is that more and more programs, not only limited to education, assign treatment in this way. The other is that several advantages of the RD design, such as mild assumptions, are realized by more and more researchers. Van der Klaauw (1997), where the effect of financial aid on college enrollment is studied, reveals the relation between fuzzy RD design and methods of instrumental variables. In the fuzzy RD design there are also unobserved variables which affect the assignment to treatment. If such unknown factors are not independent assignment 6 This covariate can be a single variable or a combination of several variables. It is called assignment, treatment-determining, selection, running, forcing or ratings variables in various literatures. In the remaining sections we prefer the name of treatment-determining variable. 7 These factors include attitudes toward intellectualism, the number of students planning to seek the MD or Phd degree, the number planning to become college teachers or scientific researchers and the number who succeed in obtaining scholarships from other granting agencies. 8 It is the test of the significance of the deviation of the first experimental value beyond the cutoff from a value predicted from a linear fit of the control groups. 9 In an ex post facto experiment, the treatment and control groups are not selected before the experiment. So the treatment can not be manipulated. The research will study the treatment effects after the naturally occurring treatment. 10 Usually these matching methods, such as propensity score method, are not applicable in RD design circumstance because the violation of strong ignorability condition. For details see Rosenbaum and Rubin (1983). 11 However, the author argues that under this setting the framework of regression discontinuity design does not work. Though this argument has been revised in several consequential papers, the essential of fuzzy design remained unclear until recently. 12 Of course the expression on the left is negative if the rule of assignment to treatment is conducted in the opposite way. 13 See Cook (2008) for a review of the history of the RD design in psychology, education, statistics and economics. 4

5 errors, then the simple regression of outcomes on the treatment will give biased estimates 14. To solve this problem the treatment is replaced by the propensity score which should be estimated at the beginning. Such a two-stage procedure will lead to a consistent estimate of the treatment effect. A semi-parametric estimation method is adopted and sensitivity analysis is also involved to check the sensitivity of the estimates to different specifications. Angrist and Lavy (1999) also analyzes a fuzzy RD design using the framework of instrumental variables, where the effect of class size on the academic achievement, say pupils scores, is discussed. Here Maimonides rule 15 is introduced to divide students into classes of equal size, which make the class size correlated to the enrollment. So instead of the propensity score, class size itself serves as the dependent variable in the first stage. At the time of these early applications instrumental variables (IV) is the dominant method, so many early researches on RD design also formulate the causal analysis in this framework, like Imbens and Angrist (1994), though they already consider cutoffs on treatment-determining variables. Then to understand it is necessary for us to briefly discuss the relation between RD design and alternative methods, such as IV methods and matching, before going deeper into specific issues. 2.5 Matching, IV and RD Design Matching Methods and (Sharp) RD Design Generally speaking, we have four potential outcomes: Y1i 1, Y 1i 0, Y 1 in the treatment group. Y1i 1 0 is the observed outcome while Y1i two are outcomes of those in the control group. Y 1 0i 0i and Y 0i 0. The first two are outcomes of objects is the unobserved counterfactual outcome. The last 0 is the unobserved counterfactual outcome while Y0i is the observed outcome. Then the average observed difference of outcomes between two groups can be expressed as: E(Y 1 1i) E(Y 0 0i) = E(Y 1 1i) E(Y 0 1i) + [E(Y 0 1i) E(Y 0 0i)], where E(Y1i 1) E(Y 1i 0 0 ) is the average treatment effect on the treated (ATET for short) and E(Y1i ) E(Y 0i 0 ) is the selection bias. If treatment is randomly assigned, we have E(Y1i 0) E(Y 0i 0 ) and the ATET just equals the average observed difference in outcomes between different groups. If the treatment is assigned on observables only, matching estimators of treatment effects become useful. To perform it we need additional assumptions. One is the conditional independence assumption: Y 0, Y 1 T X, which is equivalent to Y 0, Y 1 T P (X), where P (X) = P r(t = 1 X). Another is called overlap condition, which means that for every characteristic X in the treatment group there should be objects in the control group. So under this condition we have 0 < P r(t = 1 X) < 1 for all x in the treatment group. There are two kinds of matching methods: exact matching and inexact matching. In exact matching, we just match objects in treatment group and control group on their observable characteristics X. This procedure requires that the characteristics X are discrete and for each value there are many objects in both groups. If these conditions are not satisfied, exact matching become impractical and we have to turn to inexact matching. One of the most popular methods of inexact matching is the method of propensity scores. The propensity score is the conditional probability of receiving treatment given X. This method matches on the propensity score and compares objects in both groups whose propensity scores are closest. To estimate the propensity score we can use parametric models such as logit or nonparametric methods. The general formula of the matching ATET is as follows: AT ET M = 1 [Yi 1 ω(i, j)yj 0 ], N T j i T =1 14 This idea goes back to Barnow et al. (1980). 15 Interpreted by Maimondies, the rule can be stated as follows: Twenty-five children may be assigned to one teacher. It there are more than forty children, two teachers must be appointed. If the number of children is between twenty-five and forty, an additional assistant is needed. 5

6 where N T is the number of objects in the treatment group, j is the object in the comparison group of the object i in the treatment group, and such comparison group is expressed as C j (X) = {j X j c(x i )}, where c(x i ) is the neighborhood of characteristics X i. ω(i, j) is the weight and 0 < ω(i, j) 1. So by choosing different weights we can obtain different estimators. Both matching methods and sharp RD design are cases of assignment on observables only. But we are not able to apply matching methods such as propensity score methods to solve sharp RD design problem, because settings of the later essentially violate the overlap condition. In the sharp RD design we have P r(t = 1 X) = 0 for all X < X 0, so there is no region of common support, as would be required for matching. Formally the treatment effect of sharp RD design can be derived as follows: β s = lim P r(y i S i = s) lim P r(y i S i = s) s S 0+ s S Instrumental Variables and (Fuzzy) RD design In a fuzzy RD design, the jump in the probability of treatment at the cutoff point is smaller than one, which implies that the relation between treatment-determining variables and treatment is not deterministic. There are several reasons for such non-deterministic assignment. Sometimes we have T i (X), so for individual i the treatment assignment is a deterministic function of the treatment-determining variable but such function may be different across individuals. Sometimes there is a unique deterministic function to assign treatment for all individuals, but one or more treatment-determining variables are not observable. In that case we obtain a fuzzy RD design, though essentially it should be a sharp RD design. In the first case, where we have various deterministic functions, settings are associated with instrumental variable methods introduced by Imbens and Angrist (1994). They intensively discuss the identification and estimation of a special average treatment effect, called local average treatment effect (LATE), where there can be no subpopulation with zero probability of treatment 16. Now again we have potential outcomes Yi 1 and Yi 0. Moreover, we have an instrumental variable Z i that is independent of the potential outcomes and related with treatment. So we have T i (Z) = 1 if individual i would be treated with Z = z and T i (Z) = 0 if he or she would not be treated with Z = z. Then a latent index model is introduced, where the treatment is related to a latent treatment index and such index is determined by instrumental variables. Mathematically we have T i = γ 0 + γ 1 Z i + τ i, T i = { 1, T i > 0 0, T i 0 Then to identify the treatment effect, we need some mild assumptions. One is the assumption of the existence of instruments: for a random variable Z, for all z, (Yi 0, Y i 1, T i) are jointly independent of Z i ; P (z) = E(T i Z i = z) is a nontrivial function of z 17. Another is the monotonicity assumption: for all individuals i, for all z and ω, we have either T i (z) T i (ω) or T i (z) T i (ω), where both T i (z) and T i (ω) can be zero or one. Then LATE can be identified and estimated through the IV approach: LAT E z,w = E[Yi 1 Yi 0 T i(z) T i (ω)]. This estimated effect is the average effect for individuals who will change their treatment status when the instrument is changed. In a fuzzy RD design with various deterministic assignment functions, the treatment indicator 1 { S i S 0}, as well as polynomial functions of the treatment-determining variables and other covariates, if any, serve as the instrumental variable. Then under similar assumptions, we can also identify the LATE at the cutoff point. That is the treatment effect of individuals whose treatment status changes discontinuously, say from non-treatment to treatment when the value of treatment-determining variable crosses the cutoff point. 16 In fact, only with binary instrument the LATE and the IV estimates are equivalent. With multiple instruments they are different in general. For details, see Angrist, Imbens and Rubin (1996). 17 Here both of T i (z) and Z i are random variables, e.g., Z i is randomly assigned and not able to determine T i (z) deterministically. 6

7 More detailed, we can estimate the treatment effect of fuzzy RD design with a two-stage procedure. In the first stage, the propensity score function is estimated: E(T i S i ) = f(s i ) + λ1 { S i S 0}, where the polynomial f can be estimated parametrically, semi-parametrically or non-parametrically. In the second stage the function called the control function-augmented outcome equation is estimated: Y i = α + βe(t i S i ) + l(s i ) + ɛ i, where the estimated propensity score from the first stage replaces the treatment variable. Both f and l are polynomials of S i and can be chosen separately, so they can be different or the same. Finally we can get the treatment effect of fuzzy RD design as follows: 2.6 Identification and Estimation lim E(Y i S i = s) lim E(Y i S i = s) β fuzzy s S = o+ s S o lim E(T i S i = s) lim E(T i S i = s) s S o+ s S o Almost at the same time of early studies mentioned above, the identification and estimation of the treatment effect in RD design are also developed by several researches. There are two important questions remaining unclear for early researchers. One is what sources of identification are and another is how to estimate the treatment effect under minimal restrictions or assumptions on functions or parameters involved. In this section we will briefly discuss relevant issues and leave details to the following formal analytical sections. Hahn, Todd and Van der Klaauw (2001) briefly discusses these two questions about the RD design. It is shown that under several weak functional continuity assumptions, the treatment effect can be identified nonparametrically by comparing persons arbitrarily close to the point at which the probability of receiving treatment changes discontinuously 18. For the estimation, it is shown that the kernel estimator is numerically equivalent to a standard local Wald estimator 19 under certain conditions, such as a particular choice of kernel and subsample. But the inferences based on them are different because the kernel estimator is asymptotically biased due to the bad boundary behavior 20. To avoid this problem the method of local linear regression is introduced, whose associated bias is smaller than that of standard kernel estimator and does not depend on the density of the data 21. Porter (2003) follows the discussion of Hahn, et. al. (2001) and focuses on the bias problems in the estimation of RD design. To overcome such problems, several estimators are investigated to attain the optimal convergence rate under various smoothness conditions, including the Nadaraya-Watson estimator, partially linear estimator and a local polynomial estimator 22. The last two are advocated because of their bias-reduction property. The local polynomial estimator is even more robust because such a property also holds under smoothness conditions of the partially linear estimator. No matter which one is adopted, say the local polynomial estimator or partially linear estimator, their asymptotic properties rely on the smoothness conditions of control functions involved in the regression equations. If the degree of smoothness is unknown, then the estimates from these methods will inflate the bias rather than decrease it. To solve this potential problem Sun (2005) introduces the adaptive estimator, which first estimates the degree of smoothness before applying estimations mentioned above. 18 In fact, without the constant effect assumption, the treatment effect can only be identified at the discontinuous point. 19 The Wald Estimator was first introduced in Wald (1940). Under the binary IV settings, it can be described as follows: β W ald = [E(Y Z = 1) E(Y Z = 0)]/[E(X Z = 1) E(X Z = 0)]. 20 At the boundary points, the order of the bias of the standard kernel estimators is O(h) while such order is O(h 2 ) at interior points. So the convergence rate at the boundary points is slower and the bias will be substantial in finite samples. 21 In the local linear regression, a local straight line is used to fit the underlying function, while in the standard kernel estimation the straight line is simplified to a constant. 22 Here the Nadaraya-Watson estimator is just the local Wald estimator and the local polynomial estimator is just the local linear estimator, both of which are discussed in Hahn, et. al. (2001). 7

8 All these estimations discussed so far belong to non-parametric or semi-parametric estimation, which is a popular way implemented by a large amount of researches because of its weak assumptions and mild mis-specification bias 23. There are also several parametric estimations, such as the control function approach. However, the validity of these parametric estimations relies on the delicate correct specification of control functions and the stronger global continuity assumption. Though with such assumption we can use data far from the cutoff, the large potential bias may decrease the precision of the estimation greatly. Following the non-parametrc or semi-parametric estimations, there are many further discussions. Some focus on the choice of bandwidth used in kernel estimations, such as Imbens and Lemieux (2008) and Ludwig and Miller (2005). A special case is introduced by Black, Galdo and Smith (2007), which mainly discusses the order of polynomial in the local polynomial estimation. As a sub-conclusion, we summarize the basic assumptions and results of identification. If we assume constant treatment effects and continuous error term at the cutoff, the treatment effect of sharp RD design and fuzzy RD design can be derived as follows: 2.7 Special Subjects and Relevant Tests β sharp = lim P r(y i S i = s) lim P r(y i S i = s), s S 0+ s S 0 lim E(Y i S i = s) lim E(Y i S i = s) β fuzzy s S = o+ s S o lim E(T i S i = s) lim E(T i S i = s). s S o+ s S o Since being recovered, researches on RD design develop in two ways: theoretical improvements and empirical applications 24. In this section we mainly focus on the former and give relevant applications for each theoretical development. We will also be biased to those related closely to potential developments of the value-added problem in our application Covariates Other than Treatment-determining Variables One natural starting point is to consider the roles of covariates other than the treatment-determining variables. Then the regression model for the observed outcomes becomes: Y i = α + β i T i + γ i X i + ɛ i According to Imbens and Lemieux (2008), these covariates are mainly used for three purposes: examining the validity of RD design; eliminating small sample biases and improving the precision. Firstly, if the treatment is locally randomized, then the observed covariates should be locally balanced on either side of the cutoff. Lee and Lemieux (2009) introduces Seemingly Unrelated Regression (SUR) to check whether predetermined covariates are influenced by treatment-determining variables, say whether they are discontinuous at the cutoff. Mathematically, we have several covariates and relevant regression equations: X j = α j + β j T + τ j S + µ j, where j = 1,..., J, so we have J covariates. Then we just need to perform a χ 2 test to see whether β j are jointly equal to zero. An empirical application can be found in Lee, Moretti and Butler (2004). Secondly, in practice sometimes we include observations with treatment-determining variables not close to the cutoff, and then with additional covariates some biases from these observations may be eliminated and the estimator will still be consistent 25. Given that some observed covariates are controlled, the identifying assumptions can still 23 See van der Klaauw (2008)b for a detailed comparison between non-parametric and parametric estimations. 24 It is a rough classification because few of them only focus on one aspect. Most of these studies mainly pay attention to one aspect while also giving considerations to the other. 25 For these observations, the assignment of treatment may be no longer independent of covariates. 8

9 hold even observations a bit far from the cutoff are included. For example, suppose there is a covariate relating to the potential outcome. If it is not controlled, with observations far from the cutoff, both the potential outcome and the error term will be likely to jump at the cutoff. Then it is difficult to distinguish the treatment effect with the effect of this covariate and a spurious effect will be induced. Thirdly, even if the RD design is valid without additional covariates, incorporating additional covariates may still be helpful for the improvement of the efficiency 26. Generally the variance of the estimator with additional covariates is smaller than that with only the treatment-determining variable. Furthermore, the former is decreasing with the number of additional covariates Discrete Treatment-Determining Variables Another significant improvement comes from discrete treatment-determining covariates. Card and Shore- Sheppard (2004) is an early investigation of this topic, where the effects of two large expansions that offer Medicaid coverage to low-income children in certain age ranges are examined. Several variables, such as Medicaid enrollment, on either side of the cutoff of the age are compared. The model used here involves discrete treatment-determining covariates, say the dummy variable that says whether the age is larger than the cutoff. A low-order polynomial function of age and income is also included into the model to smooth the change of outcome variables. Lee and Card (2008) intensively discusses this problem from a purely theoretical viewpoint. It is shown that if the treatment-determining variable is discrete, the conditions for non-parametric or semi-parametric methods do not hold and consequentially the treatment effect is not non-parametrically identified 28. Then identification can be achieved by introducing an underlying function in a parametric form for the approximation of the relation between the treatment-determining variable and the expected outcome, as what is done in Card and Shore-Sheppard (2004). The intuition can be illustrated by the following example: suppose that we have discrete treatment-determining variable X = (x 1,..., x J ), the regression function can be expressed as E(Y X = x j ) = T j β 0 + h(x j ) and it is equivalent to a micro-data model: Y ij = T j β 0 + h(x j ) + ɛ ij, where h(x j ) is a continuous function and can be approximated by a certain form, such as a polynomial. ɛ ij is the error term and defined as ɛ ij = Y ij E(Y ij X = x j ). However, being different from previous studies, without the assumption that the form of underlying regression function is correct, the model here allows for the deviations of the expected value of the outcome from the predicted value given by the function. Such a deviation is called the random specification error. Under the polynomial function assumption, the regression can be written as: Y ij = α 0 + T j β 0 + X j γ 0 + α j + ɛ ij. Here α j is just the random specification error and defined as α j = h(x j ) X j γ 0. Then under this framework the standard errors of estimation are intensively discussed, say in what conditions heteroskedasticity-consistent standard errors, clusterconsistent standard errors and further adjustment errors are properly used. Finally how to obtain more efficient estimators and the relation between such estimator and Bayesian estimation are also discussed Continuity of Density As shown in Hahn et al. (2001), the RD design can be related to treatment effects and can be as good as a randomized experiment as long as expectations of the potential outcomes under treatment and control states are both continuous in treatment-determining variables, especially around the cutoff. Such validity of RD design can be tested by examining whether treatment-determining variables are continuous at the cutoff. However, as shown by several studies, this assumption does not hold naturally all the time. Martorell (2004) is one of the early researches mentioning this problem, where the effects of high school graduation exams on several outcomes, such as graduation, earnings and so on, are examined. Here whether unobservable determinants of the student outcome 26 However, including additional covariates does not necessarily improve the efficiency of estimation. See Lee (2008) for a counterexample. 27 For details about the RD design with additional covariates, see Frolich (2007). 28 That is because there are no observations in an arbitrarily small neighborhood of cutoff even with infinite data. Then the kernel estimator in the limit will put all weights on the empty neighborhood extremely closed to the cutoff. 9

10 exhibit discontinuous behavior at the passing cutoff serves as the crucial condition under which the causal effect is valid. Lee (2008) finds that if individuals are able to perfectly manipulate the treatment-determining variable, then the density of such a variable is likely to be discontinuous at the cutoff. The continuity condition is formalized as follows: the cdf of the treatment-determining variable s on the error term ɛ, say F (s ɛ), is continuously differentiable and satisfies F (s ɛ) (0, 1) at the cutoff for each ɛ. It is also shown that the treatment-determining variable could contain two components. One is systematic and can be affected by actions of individuals and the other is an exogenous random chance part. With the second part, though endogenous sorting still exists to some extent, the local random assignment can occur because of imprecise manipulation. Then unbiased impact estimates can still be obtained. An example regarding the U.S. House election is analyzed to illustrate the main points discussed. Though the test regarding the continuity of expectation of pre-determined characteristics on the treatmentdetermining variable is a powerful process, sometimes these characteristics are unobserved or unavailable. As discussed above, in that case the test about the discontinuity of density function of the treatment-determining variable is not suitable. McCrary (2008) introduces a special density test for this issue, which is based on an estimator for the discontinuity in the density function of the treatment-determining variable at the cutoff. Such a test is a Wald test with the null hypothesis that the discontinuity is zero. It is implemented in two steps: first a finely gridded histogram is obtained, and then local linear regression is adopted to smooth the histogram on either side of the cutoff point. Two additional conditions for the validity of this test are also discussed: one is the monotony of manipulation, say all the individuals should manipulate the treatment-determining variable to the same direction. The other is that the identification actually fails because of such manipulation. The continuous density of the treatment-determining variable is neither necessary nor sufficient for the identification Fuzzy RD Design In our potential settings of the value-added problem, as well as many other quasi-experimental settings in educational fields, the treatment is rarely completely determined by the treatment-determining variables. In that case it seems that a separate concentration on fuzzy RD design is necessary. Fuzzy RD design means that the size of the discontinuity is less than one. The treatment effect can be identified by similar processes with sharp RD design. But it requires more and stronger conditions to interpret the treatment effect. For example, it can be defined as the mean effect on the subpopulation of compliers in a neighborhood of the cutoff 30. We will discuss the identification and interpretation of fuzzy RD design in greater details in other relevant sections. Chay, McEwan and Urquiola (2005) is an early typical empirical research on this topic. The effects of Chile s 900 Schools Program that allocated resources based on cutoffs in schools mean test scores are analyzed. Because the noise and mean reversion can bias the estimation from conventional strategies, an RD design is used to solve this problem. Meanwhile because the assignment does not rely exclusively on the test scores, a fuzzy RD design is preferred. Some other aspects, like unobserved exact cutoffs, are also discussed to make the estimation more precise. Battistin and Rettore (2008) analyzes the partially fuzzy design in which the eligible individuals can participate based on self-selection. Through information on three groups: ineligibles, eligible non-participants and participants, they show that in this case the identification of the mean effect on participants who are marginally eligible in a right-neighborhood of the cutoff requires the same conditions as those in a sharp design. A specification test if the validity of non-experimental estimators through these local identifications is also discussed. Such a test allows us to test the ignorability condition, say Y 0 T S, X, directly by checking whether the selection bias equals zero, 29 The manipulation will lead to a discontinuous density, however, if such manipulation is randomly performed, the treatment effect can still be identified. 30 The concept of compliers and relevant settings are similar to those used in IV approach, for details see Angrist, Imbens and Rubin (1996). 10

11 say lim {E(Y 0 T = 1, s, x) E(Y 0 T = 0, s, x)} = 0. This test is informative only at the cutoff of eligibility, but s S 0+ if it rejects the non-experimental estimators locally then it is enough to reject altogether. An application to the PROGRESA program, which aims at encouraging investments in education, health and nutrition through large monetary transfers in rural Mexico, is used to illustrate the crucial points in this paper. Yang (2009) generalizes the problem by pointing out the dual nature of RD design: a borderline experiment near the cutoff and a strong valid exclusion restriction in the selection equation. Focusing on fuzzy RD design, the paper proposes two estimators for the average treatment effect in the presence of multiple selection biases 31 : RD robust estimator and correction function estimator. The former is used for the population near the cutoff where selection is based on observables, while the latter is used for a population away from the cutoff where selection is based on unobservables. Which estimator is appropriate depends on the research question. The choice between them is essentially a balance between internal and external validity 32. The empirical analysis by Chay, et. al. (2005) is reexamined to show the improvements brought by these two new estimators. 2.9 Applications of RD Design to Value-Added Problem In this section we will introduce several recent applications of the RD design to value-added problems. We hope that it is helpful for forming a complete picture of such design. To be more comprehensible, these applications are organized in three groups with different treatments: school based, class based and teacher based 33. In all of these applications the potential outcome is a test score or another similar variable, such as the graduation rate. The treatment-determining variables are test scores (usually for school based and teacher based cases) or student enrollment (usually for class based cases) School Based Applications In school based applications, the treatment is usually the assignment to a certain special kind of schools or to policies imposed on characteristics of schools. One kind of typical studies in this field focuses on the effect of summer schools, such as Matsudaira (2008) that exploits the effect of mandatory summer school on students achievement 34, where the treatment-determining variables are the math and reading scores in the year 2000; the potential outcomes are math and reading scores in the year 2002; the treatment is attending summer school. Here students in Chicago who fail to meet any of several criteria may be mandated to attend the summer school, so in general the problem falls in a fuzzy RD design framework. Following Porter (2003), a three order polynomial function of the test score is included in the regression equations and the effect is estimated parametrically. The empirical results show that the average effects are much smaller than those of other studies. Furthermore, they are heterogeneous across grades Teacher Based Applications Most of the teacher based applications focus on effects of incentives for teachers on students performance 35. Lavy (2004) evaluates such an effect in Israel with two identification strategies: propensity score matching and RD design. Here the potential outcomes are the scores in several subjects; the treatment-determining variable is the 31 For RD robust estimator, it removes selection bias on observables and controls for the heterogeneous bias from the interaction between observables and the treatment. For correction function estimator, it deals with omitted-variable bias and controls for the sorting bias. 32 The RD robust estimator is only valid near the cutoff but has few specifications on function and flexible parameterization while the correction function estimator can deal with biases on unobservables but has a more restrictive parameterization. 33 There is also antoher kind of popular application concerning the effect of financial matters. We will not discuss it because it is related to value-added problem very closely. People who have interest in this topic can turn to Guryan (2001), Canton and Blom (2004), Leuven and Oosterbeek (2007) and Van der Klaauw (2008). 34 Jacob and Lefgren (2004)a is another example that also discusses the effects of summer school on students test scores in the RD design framework. For other intensive sutides, see Roderick, Engel and Nagaoka (2003). 35 For effects of teacher training program, see Jacob and Lefgren (2004)b, which analyzes an example from elementary schools in Chicago with school averages on test scores as treatment-determining variables. 11

12 1999 school matriculation rate; the treatment is the assignment to cash bonuses for teachers. The propensity score matching is feasible because of the very rich and unique data available on all schools and students. In the RD design two variations are considered. One is to exploit the random measurement error in the treatment-determining variable. Then conditional on the true value, the treatment assignment is random 36. Then the treatment effect can be identified by non-parametrically matching schools on the basis of the true value. The other is a sharp RD design with a bandwidth of about 10 percent. Both estimations involve a panel data structure with fixed schoollevel effects. Several additional effects of the incentive program are also discussed, such as the spillover effects on other non-treated subjects, effects on teachers pedagogy, effort and grading ethics. The empirical results show that such incentives actually increase student achievements because of improvements of effort and pedagogy rather than artificial inflation in test scores Class Based Applications The relation between class size and students achievements is the crucial question in this kind of applications. One recent research on this topic comes from Urquiola (2006), where the effects of class size on test scores in rural Bolivia are analyzed. The potential outcome is test scores. The treatment-determining variable is the enrollment at the school level. The treatment is class size. Two identification strategies are presented: one is only focusing on schools with fewer than 30 students, which is helpful for eliminating schools and parental manipulating choice. The other is similar to that in Angrist, et. al. (1999), which also generates a fuzzy discontinuity in the relation between class size and enrollment. Such relation can be described as follows: C jk = E k /n k, where C jk is the class size of j class in school k, E k is the enrollment of school k, n k is the number of classes in school k. Empirical results from both strategies confirm the negative relation between class sizes and test scores. However, such effect is larger for the RD strategy and smaller for the first strategy with smaller class sizes Conclusion and Prospects In previous sections we mainly review the origins and developments of RD design, along the technological ways with a concentration on applications to economics of education. Several recent applications to value-added programs are also summarized. Considering the research question at hand, we can develop the work in several potential directions, though not all of them will appear in the remaining sections. Firstly, we could go deeper into the problem of sorting or manipulation of the treatment-determining variables. In our problem, it seems that the assignment to model schools is partially determined by scores of the high school entrance exam, so a fuzzy RD design is feasible. But though the exact cutoff remains unknown before the exam, the rules are public knowledge and information of previous years is also available. In that case the endogenous sorting might invalidate the RD design and lead to biased estimation 37. Secondly, we could also exploit the partial fuzzy RD design in our problem. It seems that the cutoff of tests from the high school entrance exam is just an eligibility threshold rather than an actual treatment threshold. Students with higher scores can give up on their specific considerations. Thirdly, we could compare empirical results from RD design with those from other methods, say matching or IV. Several relevant aspects can also be considered, such as the testing of randomized property near the cutoff and the actual treatment-receiving mechanism for fuzzy RD design. Finally, combining empirical results with actual educational policies implemented in China, we could try some policies suggestions. At least in China there are few researches on value-added problems which use RD design 36 Limited to the sample of 97 schools that were eligible for the treatment, the correlation between the correct matriculation rate and the measurement error is very low. Finally a sample of 29 schools is used. In it 17 schools with the correct matriculation rate higher than the cutoff is chosen for the treatment erroneously, while other 12 schools have similar correct matriculation rate but are not treated because the their measurement errors, if any, are not negative enough. 37 See Urquiola and Verhoogen (2007) for a recent research on effects of class size focusing on the endogenous sorting problem. 12

13 framework. So if some interesting and unique results are obtained, they may also be very meaningful in political sense. 3 Empirical Backgrounds 3.1 Test Scores and Model School Program Entrance Exam of High Schools in Beijing The entrance exam of high schools is held once per summer (usually in late June). The score of it serves as the only criteria for the enrollment in high schools in most cases 38. The students in middle schools can participate in the entrance exam if they satisfy one of the following criteria: First, the student has citizenship of Beijing. He or she can be a third-grade student or an already graduated student younger than 18 years old. Second, there are also several exceptions for those without the citizenship of Beijing, such as children of post-doctoral researchers. In our sample of Daxin District there are few students satisfying these criteria, so we will not show them in detail 39. The questions or items in the exam are the same for all students. There are six subjects involved in it: Chinese, Mathematics, English 40, Physics, Chemistry and PE. The full scores are 120, 120, 120, 100, 80 and 30 respectively. So the total full score is Admission to High Schools The students should submit their choice, which includes at most eight schools in the order of preference, before the exam. High schools will admit students according to their scores and choices. Here is an example to illustrate the process. Suppose that School A plans to admit 10 students. Then it will rank students who choose it as the first choice by scores and admit the first 10. Other students who are not admitted by School A will be returned to the pool and wait for consideration of the rest schools in their choice. If the number of students who choose School A as the first choice is less than 10, say 8. Then School A will admit all of them and rank students who remain in the pool and choose them as the second choice by scores. The first 2 will be admitted. If there are still vacancies, the same process will be followed for students who choose School A as the third choice and so on. If School A still has vacancies after considering students who choose them as the eighth choice, it can contact those who remain in the pool but do not choose it. The rank of students is based on scores already including those special extra cases. If two students have the same score, then there are several priorities to distinguish them 41. If they still have the same, then they will be ranked by scores of Mathematics, Chinese and English sequentially. If all of scores of these subjects are the same for two students, the student with smaller pre-assigned random series number will be admitted. 38 There are two kinds of exceptions: one is that students with excellent awards, such as Jin Fan and Yin Fan awards, can enter high schools assigned in advance. The other is that students satisfying several conditions, such as minority race or children of martyr, can obtain additional scores. 39 All of the observations involved in this study are eligible for the entrance exam. We do not have the exact data concerning the second criterion. However, we can explore it indirectly. For example, in our sample the proportions of parents who hold a graduation degree are only approximately 0.5% and 0.6% for father and mother respectively. For post-doctoral researchers the proportions are much less than those. Furthermore, in our sample there are only 3.6% of students who do not study in the assigned middle schools. Students without the citizenship of Beijing belong to that group but it is not the sole reason for that, so the proportion of these students will be even less than 3.6%. 40 Very few students in special schools will choose other foreign languages, such as French or Spanish. 41 These priorities include children of serviceman and diplomats. 13

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic

More information

ted: a Stata Command for Testing Stability of Regression Discontinuity Models

ted: a Stata Command for Testing Stability of Regression Discontinuity Models ted: a Stata Command for Testing Stability of Regression Discontinuity Models Giovanni Cerulli IRCrES, Research Institute on Sustainable Economic Growth National Research Council of Italy 2016 Stata Conference

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

An Alternative Assumption to Identify LATE in Regression Discontinuity Design An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs An Alternative Assumption to Identify LATE in Regression Discontinuity Designs Yingying Dong University of California Irvine September 2014 Abstract One key assumption Imbens and Angrist (1994) use to

More information

Regression Discontinuity Designs.

Regression Discontinuity Designs. Regression Discontinuity Designs. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 31/10/2017 I. Brunetti Labour Economics in an European Perspective 31/10/2017 1 / 36 Introduction

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 16, 2018 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 24, 2017 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

Lecture 10 Regression Discontinuity (and Kink) Design

Lecture 10 Regression Discontinuity (and Kink) Design Lecture 10 Regression Discontinuity (and Kink) Design Economics 2123 George Washington University Instructor: Prof. Ben Williams Introduction Estimation in RDD Identification RDD implementation RDD example

More information

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous Econometrics of causal inference Throughout, we consider the simplest case of a linear outcome equation, and homogeneous effects: y = βx + ɛ (1) where y is some outcome, x is an explanatory variable, and

More information

Applied Microeconometrics Chapter 8 Regression Discontinuity (RD)

Applied Microeconometrics Chapter 8 Regression Discontinuity (RD) 1 / 26 Applied Microeconometrics Chapter 8 Regression Discontinuity (RD) Romuald Méango and Michele Battisti LMU, SoSe 2016 Overview What is it about? What are its assumptions? What are the main applications?

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Regression Discontinuity Designs

Regression Discontinuity Designs Regression Discontinuity Designs Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Regression Discontinuity Design Stat186/Gov2002 Fall 2018 1 / 1 Observational

More information

The Economics of European Regions: Theory, Empirics, and Policy

The Economics of European Regions: Theory, Empirics, and Policy The Economics of European Regions: Theory, Empirics, and Policy Dipartimento di Economia e Management Davide Fiaschi Angela Parenti 1 1 davide.fiaschi@unipi.it, and aparenti@ec.unipi.it. Fiaschi-Parenti

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models

Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models Yingying Dong and Arthur Lewbel University of California Irvine and Boston College First version July 2010, revised

More information

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015 Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 9, 2016 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

The problem of causality in microeconometrics.

The problem of causality in microeconometrics. The problem of causality in microeconometrics. Andrea Ichino University of Bologna and Cepr June 11, 2007 Contents 1 The Problem of Causality 1 1.1 A formal framework to think about causality....................................

More information

Quantitative Economics for the Evaluation of the European Policy

Quantitative Economics for the Evaluation of the European Policy Quantitative Economics for the Evaluation of the European Policy Dipartimento di Economia e Management Irene Brunetti Davide Fiaschi Angela Parenti 1 25th of September, 2017 1 ireneb@ec.unipi.it, davide.fiaschi@unipi.it,

More information

Addressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN

Addressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN Addressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN Overview Assumptions of RD Causal estimand of interest Discuss common analysis issues In the afternoon, you will have the opportunity to

More information

Why high-order polynomials should not be used in regression discontinuity designs

Why high-order polynomials should not be used in regression discontinuity designs Why high-order polynomials should not be used in regression discontinuity designs Andrew Gelman Guido Imbens 6 Jul 217 Abstract It is common in regression discontinuity analysis to control for third, fourth,

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Empirical approaches in public economics

Empirical approaches in public economics Empirical approaches in public economics ECON4624 Empirical Public Economics Fall 2016 Gaute Torsvik Outline for today The canonical problem Basic concepts of causal inference Randomized experiments Non-experimental

More information

Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"

Supplemental Appendix to Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs" Yingying Dong University of California Irvine February 2018 Abstract This document provides

More information

Regression Discontinuity Design

Regression Discontinuity Design Chapter 11 Regression Discontinuity Design 11.1 Introduction The idea in Regression Discontinuity Design (RDD) is to estimate a treatment effect where the treatment is determined by whether as observed

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Regression Discontinuity Inference with Specification Error. June 2004 ABSTRACT

Regression Discontinuity Inference with Specification Error. June 2004 ABSTRACT CENTER FOR LABOR ECONOMICS UNIVERSIY OF CALIFORNIA, BERKELEY WORKING PAPER NO. 74 Regression Discontinuity Inference with Specification Error David S. Lee UC Berkeley and NBER David Card UC Berkeley and

More information

Introduction to Causal Inference. Solutions to Quiz 4

Introduction to Causal Inference. Solutions to Quiz 4 Introduction to Causal Inference Solutions to Quiz 4 Teppei Yamamoto Tuesday, July 9 206 Instructions: Write your name in the space provided below before you begin. You have 20 minutes to complete the

More information

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models James J. Heckman and Salvador Navarro The University of Chicago Review of Economics and Statistics 86(1)

More information

Regression Discontinuity Design Econometric Issues

Regression Discontinuity Design Econometric Issues Regression Discontinuity Design Econometric Issues Brian P. McCall University of Michigan Texas Schools Project, University of Texas, Dallas November 20, 2009 1 Regression Discontinuity Design Introduction

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Noncompliance in Experiments Stat186/Gov2002 Fall 2018 1 / 18 Instrumental Variables

More information

Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs

Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs Andrew GELMAN Department of Statistics and Department of Political Science, Columbia University, New York, NY, 10027 (gelman@stat.columbia.edu)

More information

Regression Discontinuity Designs in Economics

Regression Discontinuity Designs in Economics Regression Discontinuity Designs in Economics Dr. Kamiljon T. Akramov IFPRI, Washington, DC, USA Training Course on Applied Econometric Analysis September 13-23, 2016, WIUT, Tashkent, Uzbekistan Outline

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint

More information

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives Estimating the Dynamic Effects of a Job Training Program with Multiple Alternatives Kai Liu 1, Antonio Dalla-Zuanna 2 1 University of Cambridge 2 Norwegian School of Economics June 19, 2018 Introduction

More information

Statistical Models for Causal Analysis

Statistical Models for Causal Analysis Statistical Models for Causal Analysis Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 Three Modes of Statistical Inference 1. Descriptive Inference: summarizing and exploring

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

Estimating the Effect of Financial Aid Offers on College Enrollment: A Regression Discontinuity Approach. Wilbert van der Klaaw

Estimating the Effect of Financial Aid Offers on College Enrollment: A Regression Discontinuity Approach. Wilbert van der Klaaw Estimating the Effect of Financial Aid Offers on College Enrollment: A Regression Discontinuity Approach Wilbert van der Klaaw 2002 Regression Discontinuity Design The Treatment Model W i = 1 : Treatment

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

150C Causal Inference

150C Causal Inference 150C Causal Inference Instrumental Variables: Modern Perspective with Heterogeneous Treatment Effects Jonathan Mummolo May 22, 2017 Jonathan Mummolo 150C Causal Inference May 22, 2017 1 / 26 Two Views

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3

More information

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Clément de Chaisemartin September 1, 2016 Abstract This paper gathers the supplementary material to de

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Differences in Differences (Dif-in and Panel Data. Technical Session III. Manila, December Manila, December Christel Vermeersch

Differences in Differences (Dif-in and Panel Data. Technical Session III. Manila, December Manila, December Christel Vermeersch Technical Session III Differences in Differences (Dif-in in-dif) and Panel Data Christel Vermeersch Manila, December 2008 Manila, December 2008 Human Development Network East Asia and the Pacific Region

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Friday, June 5, 009 Examination time: 3 hours

More information

Recitation Notes 6. Konrad Menzel. October 22, 2006

Recitation Notes 6. Konrad Menzel. October 22, 2006 Recitation Notes 6 Konrad Menzel October, 006 Random Coefficient Models. Motivation In the empirical literature on education and earnings, the main object of interest is the human capital earnings function

More information

CEPA Working Paper No

CEPA Working Paper No CEPA Working Paper No. 15-06 Identification based on Difference-in-Differences Approaches with Multiple Treatments AUTHORS Hans Fricke Stanford University ABSTRACT This paper discusses identification based

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including

More information

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest Edward Vytlacil, Yale University Renmin University, Department

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Lecture 1: Introduction to Regression Discontinuity Designs in Economics

Lecture 1: Introduction to Regression Discontinuity Designs in Economics Lecture 1: Introduction to Regression Discontinuity Designs in Economics Thomas Lemieux, UBC Spring Course in Labor Econometrics University of Coimbra, March 18 2011 Plan of the three lectures on regression

More information

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable

More information

Imbens, Lecture Notes 4, Regression Discontinuity, IEN, Miami, Oct Regression Discontinuity Designs

Imbens, Lecture Notes 4, Regression Discontinuity, IEN, Miami, Oct Regression Discontinuity Designs Imbens, Lecture Notes 4, Regression Discontinuity, IEN, Miami, Oct 10 1 Lectures on Evaluation Methods Guido Imbens Impact Evaluation Network October 2010, Miami Regression Discontinuity Designs 1. Introduction

More information

The problem of causality in microeconometrics.

The problem of causality in microeconometrics. The problem of causality in microeconometrics. Andrea Ichino European University Institute April 15, 2014 Contents 1 The Problem of Causality 1 1.1 A formal framework to think about causality....................................

More information

Using Instrumental Variables to Find Causal Effects in Public Health

Using Instrumental Variables to Find Causal Effects in Public Health 1 Using Instrumental Variables to Find Causal Effects in Public Health Antonio Trujillo, PhD John Hopkins Bloomberg School of Public Health Department of International Health Health Systems Program October

More information

Comparative Advantage and Schooling

Comparative Advantage and Schooling Comparative Advantage and Schooling Pedro Carneiro University College London, Institute for Fiscal Studies and IZA Sokbae Lee University College London and Institute for Fiscal Studies June 7, 2004 Abstract

More information

Optimal bandwidth selection for the fuzzy regression discontinuity estimator

Optimal bandwidth selection for the fuzzy regression discontinuity estimator Optimal bandwidth selection for the fuzzy regression discontinuity estimator Yoichi Arai Hidehiko Ichimura The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP49/5 Optimal

More information

Implementing Matching Estimators for. Average Treatment Effects in STATA

Implementing Matching Estimators for. Average Treatment Effects in STATA Implementing Matching Estimators for Average Treatment Effects in STATA Guido W. Imbens - Harvard University West Coast Stata Users Group meeting, Los Angeles October 26th, 2007 General Motivation Estimation

More information

Imbens/Wooldridge, Lecture Notes 3, NBER, Summer 07 1

Imbens/Wooldridge, Lecture Notes 3, NBER, Summer 07 1 Imbens/Wooldridge, Lecture Notes 3, NBER, Summer 07 1 What s New in Econometrics NBER, Summer 2007 Lecture 3, Monday, July 30th, 2.00-3.00pm R e gr e ssi on D i sc ont i n ui ty D e si gns 1 1. Introduction

More information

Empirical Methods in Applied Microeconomics

Empirical Methods in Applied Microeconomics Empirical Methods in Applied Microeconomics Jörn-Ste en Pischke LSE November 2007 1 Nonlinearity and Heterogeneity We have so far concentrated on the estimation of treatment e ects when the treatment e

More information

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS ECONOMETRICS II (ECO 2401) Victor Aguirregabiria Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS 1. Introduction and Notation 2. Randomized treatment 3. Conditional independence

More information

Development. ECON 8830 Anant Nyshadham

Development. ECON 8830 Anant Nyshadham Development ECON 8830 Anant Nyshadham Projections & Regressions Linear Projections If we have many potentially related (jointly distributed) variables Outcome of interest Y Explanatory variable of interest

More information

Assessing Studies Based on Multiple Regression

Assessing Studies Based on Multiple Regression Assessing Studies Based on Multiple Regression Outline 1. Internal and External Validity 2. Threats to Internal Validity a. Omitted variable bias b. Functional form misspecification c. Errors-in-variables

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint

More information

NBER WORKING PAPER SERIES REGRESSION DISCONTINUITY DESIGNS IN ECONOMICS. David S. Lee Thomas Lemieux

NBER WORKING PAPER SERIES REGRESSION DISCONTINUITY DESIGNS IN ECONOMICS. David S. Lee Thomas Lemieux NBER WORKING PAPER SERIES REGRESSION DISCONTINUITY DESIGNS IN ECONOMICS David S. Lee Thomas Lemieux Working Paper 14723 http://www.nber.org/papers/w14723 NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Experiments and Quasi-Experiments

Experiments and Quasi-Experiments Experiments and Quasi-Experiments (SW Chapter 13) Outline 1. Potential Outcomes, Causal Effects, and Idealized Experiments 2. Threats to Validity of Experiments 3. Application: The Tennessee STAR Experiment

More information

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1 Imbens/Wooldridge, IRP Lecture Notes 2, August 08 IRP Lectures Madison, WI, August 2008 Lecture 2, Monday, Aug 4th, 0.00-.00am Estimation of Average Treatment Effects Under Unconfoundedness, Part II. Introduction

More information

Predicting the Treatment Status

Predicting the Treatment Status Predicting the Treatment Status Nikolay Doudchenko 1 Introduction Many studies in social sciences deal with treatment effect models. 1 Usually there is a treatment variable which determines whether a particular

More information

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region Impact Evaluation Technical Session VI Matching Techniques Jed Friedman Manila, December 2008 Human Development Network East Asia and the Pacific Region Spanish Impact Evaluation Fund The case of random

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E OBJECTIVE COURSE Understand the concept of population and sampling in the research. Identify the type

More information

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006 Comments on: Panel Data Analysis Advantages and Challenges Manuel Arellano CEMFI, Madrid November 2006 This paper provides an impressive, yet compact and easily accessible review of the econometric literature

More information

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures Andrea Ichino (European University Institute and CEPR) February 28, 2006 Abstract This course

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Experiments and Quasi-Experiments (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Why study experiments? Ideal randomized controlled experiments

More information

Research Design: Causal inference and counterfactuals

Research Design: Causal inference and counterfactuals Research Design: Causal inference and counterfactuals University College Dublin 8 March 2013 1 2 3 4 Outline 1 2 3 4 Inference In regression analysis we look at the relationship between (a set of) independent

More information

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil Four Parameters of Interest in the Evaluation of Social Programs James J. Heckman Justin L. Tobias Edward Vytlacil Nueld College, Oxford, August, 2005 1 1 Introduction This paper uses a latent variable

More information

Econometrics of Policy Evaluation (Geneva summer school)

Econometrics of Policy Evaluation (Geneva summer school) Michael Lechner, Slide 1 Econometrics of Policy Evaluation (Geneva summer school) Michael Lechner Swiss Institute for Empirical Economic Research (SEW) University of St. Gallen Switzerland June 2016 Overview

More information

Prediction and causal inference, in a nutshell

Prediction and causal inference, in a nutshell Prediction and causal inference, in a nutshell 1 Prediction (Source: Amemiya, ch. 4) Best Linear Predictor: a motivation for linear univariate regression Consider two random variables X and Y. What is

More information

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 8. Roy Model, IV with essential heterogeneity, MTE Lecture 8. Roy Model, IV with essential heterogeneity, MTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Heterogeneity When we talk about heterogeneity, usually we mean heterogeneity

More information

Regression Discontinuity: Advanced Topics. NYU Wagner Rajeev Dehejia

Regression Discontinuity: Advanced Topics. NYU Wagner Rajeev Dehejia Regression Discontinuity: Advanced Topics NYU Wagner Rajeev Dehejia Summary of RD assumptions The treatment is determined at least in part by the assignment variable There is a discontinuity in the level

More information

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha January 18, 2010 A2 This appendix has six parts: 1. Proof that ab = c d

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

Econometric Causality

Econometric Causality Econometric (2008) International Statistical Review, 76(1):1-27 James J. Heckman Spencer/INET Conference University of Chicago Econometric The econometric approach to causality develops explicit models

More information

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science EXAMINATION: QUANTITATIVE EMPIRICAL METHODS Yale University Department of Political Science January 2014 You have seven hours (and fifteen minutes) to complete the exam. You can use the points assigned

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

Exam ECON5106/9106 Fall 2018

Exam ECON5106/9106 Fall 2018 Exam ECO506/906 Fall 208. Suppose you observe (y i,x i ) for i,2,, and you assume f (y i x i ;α,β) γ i exp( γ i y i ) where γ i exp(α + βx i ). ote that in this case, the conditional mean of E(y i X x

More information

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL BRENDAN KLINE AND ELIE TAMER Abstract. Randomized trials (RTs) are used to learn about treatment effects. This paper

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

Chilean and High School Dropout Calculations to Testing the Correlated Random Coefficient Model

Chilean and High School Dropout Calculations to Testing the Correlated Random Coefficient Model Chilean and High School Dropout Calculations to Testing the Correlated Random Coefficient Model James J. Heckman University of Chicago, University College Dublin Cowles Foundation, Yale University and

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Yona Rubinstein July 2016 Yona Rubinstein (LSE) Instrumental Variables 07/16 1 / 31 The Limitation of Panel Data So far we learned how to account for selection on time invariant

More information

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects. A Course in Applied Econometrics Lecture 5 Outline. Introduction 2. Basics Instrumental Variables with Treatment Effect Heterogeneity: Local Average Treatment Effects 3. Local Average Treatment Effects

More information