TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN 1. INTRODUCTION

Size: px
Start display at page:

Download "TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN 1. INTRODUCTION"

Transcription

1 TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN YOICHI ARAI a YU-CHIN HSU b TORU KITAGAWA c ISMAEL MOURIFIÉ d YUANYUAN WAN e GRIPS ACADEMIA SINICA UCL UNIVERSITY OF TORONTO ABSTRACT. We propose a test for the key identification and estimation conditions in Fuzzy Regression Discontinuity (FRD) design. We characterize the set of sharp testable implications of the FRD assumptions, for which the proposed test is uniformly valid under a class of distributions, is consistent against all fix alternatives and has non-trivial power against some local alternatives. Keywords: Fuzzy Regression Discontinuity Design, Heterogeneous Treatment Effect, Nonparametric Test, Inequality Restriction 1. INTRODUCTION In recent years Regression Discontinuity (RD) design, which was first introduced by Thistlethwaite and Campbell (1960), has become one of the most widely-used quasi-experimental method for identification and estimation of causal effect of policy interventions in applied researches. For example, Van der Klaauw (2002) and Angrist and Lavy (1999) estimate the effect of financial aid offers on students performance, Lee (2008) investigates the electoral advantage of incumbency in the United States House of Representatives, and DiNardo and Lee (2004) studies economic impacts of unionization on employers, among many others. See Imbens and Lemieux (2008) and Lee and Lemieux (2010) for more complete survey. Parallel to its wide applications in empirical researches, there have been developments on econometrics theory of identification and estimation in RD design. Hahn, Todd, and Van der Klaauw (2001, HTV) establishes conditions under which the (Fuzzy) RD design identifies average treatment effect for certain subpopulations in the presence of heterogeneity, Porter (2003) discusses estimation of RD Date: Wednesday 25 th January, *This paper is based on independent work of Arai and Kitagawa (2016) and Hsu, Mourifié, and Wan (2016). a. National Graduate Institute for Policy Studies (GRIPS), mailto:yarai@grips.ac.jp. b. Institute of Economics, Academia Sinica, ychsu@econ.sinica.edu.tw. c. Department of Economics, University College London, t.kitagawa@ucl.ac.uk. d. Department of Economics, University of Toronto, ismael.mourifie@utoronto.ca. e. Department of Economics, University of Toronto, yuanyuan.wan@utoronto.ca. 1

2 design, Lee (2008) and McCrary (2008) discuss testing the specification of RD design by examining the continuity of the density of running variable or other pre-determined variables, Frandsen, Frölich, and Melly (2012) studies identification and estimation of quantile treatment effect, Calonico, Cattaneo, and Titiunik (2014) provides robust confidence intervals for treatment effects, Dong (2014) considers cases in which the first stage regression can either poses a jump or kink at the cutoff point, and recently Gerard, Rokkanen, and Rothe (2016) considers partial identification in RD design when the running variable can be potentially manipulated. There have been different sets of identifying assumptions proposed in the literature. HTV is one of the earliest papers which formally provide assumptions to non-parametrically identify the heterogeneous treatment effect in the RD design. To identify the average treatment effect of the group of compliers (in the language of Imbens and Angrist, 1994), HTV imposes two main assumptions, namely (i) independence between potential variables and the running variable near the cutoff (local independence) and (ii) monotonic response of the treatment with respect to running variable near the cutoff (local monotonicity). These two assumptions play an analogous role as the independence and monotonicity assumptions in Imbens and Angrist (1994) for identification of local average treatment effect (LATE). Frandsen, Frölich, and Melly (2012, FFM) relaxes the local independence assumption by assuming that the conditional distribution (CDF) of potential outcomes given complying status and running variable is continuous in the latter near the cutoff (which FFM refers as local smoothness). Under local smoothness and local monotonicity, 1 FFM identifies the distribution of potential outcomes of compliers and then the local quantile treatment effect (LQTE). In this paper, we aim to test the sharp implications of the local monotonicity and a FFM-type local smoothness assumptions (see main text for formal definition). We focus on the FFM-type local smoothness because it is weaker than local independence and it is empirically more relevant in some applications as discussed in Dong (2016), and the sharp characterizations of local independence+local monotonicity are indeed the same as those under local smoothness+local monotonicity joint with other suitable conditions, as we will show later in Section 7.1. Our paper contributes to the RD design literature by providing "sharp testable implications" of RD assumptions, that is, it provides the most informative set of testable implications for detecting 1 The local monotonicity assumptions used in FFM and HTV have different forms but coincide in the limit. For now we refer both as local monotonicity. See more discussions in Section

3 observable violations of the Fuzzy Regression Discontinuity (FRD) design assumptions. To the best of our knowledge, it is the first result established in the RD design literature. Quite interestingly, analogous to the observation of Angrist, Imbens, and Rubin (1996) and Kitagawa (2015) make in the instrumental variable model framework, our sharp testable implication can also be interpreted as the non-negativity of the potential outcomes density functions for the compliers at the cut-off. Recently, Fiorini and Stevens (2014) discusses the importance of specification test in FRD design and suggests an idea of testing based on Kitagawa (2015) and Huber and Mellace (2015) and interpret it as a test of local monotonicity by assuming that local independence holds. Fiorini and Stevens (2014), however, does not discuss the sharpness of the testable implications and they test different assumptions from us. We propose nonparametric tests for the sharp implications of the FRD assumptions. We construct the test statistics using the local polynomial method (Fan and Gijbels, 1992) and propose to methods to compute the critical values: a multiplier bootstrap method and a pool bootstrap method. We incorporate both procedures with the generalized moment selection (GMS) method (see Andrews and Soares, 2010; Andrews and Shi, 2013) to improve the power of the test. The proposed tests are uniformly valid under a class of distributions, are consistent against all fix alternatives and has non-trivial power against some local alternatives. In the main theorems, we establish the results without additional covariates; in Section 7.3, we show that our testable implication and test can be adjusted straightforwardly to incorporate them. Our test focuses on different aspects to existing specification tests in FRD framework which often test the continuity of density function of running variable and/or other predetermined variables around the cutoff (see discussions in Lee, 2008). Rejecting the null hypothesis of continuity is interpreted as evidences of manipulation of the design. McCrary (2008) proposes a formal test for continuity of the density of the running variable at the cutoff point. It is worth noting that the continuity of the density of the running variable (and other predetermined variables) is neither sufficient nor necessary for identification of the LATE or LQTE. On the contrary, we test the necessary and sufficient conditions on the distribution of the observable variables for rationalizing the FRD design model under analysis. Therefore, our test is significantly different from the test suggested by McCrary (2008) and Lee (2008) and is a useful complementary to the literature. We will discuss this in more details in Section

4 Our paper also contributes to the growing literature of specification tests in causal inference. Recently, Kitagawa (2008, 2015), Huber and Mellace (2015), and Mourifié and Wan (2016) propose testing procedures to assess validity of key assumptions in the instrumental variable model discussed in Imbens and Angrist (1994). Even the parameter that one can identify in a FRD design can be interpret as LATE and some identifying assumptions play similar roles as their counterparts in instrumental variable models, in general FRD assumptions are invoked differently and the above mentioned tests cannot be directly applied to assess the validity of FRD assumptions. A formal test on FRD assumptions would be valuable by providing empirical researchers with a tool to assess the validity of FRD assumptions rather than just relying on economic intuition or on specification tests which are not very informative. The rest of the paper is organized as follows. In Section 2 we lay out the main identifying assumptions that we are interested in and derive their sharp characterizations in Section 3. We formally provide a test statistics and show how to obtain its critical values in Section 4. In Section 7, we discuss several extensions. All mathematical proofs are deferred to the Appendix. 2. ANALYTICAL FRAMEWORK AND IDENTIFICATION Consider the potential outcome model introduced in Rubin (1974). Let Y = Y 1 D + Y 0 (1 D), where Y is the observed outcome Y Y R, D {0, 1} is the observed treatment indicator, and (Y 1, Y 0 ) are potential outcomes. We define the potential treatment status when the running variable R R R be externally set to r as D(r), and hence the observed treatment status is D = D(R). The main research interest in such models is to identify the causal impact of the treatment D on the outcome Y. Let P be the joint distribution of (Y, D, R) and F R (r) denote the margin distribution function of R. We suppress covariates X for now and will show in Section 7.3 that it is straightforward to include covariates in the model. The main feature of the RD design is the existence of the running variable R which influences the probability of being treated in a discontinuous way when it is takes value around a threshold r 0. There are two main types of discontinuity designs considered in the literature: the sharp design and the fuzzy design. In the sharp RD design, the potential treatment D(r) is deterministic, while in the FRD design, D(r) is a Bernoulli random variable but the probability of taking value 1 varies discontinuously at r 0. This paper consider the FRD design. 4

5 Various sets of assumptions has been proposed in the literature in order to identify the LATE and the LQTE in the FRD design context, see HTV for the LATE and FFM for the LQTE, respectively. Identification of the LQTE is especially important for the analyses of heterogeneous treatment effects. As mentioned by Imbens and Rubin (1997) and Angrist and Pischke (2008), such object is useful for policy makers who want to take into account differences in the dispersion of the outcome of interest when contemplating the merits of one program or treatment versus another. In this paper we will focus mainly on the testability of the FFM-type of assumptions (local smoothness and monotonicity) at least for two reasons: (i) under FFM-type s assumptions, the distribution of the subpopulation of compliers can be identified, then the density (when it is well defined) therefore enable us to recover a wider class of parameter of interests in addition to the LATE and the LQTE, and (ii) the economic intuition behind the set of assumption required to identify only the LATE is similar to the one required by the FFM-type of assumptions to identify the LQTE. Furthermore, we will also discuss the testability of a variant of the FFM-type of assumptions (local independence instead of local smoothness) often invoked to identify the LQTE. We follow FFM and define D + = lim r r0 D(r) and D = lim r r0 D(r), if the limits exist. Based on this definition, we can define the compliance status as shown in the following table: We use TABLE 1. Subpopulations D D + Proportion A: Always-takers 1 1 π 11 D: Defiers 1 0 π 10 C: Compliers 0 1 π 01 N: Never-takers 0 0 π 00 π ij, i, j {0, 1} to denote the probability mass of corresponding groups. Indeed, there may exist a subpopulation for which D +, and D are not simultaneously well defined, however as in FFM, we maintain implicitly that such a subpopulation has a null mass, that is, π 11 + π 10 + π 01 + π 00 = 1. Throughout the paper, we implicitly assume that there are some observations close to the cutoff point r 0. Hereafter, let E P be the expectation under P and let P P denote the probability under P. 5

6 Assumption 1 (Discontinuity). The limits π + lim r r0 P P (D = 1 R = r) and π lim r r0 P P (D = 1 R = r) exist and π + = π. Assumption 1 represents the main feature of the FRD design in that the probability of receiving treatment differs on either side of the cutoff value. Let T denote the complying status and A, C, and N denote always-takers, compliers, and nevertakes, respectively. Assumption 2 (Local Smoothness). For d = 0, 1, t {A, C, N}, and B Y be a measurable set, the conditional probability P P (Y d B, T = t R = r) is continuous in r at r 0. Assumption 3 (Local Monotonicity-1). lim r r0 P P (D + D R = r) = 1. Assumption 2 essentially plays the same role as FFM Assumption I2. 2 The local smoothness condition is imposed on the conditional distribution rather than on the conditional mean (as in the HTV s spirit) and this stronger requirement allows to identify a wider class of parameter of interests, like the LQTE. Assumption 3 is the same as FFM Assumption I3 since we already assume that the subpopulation for which D + and D are not well-defined has mass zero. The direction of the Assumption 3 is without loss of generality and implies that π 10 = 0. Together with Assumption 1, it also implies that π + > π. Let 1{ } denote the indicator function. It can be shown that under Assumptions 1 to 3 (following FFM s Lemma 1 proof and hence omitted here), the compliers potential outcome distribution is identified: and F Y1 C,R=r 0 (y; P) = lim r r 0 E P [1{Y y}d R = r] lim r r0 E P [1{Y y}d R = r], (1) lim r r0 E P [D R = r] lim r r0 E P [D R = r] F Y0 C,R=r 0 (y; P) = lim r r 0 E P [1{Y y}(1 D) R = r] lim r r0 E P [1{Y y}(1 D)R = r]. lim r r0 E P [D R = r] lim r r0 E P [D R = r] (2) If we strengthen Assumption 2 to local independence, then it is possible to identify a wider class of parameters. Please see Section 7.1 for more discussions. 2 FFM-I2 assumes that (i) FYd D +,D,R (y d+, d, r; P) is continuous in r at r 0, for d +, d {0, 1} for all y Y and (ii) E P [D + R = r] and E P [D R = r] are both continuous in r at r 0, where F Yd D +,D,R (y d+, d, r; P) denote the conditional distribution of Y d conditioning on D + = d +,D = d and R = r under P for d +, d {0, 1}. 6

7 3. SHARP TESTABLE IMPLICATIONS In this section, we show that those assumptions together imply a set of restrictions on the data (Y, D, R) which will allow us to provide empirical content for the model. Let B Y be the collection of all Borel sets generated from Y and let B be an arbitrary element in B Y. The following proposition states the testable implication of Assumptions 1 to 3. Proposition 1. Let Assumptions 1 to 3 hold, then lim E P [1{Y B}D R = r] lim E P [1{Y B}D R = r] 0, (3) r r0 r r0 lim E P [1{Y B}(1 D) R = r] lim E P [1{Y B}(1 D) R = r] 0. (4) r r0 r r0 For the purpose of exposition, we collect all the proofs in appendix. Inequalities 3 and 4 represent the testable implications of Assumptions 1 to 3. While those inequalities form basis to assess the validity of the FRD assumptions, there are some important questions worth discussing. First, from an identification point of view, are inequalities 3 and 4 sharp testable implications of Assumptions 1 to 3? Could we identify a smaller set of inequalities that contain all the information deliver by the inequalities 3 and 4 regarding the testability of Assumptions 1 to 3. This is related to the issue of finding the low (lowest) cardinality class of sets that can characterize all the restrictions on the data imposed by the RD assumptions. 3 Second, from both inferential and empirical point of view, the set of inequalities 3 and 4 is difficult to analyze, for example the class of functions indexed by all Borel-set needs not be Donsker and it is certainly desirable to reduce the cardinality of the class of sets. Theorem 1 addresses these concerns while we will discuss the statistical property of our tests in details in Section 4. Theorem 1. Let {D(r), Y 1, Y 0 } a vector of unobserved potential treatments and outcomes, and (Y, D, R) the vector of observed variables where Y = Y 1 D + Y 0 (1 D) with the observed D defined as follows D D(R). Let r 0 the known cut-off point. 3 See Galichon and Henry (2006, 2011) and Chesher and Rosen (2015) for discussion of core determining classes. 7

8 (i) Under Assumptions 1 to 3, the following inequalities hold: η P,1 (y, y ) = lim r r0 E P [1{y Y y }D R = r] lim r r0 E P [1{y Y y }D R = r] 0 (5) η P,0 (y, y ) = lim r r0 E P [1{y Y y }(1 D) R = r] lim r r0 E P [1{y Y y }(1 D) R = r] 0 for all y, y Y, and inequalities hold strictly if Y [y, y ]. (ii) If inequalities 5 and 6 hold, there exists a joint distribution of ( D(r), Ỹ 1, Ỹ 0 ) such that Assumptions 1 to 3 hold, and the conditional distributions of (Ỹ, D) R = r and (Y, D) R = r satisfy (6) lim F Y,D R=r = lim FỸ, D R=r and lim F Y,D R=r = lim FỸ, D R=r, (7) r r0 r r0 r r0 r r0 that is, the observed distribution and the counter-factual distribution are the same as r approaches r 0 from each side, respectively. Remark: Theorem 1 part (i) shows a necessary condition that the distribution of observable variables should satisfy under the FRD assumptions. Part (ii) is more important. It shows that inequalities 5 and 6 are the most informative way to screen all the observables violations of the FRD assumptions. First, this means that we do not lose any information by reducing the collection of Borel sets B Y to the set of closed intervals. Second, inequalities 5 and 6 are necessary and sufficient conditions for the distribution of the observable variables (Y, D, R) to rationalize the FRD design entertained here, i.e., potential outcome model and Assumptions 1 to 3. To the best of our knowledge, this paper is the first in the literature that provides a set of inequalities that are the most informative to screen the violation of the FRD assumptions. Remark: Economic s interpretation of the "sharp" testable implications. Following Angrist, Imbens, and Rubin (1996) and Kitagawa (2008, 2015), we interpret the testable implication of those assumptions as the non-negativity of compliers potential outcome distributions. Consider a closed interval [y, y + h] for some h > 0, and follow the argument in the proof of Proposition 1, inequality 8

9 8 implies that lim E P [1{y Y y + h}d R = r] lim E P [1{y Y y + h}d R = r] r r0 r r0 = E P [1{y Y 1 y + h} C, R = r 0 ]P P (C R = r 0 ) 0. Dividing both side by h and let h 0 first (suppose we can switch the order of the limit and corresponding densities are well defined), we have lim P P (D = 1 R = r) f Y D,R (y 1, r; P) lim P P (D = 1 R = r) f Y D,R (y 1, r; P) r r0 r r0 = f Y1 C,R(y r 0 ; P)P P (C R = r 0 ) 0 where f Y1 C,R(y r; P) denote the potential outcome Y 1 probability density function for the subpopulation of compliers (when they exist). Therefore, our test can be interpreted as the non-negativity of the potential outcome density functions for the compliers at the cut-off. 4. TESTING Assume that we observe a random sample of (Y, D, R) of size n generated from the distribution P. Let F R denote the marginal distribution of P on R. Let n + and n denote number of observations such that R r 0 and R < r 0, respectively. Throughout the paper, we assume that n + /n converges to a constant bounded away from 0 and 1. For the purpose of inference, we define an equivalent class of inequalities of 5 and 6. Let G be the set of the indicator functions of a class of closed intervals C l Y, l L such that: G = {g l ( ) = 1{ C l } : l L}. We will propose later different choices of interval class L such that our test statistics has desirable asymptotic property and is easy to computed. As we show in Corollary 5 in appendix, the set of inequalities 5 and 6 are equivalent to the inequalities 8 and 9 below: ν P,1 (l) = lim r r0 E P [g l (Y)D R = r] lim r r0 E P [g l (Y)D R = r] 0, (8) ν P,0 (l) = lim r r0 E P [g l (Y)(1 D) R = r] lim r r0 E P [g l (Y)(1 D) R = r] 0. (9) 9

10 for all l L. We therefore define the hypothesizes H 0 and H 1 as H 0 : ν P,1 (l) 0 and ν P,0 (l) 0 for all l L, H 1 : H 0 does not hold. (10) 4.1. Test Statistics. We will construct our test statistics based on standardized nonparametric estimates of ν P,1 (l) and ν P,0 (l). To be specific, let m P,1,+ (l) = lim r r0 E P [g l (Y)D R = r], m P,1, (l) = lim r r0 E P [g l (Y)D R = r], m P,0,+ (l) = lim r r0 E P [g l (Y)(1 D) R = r] and m P,0, (l) = lim r r0 E P [g l (Y)(1 D) R = r], then we can estimate ν P,1 (l) and ν P,0 (l) respectively by ˆν 1 (l) = ˆm 1, (l) ˆm 1,+ (l), ˆν 0 (l) = ˆm 0,+ (l) ˆm 0, (l), where the right hand side term ˆm 1,+ (l), ˆm 1, (l), ˆm 0,+ (l) and ˆm 0, (l) are local linear estimators, which in turn can be constructed as the constant terms â 1,+ (l), â 1, (l), â 0,+ (l) and â 0, (l) in regressions of the form n ( Ri r 0 min 1(R i r 0 ) K â 1,+ (l),ˆb 1,+ (l) i=1 min n â 1, (l),ˆb 1, (l) i=1 min n â 0,+ (l),ˆb 0,+ (l) i=1 min n â 0, (l),ˆb 0, (l) i=1 h + ( Ri r 0 1(R i < r 0 ) K h ( Ri r 0 1(R i r 0 ) K h + ( Ri r 0 1(R i < r 0 ) K h ) [ g l (Y i )D i â 1,+(l) ˆb ] 2, 1,+(l)(R i r 0 ) ) [ g l (Y i )D i â 1, (l) ˆb ] 2, 1, (l)(r i r 0 ) ) [ g l (Y i )(1 D i ) â 0,+(l) ˆb ] 2, 0,+(l)(R i r 0 ) ) [ g l (Y i )(1 D i ) â 0, (l) ˆb ] 2, 0, (l)(r i r 0 ) where K( ) is a symmetric kernel function and (h +, h ) are the bandwidths for regression above and below the threshold, respectively. In particular, we let h + = c + h and h = c h, with (c +, c ) be positive constants and h 0. We follow the RD literature and suggest use the triangular kernel for boundary local linear estimators. As in Fan and Gijbels (1992), we write the local linear estimators as ˆm 1,+ (l) = ˆm 0,+ (l) = n w + n ni g l(y i )D i, ˆm 1, (l) = w ni g l(y i )D i, (11) i=1 i=1 n w + n ni g l(y i )(1 D i ), ˆm 0, (l) = w ni g l(y i )(1 D i ), (12) i=1 i=1 10

11 where the weights w + ni = 1(R i r 0 ) K( R i r 0 h + )[S + n,2 S+ n,1 (R i r 0 )] n i=1 1(R i r 0 ) K( R i r 0 h + )[S + n,2 S+ n,1 (R i r 0 )] w ni = 1(R i < r 0 ) K( R i r 0 h )[S n,2 S n,1 (R i r 0 )] i=1 n 1(R i < r 0 ) K( R i r 0 h )[S n,2 S n,1 (R i r 0 )]. and for j = 0, 1, 2,, S + n ( ) n,j = Ri r 0 1(R i r 0 ) K (R i r 0 ) j, i=1 h + S n ( ) n,j = Ri r 0 1(R i < r 0 ) K (R i r 0 ) j, i=1 h Finally, let ˆσ 1 (l) and ˆσ 0 (l) be consistent estimators of the asymptotic standard errors of ˆν 1 and ˆν 0, respectively, then we can define a Kolmogorov-Smirnov (KS) type test statistic as Ŝ n = nh sup d=0,1, l L where ξ d, d = 0, 1 are trimming constants chosen by researchers. ˆν d (l) max{ˆσ d (l), ξ d }, (13) 4.2. Critical Values. We propose three resampling procedures to obtain critical values. To estimate the (1 α)-th quantile of the asymptotic distribution of Ŝ n, we consider two resampling-based approaches: multiplier bootstrap and pooled bootstrap and recentered bootstrap. The first approach is based on Hsu (2016b). The second approach is along the idea of pooled bootstrap procedure considered in Abadie (2002) and Kitagawa (2015), which we tailor to the current context of fuzzy regression discontinuity design. In this section, we outline the steps of each procedure and will establish the asymptotic properties in next subsection. For both procedures, we need to specify bandwidths and kernel functions. We follow the RD literature and suggest use the triangular kernel. In simulations, we choose undersmoothed versions of conventional bandwidths options in RDD framework, such as the optimal bandwidths proposed by (Imbens and Kalyanaraman, 2011, IK), (Calonico, Cattaneo, and Titiunik, 2014, CCT), and (Arai and Ichimura, 2016, AI), respectively Multiplier Bootstrap. Let B be a large positive integer. 11

12 (1) Based on the original sample, compute ˆm 1,+ (l), ˆm 1, (l), ˆm 0,+ (l) and ˆm 0, (l) for each l L, as described in Equations (11) and (12). Calculate the test statistics using Equation (13). (2) For each l L, calculate the sample analog of the influence function ˆφ ν1,ni(l) = nh ( w + ni (g l(y i )D i ˆm 1,+ (l)) w ni (g l(y i )D i ˆm 1, (l)) ), ˆφ ν0,ni(l) = nh ( w ni (g l(y i )(1 D i ) ˆm 0, (l)) w + ni (g l(y i )(1 D i ) ˆm 0,+ (l)) ). (3) For each b = 1,, B, draw U b1, U b2, U bn as i.i.d. pseudo random variables with E[U bi ] = 0, E[Ubi 2 ] = 1 and E[U4 bi ] < that are independent of the sample path. (4) Calculated the simulated process (of l) as Φ u ν 1,n,b (l) and Φ u ν 0,n,b(l) be Φ u n ν 1,n,b (l) = U ib ˆφ ν1,ni(l), Φ u n ν 0,n,b (l) = U ib ˆφ ν0,ni(l). i=1 i=1 (5) Estimate the asymptotic variance by ˆσ 2 d (l) = n i=1 ˆφ 2 ν d,n,i (l) (6) Let B n be a sequences of non-negative numbers. For d = 0 and 1, define ψ nd (l) as ψ nd (l) = B n 1( nh ˆν d (l) < a n ), where a n is a sequence of non-negative numbers. (7) Let PB u denote the multiplier probability measure. For significance level α < 1/2, calculate the simulated critical value ĉ n,η (α) as ĉ n,η (α) = sup { q : P u B ( sup d=1,0, l L Φ u ν d,n,b (l) + ψ ) } nd(l) q 1 α + η + η, max{ˆσ d (l), ξ d } where η > 0 is an arbitrarily small positive number, e.g., That is, ĉ n,η (α) is the (1 { } Φ u B ν α + η)-th quantile of the simulated null distribution of sup d,n,b (l)+ψ nd(l) d=1,0, l L max{ˆσ d (l),ξ d } plus η. (8) Reject H 0 if Ŝ n > ĉ n,η (α). b=1 Remark. As most papers in the moment inequality literature, our paper uses the generalized moment selection method to construct the critical value (Step 5). The GMS method is introduced 12

13 by Andrews and Soares (2010) and Andrews and Shi (2013, 2010, 2016). It is similar to the recentering method in Hansen (2012) and Donald and Hsu (2016), as well as the contact set approach in Linton, Song, and Whang (2010). By doing this, one can construct a more powerful test than resorting to the least favorable configuration. Following Andrews and Shi (2013, 2016), we suggest a n = (0.3 ln(n)) 1/2 and B n = (0.4 ln(n)/ ln ln(n)) 1/2. Remark. The η constant is called an infinitesimal uniformity factor and is introduced by Andrews and Shi (2013) to avoid the problems that arise due to the presence of the infinite-dimensional nuisance parameter ν P,1 (l) and ν P,0 (l) Pooled Bootstrap. We introduce some additional notation. Let η n = n h (n h +n + h +. For l L and d {0, 1}, let b n (l, d) denote rescaled bias of local linear estimator ˆν d (l) It can be shown that b n (l, d) = E[ ˆν d (l)] ν d (l). b n (l, 1) p b(l, 1) κ [ ( ) 1 2 ηλ 2 r 2 P(Y A, D = 1 r ) ( )] 2 ηλ + r 2 P(Y A, D = 1 r +), (14) b n (l, 0) p b(l, 1) κ [ ( ) ηλ+ 2 2 r 2 P(Y A, D = 0 r +) ( )] 2 1 ηλ r 2 P(Y A, D = 0 r ), (15) where κ is the constant that depends only on the kernel function, κ κ2 2 κ 1κ 0 3 κ 2 κ 0 κ1 2, κ l u l k(u)du, and λ +, λ and η are probability limits of n + h 5 +, n + h 5 + and η n, respectively. Now we outline the steps of the pooled bootstrap procedure. (1) Based on the original sample, compute ˆm 1,+ (l), ˆm 1, (l), ˆm 0,+ (l) and ˆm 0, (l) for each l L, as described in Equations (11) and (12). Calculate the test statistics using Equation (13). 13

14 (2) (bias estimation) Estimate the bias terms shown in Equations (14) and (15) by plugging in estimator of the second derivatives obtained from, e.g., the local cubic regressions. (3) (construction of mixtures) Let Q n be the empirical distribution constructed by {(Y i, D i, R i ) : R i < c, i = 1,..., n } and ˇQ n be the empirical distribution constructed by {(Y i, D i, 2c R i ) : R i < c, i = 1,..., n }. Similarly, let P n+ be the empirical distribution constructed by {(Y j, D j, R j ) : R j c, j = 1,..., n + } and ˇP n+ be the empirical distribution constructed by {(Y j, D j, 2c R j ) : R j c, j = 1,..., n + }. Note that ˇQ n and ˇP n+ flip the original values of R symmetrically around the cut-off value. Define mixtures of the empirical distributions H (1 η n )Q n + η n ˇP n+, H + (1 η n ) ˇQ n + η n P n+. (4) (bootstrap) Draw n iid observations of (Y, D, R) from H and n + iid observations from H +. Using thus-generated bootstrap sample, compute ˆν d (l) and ˆσ d (l), which are denoted by ˆν d (l) and ˆσ d (l). We then form the bootstrap statistic { [ sup l L [ξ 1 ˆσ 1 Ŝ boot = max (l)] 1 ˆν 1 (l) + ˆb(l, ]} 1) { [ sup l L [ξ 0 ˆσ 0 (l)] 1 ˆν 0 (l) + ˆb(l, ]} 0), where ˆb(l, d) are the estimates of the bias constructed in Step 2. (5) Repeat Step 4 many times and use ĉ 1 α the empirical (1 α)-th quantile of Ŝ boot as a critical value of the test. Reject the null hypothesis if Ŝ n > ĉ 1 α. Remark. If the bias is controlled by under-smoothing, then λ = λ + = 0, implying that b n (l, d) is asymptotically negligible Asymptotics of the Proposed Tests Multiplier Bootstrap. In this subsection we derive the asymptotic properties of the multiplier bootstrapping procedure. We let the observed outcome Y Y [0, 1]. 4 We establish the results for 4 Assuming that the suppose of Y is a subset of [0,1] is without loss of generality. If not, we can define Ỹ = Φ(Y) where Φ( ) is the CDF of standard normal. 14

15 the following choice of set L: G = {g l ( ) = 1( C l ) : l (y, c) L}, where C l = [y, y + c] Y and L = { (y, c) : c 1 = q, and q y {0, 1, 2,, (q 1)} for q = 1, 2, }. (16) Let h 2 (, ) be a covariance kernel on L L. Let H 2 be the collection of all possible covariance kernel functions on L L. For any pair of h (1) 2 and h (2) 2, we define the distance between them as d(h (1) 2, h(2) 2 ) = sup h (1) 2 (l 1, l 2 ) h (2) 2 (l 1, l 2 ). (17) l 1,l 2 L Let σ 2 P,1,+ (l 1, l 2 ) = lim r r0 Cov P,1 (g l1 (Y)D, g l2 (Y)D R = r) be the conditional covariance of g l1 (Y)D and g l2 (Y)D when r approach to r 0 from above. Let σ 2 P,0,+ (l 1, l 2 ) = lim r r0 Cov P,0 (g l1 (Y)(1 D), g l2 (Y)(1 D) R = r) be the conditional covariance of g l1 (Y)(1 D) and g l2 (Y)(1 D) when r approach to r 0 from above. Define σ 2 P,1, (l 1, l 2 ) and σ 2 P,0, (l 1, l 2 ), similarly. For j = 0, 1, 2,..., let ϑ j = 0 uj K(u)du. For d = 0, 1, Define 0 h 2,P,d (l 1, l 2 ) = (ϑ 2 uϑ 1 ) 2 K 2 (u)du σp,d,+ 2 (l 1, l 2 ) + σp,d, 2 (l 1, l 2 ) (ϑ 2 ϑ 0 ϑ1 2)2 f r (r 0 ) which is the covariance kernel of Φ u ν 1,n(l) under distribution P. Also, f r (r 0 ) is the density function of R evaluated at R = r 0 that is assumed to be not depending on P. Let P be the collection of probability distributions. Let f R (r) denote the probability density function of F R under P P. We consider the case in which f R (r) is the same for all P P. For d = 0, 1, let f Y D,R (y d, r; P) denote the conditional pdf of Y conditional on D = d and R = r that is equivalent to f Yd r(y r; P) under P. Let ρ P,+ (r) = E P [D(r) R = r] for r r 0 and ρ P, (r) = E P [D(r) R = r] for r < r 0. Let f and f denote the first and second derivatives of function f. Let for any δ > 0, N δ (r 0 ) = {r : r r 0 δ} denote a neighborhood of r around r 0. Let N + δ (r 0) = {r : 0 = r r 0 δ} and N δ (r 0) = {r : 0 < r 0 r δ}. Assumption 4. Assume that for all P P, (1) P r is the same and f R (r) is twice continuously differentiable in r on N δ (r 0 ), (2) f r (r) is bounded away from zero on N δ (r 0 ), 15

16 (3) ρ P,+ (r) is twice continuously differentiable in r on N + δ (r 0) and ρ P, (r) is twice continuously differentiable in r on N δ (r 0). (4) for d = 0, 1 and for each y Y, f y d,r (y, r; P) is twice continuously differentiable in r on N δ (r 0 ), (5) dρ P,+ (r)/dr M and d 2 ρ P,+ (r)/drdr M on N + δ (r 0), and dρ P, (r)/dr M and d 2 ρ P, (r)/drdr M on r on N δ (r 0), (6) for d = 0, 1, f Y D,R (y d, r; P)/ r M and 2 f Y D,R (y d, r; P)/ r r M on r N δ (r 0 ) for each y Y, (7) for d = 0, 1, f Y D,R (y d, r; P) is continuous on y Y for all r N δ (r 0 ), (8) M and δ are positive constants not dependent on P. Assumption 4 (ii)-(vi) are standard in nonparametric estimation. Assumption 4(v) and (vi) are needed to show that the bias terms of the ˆν 1 (l) and ˆν 0 (l) are asymptotically negligible uniformly over l L. Assumption 4(vii) is assumed for notational simplicity. It requires that the conditional distribution of Y conditioning on R = r and D = d is continuous and this basically rules out the cases in which Y is a discrete variable or a mixture of discrete and continuous variables. Extensions of our results to these cases are straightforward and we omit them for brevity. Assumption 5. Assume that (1) The K( ) is a non-negative symmetric bounded kernel with a compact support in R (say [ 1, 1]). (2) K(u)du = 1, (3) h 0, nh and nh 5 0 as n. Assumption 5 is standard for nonparametric estimation. Note that nh 5 0 as n implies undersmoothing so that the bias terms converge to zero even after we multiply it with nh. This condition is standard if one wants to obtain the asymptotic normality of the estimators. Assumption 6. Let {U i : 1 i n} be a sequence of i.i.d. random variables E[U] = 0, E[U 2 ] = 1, and E[ U 4 ] < M 1 for some M 1 > 0, and {U i : 1 i n} is independent of the sample path {(Y i, D i, R i ) : 1 i n}. 16

17 Assumption 6 is standard for the multiplier bootstrap as in Hsu (2016b) and E[ U 4 ] < M 1 is needed for the multiplier bootstrap for nonparametric method. Assumption 7. Assume that: (i) Let a n be a sequence of non-negative numbers satisfying lim n a n = and lim n a n / nh = 0. (ii) Let B n be a sequence of non-negative numbers satisfying that B n is non-decreasing, lim n B n = and lim n B n /a n = 0. The following theorem summarizes the uniform size of our test. Let λ( ) denotes the Lebesgue measure. Define YP,d o {y Y : f Y d C,R(y, r o ; P)P P (C R = r 0 ) = 0} and L o P,d {l L : ν P,d (l) = 0}. It is straightforward to see that when λ(yp,d o ) > 0, then Lo P,d is not an empty set. Assumption 8. Let P 0 be the subset of P that satisfies Assumption 4 such that the null hypothesis in Equation (10) holds under P if P P 0. Assumption 9. Assume that there exists P c P 0 such that (1) max d=0,1 λ(yp o c,d ) > 0 under P c. (2) For d = 0, 1, h 2,Pc,d H 2,cpt. (3) Either h 2,Pc,1 restricted to L o P c,1 Lo P c,1 is not a zero function or h 2,P c,0 restricted to L o P c,0 L o P c,0 is not a zero function. Theorem 2. Suppose Assumption 4-Assumption 8 hold. We reject H 0 when Ŝ n > ĉ η. Then, for every compact subset H 2,cpt of H 2 (a) lim sup n sup {P P0 : h 2,P,1,h 2,P,0 H 2,cpt } P(Ŝ n > ĉ n,η (α)) α. (b) if Assumption 9 also holds, then lim lim sup η 0 n sup P(Ŝ n > ĉ n,η (α)) = α. {P P 0 :h 2,1,P,h 2,0,P H 2,cpt } Theorem 2(a) shows that our test have correct uniform asymptotic size over a compact sets of covariance kernels which is similar to Theorem 2(a) of Andrews and Shi (2013). Theorem 2(b) shows that our test is at most infinitesimally conservative asymptotically when there exists at least one P c such that max d=0,1 λ(yp o c,d ) > 0 under P c and one of h 2,Pc,d restricted to L o d Lo d is not a 17

18 zero function. Theorem 2 is similar to Theorem 2 of Andrews and Shi (2013) and Theorem 5.1 of Hsu (2016a) except that we consider a testing problem in regression discontinuity design. We present the power of our test against fixed alternatives. Let P 1 P such that one of f Yd C,R(y r o ; P 1 )P P1 (C R = r 0 ) > 0 for some y Y under P 1. Then by the continuity of f Yd C,R(y r o ; P 1 ) in y, there exists a neighborhood N (y ) around y such that f Yd C,R(y r o ; P 1 )P P (C R = r 0 ) > 0 for all y N (y ) and λ(n (y )) > 0. This implies that there exists l L such that ν P1,d(l ) > 0. The following theorem shows the consistency of our test. Theorem 3. Suppose Assumption 4-Assumption 7 hold and α < 1/2. Let P 1 P such that at least one of f Yd C,R(y r o ; P 1 )P P1 (C R = r 0 ) > 0 for some y Y under P 1. Then, lim n P(Ŝ n > ĉ n,η (α)) = 1. Theorem 3 follows from the fact that the test statistic Ŝ n diverges to positive infinity under P 1 and the critical value ĉ n,η (α) is bounded in probability. We show that our test is unbiased against some nh 1/2 local alternatives. Define A\B {y : y A but y B} for any two sets A and B. We consider a sequence of P n P\P 0 such that f Y1 C,R(y r o ; P n ) = f Y1 C,R(y r o ; P o ) + δ 1(y) nh, f Y1 C,R(y r o ; P n ) = f Y0 C,R(y r o ; P o ) + δ 0(y) nh, (18) for some P o P 0. We impose the following conditions on the local alternatives we consider. Assumption 10. Let {P n P\P 0 : n 1} and P o P 0 such that: (1) Equation (18) holds under P n, (2) max d=0,1 λ(yp o o,d ) > 0, (3) for d = 0, 1, δ d (y) 0 if y YP o o,d, (4) max d=0,1 λ(δ d Y P o o,d ) > 0 where δ d {y : δ d(y) > 0}. (5) for d = 0, 1, lim n d(h 2,Pn,d, h2 ) = 0 for some h 2,d H 2. Assumption 10(i) requires that the local alternatives converge to a null hypothesis at the rate (nh) 1/2. Assumption 10(ii) requires that the null hypothesis is not an interior one so that we can have nontrivial local power. Assumption 10(iii) ensures that our test is unbiased. Assumption 10(iv) 18

19 specifies the asymptotic behavior of the covariance kernels and this is similar to LA1(c) of Andrews and Shi (2013). The following theorem shows that the asymptotic local power of our test is greater than or equal to α when η tends to zero, i.e., our test is unbiased against those local alternatives that satisfy Assumption 10. Theorem 4. Suppose Assumption 4-Assumption 7 hold and α < 1/2. Under {P n : n 1} that satisfies Assumption 10, lim η 0 lim n P(Ŝ n > ĉ n,η (α)) α Pooled Bootstrap. 5. SIMULATION In this section, we provide intensive Monte Carlo experiments to investigate the finite sample performance of the proposed procedures. We consider 10 DGPs, including 4 DGPs for examining size property and six for power property. We collect the details of the design and full sets of simulation results in Appendix 3 for the purpose of exposition. For each DGP, we generate random sample of four sizes: 1000, 2000, 4000 and These sample sizes are comparable to, if not smaller than, the sample sizes of many empirical applications of RD designs. We consider three choices of bandwidths: AI, IK and CCT. For each bandwidth, we conduct undersmoothing by multiplying each bandwidths by a factor of n c, with c = 4.5. We conduct simulations for other choices of c [3, 5] and the results are qualitatively robust. We consider two classes of intervals. One is the interval class defined in Equation (16), for which we choose q = 1, 2,, L and choose L = 10. We also replicate all the simulations with L = 15 as a robustness check. The second class of intervals are formed by observed Y-values. For the observations with D=1, we first sort the Y-values. Suppose there are in total L 1 number of distinct observed values of Y. We then create the equal-distant integer subsequence of (1, 2,..., L 1 ) with length L 1 /10 (rounded up if fractional), and construct the grids by picking up the corresponding ordered statistics of Y. To put it in another way, we construct grids {g 1, g 2,, g L1/10 } such that there are 10 distinguished Y-values in each interval [g j, g j+1 ). Based on these grids, we construct intervals associate with D = 1 as the intervals formed by every possible combination of those grid points. The same operation applies to the observations with D = 0. 19

20 To briefly summarize, we consider ten DGPs. For each DGP, we consider four sample sizes, three nominal levels (1%, 5%, 10%), three bandwidth choices (AI, IK, CCT), two interval classes, and two bootstrap approaches. We conduct 1000 replications for each simulation and choose bootstrap sample size B = 300. Overall, the simulations results demonstrate very good finite sample performances. Please see Appendix 3 for details. To be written APPLICATION 7. EXTENSIONS AND DISCUSSIONS 7.1. Local smoothness and local independence. In this section, we discuss the identification and specification testing under a set of alternative assumptions that are often invoked in the literature. Let ɛ > 0 be a small constant, Assumption 11 (Local Independence). (Y d, D(r)) is jointly independent of R in a small neighborhood (r 0 ɛ, r 0 + ɛ). Assumption 12 (Local Monotonicity-2). D(r 0 + e) D(r 0 e) for any e (0, ɛ). Assumption 11 is often referred as local independence assumption, which assumes that the potential outcomes and treatment variables are independent of the running variable near the cut-off point. Note that local independence implies local smoothness (Assumption 2) since if the former is true, the conditional distribution will be flat and hence differentiable in r in the neighhood of r 0. Recently, Dong (2016) gives an interesting discussion regarding the local smoothness vs the local independence and provides some empirical evidences for which the local smoothness would be a more interesting assumption. Assumption 12, which we refer as local monotonicity-2, is an alternative monotonicity assumption from Assumption 3 in this paper. Local monotonicity-2 imposes restrictions on D(r) in the neighbourhood of r 0, rather than just at the limit. It is interesting to note that with Assumption 11, one can identify some parameters directly. For example, a researcher would be interested in knowing the impact of the treatment or a program on the dispersion measure like the variance because even when a treatment has a zero (local) average 20

21 treatment effect it would improve social welfare by lowering the dispersion of the potential outcome distribution. The following corollary provides such results. Corollary 1. Suppose Assumptions 1, 11 and 12 are satisfied. Let G( ) : Y R be a measurable function, then and E P [G(Y 1 ) C, R = r 0 ] = lim r r 0 E P [G(Y)D R = r] lim r r0 E P [G(Y)D R = r], lim r r0 E P [D R = r] lim r r0 E P [D R = r] E P [G(Y 0 ) C, R = r 0 ] = lim r r 0 E P [G(Y)(1 D) R = r] lim r r0 E P [G(Y)(1 D) R = r]. lim r r0 E P [D R = r] lim r r0 E P [D R = r] Note that for G(Y) = Y, we recover the LATE estimand derived in HTV, and for G(Y) = 1{Y y} we obtain the identification of the CDF of the potential outcomes for the compliers derived in FFM. An interesting direct application of this general identification result is the identification of the variance of the potential outcome distribution V P (Y 1 C). V P (Y 1 C) = lim r r 0 E P [Y 2 D R = r] lim r r0 E P [Y 2 D R = r] lim r r0 E P [D R = r] lim r r0 E P [D R = r] ( ) limr r0 E P [YD R = r] lim r r0 E P [YD R = r] 2. lim r r0 E P [D R = r] lim r r0 E P [D R = r] It is worth noting that V P (Y 1 C) can also been identified from the result in Equations (1) and (2) since it identifies the potential outcome distribution of the compliers F Y1 C(.) and calculate V P (Y 1 C) subsequently. However, this procedure involves multiples steps and make the inference procedure more involved, this would justify why empirical researchers do not reports such important parameter of interest in their study. This present identification result proposes an easy way to identify and estimate the V P (Y 1 C). Notice that because the variance is always positive, this identification result implies also an additional restriction on the data (Y, D, R) which is: lim r r0 E P [Y 2 D R = r] lim r r0 E P [Y 2 D R = r] lim r r0 E P [D R = r] lim r r0 E P [D R = r] ( ) limr r0 E P [YD R = r] lim r r0 E P [YD R = r] 2. lim r r0 E P [D R = r] lim r r0 E P [D R = r] More generally the local independence and local monotonicity-2, i.e., Assumptions 1, 11 and 12 deliver the following set of testable restrictions: Let G + a measurable function G + : Y R +. 21

22 Under Assumptions 1, 11 and 12, for any e (0, ɛ) we have: E P [G + (Y)D R = r 0 + e] = E P [G + (Y 1 )D(r 0 + e) R = r 0 + e] = E P [G + (Y 1 )D(r 0 + e)] E P [G + (Y 1 )D(r 0 e)] = E P [G + (Y 1 )D(r 0 e) R = r 0 e] = E P [G + (Y)D R = r 0 e], where the first and fourth equalities are by definition of potential variables, the second and third equalities are by local independence assumption, and the inequality is due to monotonicity and the non-negativity of G +. Similarly, we can derive the second testable implication involving Y 0. Both can be summarized as follows: For any e (0, ɛ) we have: E P [G + (Y)D R = r 0 e] E P [G + (Y)D R = r 0 + e] 0, (19) E P [G + (Y)(1 D) R = r 0 + e] E P [G + (Y)(1 D) R = r 0 e] 0. (20) Remark 3: The local independence appears to impose more restrictions on the observables. Indeed, by letting ɛ converge to 0 and restrict the class of positive measurable function G + to G we obtain inequalities (8) and (9). This is not surprising since the local independence is a stronger requirement. As before, the class of positive measurable function G + can also be reduced to G without losing information on the testability of Assumptions 1, 11 and 12. Remark 4: If the ɛ appears in Assumptions 11 and 12 were unknown to researchers, the sharp testable implications of local independence and local monotonicity-2 is actually the same as described in inequalities (8) and (9) since we can only look at the limit without knowledge of ɛ. However, if ɛ were known, then Assumptions 11 and 12 still imposes more restrictions on the data. It is worth stressing that while we may think that ɛ is unknown in general, in practice, applied researchers usually move away from the cut-off point to collect more observations in order to increase the significance of their estimates. This deviation is sometimes captured by the type of bandwidth they use in their non-parametric estimation. This result shows that researcher should be careful about the bandwidth they use since a wider one imposes more restriction on the data which make more likely the RD assumptions to be violated. The following corollaries summarize our results. 22

23 Corollary 2. Let D + and D be well defined. If ɛ is unknown, then the statement of Theorem 1 holds with Assumption 2 and Assumption 3 being replaced by Assumption 11 and Assumption 12, respectively. Corollary 3. Let D + and D be well defined and assume that ɛ is known. Let {D(r), Y 1, Y 0 } a vector of unobserved potential treatments and outcomes, and (Y, D, R) the vector of observed variables where Y = Y 1 D + Y 0 (1 D) with the observed D defined as follows D D(R). Let r 0 the observed cut-off point. (i) Under Assumptions 1, 11 and 12, then η 1 (l, e)e P [1{Y C l }D R = r + e] E P [1{Y C l }D R = r e] 0 (21) η 0 (l, e) = E P [1{Y C l }(1 D) R = r + e] E P [1{Y C l }(1 D) R = r e] 0 (22) for all l L and e (0, ɛ), and the inequalities hold strictly for C l = Y and sufficiently small e. (ii) If inequalities 21 and 22 hold, there exists a joint distribution of ( D(r), Ỹ 1, Ỹ 0 ) such that Assumptions 1, 11 and 12 hold for r (r 0 ɛ, r 0 + ɛ), and the conditional distributions of (Ỹ, D) R = r and (Y, D) R = r satisfy F Y,D R=r = FỸ, D R=r r (r 0 ɛ, r 0 + ɛ). (23) 7.2. Discussion on other specification tests. McCrary (2008) density test is the of the most used tests by empirical researchers to provide evidence that either supports or discounts the validity of the regression discontinuity design. McCrary (2008) suggests to examine the density of observations of the running variable, especially at the cut-off point r 0. If there is a discontinuity in the density at r 0, this may suggest that agents were able to perfectly manipulate their treatment status, in other terms being just above or below the cut-off would be no longer random. In such a case, it would "intuitively" provide some suggestive evidence of the violation of the local independence Assumption 11. Although this test has a very intuitive interpretation, the theoretical justification of this test was not very clear, especially in presence of Fuzzy design. Dong (2016) recently suggested a theoretical justification of this test that we will investigate below. Dong (2016) proposed the following assumption as an alternative assumption of Assumption 11 or Assumption 2 for identification of 23

24 parameters of interest. Denote f R Y1,Y 0,D +,D (.) the conditional density of the running variable R conditional to the joint vector of potential outcomes and the compliance groups. Assumption 13 (Strong Smoothness). f R Y1,Y 0,D +,D (.) is continuous in the neighborhood of r 0, and f R (.) is continuous and strictly positive in the neighborhood of r 0. First, notice that Assumption 13 implies Assumption 2 (i), then stronger that what required for identification. 5 Second, because the conditioning variable involves the unobserved potential outcomes, Assumption 13 is not directly testable but has as a testable implication the continuity of the unconditional density of the running variable at r 0 which is the test proposed by McCrary (2008). In addition of that, McCrary (2008) s test is not related to the monotonicity assumption which is required for identification in presence of heterogenous treatment effect as mainly in HTV and FFM. Therefore, unlike our proposed test, McCrary (2008) test is neither a sufficient nor necessarily for assessing the validity of the FRD assumptions, as illustrated on Figure 1. We assume that all data generated processes (DGPs) represented on this figure satisfy Assumption 1. The intersection of the "black" and the "red" ellipses represents all DGPs that respect the FRD design s assumptions. The "blue" ellipse depicted DGPs that respect the McCrary s null hypothesis. We clearly see that McCrary s test is not, in general, very informative about the validity of the FRD assumptions, especially when the treatment effect is heterogenous as entertained here. However, our proposed null hypothesis depicted in "green" covers all DGPs that respect the FRD assumptions. This shows the necessity of our test. In this figure, the sharpness (or sufficiency of our inequalities, in Pearl (1994) s language) means that there is no testable inequalities for which their set representation cover the FRD s assumption and is included in the "green" ellipse Incorporating Covariates. We briefly illustrate that it is straightforward to incorporate additional covariates X X R d x. Again, we implicitly assume that there are observations near the cutoff point conditioning on each realization of x. We make the following assumptions. Assumption 14. The limits π + (x) lim r r0 P P (D = 1 R = r, X = x) and π (x) lim r r0 P P (D = 1 R = r, X = x) exist and π + (x) = π (x) for all x X. 5 Refer to proof of Dong (2016) s Lemma. 24

TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN

TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN YOICHI ARAI a YU-CHIN HSU b TORU KITAGAWA c. GRIPS ACADEMIA SINICA UCL and KYOTO UNIVERSITY ISMAEL MOURIFIÉ d YUANYUAN WAN e UNIVERSITY

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

An Alternative Assumption to Identify LATE in Regression Discontinuity Design An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify

More information

Optimal bandwidth selection for the fuzzy regression discontinuity estimator

Optimal bandwidth selection for the fuzzy regression discontinuity estimator Optimal bandwidth selection for the fuzzy regression discontinuity estimator Yoichi Arai Hidehiko Ichimura The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP49/5 Optimal

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs An Alternative Assumption to Identify LATE in Regression Discontinuity Designs Yingying Dong University of California Irvine September 2014 Abstract One key assumption Imbens and Angrist (1994) use to

More information

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Clément de Chaisemartin September 1, 2016 Abstract This paper gathers the supplementary material to de

More information

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic

More information

Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"

Supplemental Appendix to Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs" Yingying Dong University of California Irvine February 2018 Abstract This document provides

More information

Testing for Treatment Effect Heterogeneity in Regression Discontinuity Design

Testing for Treatment Effect Heterogeneity in Regression Discontinuity Design Testing for Treatment Effect Heterogeneity in Regression Discontinuity Design Yu-Chin Hsu Institute of Economics Academia Sinica Shu Shen Department of Economics University of California, Davis E-mail:

More information

ted: a Stata Command for Testing Stability of Regression Discontinuity Models

ted: a Stata Command for Testing Stability of Regression Discontinuity Models ted: a Stata Command for Testing Stability of Regression Discontinuity Models Giovanni Cerulli IRCrES, Research Institute on Sustainable Economic Growth National Research Council of Italy 2016 Stata Conference

More information

Consistent Tests for Conditional Treatment Effects

Consistent Tests for Conditional Treatment Effects Consistent Tests for Conditional Treatment Effects Yu-Chin Hsu Department of Economics University of Missouri at Columbia Preliminary: please do not cite or quote without permission.) This version: May

More information

Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models

Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models Yingying Dong and Arthur Lewbel University of California Irvine and Boston College First version July 2010, revised

More information

Regression Discontinuity Designs with a Continuous Treatment

Regression Discontinuity Designs with a Continuous Treatment Regression Discontinuity Designs with a Continuous Treatment Yingying Dong, Ying-Ying Lee, Michael Gou First version: April 17; this version: December 18 Abstract This paper provides identification and

More information

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Yingying Dong and Shu Shen UC Irvine and UC Davis Sept 2015 @ Chicago 1 / 37 Dong, Shen Testing

More information

Regression Discontinuity Designs

Regression Discontinuity Designs Regression Discontinuity Designs Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Regression Discontinuity Design Stat186/Gov2002 Fall 2018 1 / 1 Observational

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Monotonicity Test for Local Average Treatment Effects Under Regression Discontinuity

Monotonicity Test for Local Average Treatment Effects Under Regression Discontinuity Monotonicity Test for Local Average Treatment Effects Under Regression Discontinuity Yu-Chin Hsu Shu Shen This version: July 15, 2018 Abstract Researchers are often interested in the relationship between

More information

Regression Discontinuity Designs with a Continuous Treatment

Regression Discontinuity Designs with a Continuous Treatment Regression Discontinuity Designs with a Continuous Treatment Yingying Dong, Ying-Ying Lee, Michael Gou This version: March 18 Abstract This paper provides identification and inference theory for the class

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Section 7: Local linear regression (loess) and regression discontinuity designs

Section 7: Local linear regression (loess) and regression discontinuity designs Section 7: Local linear regression (loess) and regression discontinuity designs Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A October 26, 2015 1 / 57 Motivation We will focus on local linear

More information

Regression Discontinuity Designs.

Regression Discontinuity Designs. Regression Discontinuity Designs. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 31/10/2017 I. Brunetti Labour Economics in an European Perspective 31/10/2017 1 / 36 Introduction

More information

Approximate Permutation Tests and Induced Order Statistics in the Regression Discontinuity Design

Approximate Permutation Tests and Induced Order Statistics in the Regression Discontinuity Design Approximate Permutation Tests and Induced Order Statistics in the Regression Discontinuity Design Ivan A. Canay Department of Economics Northwestern University iacanay@northwestern.edu Vishal Kamat Department

More information

University of Toronto Department of Economics. Testing Local Average Treatment Effect Assumptions

University of Toronto Department of Economics. Testing Local Average Treatment Effect Assumptions University of Toronto Department of Economics Working Paper 514 Testing Local Average Treatment Effect Assumptions By Ismael Mourifie and Yuanyuan Wan July 7, 214 TESTING LATE ASSUMPTIONS ISMAEL MOURIFIÉ

More information

The Economics of European Regions: Theory, Empirics, and Policy

The Economics of European Regions: Theory, Empirics, and Policy The Economics of European Regions: Theory, Empirics, and Policy Dipartimento di Economia e Management Davide Fiaschi Angela Parenti 1 1 davide.fiaschi@unipi.it, and aparenti@ec.unipi.it. Fiaschi-Parenti

More information

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Yingying Dong University of California, Irvine Shu Shen University of California, Davis First

More information

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Identification and Inference in Regression Discontinuity Designs with a Manipulated Running Variable

Identification and Inference in Regression Discontinuity Designs with a Manipulated Running Variable DISCUSSION PAPER SERIES IZA DP No. 9604 Identification and Inference in Regression Discontinuity Designs with a Manipulated Running Variable Francois Gerard Miikka Rokkanen Christoph Rothe December 2015

More information

Testing instrument validity for LATE identification based on inequality moment constraints

Testing instrument validity for LATE identification based on inequality moment constraints Testing instrument validity for LATE identification based on inequality moment constraints Martin Huber* and Giovanni Mellace** *Harvard University, Dept. of Economics and University of St. Gallen, Dept.

More information

Testing continuity of a density via g-order statistics in the regression discontinuity design

Testing continuity of a density via g-order statistics in the regression discontinuity design Testing continuity of a density via g-order statistics in the regression discontinuity design Federico A. Bugni Ivan A. Canay The Institute for Fiscal Studies Department of Economics, UCL cemmap working

More information

Applied Microeconometrics Chapter 8 Regression Discontinuity (RD)

Applied Microeconometrics Chapter 8 Regression Discontinuity (RD) 1 / 26 Applied Microeconometrics Chapter 8 Regression Discontinuity (RD) Romuald Méango and Michele Battisti LMU, SoSe 2016 Overview What is it about? What are its assumptions? What are the main applications?

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

Regression Discontinuity Designs with a Continuous Treatment

Regression Discontinuity Designs with a Continuous Treatment Regression Discontinuity Designs with a Continuous Treatment Yingying Dong, Ying-Ying Lee, Michael Gou This version: January 8 Abstract This paper provides identification and inference theory for the class

More information

Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator

Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator Quantitative Economics 9 (8), 44 48 759-733/844 Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator Yoichi Arai School of Social Sciences, Waseda University Hidehiko

More information

Testing for Rank Invariance or Similarity in Program Evaluation

Testing for Rank Invariance or Similarity in Program Evaluation Testing for Rank Invariance or Similarity in Program Evaluation Yingying Dong University of California, Irvine Shu Shen University of California, Davis First version, February 2015; this version, October

More information

Robust Inference in Fuzzy Regression Discontinuity Designs

Robust Inference in Fuzzy Regression Discontinuity Designs Robust Inference in Fuzzy Regression Discontinuity Designs Yang He November 2, 2017 Abstract Fuzzy regression discontinuity (RD) design and instrumental variable(s) (IV) regression share similar identification

More information

TESTING LOCAL AVERAGE TREATMENT EFFECT ASSUMPTIONS. This version: Thursday 19 th November, 2015 First version: December 7 th, 2013

TESTING LOCAL AVERAGE TREATMENT EFFECT ASSUMPTIONS. This version: Thursday 19 th November, 2015 First version: December 7 th, 2013 TESTING LOCAL AVERAGE TREATMENT EFFECT ASSUMPTIONS ISMAEL MOURIFIÉ AND YUANYUAN WAN ABSTRACT. In this paper we propose an easy-to-implement procedure to test the key conditions for the identification and

More information

Why high-order polynomials should not be used in regression discontinuity designs

Why high-order polynomials should not be used in regression discontinuity designs Why high-order polynomials should not be used in regression discontinuity designs Andrew Gelman Guido Imbens 6 Jul 217 Abstract It is common in regression discontinuity analysis to control for third, fourth,

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects Brigham R. Frandsen Lars J. Lefgren December 16, 2015 Abstract This article develops

More information

Regression Discontinuity Design

Regression Discontinuity Design Chapter 11 Regression Discontinuity Design 11.1 Introduction The idea in Regression Discontinuity Design (RDD) is to estimate a treatment effect where the treatment is determined by whether as observed

More information

Multiscale Adaptive Inference on Conditional Moment Inequalities

Multiscale Adaptive Inference on Conditional Moment Inequalities Multiscale Adaptive Inference on Conditional Moment Inequalities Timothy B. Armstrong 1 Hock Peng Chan 2 1 Yale University 2 National University of Singapore June 2013 Conditional moment inequality models

More information

Testing Continuity of a Density via g-order statistics in the Regression Discontinuity Design

Testing Continuity of a Density via g-order statistics in the Regression Discontinuity Design Testing Continuity of a Density via g-order statistics in the Regression Discontinuity Design Federico A. Bugni Department of Economics Duke University federico.bugni@duke.edu Ivan A. Canay Department

More information

Program Evaluation with High-Dimensional Data

Program Evaluation with High-Dimensional Data Program Evaluation with High-Dimensional Data Alexandre Belloni Duke Victor Chernozhukov MIT Iván Fernández-Val BU Christian Hansen Booth ESWC 215 August 17, 215 Introduction Goal is to perform inference

More information

Testing instrument validity for LATE identification based on inequality moment constraints

Testing instrument validity for LATE identification based on inequality moment constraints Testing instrument validity for LATE identification based on inequality moment constraints Martin Huber and Giovanni Mellace University of St. Gallen, Dept. of Economics Abstract: This paper proposes bootstrap

More information

Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator

Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator Yoichi Arai Hidehiko Ichimura The Institute for Fiscal Studies Department of Economics, UCL cemmap working

More information

Bounds on Treatment Effects in Regression Discontinuity Designs with a Manipulated Running Variable

Bounds on Treatment Effects in Regression Discontinuity Designs with a Manipulated Running Variable Bounds on Treatment Effects in Regression Discontinuity Designs with a Manipulated Running Variable François Gerard, Miikka Rokkanen, and Christoph Rothe Abstract The key assumption in regression discontinuity

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Regression Discontinuity Design Econometric Issues

Regression Discontinuity Design Econometric Issues Regression Discontinuity Design Econometric Issues Brian P. McCall University of Michigan Texas Schools Project, University of Texas, Dallas November 20, 2009 1 Regression Discontinuity Design Introduction

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Applied Microeconometrics. Maximilian Kasy

Applied Microeconometrics. Maximilian Kasy Applied Microeconometrics Maximilian Kasy 7) Distributional Effects, quantile regression (cf. Mostly Harmless Econometrics, chapter 7) Sir Francis Galton (Natural Inheritance, 1889): It is difficult to

More information

Optimal Bandwidth Choice for the Regression Discontinuity Estimator

Optimal Bandwidth Choice for the Regression Discontinuity Estimator Optimal Bandwidth Choice for the Regression Discontinuity Estimator Guido Imbens and Karthik Kalyanaraman First Draft: June 8 This Draft: September Abstract We investigate the choice of the bandwidth for

More information

Estimation and Inference for Distribution Functions and Quantile Functions in Endogenous Treatment Effect Models. Abstract

Estimation and Inference for Distribution Functions and Quantile Functions in Endogenous Treatment Effect Models. Abstract Estimation and Inference for Distribution Functions and Quantile Functions in Endogenous Treatment Effect Models Yu-Chin Hsu Robert P. Lieli Tsung-Chih Lai Abstract We propose a new monotonizing method

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 16, 2018 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 24, 2017 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs

Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs Andrew GELMAN Department of Statistics and Department of Political Science, Columbia University, New York, NY, 10027 (gelman@stat.columbia.edu)

More information

Independent and conditionally independent counterfactual distributions

Independent and conditionally independent counterfactual distributions Independent and conditionally independent counterfactual distributions Marcin Wolski European Investment Bank M.Wolski@eib.org Society for Nonlinear Dynamics and Econometrics Tokyo March 19, 2018 Views

More information

Optimal bandwidth selection for differences of nonparametric estimators with an application to the sharp regression discontinuity design

Optimal bandwidth selection for differences of nonparametric estimators with an application to the sharp regression discontinuity design Optimal bandwidth selection for differences of nonparametric estimators with an application to the sharp regression discontinuity design Yoichi Arai Hidehiko Ichimura The Institute for Fiscal Studies Department

More information

A Simple Adjustment for Bandwidth Snooping

A Simple Adjustment for Bandwidth Snooping A Simple Adjustment for Bandwidth Snooping Timothy B. Armstrong Yale University Michal Kolesár Princeton University October 18, 2016 Abstract Kernel-based estimators are often evaluated at multiple bandwidths

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

The problem of causality in microeconometrics.

The problem of causality in microeconometrics. The problem of causality in microeconometrics. Andrea Ichino University of Bologna and Cepr June 11, 2007 Contents 1 The Problem of Causality 1 1.1 A formal framework to think about causality....................................

More information

Semi and Nonparametric Models in Econometrics

Semi and Nonparametric Models in Econometrics Semi and Nonparametric Models in Econometrics Part 4: partial identification Xavier d Haultfoeuille CREST-INSEE Outline Introduction First examples: missing data Second example: incomplete models Inference

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University The Bootstrap: Theory and Applications Biing-Shen Kuo National Chengchi University Motivation: Poor Asymptotic Approximation Most of statistical inference relies on asymptotic theory. Motivation: Poor

More information

Bayesian nonparametric predictive approaches for causal inference: Regression Discontinuity Methods

Bayesian nonparametric predictive approaches for causal inference: Regression Discontinuity Methods Bayesian nonparametric predictive approaches for causal inference: Regression Discontinuity Methods George Karabatsos University of Illinois-Chicago ERCIM Conference, 14-16 December, 2013 Senate House,

More information

Causal Inference with Big Data Sets

Causal Inference with Big Data Sets Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity

More information

Simultaneous Selection of Optimal Bandwidths for the Sharp Regression Discontinuity Estimator

Simultaneous Selection of Optimal Bandwidths for the Sharp Regression Discontinuity Estimator GRIPS Discussion Paper 14-03 Simultaneous Selection of Optimal Bandwidths for the Sharp Regression Discontinuity Estimator Yoichi Arai Hidehiko Ichimura April 014 National Graduate Institute for Policy

More information

Regression Discontinuity Designs in Stata

Regression Discontinuity Designs in Stata Regression Discontinuity Designs in Stata Matias D. Cattaneo University of Michigan July 30, 2015 Overview Main goal: learn about treatment effect of policy or intervention. If treatment randomization

More information

IDENTIFICATION OF MARGINAL EFFECTS IN NONSEPARABLE MODELS WITHOUT MONOTONICITY

IDENTIFICATION OF MARGINAL EFFECTS IN NONSEPARABLE MODELS WITHOUT MONOTONICITY Econometrica, Vol. 75, No. 5 (September, 2007), 1513 1518 IDENTIFICATION OF MARGINAL EFFECTS IN NONSEPARABLE MODELS WITHOUT MONOTONICITY BY STEFAN HODERLEIN AND ENNO MAMMEN 1 Nonseparable models do not

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

Supplement to Fuzzy Differences-in-Differences

Supplement to Fuzzy Differences-in-Differences Supplement to Fuzzy Differences-in-Differences Clément de Chaisemartin Xavier D Haultfœuille June 30, 2017 Abstract This paper gathers the supplementary material to de Chaisemartin and D Haultfœuille (2017).

More information

A Simple Adjustment for Bandwidth Snooping

A Simple Adjustment for Bandwidth Snooping A Simple Adjustment for Bandwidth Snooping Timothy B. Armstrong Yale University Michal Kolesár Princeton University June 28, 2017 Abstract Kernel-based estimators such as local polynomial estimators in

More information

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects. A Course in Applied Econometrics Lecture 5 Outline. Introduction 2. Basics Instrumental Variables with Treatment Effect Heterogeneity: Local Average Treatment Effects 3. Local Average Treatment Effects

More information

It s never too LATE: A new look at local average treatment effects with or without defiers

It s never too LATE: A new look at local average treatment effects with or without defiers It s never too LATE: A new look at local average treatment effects with or without defiers by Christian M. Dahl, Martin Huber and Giovanni Mellace Discussion Papers on Business and Economics No. 2/2017

More information

Fuzzy Differences-in-Differences

Fuzzy Differences-in-Differences Review of Economic Studies (2017) 01, 1 30 0034-6527/17/00000001$02.00 c 2017 The Review of Economic Studies Limited Fuzzy Differences-in-Differences C. DE CHAISEMARTIN University of California at Santa

More information

TESTING REGRESSION MONOTONICITY IN ECONOMETRIC MODELS

TESTING REGRESSION MONOTONICITY IN ECONOMETRIC MODELS TESTING REGRESSION MONOTONICITY IN ECONOMETRIC MODELS DENIS CHETVERIKOV Abstract. Monotonicity is a key qualitative prediction of a wide array of economic models derived via robust comparative statics.

More information

Distributional Tests for Regression Discontinuity: Theory and Empirical Examples

Distributional Tests for Regression Discontinuity: Theory and Empirical Examples Distributional Tests for Regression Discontinuity: Theory and Empirical Examples Shu Shen University of California, Davis shushen@ucdavis.edu Xiaohan Zhang University of California, Davis xhzhang@ucdavis.edu

More information

A Discontinuity Test for Identification in Nonparametric Models with Endogeneity

A Discontinuity Test for Identification in Nonparametric Models with Endogeneity A Discontinuity Test for Identification in Nonparametric Models with Endogeneity Carolina Caetano 1 Christoph Rothe 2 Nese Yildiz 1 1 Department of Economics 2 Department of Economics University of Rochester

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death

Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death Kosuke Imai First Draft: January 19, 2007 This Draft: August 24, 2007 Abstract Zhang and Rubin 2003) derives

More information

Regression Discontinuity Design on Model Schools Value-Added Effects: Empirical Evidence from Rural Beijing

Regression Discontinuity Design on Model Schools Value-Added Effects: Empirical Evidence from Rural Beijing Regression Discontinuity Design on Model Schools Value-Added Effects: Empirical Evidence from Rural Beijing Kai Hong CentER Graduate School, Tilburg University April 2010 Abstract In this study we examine

More information

Lecture 8 Inequality Testing and Moment Inequality Models

Lecture 8 Inequality Testing and Moment Inequality Models Lecture 8 Inequality Testing and Moment Inequality Models Inequality Testing In the previous lecture, we discussed how to test the nonlinear hypothesis H 0 : h(θ 0 ) 0 when the sample information comes

More information

large number of i.i.d. observations from P. For concreteness, suppose

large number of i.i.d. observations from P. For concreteness, suppose 1 Subsampling Suppose X i, i = 1,..., n is an i.i.d. sequence of random variables with distribution P. Let θ(p ) be some real-valued parameter of interest, and let ˆθ n = ˆθ n (X 1,..., X n ) be some estimate

More information

ON THE CHOICE OF TEST STATISTIC FOR CONDITIONAL MOMENT INEQUALITES. Timothy B. Armstrong. October 2014 Revised July 2017

ON THE CHOICE OF TEST STATISTIC FOR CONDITIONAL MOMENT INEQUALITES. Timothy B. Armstrong. October 2014 Revised July 2017 ON THE CHOICE OF TEST STATISTIC FOR CONDITIONAL MOMENT INEQUALITES By Timothy B. Armstrong October 2014 Revised July 2017 COWLES FOUNDATION DISCUSSION PAPER NO. 1960R2 COWLES FOUNDATION FOR RESEARCH IN

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs

Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs Sebastian Calonico Matias D. Cattaneo Max H. Farrell September 14, 2018 Abstract Modern empirical work in

More information

Comparison of inferential methods in partially identified models in terms of error in coverage probability

Comparison of inferential methods in partially identified models in terms of error in coverage probability Comparison of inferential methods in partially identified models in terms of error in coverage probability Federico A. Bugni Department of Economics Duke University federico.bugni@duke.edu. September 22,

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

Lecture 10 Regression Discontinuity (and Kink) Design

Lecture 10 Regression Discontinuity (and Kink) Design Lecture 10 Regression Discontinuity (and Kink) Design Economics 2123 George Washington University Instructor: Prof. Ben Williams Introduction Estimation in RDD Identification RDD implementation RDD example

More information

Taisuke Otsu, Ke-Li Xu, Yukitoshi Matsushita Empirical likelihood for regression discontinuity design

Taisuke Otsu, Ke-Li Xu, Yukitoshi Matsushita Empirical likelihood for regression discontinuity design Taisuke Otsu, Ke-Li Xu, Yukitoshi Matsushita Empirical likelihood for regression discontinuity design Article Accepted version Refereed Original citation: Otsu, Taisuke, Xu, Ke-Li and Matsushita, Yukitoshi

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

What s New in Econometrics. Lecture 13

What s New in Econometrics. Lecture 13 What s New in Econometrics Lecture 13 Weak Instruments and Many Instruments Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Motivation 3. Weak Instruments 4. Many Weak) Instruments

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

Inference in Regression Discontinuity Designs with a Discrete Running Variable

Inference in Regression Discontinuity Designs with a Discrete Running Variable Inference in Regression Discontinuity Designs with a Discrete Running Variable Michal Kolesár Christoph Rothe June 21, 2016 Abstract We consider inference in regression discontinuity designs when the running

More information