Regression Discontinuity Designs with a Continuous Treatment

Size: px
Start display at page:

Download "Regression Discontinuity Designs with a Continuous Treatment"

Transcription

1 Regression Discontinuity Designs with a Continuous Treatment Yingying Dong, Ying-Ying Lee, Michael Gou First version: April 17; this version: December 18 Abstract This paper provides identification and inference theory for the class of regression discontinuity RD) designs with a continuous treatment. Causal identification is achieved by utilizing any changes in the treatment distribution at the RD threshold including the usual mean change as a special case). We provide bias-corrected robust inference for the identified treatment effects, either at a given treatment quantile or averaging over the treatment distribution. Also provided are asymptotically mean squared error optimal bandwidths. Our model applies to a large class of policies that target parts or features of the treatment distribution, such as changing the mean, changing the variance or shifting one or both tails of the distribution. We apply the proposed approach to investigate the impacts of bank capital on bank stability, taking advantage of the minimum capital requirements in the early th century. This policy targets small banks, creating shifts in the lower tail of the capital distribution at certain policy thresholds. JEL codes: C1, C5, I3 Keywords: Regression discontinuity RD) design, Continuous treatment, Control variable, Robust inference, Distributional change, Rank invariance, Rank similarity, Capital regulation, Bank stability 1 Introduction RD designs originated in the early 196 s Thistlethwaite and Campbell, 196). Hahn, Tood, and van der Klaauw 1) formally establish the identification theory for the RD design. Since then, there has been a large number of studies applying and extending RD designs for treatment effect identification and policy evaluation. 1 The standard RD design assumes a binary treatment. Let Y Y R be the outcome of interest, which can be continuous or discrete, and T T R be a treatment. Let R R R be the continuous running or forcing variable that partly determines niversity of California Irvine. Yingying Dong, yyd@uci.edu. Ying-Ying Lee, yingying.lee@uci.edu. 1 For theoretical discussion of the standard RD design, see, e.g., Porter 3), Lee 9), Imbens and Kalyanaraman 1), Frandsen, Frölich, and Melly 1), Calonico, Cattaneo, and Titiunik 14), Cattaneo, Frandsen, and Titiunik 15), Otsu, Xu, and Matsushita 15), Dong and Lewbel 15), Angrist and Rokkanen 15), Feir, Lemieux, and Marmer 16), Bertanha 16), Dong 18), Arai et al. 17), Chiang, Hsu, and Sasaki 18), Bugni and Canay 18), Canay and Kamat 18), Cattaneo, Jansson, and Ma 18), and Gerard, Rokkanen, and Rothe 18). 1

2 the treatment. Define Z 1 R r ) for a known cutoff value of the running variable r, where 1 ) is an indicator function equal to 1 if the expression in the parentheses is true and otherwise. When the treatment T is binary, let Y t, t = 1,, be the potential outcome under treatment or no treatment Rubin, 1974) and further let C be the set of compliers who are treated if and only if they are above the RD threshold, i.e., T = Z for compliers. In a seminal paper, Hahn, Todd, and van der Klaauw 1) show that the following local Wald ratio identifies a local average treatment effect LATE) for compliers at the RD threshold. lim r r + E [Y R = r] lim r r E [Y R = r] lim r r + E [T R = r] lim r r E [T R = r] = E [Y 1 Y R = r, C]. In practice, many empirical applications of RD designs involve continuous treatments see, for recent examples, Oreopoulos, 6, Card, Chetty, and Weber, 7, Schmieder, von Wachter, and Bender, 1, Pop-Eleches and rquiola, 13, and Clark and Royer, 13). Empirical researchers typically apply the above standard RD estimand derived for a binary treatment to applications with continuous treatments. Intuitively, with a continuous treatment, under certain conditions, this local Wald ratio might identify a causal effect, providing that the mean of the treatment changes at the RD cutoff, i.e., lim r r + E [T R = r] lim r r E [T R = r]. Even when it is valid, the interpretation of the above local Wald ratio for a continuous treatment would be more complicated than the case of a binary treatment. More importantly, many public policies target not necessarily the average units, but only those in some parts e.g., top or bottom) of the distribution for treatment. Examples include minimum school leaving age, minimum wage, maximum welfare benefits, etc. When there is little or no mean change in the treatment, identification in the standard RD designs can be weak or fail. This can occur, e.g., if treatment changes concentrate in a few selected quantiles. nlike the weak IV approach Feir, Lemieux, and Marmer, 16), we seek to provide alternative ways of identifying and estimating causal treatment effects. By focusing on where the true changes are in the treatment distribution, we provide what are likely to be the most policy relevant treatment effects. Consider our empirical scenario for concreteness. We are interested in investigating how capital regulation affects bank capital and thereby affects banks behaviors and long-run suspension rate. We take advantage of the minimum capital requirement rule in the early th century of the nited States, which features abrupt changes at certain policy thresholds. Figure 1 provides a scatter plot of the treatment variable against the running variable, where the treatment variable is bank capital, and the running variable is the population of the town where a bank was located. What s striking about this scatter plot is its staircase shaped bottom contour. The bottom contour is dictated by the minimum capital requirements at different town populations, which we mark using the solid line. For example, in towns with a population just below 3,, banks were required to hold a minimum capital of $5,, while in towns with a population just above 3,, banks were

3 Figure 1: Minimum capital requirements in 195 required to hold twice as much. There were two other thresholds at the populations 6, and 5,. The minimum capital requirements target banks with low levels of capital, so changes in bank capital occur primarily at the bottom of the capital distribution at each policy threshold. Empirical analysis in Section 7 shows that there is no significant mean change in bank capital around the first threshold, where most of banks are present. Applying the standard RD design would be difficult in this case. Let T T 1 ) be the potential capital a bank would hold if the bank operated in a town with a population just below above) the 3, policy threshold. Write T = α+β Z, where random variables α = T and β = T 1 T. One may then utilize this first-stage heterogeneity or random coefficient to identify causal effects of the treatment. Following Imbens and Newey 9), we construct a control variable, which is the conditional cumulative distribution function of T given Z and R. Intuitively, this control variable isolates exogenous distributional changes in T, instead of mean changes, for identification. We establish new identification results for the class of RD designs with a continuous treatment, utilizing any distributional changes in the treatment at the RD cutoff. The distributional changes include the usual mean change as a special case. We identify local average treatment effects at different treatment quantiles. These quantile specific local average treatment effects Q-LATEs) provide information on treatment effect heterogeneity at different treatment levels. We also identify a local weighted average treatment effect averaging over the treatment distribution WQ-LATE). Importantly, the WQ- LATE estimand incorporates the standard RD local Wald ratio as a special case. It works and is the same) when the standard RD estimand works, and can still work when the standard RD estimand does not. We further provide both conventional undersmoothing and bias-corrected robust inference for Q-LATE and WQ-LATE, as well as their associated asymptotic mean squared error AMSE) optimal bandwidths. The new theory allows us to quantify how capital regulation affects bank capital and thereby affects 3

4 bank outcomes, particularly among those banks targeted by the capital regulation. We show that while capital requirements lead to small banks to hold more capital, these banks respond in ways that prevent the regulation from having intended effects. On average a 1% increase in capital leads to almost a 1% increase in assets among banks at lower quantiles of the capital distribution. Leverage is not significantly lowered, leaving their long run up to 4 years) risk of suspension stays unchanged. Our paper complements the existing studies of the classical RD design with a binary treatment. Our paper is further related to several other important strands of literature. For example, we discuss causal model identification with a continuous treatment. Existing studies on causal model identification typically focus on binary treatments. This includes the LATE literature see, e.g., Imbens and Angirst, 1994, Angrist, Imbens, and Rubin 1996), the local quantile treatment effect LQTE) literature see, e.g., Abadie, Angrist, and Imbens,, Abadie, 3, Frölich and Melly 13), and the marginal treatment effect MTE) literature see, e.g., Vytlacil, Heckman and Vytlacil 5, 7, Carneiro, Heckman and Vytlacil, 1). Important work discussing causal identification with a continuous treatment includes Angrist, Graddy, and Imbens ) and Florens et al. 8) among others. More broadly, our paper is related to the non-separable IV literature with continuous endogenous covariates, where identification typically requires a scalar unobservable rank invariance) in either the first-stage or the outcome equation or both see, e.g., Chesher, 3, Horowitz and Lee, 7, Chernozhukov, Imbens, and Newey, 7, Florens et al., 8, Imbens and Newey, 9, D haultfoeuille and Février, 15, and Torgovitsky, 15). In contrast, we allow for multidimensional unobservables in both the first-stage and outcome equations and exemplify identification with a binary IV in this case. Torgovitsky 15) discusses the identifying power of imposing restrictions on heterogeneity or the dimensions of unobservables. He shows that by imposing rank invariance in both the first-stage and outcome equations, one can identify an infinite-dimensional object even with a discrete or binary IV. See also similar discussion in D haultfoeuille and Février 15). Our paper is also related to the IV literature using first-stage heterogeneity for identification. See, for recent examples, Brinch et al., 17) and Caetano and Escanciano 17). These existing studies consider heterogeneity of treatment responses in covariates. In contrast, we consider heterogeneity of treatment responses along the treatment distribution. In addition, the RD QTE model of Frandsen, Frölich, and Melly 1) considers a binary treatment in the RD design, but identifies heterogenous treatment effects along the outcome distribution. The rest of the paper proceeds as follows. Section defines the causal parameters of interest, and provides our main identification results. Section 3 provides a robust estimand that incorporates the standard RD estimand as a special case. Section 4 proposes convenient tests for our identifying assumptions and discusses including covariates. Section 5 describes estimation. Section 6 provides conventional and robust inference as well as the AMES optimal bandwidths. Section 7 presents the empirical analysis. Short concluding remarks are provided in Section 8. All proofs are in the Appendix. 4

5 Identification We present causal identification in RD designs with a continuous treatment, following the control variable approach by Imbens and Newey 9). Various discussions on the control variable approach to simultaneous equations models include Blundell and Powell 3), Newey, Powell, and Vella 1999) and Pinkse ), and Ma and Koenker 6). Consider an outcome equation Y = G T, R, ε), where ε E R d ε is allowed to be of arbitrary dimension. The continuous treatment variable can be written as T = T 1 1R r ) + T 1R < r ), where T z, for z =, 1, is the potential treatment when Z 1 R r ) is set at a hypothetical value z. Assume T z has a reduced-form equation T z = q z R, z ) with an unobserved disturbance z. Let F, ) and f, ) be the conditional cumulative distribution function CDF) and probability density function PDF), respectively, and f ) be the unconditional PDF. The following discussion focuses on R R, where R is an arbitrarily small compact interval around the known cutoff r. Assumption 1 Monotonic Quantile). For z =, 1, T z is a continuous random variable with a conditional CDF F Tz RT z, r) for any r R that is strictly increasing on its support; further q z R, z ) is strictly monotonic in z with probability 1. nder Assumption 1, without loss of generality, one can let q z R, z ) be the conditional quantile function of T z given R, where z = F Tz RT z, R) is the conditional rank of T z given R, and z ni f, 1). 3 Assumption Smoothness). q z, u), z =, 1, is a continuous function for all u, 1). G,, ) is a continuous function. f ε z,r e, u, r) for all e E and u, 1) is continuous in r R. f R r) is continuous and strictly positive at r = r. Assumption 3 Local rank invariance or local rank similarity). 1. = 1 conditional on R = r, or more generally. ε, R = r ) 1 ε, R = r ). Assumption assumes that the running variable has only smooth effects on potential treatments and that the running variable, treatment, and unobservables all impose smooth impacts on the outcome. It further assumes that at a given rank of the potential treatment, the distribution of the unobservables in the outcome model is smooth near the RD threshold. The last condition, the running variable is continuous with a positive density at the RD threshold, is a standard assumption that is essentially required for any RD designs. Assumption in practice requires the no manipulation condition that units cannot sort to be just above or below the RD threshold McCrary, 8). For clarity, we assume that any observable covariates are subsumed in ε. All the discussion can be extended to explicitly condition on covariates. 3 Suppose more generally T z = H z R, V z ), z =, 1, where V z is a continuous random variable that is not necessarily uniformly distributed and H z is strictly monotonic in V z. One can simply normalize T z = H z R, V z ) to be its quantile representation by a strictly monotonic transformation of V z, i.e., set z = F Vz RV z, R). 5

6 Assumption 3 imposes local rank restrictions. In particular, rank invariance or rank similarity is required to hold only at the RD cutoff. Assumption 3.1 requires units to stay at the same rank of the potential treatment distribution right above or below the RD threshold. Assumption 3. assumes local rank similarity, a weaker condition than Assumption 3.1. Without conditioning on ε, and 1 given R = r both follow a uniform distribution over the unit interval, i.e., R = r ) 1 R = r ) by construction. Local rank similarity here permits random slippages from the common rank level in the treatment distribution just above or just below the RD cutoff. Rank similarity has been proposed to identify quantile treatment effects QTEs) in IV models Chernozhukov and Hansen, 5, 6). nlike Chernozhukov and Hansen 5, 6), we impose the similarity assumption on the ranks of potential treatments, instead of ranks of potential outcomes. In our empirical analysis, Assumption 3 requires that if a bank tends to hold more capital when operating in a town with a population just below 3,, then it would also tend to hold more capital when operating in a town with a population at or right above 3,. These local rank restrictions along with Assumptions 1 and ) have readily testable implications. See Dong and Shen 18) and Frandsen and Lefgren 18) for discussion on the testable implications of rank invariance or rank similarity in treatment models. In Section 4 we discuss a convenient test for these assumptions in our setting. The following lemma shows that conditioning on 1 1R r ) + 1R < r ), any changes in the outcome at the RD threshold are causally related to changes in the treatment. Lemma 1. Let Assumptions 1-3 hold. 1. T ε, R).. For any integrable function of Y, Ɣ Y ), and any u, 1), = lim r r + E [Ɣ Y ) = u, R = r] lim r r E [Ɣ Y ) = u, R = r] Ɣ G q 1 r, u), r, e)) Ɣ G q r, u), r, e))) d F ε,r e, u, r ). Lemma 1.1 states that is a control variable conditional on R. The defining feature of any control variable is that conditional on this variable, treatment is exogenous to the outcome of interest. Note that here the IV Z 1 R r ) is binary and is a deterministic function of a possibly endogenous covariate R. When = u and R = r R\r, T is deterministic, i.e., T = q 1 r, u) for r > r, and T = q r, u) for r < r ; whereas when = u and R r, T can randomly take on two potential values q r, u) and q 1 r, u). Therefore, causal identification with this control variable is still local to the RD cutoff, which is a generic feature of the RD design. Lemma 1.1 can follow analogously from Theorem 1 of Imbens and Newey 9). 4 4 Note, however, that Imbens and Newey 9) focus on a continuous IV, which when equipped with a large support 6

7 Lemma 1. states that conditional on = u, the mean difference in Ɣ Y ) above and below the cutoff represents the impacts of an exogenous change in treatment from q r, u) to q 1 r, u). Lemma 1. gives the average effect of the IV Z on the outcome Ɣ Y ) for individuals at the u quantile of the treatment distribution. Based on Lemma 1, we define our parameters of interest and discuss identification. Let {u, 1): q 1 r, u) q r, u) > }. For any u, define the treatment quantile specific LATE Q-LATE) as τ u) G q1 r, u), r, e) G q r, u), r, e) d F ε,r e, u, r ) q 1 r, u) q r, u) = E [Y 1 = u, R = r ] E [Y = u, R = r ]. 1) q 1 r, u) q r, u) Essentially, Q-LATE τ u) is the ratio of the reduced-form effect of Z on Y to that of Z on T at the u quantile of the treatment T. 5 Further define the weighted average of Q-LATEs or, for short, WQ-LATE as π w) τ u) w u) du, where w u) and w u) du = 1. That is, w u) is a properly defined weighting function. When the function G T, r, ε) is continuously differentiable in its first argument, both parameters can be expressed as weighted average derivatives of the outcome G T, r, ε) with respect to T. In particular, following Lemma 5 of Angrist, Graddy, and Imbens ), one can write the numerator of Q-LATE in 1) as [ ] E G q 1 r, u), r, ε) G q r, u), r, ε) = u, R = r [ q1 r ],u) = E q r,u) t G t, r, ε) dt = u, R = r q1 r,u) [ ] = E t G t, r, ε) = u, R = r dt, q r,u) where the second equality follows from that under standard regularity conditions, one can interchange assumption that essentially requires the IV varies a lot after possibly conditioning on exogenous covariates, yields more general identification results. 5 Consider the sepcial case where Y is given by a linear correlated random coeffcients model of the form Y i = a i + b i T i. Q-LATE τ u) = E [b i = u, R = r ], which captures the average partial effect for units ranked at the u quantile of the treatment distribution at the RD threshold. 7

8 the order of integration. τ u) is then given by τ u) = q1 r,u) q r,u) [ ] E t G t, r, ε) = u, R = r ϰ u) dt, 1 for ϰ u) q 1 r,u) q r,u). That is, Q-LATE τ u) is a weighted average derivative, averaging over the change in T at a given quantile u at r. It follows that WQ-LATE π w) is also a weighted average derivative, averaging over both changes in T at a given quantile u and over at the RD threshold. To identify Q-LATE and WQ-LATE, we impose the following first-stage assumption. Assumption 4 First-stage). q 1 r, u) q r, u) for at least some u, 1). Assumption 4 requires that F T1 R t, r ) = F T R t, r ) does not hold for all t T T 1, where T z, z =, 1, denotes the conditional support of T z given R = r. 6 This is in contrast to the standard RD first-stage assumption requiring a mean change, i.e., E [T 1 R = r ] E [T R = r ]. The weak IV literature considers the case when E [T Z = 1 ET Z = ] is close to zero. There, inference on the Wald ratio estimator is known to be difficult. We instead seek to provide alternative ways to identify and estimate causal treatment effects. Intuitively, if treatment changes concentrate in a few quantiles, instead of looking at the average change, focusing on where the true changes are in the treatment distribution may strengthen identification. 7 For notational convenience, let T = qr, ) q R, ) 1 Z) + q 1 R, 1 )Z. Given smoothness of q z r, z ) by Assumption, qr, ) is right and left continuous in r at r = r for all u, 1). Then define q + u) lim r r + qr, u) and q u) lim r r qr, u). Let mt, r) E [Y T = t, R = r], and similarly define m + u) lim r r + mq + u), r) and m u) lim r r mq u), r). q ± u) and m ± u) can be consistently estimated from the data. The following theorem provides identification of Q-LATE and WQ-LATE. Theorem 1 Identification). nder Assumptions 1 4, for any u, Q-LATE τ u) is identified and is given by τ u) = m+ u) m u) q + u) q u). ) Further, WQ-LATE π w) τ u) w u) du is identified for any known or estimable weighting function w u) such that w u) and w u) du = 1. 6 When either F T1 R t, r ) or F T R t, r ) is not defined for t T T 1, we assume that F T1 R t, r ) = F T R t, r ) does not hold. Assumption 4 ensures that is non-empty. 7 When changes at some treatment quantiles are small, one can modify our trimming threshold for to be above some constant some constant c > instead of. 8

9 To aggregate Q-LATE, one simple weighting function is equal weighting, i.e., w u) = w S u) 1/ 1du. One may choose other properly defined weighting functions. w u) is required to be nonnegative; otherwise, π w) can be a weighted difference of the average treatment effects among those who change treatment levels at the RD threshold. The next section shows that the standard RD estimand can be expressed as a WQ-LATE, using a particular weighting function. In the special case when treatment effect is locally constant, the weighting function does not matter. With any valid weighting function, one can identify the same homogenous treatment effect. Replacing Y by any integrable function Ɣ Y ) in the above, one can readily identify Q-LATE and WQ-LATE on Ɣ Y ). In addition to Q-LATE and WQ-LATE, one may identify other parameters. Conditional on = u, there are two potential treatment values at r = r, in particular, t q r, u) and t 1 q 1 r, u). One can then identify potential outcome distributions at each u, 1) at the two treatment values. Assume that Y is continuous. Let the potential outcome corresponding to the treatment value t T be Y t G t, R, ε). nder Assumptions 1-3, F Yt1,R y, u, r ) = lim r r + E [ 1 Y y) T = q + u), R = r ] for any u, 1). F Yt,R y, u, r ) can be analogously identified. Further one can identify the LQTE at each u when treatment changes from q r, u) to q 1 r, u). It is given by Q Yt1,R v, u, r ) Q Yt,R v, u, r ) for any v, 1) and u, where Q Yt z,r v, u, r ) F 1 Y t z,r v, u, r ). 3 Robust estimand In this section, we discuss the standard RD estimand and show that it can be expressed as a WQ-LATE, using a particular weighting function. We then seek to provide a robust estimand that incorporates the standard RD estimand as a special case. That is, it works and is equivalent to the standard RD estimand when the standard RD estimand works and continues to work under our assumptions when the standard RD estimand does not work. 3.1 Standard RD estimand Consider the standard RD estimand in the form of the local Wald ratio, and rewrite it as follows π R D lim r r + E [Y R = r] lim r r E [Y R = r] lim r r + E [T R = r] lim r r E [T R = r] 1 = {E [Y 1 = u, R = r ] E [Y = u, R = r ]} du 1 q 3) 1r, u) q r, u)) du E [Y 1 = u, R = r ] E [Y = u, R = r ] q u) = q u) 1 du 4) q u) du q u) = τ u) du, 5) q u) du 9

10 where q u) q 1 r, u) q r, u), equation 3) follows from the smoothness conditions in Assumption, 4) follows from re-arranging, and 5) follows from Assumption 3 and the definition of Q-LATE τ u). Equation 5) shows that under our assumptions, the standard RD estimand identifies a weighted average of Q-LATEs, using as weights w RD u) q u) / q u) du. The above requires q u) or q u) for all u to ensure w RD u) over. Otherwise, when q u) can switch signs over, the local Wald ratio π RD would be undefined if the denominator q u) du =. Further if q u) du, π RD would be a weighted difference of average treatment effects for units with positive treatment changes and units with negative treatment changes. The following assumption is sufficient for w RD u). Assumption 3b Monotonicity). PrT 1 T R = r ) = 1 or PrT 1 T R = r ) = 1. Assumption 3b requires that treatment T is weakly increasing or weakly decreasing almost surely in Z. nder Assumption 3b, q u) or q u) for all u, 1). nlike Assumption 3, which imposes rank restrictions, Assumption 3b imposes a sign restriction on the treatment changes at the RD threshold. Angrist, Graddy, and Imbens, Assumption 4) have made a similar assumption in identifying a general simultaneous equations system with binary IVs. When Assumption 3 local rank invariance or rank similarity does not hold, Q-LATE involved in equation 5) does not have a causal interpretation. However, the RD estimand may still identify a causal parameter under Assumption 3b monotonicity. This is parallel to the standard RD designs with a binary treatment, where the local Wald ratio identifies a LATE for compliers. We formally state this result in the following Lemma. Lemma. Let Assumptions 1,, 3b, and 4 hold. Then π RD identifies a local weighted average effect of T on Y at R = r. The proof of Lemma shows that Assumptions 1,, 3b, and 4, the standard RD estimand with a continous treatment identifies a weighted average of individual treatment effects among those individuals who change their treatment intensity at the RD threshold. When further GT, R, ε) is continuously differentiable in its first argument, the identified effect can be expressed as a weighted average derivative of Y w.r.t. T. The exact form of the weighted average derivative is provided in the proof of Lemma in the Appendix. 3. Robust WQ-LATE estimand The discussion so far suggests that regardless of whether the rank restriction holds or not, the standard RD estimand in general requires monotonicity in order to be a causal estimand. Monotonicity does not always hold in practice. 1

11 Both monotonicity and local rank assumptions impose restrictions on the first-stage heterogeneity, but neither assumption implies the other. Monotonicity imposes a sign restriction on T 1 T at R = r, while the local rank invariance or rank similarity imposes a rank restriction on the joint distribution of T 1 and T at R = r. It is then useful to have an estimand that is valid under either assumption. Theorem below provides a robust WQ-LATE estimand that is valid under either monotonicity or local rank invariance or rank similarity. Theorem Robust Estimand). Let Assumptions 1, and 4 hold. Then under either Assumption 3 or 3b, π = m + u) m u) q + u) q u) identifies a local weighted average effect of T on Y at R = r. q + u) q u) q+ u) q du 6) u) du When monotonicity holds, π = π RD ; otherwise, when monotonicity does not hold, but our rank assumption holds, π = π w ) τ u) w u) du for w u) qu) qu) du.8 Either way, π identifies a weighted average treatment effect among those individuals who change their treatment intensity at the RD threshold. The two alternative assumptions specify different ways in which individuals can respond to crossing the RD threshold. 9 Our robust estimand π then provides a robust way to aggregate these individual treatment effects. 4 Discussion We impose Assumptions 1-4 for identification. The monotonic quantile and first-stage conditions are assumptions imposed on estimable functions and observables, respectively. They can be directly verified in the data. In the following we briefly discuss the testable implications of the local smoothness and rank restrictions and propose convenient tests for them. It is worth emphasizing that what we propose is a joint test for the implications of both assumptions, similar to Kitagawa 15) and Arai et al. 18). Recall Y = G T, R, ε), where ε contains any other observable and unobservable) covariates of Y other than R. Let X X R be some observable component of ε. nder either local rank invariance or local rank similarity, ε, R = r ) 1 ε, R = r ). By Bayes theorem, 8 Theorems 1 and in fact only require Assumptions 3 and 3b to hold for z over the sub-support, 1), for z =, 1. Gq 1 r,u 1 ),r,ε) Gq r,u ),r,ε) q 1 r,u 1 ) q r,u 1 ) 9 nder monotonicity Pr T 1 T R = r ) = 1, π identifies a weighted average of individual causal effects among those having q 1 u 1 ) q u ) > or among those having q 1 u 1 ) q u ) < under Pr T 1 T R = r ) = 1). nder the rank restriction, π identifies a weighted average of Gq 1r,u),r,ε) Gq r,u),r,ε) q 1 r,u) q r,u) among those having ε 1 = u, R = r ) ε = u, R = r ) for any u. 11

12 ε = u, R = r ) ε 1 = u, R = r ), for any u, 1), and hence F X 1,R x, u, r ) = F X,R x, u, r ) for all x X and u, 1). Further by Assumption, F ε z,r e, u, r) and hence F X z,r x, u, r), z =, 1, is continuous at r = r. It follows that the right and left limits of F X,R x, u, r) = Z F X 1,R x, u, r) + 1 Z) F X,R x, u, r) exist at r = r. Therefore, to test the implications of the local smoothness and rank restrictions, one can test the following null hypothesis H : lim r r + E [1 X x) = u, R = r] lim r r E [1 X x) = u, R = r] =, x X, u. The left-hand side of equation 7) corresponds to the numerator of equation ) with Y being replaced by 1 X x). Testing for our identifying assumptions then amounts to testing that the Q- LATEs or WQ-LATEs on the covariate distribution are zero. Such tests are essentially falsification tests to show that treatment has no false significant impacts on covariates at any treatment quantiles. Complementary to our proposed tests, conventional RD validity tests can be used to provide suggestive evidence for local smoothness. In particular, one can test smoothness of the conditional means of covariates or the density of the running variable. 1 See, e.g., McCrary 8), Otsu, Xu, and Matsushita 13), Bugni and Canay 18), Canay and Kamat 18), and Cattaneo, Jansson, and Ma 18). In addition, Arai et al. 18) provide a joint test for the identifying assumptions of the standard fuzzy RD design with a binary treatment. The above leverages covariates for testing purposes. Our discussion of identification does not directly condition on covariates. In practice, rank similarity may be more plausible when conditioning on relevant covariates see, e.g., Chernozhokov and Hansen, 5). One may relax the rank restrictions to assume that local rank similarity holds after conditioning on additional covariates other than the running variable. Let Assumptions 1,, and 3 hold conditioning on covariates. All our discussion and results would then hold conditional on covariates. 11 To obtain the unconditional on covariates) Q-LATEs, one may average over the covariates distribution at a given treatment quantile. 7) 5 Estimation The proposed estimands for Q-LATE and WQ-LATE involve conditional means and quantiles at a boundary point. Following the standard practice of the RD literature, we estimate Q-LATE and WQ- LATE by local linear mean and quantile regressions. For simplicity, we use the same kernel function K ) for all estimation. Let the bandwidths for 1 Note that, however, smoothness of the conditional means of covariates or running variable density is neither necessary nor sufficient for the standard RD identification. 11 To test rank similarity imposed on the conditional ranks of potential treatments, one needs additional covariates other than those in the conditioning set of the conditional ranks. 1

13 T and R be h T and h R, respectively. Define h T hσ T and h R hσ R, where σ T and σ R are the standard deviations of T and R, respectively. Given a sample of n i.i.d. observations {Y i, T i, R i )} n i=1 from Y, T, R), we estimate Q-LATE and WQ-LATE by the following procedure. Step 1: Partition the unit interval, 1) into a grid of equally spaced quantiles l) {u 1, u,..., u l }. For u l), obtain estimates of q + u) and q u), denoted as q + u) and q u), by â + and â, respectively, from the following local linear quantile regressions â+, ) â+ 1 â, ) â 1 = arg min a +,a+ 1 = arg min a,a 1 {i: R i r } {i: R i <r } ) Ri r K ρ u Ti a + a+ 1 R i r ) ), h R ) Ri r K ρ u Ti a a 1 R i r ) ), where ρ u α) = αu 1α < )) is the standard check function. 1 Step : Let Ũ { u l) } : ˆqu) > ɛ n, where ˆqu) ˆq + u) ˆq u) and ɛ n is a positive sequence satisfying ɛ 1 n sup u ˆqu) qu) = op 1) and ɛ ) n supu ˆqu) qu) 1 = o p 1). 13 For all u Ũ, obtain estimates of m + u) and m u), denoted as m + u) and m u), by b + and b, respectively, from the following local linear regressions b +, b + 1, b + ) b, b 1, b ) = arg min b +,b+ 1,b+ h R {i: R i r } ) Ri r Ti ˆq + ) u) K K Y i b + b+ 1 R i r ) b + Ti ˆq + u) )), ) Ri r Ti ˆq ) u) = arg min K K b,b 1,b h {i: R i <r } R h T Y i b b 1 R i r ) b Ti ˆq u) )). Step 3: Plug the above estimates into equation ) to obtain τ u) = ˆm+ u) ˆm u) for u Ũ. ˆqu) u Ũ ˆqu). h R ˆq + u) ˆq u) Step 4: Estimate π by ˆπ = u Ũ τ u) Our identification theory requires trimming out treatment quantiles where there are no changes at the RD threshold, i.e., qu) =, whereas in practice we do not know the true qu). To avoid any pre-testing problem, we trim out all quantiles having ˆqu) ɛ n for some chosen ɛ n. Lemma 6 in Appendix B shows that when ɛ n satisfies the listed conditions, this trimming procedure is asymptotically equivalent to trimming out those treatment quantiles where qu) = and preserves the asymptotic properties of our estimator. 1 If desired, one could monotonize q ± u) using the inequality constraints or rearrangement methods in Chernozhukov, Fernández-Val, and Galichon 1) or Qu and Yoon 15). Both papers show that the monotonized estimators share the same first-order limiting distribution with the initial local linear estimator. 13 If one wishes to focus on quantiles such that qu) > c for some small c >, then one can define the trimming parameter to be c n = c + ɛ n, where ɛ n is defined the same way. h T 13

14 In practice, one can choose ɛ n max u l) se qu) ) 1.96, where qu) is a preliminary Step 1 estimator of the treatment quantile change, using the bandwidth hσ R such that h/h, and n h /h. We discuss the choice of h next in Section The associated standard errors satisfy se qu)) = O p n h ) 1/) ) 1/ ) > se ˆq u)) = O p nh. It follows that ɛ 1 n sup u ˆqu) qu) = O p se qu) ) 1 ) ) se ˆqu) = o p 1) and ɛ ) n supu ˆqu) qu) 1 = O p log n se qu) ) ) ) 1 se ˆqu) = o p 1). By this procedure, insignificant estimates at the 5% significance level) of ˆqu) along with some significant but small estimates will be trimmed out and the asymptotic behavior of our estimator is not affected. The above describes estimation of Q-LATEs or WQ-LATE. To estimate LQTEs conditional on = u described in Section.1, one may simply replace the local linear mean regressions in Step by local linear quantile regressions. Other steps remain the same. 6 Inference The proposed estimators have several distinct features which make analyzing their asymptotic properties challenging. First, the local polynomial estimator in Step involves a continuous treatment variable T, in addition to the running variable R. Evaluating T over its interior support and evaluating R at the boundary point r complicates the analysis. Second, we need to account for the sampling variation of ˆq ± u) from Step 1, which appear in both the numerator and denominator of τu), as well as in the weighting function ŵ u) for ˆπ. Third, our estimation involves a trimming procedure that is based on the estimated ˆqu). To deal with these complications, we build on the results of Kong, Linton, and Xia 1) and Qu and Yoon 15). Qu and Yoon 15) provide uniform convergence results for local linear quantile regressions, while Kong, Linton, and Xia 1) establish strong uniform convergence results for local polynomial estimators. We first present conventional inference assuming undersmoothing, i.e., applying a bandwidth sequence such that the leading bias is asymptotically first-order negligible. We then discuss the leading bias under more general bandwidth conditions. Following the popular approach of Calonico, Cattaneo, and Titiunik 14), we bias-correct the main estimators and then develop robust inference for the bias-corrected estimators. The robust inference takes into account the added variability due to bias correction and allows for a wider range of bandwidth choices. 15 Additionally, we present the asymptotically mean squared error AMSE) optimal bandwidths for both the Q-LATE and WQ-LATE 14 Consider the bandwidth sequences h = cn a and h = cn b for some constants < a, b < 1 and c >. The required conditions for ɛ n are satisfied when choosing b such that a < b < a + 1)/. Intuitively, there is a trade-off on the convergence rate of ɛ n : on the one hand, we need ɛ n converge to zero not too fast compared with ˆq q, so that the sampling variation of ˆq in the trimming procedure is asymptotically ignorable; on the other hand, ɛ n needs to converge to zero fast enough in order to keep all the quantiles in. 15 See Cattaneo, Titiuni, and Vazquez-Bare 16) for a comparison of different inference approaches for the standard RD design. 14

15 estimators by minimizing the AMSE. Imbens and Kalyanaraman 1) propose the AMSE optimal bandwidth for the standard RD estimator. The robust confidence intervals for the bias-corrected estimators deliver valid inference when the AMSE optimal bandwidths are used. We impose the following assumptions for asymptotics. Assumption 5 Asymptotics). 1. For any t T z, z =, 1, r R, and u, f Tz Rt, r) is bounded and bounded away from zero, and has bounded first order derivatives with respect to t, r); j q z r, u)/ r j is finite and Lipschitz continuous over r, u) for j = 1,, 3; q z r, u) and q z r, u)/ u are finite and Lipschitz continuous in u.. For any t T z, z =, 1, and r R, E [ GT z, R, ε) T z = t, R = r ] has bounded fourth order derivatives; the conditional variance V [ GT z, R, ε) T z = t, R = r ] is continuous and bounded away from zero; the conditional density f Tz R Y t, r, y) is bounded for any y Y. E [ Y E [ Y Tz, R ] 3] < for z =, The kernel function K is bounded, positive, compactly supported, symmetric, having finite firstorder derivative, and satisfying v K v)dv >. Assumption 5.1 imposes sufficient smoothness conditions to derive the asymptotic linear representations of ˆq ± u). In particular, the bounded joint density implies a compact support where the stochastic expansions of ˆq ± u) hold uniformly over u. Together with the smoothness conditions on q z r, u), the remainder terms in the stochastic expansions are controlled to be small. Assumption 5. imposes additional conditions to derive the asymptotic linear representation of Ê [Y T, R] and asymptotic normality of our estimators. Assumption 5.3 lists the standard regularity conditions for the kernel function. 6.1 Asymptotic distributions under undersmoothing Theorem 3 below presents the asymptotic distribution of ˆτu) under a bandwidth sequence h = h n that goes to zero fast enough with the sample size n i.e., satisfying nh 6 instead of nh 6 c, )), so that the bias is asymptotically negligible. Theorem 3 Asymptotic distribution of ˆτu)). Let Assumptions 1-5 hold. If h = h n, nh 3, and nh 6, then for u ˆτu) τu) Vτ,n u) d N, 1), where V τ,n u) V τ u) nh. The exact form of V τ u) is given by equation 9) of Lemma 4 in Appendix B. The bandwidth conditions in Theorem 3 imply a bandwidth choice h = h n = C τ n a for some constant a 1/6, 1/3) and C τ, ). Theorem 3 implies nh ˆτu) τu) ) d N, V τ u)), 15

16 where V τ u) is the asymptotic variance of [ nh ] ˆτu). The 11 α)% confidence interval for τu) is then given by ˆτu) ± 1 1 α/ Vτ u)/nh ), where 1 1 α/ is the 1 α/)-quantile of the standard normal distribution. One can estimate V τ u) by the usual plug-in estimator, i.e., replacing the unknown parameters involved with their consistent estimates. Theorem 4 below similarly presents the asymptotic distribution of ˆπ using a bandwidth sequence that goes to zero fast enough with the sample size i.e., satisfying nh 5 instead of nh 5 c, )), so that the bias is asymptotically negligible. Theorem 4 Asymptotic distribution of ˆπ ). Let Assumptions 1-5 hold. If h = h n, nh 4, nh 5, and the number of grid points l, then ˆπ π Vπ,n d N, 1), where V π,n V π nh. The exact form of V π is given by equation 11) of Lemma 5 in Appendix B. The bandwidth conditions in Theorem 4 imply a bandwidth choice h = h n = C π n a for a 1/5, 1/4) and C π, ). Based on Theorem 4, nh ˆπ π ) d N, V π ), where V π is the asymptotic variance of [ ] nh ˆπ. The 11 α)% confidence interval for π is then given by ˆπ ± 1. V π can be estimated by the usual plug-in estimator. 1 α/ Vπ /nh) As a convenient alternative, one may apply the standard bootstrap based on drawing n observations with replacement to obtain standard errors and confidence intervals. The bootstrap is known to be valid for the local linear mean and quantile estimators in ˆmu) and qu) for any u see, e.g., Horowitz, 1). Since ˆmu) and ˆτu) are differentiable functions of the local linear mean and quantile estimators, the bootstrap is valid for ˆτu) by the standard delta method. To show the bootstrap validity for π, it suffices to show the uniform stochastic expansion for π based on the bootstrap sample. 16 This additional technicality is out of the scope of the current paper. 6. Bias-corrected robust inference The asymptotic distributions of ˆτu) and ˆπ presented in the previous section are valid only when the bandwidths shrink to zero fast enough with the sample size, which prevents overly large bandwidth choices, as are typical in empirical practice. In particular, when nh 6 c, ) or nh 5 c, ), a leading bias term appears in the asymptotic distribution of ˆτu) or ˆπ see for details 16 More specifically, by replacing the observation Y i, T i, R i ) with the bootstrap data Yi, T i, Ri ) and replacing the probability measure p with p implied by bootstrap sampling, the asymptotic linear representations in Lemma 3 hold for bootstrap estimators q u) and ˆm u) for any u. Bootstrap validity then follows from the fact that τu) is a differentiable function of mu) and qu). ˆπ is a differentiable function of ˆτu) and ŵ u), while ŵ u) is Hadamard differentiable in qu). By Theorem 3.1 in Fang and Santos 18), the standard bootstrap is valid for inference on u l) πu) qu) u l) qu), where l) {u l) qu) > ɛ n } for a fixed number of grid points l. As l, uniformity over u is sufficient for valid inference on π. 16

17 Lemma 4 and Lemma 5 in Appendix B). Denote the bias for ˆτu) as h B τ u), and the bias for ˆπ as h B π. The exact forms of B τ u) and B π are presented, respectively, in equation 8) of Lemma 4 and equation 1) of Lemma 5 in Appendix B. These biases depend on changes in the curvatures of the conditional quantile and mean functions in Step 1 and Step estimation. We propose the following bias-corrected estimator for τu) ˆτ bc u) ˆτu) h ˆBτ u), where ˆB τ u) is a consistent estimator for B τ u). We similarly propose the following bias-corrected estimator for π ˆπ bc ˆπ h ˆBπ, where ˆB π is a consistent estimator of B π. Bias correction reduces biases, but also introduces variability. When the added variability is not accounted for, the empirical coverage of the resulting confidence intervals can be well below their nominal target, which implies that conventional confidence intervals may substantially over-reject the null hypothesis of no treatment effect. Following the robust inference approach of Calonico, Cattaneo, and Titiunik 14), we present the asymptotic distributions of the bias-corrected estimators ˆτ bc u) and ˆπ bc by taking into account the sampling variation induced by bias correction. Denote as b the bandwidth used in the bias estimation. Theorem 5 Asymptotic distribution of ˆτ bc u)). Let Assumptions 1-5 hold. If h = h n, b = b n, h/b ρ [, ], n min{h 6, b 6 } max{h, b }, n min{h, b 6 h 4 }, and nh 3 max{1, h 6 /b 6 }, then for any u, ˆτ bc u) τu) V bc τ,n u) d N, 1), where V bc τ,n u) V τ u) nh + V B τ u) nb 6 h 4 + C τ u; ρ) nhb. V τ u) is defined in Theorem 3. The exact forms of V Bτ u) and C τ u; ρ) are given in equations 19) and ), respectively, in Appendix B. V bc τ,n u) consists of three terms. V τ u) comes from the variance of ˆτu), V Bτ u) comes from the variance of ˆB τ, and C τ u; ρ) comes from the covariance between ˆτu) and ˆB τ. 17 Theorem 5 incorporates three limiting cases depending on h/b ρ [, ]. When h/b, the actual estimator ˆτu) is first-order while the bias estimator B τ u) is of smaller order, i.e., V Bτ u)/nb 6 h 4 )+ C τ u; ρ)/nhb) = o p Vτ u)/ nh )), and hence the variance reduces to V bc τ,n u) = V τ u)/nh ). 17 For the standard RD design with a binary treatment, Calonico, Cattaneo, and Titiunik 14) derive the conditional variance given the sample data. In contrast, we derive the asymptotic unconditional variance. These two approaches are asymptotically equivalent. In finite samples, the resulting confidence interval based on the conditional variance can be larger or smaller than the confidence interval based on the asymptotic unconditional variance. 17

18 When h/b ρ, ), then both ˆτu) and B τ u) contribute to the asymptotic variance. When h/b, the bias estimator B τ u) is first-order while the actual estimator ˆτu) is of smaller order and hence V bc τ,n u) = V B τ u)/nb 6 h 4 ). Note that the additional terms due to bias correction V Bτ u) and C τ u; ρ) depend on V τ u) and some constants determined by the kernel function see the proof of Theorem 5 in Appendix B for details). As a result, V bc τ,n u) only depends on V τ u) and some constants, which implies that estimating the robust variance is not computationally more demanding than estimating the conventional variance provided in the previous section. For example, for the niform kernel and ρ = 1, V bc [ τ,nu) = 13.89V τ u)/nh ). Based on Theorem 5, the 11 α)% confidence interval for τu) is ˆτ bc u) ± 1 1 α/ V bc τ,n ]. u) Theorem 6 Asymptotic distribution of ˆπ bc ). Let Assumptions 1-5 hold. If h = h n, b = b n, h/b ρ [, ], n min{h 5, b 5 } max{h, b }, n min{h, b 5 h 4 }, nh 4 max{1, h 5 b 5 }, and l, then ˆπ bc π V bc π,n d N, 1), where V bc π,n V π nh + V B π nb 5 h 4 + C π nb h 1. V π is defined in Theorem 4. The exact forms of V Bπ and C π are given in equations 3) and 4), respectively, in Appendix B. V bc π,n consists of three terms. V π comes from the variance of ˆπ, V Bπ comes from the variance of ˆB π, and C π comes from the covariance between ˆπ and ˆB π. Similar to Theorem 5, Theorem 6 also incorporates three limiting cases depending on h/b ρ [, ]. When h/b, the actual estimator ˆπ is first-order while the bias estimator B τ is of smaller order, and hence V bc π,n V π/ nh). When h/b ρ, ), then both ˆπ and B π contribute to the asymptotic variance. When h/b, the bias estimator B τ is first-order while the actual estimator ˆπ is of smaller order, and hence V bc [ π,n V B π /nbρ 4 ). Based on Theorem 6, the 11 α)% confidence interval for π w ) is ] ˆπ bc ± 1 1 α/ V bc π,n. Given our results, one can estimate the robust variances by the plug-in estimators. Details for estimating the biases and variances are provided in Appendix C. Alternatively, one can bootstrap the standard errors or confidence intervals for the bias-corrected estimators ˆτ bc u) and ˆπ bc. Bootstrap validity follows the same arguments as those for ˆτu) and ˆπ in Section AMSE optimal bandwidth Choosing a bandwidth is known to be a delicate task in nonparametric estimation. Following Imbens and Kalyanaraman 1), we derive the bandwidths for ˆτu) and ˆπ that minimize the AMSE. These 18

19 results are presented in Theorem 7 and Theorem 8 below. Further details for estimating these AMSE optimal bandwidths are provided in Appendix C. Theorem 7 AMSE optimal bandwidth for ˆτu)). [ Let Assumptions 1-5 hold. If h = h n and nh ) ], then the mean squared error of ˆτu) is E ˆτu) τu) = h 4 B τ u) + nh ) 1 Vτ u)+ o h 4 + nh ) ) 1 ; further if B τ u), the bandwidth that minimizes the AMSE is h τ = V τ u)/ B τ u))) 1/6 n 1/6. The AMSE optimal bandwidth for ˆτu) is of the form C τ n 1/6 for some constant C τ >, which satisfies the bandwidth conditions specified in Theorem 5. Therefore, one can apply the above AMSE optimal bandwidth and then conduct the bias-corrected robust inference provided in Theorem 5. Theorem 8 AMSE optimal bandwidth for ˆπ ). Let Assumptions 1-5 hold. If h = h n and nh, then the mean squared error of ˆπ is E [ ˆπ π ) ] = h 4 B π + nh) 1 V π + o h 4 + nh) 1) ; further if B π, the bandwidth that minimizes the AMSE is h π = V π / 4B π)) 1/5 n 1/5. The AMSE optimal bandwidth for ˆπ is of the form C π n 1/5 for some constant C π >, which satisfies the bandwidth conditions in Theorem 6. These AMSE optimal bandwidths trade off squared biases with variances, so when the biases are small, the AMSE optimal bandwidths can be large. 7 Empirical analysis Since the establishment of the national banking system in , capital regulation has been a primary tool used to promote bank stability. Earlier capital regulation is motivated by the concern that a bank may hold less capital than is socially optimal, as negative externalities resulting from bank default are not reflected in the market value of bank capital. However, it is challenging to evaluate the causal impacts of capital regulation and hence the induced higher capital. Historical changes in capital regulation are constantly responses to economic events like financial crises. This endogeneity generates a correlation between higher capital and bank failure, complicating the task of isolating causal relationships. The regulation regime in the early th century nited States provides a unique opportunity for one to nonparametrically identify the true causal impacts of capital requirements on bank responses and outcomes. As shown previously, the minimum capital requirements were assigned based on the population size of the town a bank operated in. The requirements changed abruptly at various population thresholds. Figure presents a close-up of banks operating in towns with a population around 3,, the first regulatory threshold. Over 8% of the towns in our sample have a population near the 3, threshold. These towns represent rural farming regions where low population density required, numerous widely, 19

20 Figure : Minimum capital requirements around the town population 3, in 195 dispersed banking offices White, 1983). Arguably these small banks are the right target of the capital regulation. We therefore focus on the first regulatory threshold and explore the exogenous changes in the capital distribution at this threshold for identification. As is clear from Figure, the bottom of the capital distribution shifts up at the 3, population threshold. Given Figure, one may be concerned that the data are censored at the minimum capital, implying mass points at $5, and $5, around the 3, population threshold. Assumption 1 would then be invalid. However, our data do not suggest a censored distribution for bank capital. In our local estimation sample around the 3, population threshold, less than 1% of the banks below and less than % of the banks above the threshold hold the minimum capital. 18 Later, we present the estimated quantile curves of log capital right above or right below the policy threshold left panel of Figure 3 in Section 7.). There are no flat regions at the lower ends of both quantile curves. We estimate the impacts of capital requirements on bank capital i.e., the first stage impact of Z on T ), and further the causal relationships between the induced higher bank capital and three outcomes of interest i.e., the impacts of T on Y ), total assets, leverage, and the suspension probability in the long run. We quantify the possible heterogenous effects or lack of those) of increased capital at various capital levels. 7.1 Data description We gather first-hand data from three sources: the annual reports of the Office of the Comptroller of the Currency OCC), Rand McNally s Bankers Directory, and the nited States population census. The OCC s annual report includes the balance sheet information for all nationally chartered banks. On the asset side, this information includes loans, discounts, investments in securities and bonds, holdings of 18 In our full sample, the fraction of the banks holding the minimum capital is less than % below and less than 1% above.

21 real estate, cash on hand, deposits in other banks, and overdrafts. On the liability side, this information includes capital, surplus and undivided profits, circulation, and deposits. We collect detailed balance sheet data on individual national banks in 195, and their suspension outcome in the following 4 years up to 199). The minimum capital requirements changed in 199. Before 19, the first regulatory threshold did not exist. National banks were required to have a minimum capital of $5, regardless of whether they operated in a town above or below the 3, population threshold. National banks established before 19 might be subject to either the old or new regulatory regime, depending on when they were rechartered. We do not have the recharter information, so we focus on national banks that were established after 19 for clean identification. 19 In our analysis, bank assets are defined as the sum of a bank s total amount of assets, and capital is the sum of a bank s capital and surplus. We further define accounting) leverage as the ratio of a bank s total assets to capital, or the amount of assets a bank holds for each dollar of capital they own. This leverage is a measure of the amount of risk a bank engages in. Higher leverage is associated with lower survival rates during financial crises Berger and Bouwman, 13). However, banks generally have an incentive to increase their leverage so they can accumulate higher rates of returns on their capital. We use logged values for all three variables since they have rather skewed distributions. Table 1 Sample summary statistics Z= Z=1 N Mean SD) N Mean SD) Difference SE) Logcapital) ) ).66.4)*** Logassets) ) ).77.6)*** Logleverage) ) ).11.4)*** Suspension ) ) -.4.3) Bank age ) ).33.11)** Black population %) ) ).1.) Farmland %) ) ) -.6.3)** Logmanufacturing output) ) ).66.1)*** Note: The sample consists of all national banks established between 19 and 195 and located in towns with a town population less than 6,; ***Significant at the 1% level, **Significant at the 5% level The OCC s annual report also indicates the town, county, and state in which each bank located. We match this information with the nited States Population Census to determine town populations. Since all banks in our sample were established between 19 and 195, their capital requirements in 195 were determined by their town population population in 19, as reported by the 19 census. Our sample consists of 8 banks in 45 towns, among which 717 had a population below 3, and 15 had a population at or above 3, but below 6,). In addition, we gather information on county characteristics that measure their business and agricultural conditions, including the percentage 19 This is unlike Guo 16), who analyzed a larger sample of banks established both before and after 199. This is different from various leverage ratios used in the bank regulation, which are defined as the ratio of a bank s capital to its possibly risk-adjusted) assets. 1

22 of black population, the percentage of farmland, and manufacturing output per capita per square miles. Brief sample summary statistics are provided in Table 1. Banks operating in towns with more than 3, people have more capital on average; they also hold more assets, and have higher measured leverages. However, these simple correlations may not reflect the true causal relationships. As we can see, towns with more than 3, people are associated with older banks, a lower percentage of farm land in their counties, and higher manufacturing output per capita. These results highlight the importance to seek for local identification. Causal relationships would be confounded if comparing banks far away from the threshold. 7. Estimation results Table reports the estimated changes in log capital from.1 to.9 quantiles and the estimated mean change. Figure 3 visualizes the estimated quantile curves of log capital right above or below the policy threshold left) and the estimated quantile changes right) along with their 95% point-wise confidence bands. These first-stage estimates are based on a bandwidth implied by the undersmoothing conditions in Theorems 3 and 4. In particular, h R = 4σ R n.3 = Further analysis using other bandwidths suggests that the estimated quantile changes are robust to different bandwidth choices. 1 Table Changes in logcapital) at the population threshold 3, Quantile Quantile )*** ) )*** ) )*** ) )*** ) )** ) ) ) ) ) ) ) ) Average ) Note: The top panel presents estimated changes in log capital at different quantiles, while the bottom row reports the estimated average change; The bandwidth is set to be h R = 4σ R n.3 = 139.5, which satisfies the undersmoothing conditions for the Q-LATE or WQ-LATE estimator in Theorems 3 and 4; Standard errors are in parentheses; *** Significant at the 1% level; ** Significant at the 5% level. Consistent with the visual evidence in Figure, results in Table and Figure 3 suggest that significant changes only occur at roughly the bottom 3 percentiles of the distribution of log capital. The estimated changes are also larger at lower quantiles. No significant change is found in the average level of log capital. At the same time, there are visible quantile crossings at the high end of the quantiles in 1 For our main analysis, we conduct both conventional undersmoothing inference and bias-corrected robust inference using the AMSE optimal bandwidths.

23 Figure 3: Estimated quantile curves above and below the population threshold 3, left) and the estimated changes at different quantiles right). Figure 3. The estimated changes at the related quantiles are negative, even though they are not statistically significant. Given that there is no mean change in log capital at the policy threshold, we cannot apply the standard RD design. Nevertheless, our new approach takes advantage of any changes in the treatment distribution, making it possible to estimate the causal impacts of bank capital in this case. Table 3 Impacts of logcapital) on bank outcomes undersmoothing) Q-LATE Quantile Logassets) Logleverage) Suspension )*** ) ) )** ) ) )** ) ) )** ) ) )** ) ) )** ) ) )* ).1.183) )* ) ) WQ-LATE.7.87)** ) ) Note: The first panel presents estimated Q-LATEs at equally spaced quantiles; The last row presents the estimated WQ-LATEs; The bandwidths are set to be h R = 4σ R n.3 = and h T = 4σ T n.3 =.395, which satisfy the undersmoothing conditions in Theorems 3 and 4; The trimming thresholds are determined by using a preliminary bandwidth for R equal to 3/4h R, or 779.6; Bootstrapped standard errors are clustered at the town level and are in the parentheses; ***Significant at the 1% level, **Significant at the 5% level. Tables 3 and 4 present the estimated impacts τ u) and ˆπ ) of log capital on the outcomes of interest. Estimates in Table 3 use bandwidths consistent with undersmoothing and so no bias correction is made. Bias-corrected counterparts are presented in Table 4. These bias-corrected estimates use the AMSE optimal bandwidth given in Theorem 8. We report bootstrapped standard errors. Since capital regulation varies at the town level, bootstrapping facilitates clustering the standard errors at the town Note that the AMSE optimal bandwidth does not take into account the clustering nature of the error, so they are not necessarily AMSE optimal in this particular empirical application. Rather we use it as a reference point and later present estimates with a larger range of bandwidths. 3

24 Figure 4: Estimates of Q-LATEs at different quantiles level. Results with analytical standard errors without clustering) are presented in Appendix D. For brevity, Tables 3 and 4 present the estimated Q-LATEs at selected quantiles. Figures 4 and 5 illustrate the estimated Q-LATEs at a finer grid of quantiles along with the 95% confidence intervals. Table 4 Impacts of logcapital) on bank outcomes bias-corrected estimates) Q-LATE Quantile Logassets) Logleverage) Suspension )*** ) ) )*** ) ) )*** ) ) )*** ) ) )*** ) ) )*** ) -.7.) )*** ) ) )*** ) -.9.4) )*** ) ) WQ-LATE )** ) ) Note: The first panel presents the bias-corrected estimates of Q-LATEs at equally spaced quantiles; The last row presents the bias-corrected estimates of WQ-LATEs; The standardized AMSE optimal bandwidth for the WQ-LATE estimator is h π =.91 the standardized AMSE optimal bandwidth for the Q-LATE estimator h τ ranges from.7 to 1.1); The bandwidths in the estimation are then set to be h R = h π σ R = 118. and h T = h π σ T =.4173; The bandwidths used to estimate the biases are times of the main bandwidths; The trimming thresholds are determined by using a preliminary bandwidth for R equal to 3/4h R = 831.; Bootstrapped standard errors are clustered at the town level and are in the parentheses; ***Significant at the 1% level, **Significant at the 5% level. Overall, estimates by undersmoothing and those by explicit bias correction show similar patterns. For example, the estimated impacts on log assets are significant for all banks at the low quantiles of log capital, while the estimated impacts on log leverage and suspension are all insignificant. In addition, the estimates for log assets are slightly larger at for banks at the lower quantiles of log capital. Note that the bias-corrected estimates use larger bandwidths and hence there is no loss of precision compared with estimates by undersmoothing. In the following, we focus on interpreting the bias-corrected estimates. The bias-corrected estimates for log assets in Table 4 range from.89 to almost.977 at various 4

25 Figure 5: Bias-corrected estimates of Q-LATEs at different quantiles Figure 6: Estimated WQ-LATEs by different bandwidths Figure 7: Bias-corrected estimates of WQ-LATEs by different bandwidths 5

26 low quantiles of log capital. All estimates are significant at the 1% level. That is, a 1% increase in bank capital leads to an increase of.89% -.977% in total assets for banks at the bottom of the capital distribution. The corresponding weighted average is estimated to be.873, which is also significant at the 1% level. On average, a 1% increase in bank capital leads to a.873% increase in a bank s total assets among all the banks that are affected by the minimum capital requirement. As a result, the estimated decreases in log leverage are all small and insignificant, so the increased minimum capital requirements do not significantly lower leverage among those affected small banks. Not surprisingly, the estimated impacts of bank capital on their long-run suspension probability are small and insignificant. Figures 6 and 7 show sensitivity of our stimates to bandwidth choices. Both types of estimates are robust to a wide range of bandwidths. Not surprisingly, with the same bandwidth, bias correction leads to wider 95% confidence intervals. 7.3 Validity checks We have estimated the impacts of increased bank capital among banks with low levels of capital. Validity of these estimates requires our local smoothness and rank restrictions to hold. In the following, we evaluate validity of these assumptions. We first perform the usual standard RD tests to provide suggestive evidence for the smoothness conditions. These smoothness conditions are imposed to ensure that banks as well as their associated business and agricultural conditions above and below the policy threshold are comparable. Given the differential capital requirements, one may be concerned that banks took advantage of the lower capital requirements and hence were more likely to operate in towns with populations just under 3,. Following the standard practice, we test smoothness of the density of town population near the policy threshold. We also test smoothness of the conditional means of pre-determined covariates. These covariates include bank age and county characteristics, particularly percentage of black population, percentage of farmland and log manufacturing output per capita per square miles. Figures 8 and 9 provide visual evidence of smoothness. The left graph in Figure 8 presents the histogram of the town population, while the right graph presents the log frequency of the town population within each bin of population. Superimposed on the right graph is the estimated log density along with the 95% confidence interval. Formal test results are reported in Table 5. No significant discontinuities are found in the conditional means of these covariates or in the density of town population. These results suggest that smoothness conditions are plausible in our empirical setting. We next perform our proposed joint test. For simplicity, instead of testing the entire distribution of covariates, we test the low order raw) moments of covariates. That is, we replace the outcome variable by each of the first and second moments of the four covariates i.e., bank age, percentage of black population, percentage of farmland, and log manufacturing output per capita) and re-estimate 6

27 Figure 8: Histogram and the empirical density of town population Figure 9: Conditional means of covariates conditional on town population Table 5 Tests for smoothness of covariates and density I: Covariate Bank age ) Farmland %) ) Black Population %) ) Logmanufacturing output) ) II: Density of town population ) Note: Panel I presents the estimated discontinuities in the conditional means of covariate; Robust standard errors are clustered at the town level and are in parentheses; Panel II presents the t statistic of the estimated density discontinuity of town population along with the p-value using the Stata command rddensity; h R = h π σ R = 118. for all estimation. 7

28 Table 6 Tests for local rank invariance or rank similarity First moment Second moment Bank age ) ) Black Population %) ) ) Farmland %).66.4).17.6) Logmanufacturing output) ) ) Note: Bias-corrected estimates of WQ-LATEs are reported; The standardized AMSE optimal bandwidth for the WQ-LATE estimator is h π =.91, so the bandwidths for estimation are set to be h R = 118. and h T =.4173 for R and T ; The trimming thresholds are determined by using a preliminary bandwidth for R equal to 3/4h R = The bandwidths used to estimate the biases are times of the main bandwidths; Bootstrapped standard errors are in the parentheses. Figure 1: Bias-corrected estimates of Q-LATEs on covariates first moments) Figure 11: Bias-corrected estimates of Q-LATEs on covariates second moments) 8

Regression Discontinuity Designs with a Continuous Treatment

Regression Discontinuity Designs with a Continuous Treatment Regression Discontinuity Designs with a Continuous Treatment Yingying Dong, Ying-Ying Lee, Michael Gou This version: March 18 Abstract This paper provides identification and inference theory for the class

More information

Regression Discontinuity Designs with a Continuous Treatment

Regression Discontinuity Designs with a Continuous Treatment Regression Discontinuity Designs with a Continuous Treatment Yingying Dong, Ying-Ying Lee, Michael Gou This version: January 8 Abstract This paper provides identification and inference theory for the class

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs An Alternative Assumption to Identify LATE in Regression Discontinuity Designs Yingying Dong University of California Irvine September 2014 Abstract One key assumption Imbens and Angrist (1994) use to

More information

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

An Alternative Assumption to Identify LATE in Regression Discontinuity Design An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify

More information

Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"

Supplemental Appendix to Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs" Yingying Dong University of California Irvine February 2018 Abstract This document provides

More information

ted: a Stata Command for Testing Stability of Regression Discontinuity Models

ted: a Stata Command for Testing Stability of Regression Discontinuity Models ted: a Stata Command for Testing Stability of Regression Discontinuity Models Giovanni Cerulli IRCrES, Research Institute on Sustainable Economic Growth National Research Council of Italy 2016 Stata Conference

More information

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Yingying Dong University of California, Irvine Shu Shen University of California, Davis First

More information

Testing for Rank Invariance or Similarity in Program Evaluation

Testing for Rank Invariance or Similarity in Program Evaluation Testing for Rank Invariance or Similarity in Program Evaluation Yingying Dong University of California, Irvine Shu Shen University of California, Davis First version, February 2015; this version, October

More information

Optimal bandwidth selection for the fuzzy regression discontinuity estimator

Optimal bandwidth selection for the fuzzy regression discontinuity estimator Optimal bandwidth selection for the fuzzy regression discontinuity estimator Yoichi Arai Hidehiko Ichimura The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP49/5 Optimal

More information

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited

Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Yingying Dong and Shu Shen UC Irvine and UC Davis Sept 2015 @ Chicago 1 / 37 Dong, Shen Testing

More information

TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN 1. INTRODUCTION

TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN 1. INTRODUCTION TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN YOICHI ARAI a YU-CHIN HSU b TORU KITAGAWA c ISMAEL MOURIFIÉ d YUANYUAN WAN e GRIPS ACADEMIA SINICA UCL UNIVERSITY OF TORONTO ABSTRACT.

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic

More information

Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models

Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models Identifying the Effect of Changing the Policy Threshold in Regression Discontinuity Models Yingying Dong and Arthur Lewbel University of California Irvine and Boston College First version July 2010, revised

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Clément de Chaisemartin September 1, 2016 Abstract This paper gathers the supplementary material to de

More information

Robust Inference in Fuzzy Regression Discontinuity Designs

Robust Inference in Fuzzy Regression Discontinuity Designs Robust Inference in Fuzzy Regression Discontinuity Designs Yang He November 2, 2017 Abstract Fuzzy regression discontinuity (RD) design and instrumental variable(s) (IV) regression share similar identification

More information

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects Brigham R. Frandsen Lars J. Lefgren December 16, 2015 Abstract This article develops

More information

Regression Discontinuity Designs

Regression Discontinuity Designs Regression Discontinuity Designs Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Regression Discontinuity Design Stat186/Gov2002 Fall 2018 1 / 1 Observational

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Regression Discontinuity Designs in Stata

Regression Discontinuity Designs in Stata Regression Discontinuity Designs in Stata Matias D. Cattaneo University of Michigan July 30, 2015 Overview Main goal: learn about treatment effect of policy or intervention. If treatment randomization

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

Program Evaluation with High-Dimensional Data

Program Evaluation with High-Dimensional Data Program Evaluation with High-Dimensional Data Alexandre Belloni Duke Victor Chernozhukov MIT Iván Fernández-Val BU Christian Hansen Booth ESWC 215 August 17, 215 Introduction Goal is to perform inference

More information

Unconditional Quantile Regression with Endogenous Regressors

Unconditional Quantile Regression with Endogenous Regressors Unconditional Quantile Regression with Endogenous Regressors Pallab Kumar Ghosh Department of Economics Syracuse University. Email: paghosh@syr.edu Abstract This paper proposes an extension of the Fortin,

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

A Discontinuity Test for Identification in Nonparametric Models with Endogeneity

A Discontinuity Test for Identification in Nonparametric Models with Endogeneity A Discontinuity Test for Identification in Nonparametric Models with Endogeneity Carolina Caetano 1 Christoph Rothe 2 Nese Yildiz 1 1 Department of Economics 2 Department of Economics University of Rochester

More information

TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN

TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN YOICHI ARAI a YU-CHIN HSU b TORU KITAGAWA c. GRIPS ACADEMIA SINICA UCL and KYOTO UNIVERSITY ISMAEL MOURIFIÉ d YUANYUAN WAN e UNIVERSITY

More information

Independent and conditionally independent counterfactual distributions

Independent and conditionally independent counterfactual distributions Independent and conditionally independent counterfactual distributions Marcin Wolski European Investment Bank M.Wolski@eib.org Society for Nonlinear Dynamics and Econometrics Tokyo March 19, 2018 Views

More information

Approximate Permutation Tests and Induced Order Statistics in the Regression Discontinuity Design

Approximate Permutation Tests and Induced Order Statistics in the Regression Discontinuity Design Approximate Permutation Tests and Induced Order Statistics in the Regression Discontinuity Design Ivan A. Canay Department of Economics Northwestern University iacanay@northwestern.edu Vishal Kamat Department

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest Edward Vytlacil, Yale University Renmin University, Department

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

Why high-order polynomials should not be used in regression discontinuity designs

Why high-order polynomials should not be used in regression discontinuity designs Why high-order polynomials should not be used in regression discontinuity designs Andrew Gelman Guido Imbens 6 Jul 217 Abstract It is common in regression discontinuity analysis to control for third, fourth,

More information

Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs

Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs Sebastian Calonico Matias D. Cattaneo Max H. Farrell September 14, 2018 Abstract Modern empirical work in

More information

Testing Continuity of a Density via g-order statistics in the Regression Discontinuity Design

Testing Continuity of a Density via g-order statistics in the Regression Discontinuity Design Testing Continuity of a Density via g-order statistics in the Regression Discontinuity Design Federico A. Bugni Department of Economics Duke University federico.bugni@duke.edu Ivan A. Canay Department

More information

Testing continuity of a density via g-order statistics in the regression discontinuity design

Testing continuity of a density via g-order statistics in the regression discontinuity design Testing continuity of a density via g-order statistics in the regression discontinuity design Federico A. Bugni Ivan A. Canay The Institute for Fiscal Studies Department of Economics, UCL cemmap working

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

A Test for Rank Similarity and Partial Identification of the Distribution of Treatment Effects Preliminary and incomplete

A Test for Rank Similarity and Partial Identification of the Distribution of Treatment Effects Preliminary and incomplete A Test for Rank Similarity and Partial Identification of the Distribution of Treatment Effects Preliminary and incomplete Brigham R. Frandsen Lars J. Lefgren April 30, 2015 Abstract We introduce a test

More information

A Test for Rank Similarity and Partial Identification of the Distribution of Treatment Effects Preliminary and incomplete

A Test for Rank Similarity and Partial Identification of the Distribution of Treatment Effects Preliminary and incomplete A Test for Rank Similarity and Partial Identification of the Distribution of Treatment Effects Preliminary and incomplete Brigham R. Frandsen Lars J. Lefgren August 1, 2015 Abstract We introduce a test

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Impact Evaluation Technical Workshop:

Impact Evaluation Technical Workshop: Impact Evaluation Technical Workshop: Asian Development Bank Sept 1 3, 2014 Manila, Philippines Session 19(b) Quantile Treatment Effects I. Quantile Treatment Effects Most of the evaluation literature

More information

Comparative Advantage and Schooling

Comparative Advantage and Schooling Comparative Advantage and Schooling Pedro Carneiro University College London, Institute for Fiscal Studies and IZA Sokbae Lee University College London and Institute for Fiscal Studies June 7, 2004 Abstract

More information

Identification and Inference in Regression Discontinuity Designs with a Manipulated Running Variable

Identification and Inference in Regression Discontinuity Designs with a Manipulated Running Variable DISCUSSION PAPER SERIES IZA DP No. 9604 Identification and Inference in Regression Discontinuity Designs with a Manipulated Running Variable Francois Gerard Miikka Rokkanen Christoph Rothe December 2015

More information

Quantile Processes for Semi and Nonparametric Regression

Quantile Processes for Semi and Nonparametric Regression Quantile Processes for Semi and Nonparametric Regression Shih-Kang Chao Department of Statistics Purdue University IMS-APRM 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile Response

More information

Locally Robust Semiparametric Estimation

Locally Robust Semiparametric Estimation Locally Robust Semiparametric Estimation Victor Chernozhukov Juan Carlos Escanciano Hidehiko Ichimura Whitney K. Newey The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

Identifying Effects of Multivalued Treatments

Identifying Effects of Multivalued Treatments Identifying Effects of Multivalued Treatments Sokbae Lee Bernard Salanie The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP72/15 Identifying Effects of Multivalued Treatments

More information

A Local Generalized Method of Moments Estimator

A Local Generalized Method of Moments Estimator A Local Generalized Method of Moments Estimator Arthur Lewbel Boston College June 2006 Abstract A local Generalized Method of Moments Estimator is proposed for nonparametrically estimating unknown functions

More information

Random Coefficients on Endogenous Variables in Simultaneous Equations Models

Random Coefficients on Endogenous Variables in Simultaneous Equations Models Review of Economic Studies (207) 00, 6 0034-6527/7/0000000$02.00 c 207 The Review of Economic Studies Limited Random Coefficients on Endogenous Variables in Simultaneous Equations Models MATTHEW A. MASTEN

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 16, 2018 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

Inference on Breakdown Frontiers

Inference on Breakdown Frontiers Inference on Breakdown Frontiers Matthew A. Masten Alexandre Poirier May 12, 2017 Abstract A breakdown frontier is the boundary between the set of assumptions which lead to a specific conclusion and those

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 24, 2017 I will describe the basic ideas of RD, but ignore many of the details Good references

More information

Exact Nonparametric Inference for a Binary. Endogenous Regressor

Exact Nonparametric Inference for a Binary. Endogenous Regressor Exact Nonparametric Inference for a Binary Endogenous Regressor Brigham R. Frandsen December 5, 2013 Abstract This paper describes a randomization-based estimation and inference procedure for the distribution

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Testing Rank Similarity

Testing Rank Similarity Testing Rank Similarity Brigham R. Frandsen Lars J. Lefgren November 13, 2015 Abstract We introduce a test of the rank invariance or rank similarity assumption common in treatment effects and instrumental

More information

Estimation and Inference for Distribution Functions and Quantile Functions in Endogenous Treatment Effect Models. Abstract

Estimation and Inference for Distribution Functions and Quantile Functions in Endogenous Treatment Effect Models. Abstract Estimation and Inference for Distribution Functions and Quantile Functions in Endogenous Treatment Effect Models Yu-Chin Hsu Robert P. Lieli Tsung-Chih Lai Abstract We propose a new monotonizing method

More information

Fuzzy Differences-in-Differences

Fuzzy Differences-in-Differences Review of Economic Studies (2017) 01, 1 30 0034-6527/17/00000001$02.00 c 2017 The Review of Economic Studies Limited Fuzzy Differences-in-Differences C. DE CHAISEMARTIN University of California at Santa

More information

The Economics of European Regions: Theory, Empirics, and Policy

The Economics of European Regions: Theory, Empirics, and Policy The Economics of European Regions: Theory, Empirics, and Policy Dipartimento di Economia e Management Davide Fiaschi Angela Parenti 1 1 davide.fiaschi@unipi.it, and aparenti@ec.unipi.it. Fiaschi-Parenti

More information

Chapter 8. Quantile Regression and Quantile Treatment Effects

Chapter 8. Quantile Regression and Quantile Treatment Effects Chapter 8. Quantile Regression and Quantile Treatment Effects By Joan Llull Quantitative & Statistical Methods II Barcelona GSE. Winter 2018 I. Introduction A. Motivation As in most of the economics literature,

More information

Regression Discontinuity Designs.

Regression Discontinuity Designs. Regression Discontinuity Designs. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 31/10/2017 I. Brunetti Labour Economics in an European Perspective 31/10/2017 1 / 36 Introduction

More information

Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters

Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters Magne Mogstad Andres Santos Alexander Torgovitsky July 22, 2016 Abstract We propose a method for using instrumental

More information

Causal Inference by Quantile Regression Kink Designs

Causal Inference by Quantile Regression Kink Designs Causal Inference by Quantile Regression Kink Designs Harold D. Chiang Yuya Sasaki Johns Hopkins University August 30, 2016 Abstract The quantile regression kink design (QRKD) is proposed by empirical researchers

More information

Lecture 9: Quantile Methods 2

Lecture 9: Quantile Methods 2 Lecture 9: Quantile Methods 2 1. Equivariance. 2. GMM for Quantiles. 3. Endogenous Models 4. Empirical Examples 1 1. Equivariance to Monotone Transformations. Theorem (Equivariance of Quantiles under Monotone

More information

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous Econometrics of causal inference Throughout, we consider the simplest case of a linear outcome equation, and homogeneous effects: y = βx + ɛ (1) where y is some outcome, x is an explanatory variable, and

More information

Lecture 11 Roy model, MTE, PRTE

Lecture 11 Roy model, MTE, PRTE Lecture 11 Roy model, MTE, PRTE Economics 2123 George Washington University Instructor: Prof. Ben Williams Roy Model Motivation The standard textbook example of simultaneity is a supply and demand system

More information

Over-Identified Regression Discontinuity Design

Over-Identified Regression Discontinuity Design Over-Identified Regression Discontinuity Design Carolina Caetano University of Rochester Gregorio Caetano University of Rochester October, 207. Juan Carlos Escanciano Indiana University Abstract We propose

More information

Lecture 10 Regression Discontinuity (and Kink) Design

Lecture 10 Regression Discontinuity (and Kink) Design Lecture 10 Regression Discontinuity (and Kink) Design Economics 2123 George Washington University Instructor: Prof. Ben Williams Introduction Estimation in RDD Identification RDD implementation RDD example

More information

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade Denis Chetverikov Brad Larsen Christopher Palmer UCLA, Stanford and NBER, UC Berkeley September

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters

Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters Magne Mogstad Andres Santos Alexander Torgovitsky May 25, 218 Abstract We propose a method for using instrumental variables

More information

Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs

Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs Andrew GELMAN Department of Statistics and Department of Political Science, Columbia University, New York, NY, 10027 (gelman@stat.columbia.edu)

More information

Nonparametric Inference via Bootstrapping the Debiased Estimator

Nonparametric Inference via Bootstrapping the Debiased Estimator Nonparametric Inference via Bootstrapping the Debiased Estimator Yen-Chi Chen Department of Statistics, University of Washington ICSA-Canada Chapter Symposium 2017 1 / 21 Problem Setup Let X 1,, X n be

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

Cross-fitting and fast remainder rates for semiparametric estimation

Cross-fitting and fast remainder rates for semiparametric estimation Cross-fitting and fast remainder rates for semiparametric estimation Whitney K. Newey James M. Robins The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP41/17 Cross-Fitting

More information

Regression Discontinuity Design

Regression Discontinuity Design Chapter 11 Regression Discontinuity Design 11.1 Introduction The idea in Regression Discontinuity Design (RDD) is to estimate a treatment effect where the treatment is determined by whether as observed

More information

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed

More information

Estimation of the Conditional Variance in Paired Experiments

Estimation of the Conditional Variance in Paired Experiments Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often

More information

ESTIMATION OF NONPARAMETRIC MODELS WITH SIMULTANEITY

ESTIMATION OF NONPARAMETRIC MODELS WITH SIMULTANEITY ESTIMATION OF NONPARAMETRIC MODELS WITH SIMULTANEITY Rosa L. Matzkin Department of Economics University of California, Los Angeles First version: May 200 This version: August 204 Abstract We introduce

More information

QUANTILE MODELS WITH ENDOGENEITY

QUANTILE MODELS WITH ENDOGENEITY QUANTILE MODELS WITH ENDOGENEITY V. CHERNOZHUKOV AND C. HANSEN Abstract. In this article, we review quantile models with endogeneity. We focus on models that achieve identification through the use of instrumental

More information

Section 7: Local linear regression (loess) and regression discontinuity designs

Section 7: Local linear regression (loess) and regression discontinuity designs Section 7: Local linear regression (loess) and regression discontinuity designs Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A October 26, 2015 1 / 57 Motivation We will focus on local linear

More information

Nonparametric Instrumental Variables Identification and Estimation of Nonseparable Panel Models

Nonparametric Instrumental Variables Identification and Estimation of Nonseparable Panel Models Nonparametric Instrumental Variables Identification and Estimation of Nonseparable Panel Models Bradley Setzler December 8, 2016 Abstract This paper considers identification and estimation of ceteris paribus

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

CAEPR Working Paper # Identifying Multiple Marginal Effects with a Single Binary Instrument or by Regression Discontinuity

CAEPR Working Paper # Identifying Multiple Marginal Effects with a Single Binary Instrument or by Regression Discontinuity CAEPR Working Paper #205-009 Identifying Multiple Marginal Effects with a Single Binary Instrument or by Regression Discontinuity Carolina Caetano University of Rochester Juan Carlos Escanciano Indiana

More information

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the

More information

The changes-in-changes model with covariates

The changes-in-changes model with covariates The changes-in-changes model with covariates Blaise Melly, Giulia Santangelo Bern University, European Commission - JRC First version: January 2015 Last changes: January 23, 2015 Preliminary! Abstract:

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College December 2016 Abstract Lewbel (2012) provides an estimator

More information

Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator

Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator Quantitative Economics 9 (8), 44 48 759-733/844 Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator Yoichi Arai School of Social Sciences, Waseda University Hidehiko

More information

Inference in Regression Discontinuity Designs with a Discrete Running Variable

Inference in Regression Discontinuity Designs with a Discrete Running Variable Inference in Regression Discontinuity Designs with a Discrete Running Variable Michal Kolesár Christoph Rothe arxiv:1606.04086v4 [stat.ap] 18 Nov 2017 November 21, 2017 Abstract We consider inference in

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

The Numerical Delta Method and Bootstrap

The Numerical Delta Method and Bootstrap The Numerical Delta Method and Bootstrap Han Hong and Jessie Li Stanford University and UCSC 1 / 41 Motivation Recent developments in econometrics have given empirical researchers access to estimators

More information

Econometrics, Harmless and Otherwise

Econometrics, Harmless and Otherwise , Harmless and Otherwise Alberto Abadie MIT and NBER Joshua Angrist MIT and NBER Chris Walters UC Berkeley ASSA January 2017 We discuss econometric techniques for program and policy evaluation, covering

More information

Robust Backtesting Tests for Value-at-Risk Models

Robust Backtesting Tests for Value-at-Risk Models Robust Backtesting Tests for Value-at-Risk Models Jose Olmo City University London (joint work with Juan Carlos Escanciano, Indiana University) Far East and South Asia Meeting of the Econometric Society

More information

Regression Discontinuity Designs Using Covariates

Regression Discontinuity Designs Using Covariates Regression Discontinuity Designs Using Covariates Sebastian Calonico Matias D. Cattaneo Max H. Farrell Rocío Titiunik May 25, 2018 We thank the co-editor, Bryan Graham, and three reviewers for comments.

More information

Nonadditive Models with Endogenous Regressors

Nonadditive Models with Endogenous Regressors Nonadditive Models with Endogenous Regressors Guido W. Imbens First Draft: July 2005 This Draft: February 2006 Abstract In the last fifteen years there has been much work on nonparametric identification

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University

More information

Bounds on Treatment Effects in Regression Discontinuity Designs with a Manipulated Running Variable

Bounds on Treatment Effects in Regression Discontinuity Designs with a Manipulated Running Variable Bounds on Treatment Effects in Regression Discontinuity Designs with a Manipulated Running Variable François Gerard, Miikka Rokkanen, and Christoph Rothe Abstract The key assumption in regression discontinuity

More information