Regression III Regression Discontinuity Designs
|
|
- Jocelin Lyons
- 5 years ago
- Views:
Transcription
1 Motivation Regression III Regression Discontinuity Designs Dave Armstrong University of Western Ontario Department of Political Science Department of Statistics and Actuarial Science (by courtesy) e: w: Often times, we want to use regression analysis to make causal statements. We can only do this if: All of our modeling assumptions hold. Including - independence between X and ". Normally, with observational data, these assumptions are unlikely to hold. Some research designs can leverage near-random assignment to make mimic an experimental situation. 1 / 67 2 / 67 Example: State-building in Vietnam US Government Metrics What were the effects of different military strategies on security, development, governance, civil society, etc... in Vietnam? Why can t we just do: Modernization = b 0 + b 1 Bombing + Z where each observation is a hamlet in Vietnam. Citation: Dell, Melissa and Pablo Querubin (2018) Nation Building Through Foreign Intervetion: Evidence from Discontinuities in Military Strategies Quarterly Journal of Economics, 133(2): e The US DoD used several metrics to guide military strategy. Abatteryof169questionsaboutsecurity,politicsandeconomicswas combined using Bayes rule to identify a security score: S =[0, 5]. The mainframe wouldn t print out the continuous score, so they rounded it and printed out the rounded numbers. Identification on causal effects can be obtained by considering hamlets that are close on the continuous score, but get rounded into different categories (e.g., , ) 3 / 67 4 / 67
2 Discontinuity Reference We know that in the assignment of a score, a discontinuity exists at the rounding threshold. How can we estimate the effect of bombings, which are assigned largely based on the discontinuity? How do we know that effect is real and not some modeling artifact? What assumptions are needed to motivate this type of analysis? This lecture is based primarily on the working manuscript: Matias D. Cattaneo, Nicolás Idrobo & Rocío Titiunik (2018) A Practical Introduction to Regression Discontinuity Designs. Cambridge University Press. Running Example (Meyersson, 2014): Units of observsation: Municipalities in Turkey s 1994 Mayoral Election Outcome: Educational attainment of women Score (Running) variable: Margin of victory of the (largest) Islamic party. Treatment: Islamic party electoral victory (Win if margin of victory > 0). 5 / 67 6 / 67 Preliminary Notation Sharp vs. Fuzzy RDD Formalizing the design, n units indexed by i = 1, 2,...,n. Each unit has a value on the score or running variable X ( i 1 if X c is a known cutoff such that T i = i > c 0 otherwise. The probability of treatment assignment changes discontinuously at the cutoff. 7 / 67 8 / 67
3 Potential Outcomes Framework Extrapolation and RDD Each observation has two potential outcomes: Y i (0): the outcome observed under the control and Y i (1): the outcome observed under the treatment Comparing these two effects should give us a sense of the causal effect of avariable. However, we only observe one or the other of those things for each observation. ( E[Y E[Y i X i ]= i (0) X i ] if x < x E[Y i (1) X i ] if x x In the sharp design, there is no joint support over Y i (0) and Y i (1) Extrapolation is required to identify the causal effect. 9 / / 67 The Main Idea The fundamental idea is that the discontinuity can provide a measure of the causal impact if: Both E[Y i (0) X i = x] and E[Y i (1) X i = x] are both continuous in x at the discontinuity X i = x. where E[Y i (1) Y i (0) X = x] =lim x# x E[Y i X = x] SRD =lim x# x E[Y i X = x] lim E[Y i X = x] x" x lim E[Y i X = x] x" x 11 / 67 In the fuzzy design... Fuzzy Design Pr(Treated) changes at x, but not from 0 to 1 like in the sharp design. This could happen if everyone above x was eligible for the treatment, but only some took part. FRD = E[(D i(1) D i (0))(Y i (1) Y i (0)) X i = x] E[(D i (1) D i (0)) X i = x] = lim x# x E[Y i X i = x] lim x" x E[Y i X i = x] lim x# x E[D i X i = x] lim x" x E[D i X i = x] where D i (0) is the treatment take-up indicator for those assigned to the control group, and D i (1) is the treatment take-up indicator for those assigned to the treatment 12 / 67
4 Let s get Kinky Kink Designs The Kink RD tries to estimate first derivatives of the regression function rather than the function itself. SKRD = d dx E[Y i(1) Y i (0) X i = x] FKRD = =lim x# x x= x d dx E[Y d i X i = x] lim x" x dx E[Y i X i = x] d dx E[(D i(1) D i (0))(Y i (1) Y i (0)) X i = x] x= x d dx E[(D i(1) D i (0)) X i = x] x= x = lim x# x d dx E[Y d i X i = x] lim x" x dx E[Y i X i = x] d lim x# x dx E[D d i X i = x] lim x" x dx E[D i X i = x] 13 / / 67 Other Designs RD Effects are Local The difference between E[Y i (1) X ] and E[Y i (0) X ] is calculated at a single point ( x) along the support of X. The effect will not necessarily generalize as we move away from the threshold without strong (usually unjustified) assumptions about the regression function. Multi-cutoff Designs Multiple Score/Geographic Designs 15 / / 67
5 Could Be (but isn t) Useful library(haven) data <- read_dta(" Y <- data$y X <- data$x Z <- data$z Z_X <- Z*X plot(y ~ X, xlab = "Islamic Victory", ylab = "Female High School Share") abline(v=0) Islamic Victory Female High School Share 17 / 67 Binning Estimator We can partition the observations into bins and then take the average y within bins to get a sense of how the discontinuity looks. Ȳ,j = 1 #{X i 2 B,j } X i:x i 2B,j Y i and Ȳ +,j = 1 #{X i 2 B +,j } X i:x i 2B +,j Y i 18 / 67 RD Plot library(rdrobust) out <- rdplot(y, X, nbins = c(20, 20), binselect = "esmv") RD Plot Y axis 19 / 67 Notes on the Previous Slide 1. The binning and global parametric model certainly make it easier to see what is happening with respect to the discontinuity. 2. Global polynomials are not necessarily great because they are known to be unstable in the tails and the tail is, by definition the place we re looking. 20 / 67
6 Binning Estimators Bins Example out = rdplot(y, X, binselect = 'es') out = rdplot(y, X, binselect = 'qs') RD Plot RD Plot Bins can be: Evenly Spaced (with different numbers of observations in each category) Quantile Spaced (with different distances between bin boundaries) Y axis X axis X axis There are a number of methods to optimally pick the number of bins. Y axis / / 67 Optimally Choosing Bins: IMSE Optimally Choosing Bins: Mimicking Variability Some optimize on Integrated Mean Squared Error (IMSE), so as to make the optimal tradeoff between bias and variance. Not always best because it could produce an overly smooth plot. Omitting the nbins argument and specifying binselect = 'es' or binselect = 'qs' will generate these optimal bins for evenly and quantile space bins, respectively. Bins can be chosen such that the variability in the binned means mimics variability in the raw data. Not overly smooth like the IMSE binned estimator. Generally results in more bins than the IMSE method. ES bins can sometimes result in high variance in sparse regions of the data because the bins have to be small to accommodate the higher density around the cutoff. binselect='esmv' and binselect='qsmv' will generate the mimicking variance estimators. 23 / / 67
7 Bins Example RD Plots out = rdplot(y, X, binselect = 'esmv') out = rdplot(y, X, binselect = 'qsmv') RD Plot RD Plot Y axis Y axis Good for illustration and investigation, but not for treatment effect. polynomials are too variable at the boundary points Use MV bins (both QS and ES side-by-side) to illustrate the design, with a global 4th or 5th order polynomial X axis X axis 25 / / 67 Continuity-based Approach Fundamentals Better for point estimates and inference of the treatment effect. Use polynomial methods local to the cutoff to model E[Y i X i = x] from either side and treat SRD as a parameter to be estimated. Either global polynomials (when all obs are used) or local polynomials (when only obs near cutoff are used) model the treatment effect. The running (X )variableisassumedtobecontinuousandsothere are few, if any, observations at X = x. To estimate E[Y i (1) X i = x] and E[Y i (0) X i = x], points near (but not at) the cutoff need to be used. The main point of interest and attention here is how the regression function is specified. Has huge effects on the robustness and credibility of the design and inference. The primary tool for estimating the effect is a low-order local polynomial regression. 27 / / 67
8 LPR in RDD Example: First-order LPR 1. Choose order of the polynomial. 2. Choose bandwidth h, such that only observations between [ x h, x + h] are used to fit the LPR. 3. Estimate the following two LPR models: Ŷ i =ˆµ + + Ŷ i =ˆµ + px ˆµ +,1 (X i c) p (1) i=1 px ˆµ,1 (X i c) p, (2) i=1 using weights w i = K x i c h. 4. ˆ SRD =ˆµ + ˆµ, which estimates: E[Y i (1) X i = x] E[Y i (0) X i = x] 29 / / 67 Choices to make in LPR Bias and Bandwidth Kernel - triangular kernel (with MSE optimal bandwidth selection) leads to a point-estimate with optimal MSE properties. Here, weight declines linearly moving away from x. Other common options are Uniform and Epanechnikov kernels, but results tend to be robust with respect to this choice. Polynomial Order - in an effort to make the appropriate bias-variance tradeoff, polynomialorderofp = 1orp = 2isusuallyrecommended with optimal bandwidth selection to maximize accuracy of the estimate. Most research relies on local linear regression. Bandwidth - automatically selected given the two choices above (more below) to make the appropriate bias-variance tradeoff. 31 / / 67
9 Optimal Bandwidth Choice Optimal BW Selection in R Generally chosen to minimize MSE: Bias 2 + Variance. The bias is found by relating the local linear estimator to the curvature of the of the unknown regression function and depends primarily on the (p + 1) th derivative of the function. The variance term is a function of density of the running variable around the cutoff (which is negatively related to variance) and the conditional variability of the estimate. Different bandwidths can be chosen on either side of the cutoff since the treatment effect is the difference between two one-sided estimates. Aregularizationtermisoftenincludedtopreventstrangebehavior when bias is nearly zero (i.e., when a global linear model fits well). summary(rdbwselect(y, X, kernel = 'triangular', p = 1, bwselect = 'msetwo')) Call: rdbwselect Number of Obs BW type msetwo Kernel Triangular VCE method NN Number of Obs Order est. (p) 1 1 Order bias (q) 2 2 ======================================================= BW est. (h) BW bias (b) Left of c Right of c Left of c Right of c ======================================================= msetwo ======================================================= Use the argument bwselect = 'mserd' for a single bandwidth across both regions. 33 / / 67 Using rdrobust to Calculate Treatment Effect Using rdrobust to Calculate Treatment Effect (2) summary(rdrobust(y, X, kernel = "triangular", p = 1, bwselect = "mserd")) Call: rdrobust Number of Obs BW type mserd Kernel Triangular VCE method NN Number of Obs Eff. Number of Obs Order est. (p) 1 1 Order bias (p) 2 2 BW est. (h) BW bias (b) rho (h/b) ============================================================================= Method Coef. Std. Err. z P> z [ 95% C.I. ] ============================================================================= Conventional [0.223, 5.817] Robust [-0.309, 6.276] ============================================================================= summary(rdrobust(y, X, kernel = "triangular", p = 1, bwselect = "msetwo")) Call: rdrobust Number of Obs BW type msetwo Kernel Triangular VCE method NN Number of Obs Eff. Number of Obs Order est. (p) 1 1 Order bias (p) 2 2 BW est. (h) BW bias (b) rho (h/b) ============================================================================= Method Coef. Std. Err. z P> z [ 95% C.I. ] ============================================================================= Conventional [0.243, 5.695] Robust [-0.245, 6.152] ============================================================================= 35 / / 67
10 RD Plot, Optimal Bandwidth Inference bandwidth <- rdrobust(y, X, kernel = 'triangular', p = 1, bwselect = 'mserd')$bws[1,1] out <- rdplot(y[abs(x)<=bandwidth], X[abs(X)<=bandwidth], p = 1, kernel = 'triangular') Y axis RD Plot X axis Inference is less straightforward here, for reasons similar to those we ve seen before. Bandwidth has been selected to make the optimal bias-variance tradeoff. An implication of this is that the model is almost necessarily mis-specified because the algorithm didn t minimize bias, but a combination of bias and variance. Cattaneo et al propose a robust, bias-corrected confidence interval for hypothesis testing. Centered around a bias-corrected parameter estimate Variance takes into account the variability in the bias-correction phase as well as sampling variability. 37 / / 67 Inference in Practice Another Alternative out <- rdrobust(y, X, kernel = "triangular", p = 1, bwselect = "mserd", all = TRUE) cbind(out$coef, out$ci) Coeff CI Lower CI Upper Conventional Bias-Corrected Robust Another alternative for inference is to use a different bandwidth for inference than for point estimation. An interesting choice here is to use h CER which is a bandwidth that is defined to minimize coverage errors of confidence intervals. out <- rdrobust(y, X, kernel = "triangular", p = 1, bwselect = "cerrd", all = TRUE) cbind(out$coef, out$ci) Coeff CI Lower CI Upper Conventional Bias-Corrected Robust / / 67
11 Covariates in the Meyersson Datas Including Covariates searchvarlabels(data, "") ind label X 1 Islamic Vote Margin in 1994 Y 2 Share Women aged with High School Education Z 3 Islamic mayor in 1994 ageshr19 4 Population share below 19 in 2000 ageshr60 5 Population share above 60 in 2000 buyuk 6 Metro center hischshr1520m 7 Share Men aged with High School Education i89 8 Islamic Mayor in 1989 lpop Log Population in 1994 merkezi 10 District center merkezp 11 Province center partycount 12 Number of parties receiving votes 1994 sexr 13 Gender ratio in 2000 shhs 14 Household size in 2000 subbuyuk 15 Sub-metro center vshr_islam Islamic vote share 1994 Covariates can be included in the RD design with the covs argument in rdrobust. The estimate is only really considered a treatment effect if the covariates are determined and fixed before the assignment of the treatment. Covariates can reduce sampling variability without increasing bias in the best case scenario. Z = data[,c("vshr_islam1994", "partycount", "lpop1994", "merkezi", "merkezp", "subbuyuk", "buyuk")] outcov <- rdrobust(y, X, covs = Z, kernel = 'triangular', scaleregul = 1, p = 1, bwselect = 'mserd') cbind(outcov$coef, outcov$ci) Coeff CI Lower CI Upper Conventional Bias-Corrected Robust / / 67 Randomization Inference Approach The previous approach leveraged the assumption of continuity and smoothness of E[Y i (0) X i = x] and E[Y i (1) X i = x] at the cutoff to make inferences. Randomization inference views the RD design as a randomized experiment around the cutoff x. The sharp differences in treatment status at the cutoff resemble a randomized controlled trial at the cutoff. Units whose score value (values on the running variable) are in a small window around the cutoff can be analyzed as being from a randomly assigned experiment. Local randomization inference is particularly useful when the running variable is discrete or has relatively few points. It can be used as a robustness check for continuity based designs, but local randomization requires stronger assumptions. We assume that: Local Randomization Overview For points in a small window around the cutoff, W 0 =[ x w 0, x + w 0 ], status into treatment or control can be considered to be randomly assigned (aka as if random assignment). Not only is the assignment random, but the running variable in the window must be unrelated to the outcome. Similarity of RD and Experiments: 43 / / 67
12 Formalization Estimation and Inference In the strongest version, we assume: For X i 2 W 0, Y i (X i, T i )=Y i (T i ), the running variable only influences Y through the treatment indicator. In a weaker version, we could relax above to: (Y i (X i, T i ), X i, T i )=Ỹi(T i ), there exists a transformation for which the first condition mentioned above is true. Estimation could take the form of large-sample statistical estimators if there are lots of X i 2 W 0, but this is often not the case. Randomization inference has exact, finite-sample properties which makes it quite attractive for this case. Fisherian inference: Potential outcomes are non-stochastic (i.e., fixed, no random sampling assumed). H0 F : Y i(0) =Y i (1)8i Under the null, all outcomes are observed because for each observation the two outcomes are the same. 45 / / 67 Hypothetical Example of Fisherian Inference Distribution of Test Statistic under Null Imagine we have 5 units in W 0 and we randomly assign n W0,+ = 3units to the treatment and n W0, = n W0 n W0,+ = 2unitstothecontrol. Under full randomization, we could assume that n W0,+, and by extension n W0, are fixed and find all possible vectors t of the treatment and control that preserve the marginal distribution of T. In our example, there are 5 3 = 10 possible assignments to treatment and control. Assume that Y =(5, 2, 2, 5, 5) and that T =(1, 0, 0, 1, 1), then the observed difference in means is S obs = Ȳ + Ȳ = = 3. If complete enumeration of all possible outcomes is not feasibe, simulation can be used. 47 / / 67
13 Test Statistics Fisherian inference is general and should work for any test statistic. Some other common choices for RD designs are: Kolmogorov-Smirnov (KS) statistics: S KS = sum ˆF 1 (y) ˆF 0 (y), the biggest absolute difference in the two empirical CDFs. Better than difference of means when departures from null are in other moments or quantiles. Wilcoxon rank sum statistic: S WR = P i:t i =1 Ry i where R y i is the outcome rank. S WR is not effected by the cardinal values of the outcome, only their ordering. 49 / 67 library(rdlocrand) rdrandinf(y, X, wl = -2.5, wr=2.5, seed = 50, reps=2500) Selected window = [-2.5;2.5] Running randomization-based test... Randomization-based test complete. Number of obs = 2629 Order of poly = 0 Kernel type = uniform Reps = 2500 Window = set by user H0: tau = 0 Randomization = fixed margins Cutoff c = 0 Left of c Right of c Number of obs Eff. number of obs Mean of outcome S.d. of outcome Window Finite sample Large sample Statistic T P> T P> T Power vs d = 4.27 Diff. in means $sumstats [,1] [,2] [1,] [2,] [3,] [4,] [5,] $obs.stat [1] / 67 Choosing the Window Data-driven Choice of Window Some options: 1. Ad hoc or theoretically defined - both are different flavors of arbitrary. 2. Use pre-treatment covariates to select the window. Assumes that there exists a variable Z that is related to the running variable outside the window, but not inside the window. (without this assumption, the procedure breaks down) The effect of the treatment on Z, since it is pre-determined, is 0 by construction. 1. Identify H0 F : Z is unrelated to T or balanced on T. 2. Start with smallest possible window and test H0 F. 3. Continue to widen window until H0 F is rejected at a pre-specified significance level. 4. The chosen window is the largest one that continues to fail to reject H0 F. We need to choose the following things: Relevant Covariates Test Statistic Randomization mechanism Minimum n in smallest window Significance level 51 / / 67
14 Formalization of the Procedure Window Selection in R Z <- data[, c("i89", "vshr_islam1994", "partycount", "lpop1994", "merkezi", "merkezp", "subbuyuk", "buyuk")] rdwinselect(x, Z, seed = 50, reps = 1000, wobs = 2) 1. Start with a symmetric window of length 2w j, W j = X ± w j 2. Compute the test statistic either for each covariate individually or compute the omnibus test p-value. 3. Find the smallest p-vale p min and evaluate whether p min >. If yes, then fail to reject H 0 and increase the size of the window by a pre-specified step. If no, then use the window W j 1. The step procedure can be defined by a fixed length (wstep in R) or such that a certein number of observations is included (wobs in R). 53 / 67 Window selection for RD under local randomization Number of obs = 2629 Order of poly = 0 Kernel type = uniform Reps = 1000 Testing method = rdrandinf Balance test = diffmeans Cutoff c = 0 Left of c Right of c Number of obs st percentile th percentile th percentile th percentile Window length / 2 p-value Var. name Bin.test Obs<c Obs>=c i i i i i i i i merkezi i Recommended window is [-0.944;0.944] with 38 observations (17 below, 21 above). 54 / 67 rdrandinf(y, X, wl = -.944, wr=.944, seed = 50, reps=2500) Selected window = [-0.944;0.944] Running randomization-based test... Randomization-based test complete. Number of obs = 2629 Order of poly = 0 Kernel type = uniform Reps = 2500 Window = set by user H0: tau = 0 Randomization = fixed margins Cutoff c = 0 Left of c Right of c Number of obs Eff. number of obs Mean of outcome S.d. of outcome Window Finite sample Large sample Statistic T P> T P> T Power vs d = Diff. in means $sumstats [,1] [,2] [1,] [2,] [3,] [4,] [5,] $obs.stat [1] $p.value 55 / 67 Local Randomization or Continuity Approach? Local randomization requires stronger assumptions than the continuity-based approach, thus one might use this approach to probe the conditions under which inference makes sense. The continuity-based approach requires reasonable data density around the cutoff. If this isn t the case, then the local randomization approach might be better. When the running variable is discrete (even potentially with lots of values, e.g., age in years), the local randomization approach could be better because there will be mass points with multiple observations. 56 / 67
15 Validation RD Plots of Covariates and Placebos rdplot(data$i89, X, x.label = "Score", y.label = "", title = "", x.lim = c(-100,100), cex.axis = 1.5, cex.lab = 1.5) There are threats to validity with RD designs. If the cutoff is known to the observations ahead of time, this can threaten the validity of the RD design. Observations may try to actively manipulate their score if they are just below the cutoff. There are empirical tests aimed at evaluating the validity of the design. 1. continuity of the score density around the cutoff 2. null treatment effects on pre-treatment covariates and placebos 3. Look at regression function continuity at arbitrary alternative cutoffs. rdplot(data$merkezi, X, x.label = "Score", y.label = "", title = "", x.lim = c(-100,100), cex.axis = 1.5, cex.lab = 1.5) Score Score 57 / / 67 Density of the Running Variable If units don t have the ability to manipulate their score, then there should be similar data density on both sides of the cutoff. library(rddensity) summary(rddensity(x)) Density Histogram bw_left = as.numeric(rddensity(x)$h[1]); bw_right = as.numeric(rddensity(x)$h[2]); tempdata = as.data.frame(x); colnames(tempdata) = c("v1"); plot2 = ggplot(data=tempdata, aes(tempdata$v1)) + theme_bw(base_size = 17) + geom_histogram(data = tempdata, aes(x = v1, y=..count..), breaks = seq(-bw_left, 0, 1), fill = "blue", col geom_histogram(data = tempdata, aes(x = v1, y=..count..), breaks = seq(0, bw_right, 1), fill = "red", col = labs(x = "Score", y = "Number of Observations") + geom_vline(xintercept = 0, color = "black") plot2 RD Manipulation Test using local polynomial density estimation. Number of obs = 2629 Model = unrestricted Kernel = triangular BW method = comb VCE method = jackknife Cutoff c = 0 Left of c Right of c Number of obs Eff. Number of obs Order est. (p) 2 2 Order bias (q) 3 3 BW est. (h) Method T P > T Robust Number of Observations Score 59 / / 67
16 Null Effects on Pre-treatment Covariates and Placebos Covariates and Placebos If the effect is causal, then it should not be related to pre-treatment covariates or placebo conditions. Anything determined before the treatment counts as a pre-treatment covariate. Placebo outcomes are context-specific. Z <- data[, c("i89", "vshr_islam1994", "partycount", "lpop1994", "merkezi", "merkezp", "subbuyuk", "buyuk")] robs <- lapply(1:ncol(z), function(x)rdrobust(z[[x]], X)) names(robs) <- colnames(z) t(round(sapply(robs, function(x)cbind(x$coef, x$ci)[3,]), 3)) Coeff CI Lower CI Upper i vshr_islam partycount lpop merkezi merkezp subbuyuk buyuk / / 67 With Randomization Inference robs <- lapply(1:ncol(z), function(x)rdrandinf(z[[x]], X, wl=-.944, wr=.944)) names(robs) <- colnames(z) t(round(sapply(robs, function(x)c(stat=x$obs.stat, pval=x$p.value)), 4)) stat pval i vshr_islam partycount lpop merkezi merkezp subbuyuk buyuk Regression Function Continuity One of the assumptions we made before was that the regression functions are continuous at the cutoff for both treatment and control groups. treat <- which(x >= 0) contr <- which(x < 0) cutoffs <- seq(-5,5, by=1) cutoffs <- cutoffs[-which(cutoffs == 0)] res <- list() for(i in 1:length(cutoffs)){ if(cutoffs[i] < 0){ res[[i]] <- rdrobust(y[contr], X[contr], c=cutoffs[i]) } else { res[[i]] <- rdrobust(y[treat], X[treat], c=cutoffs[i]) } } cbind(cutoff = cutoffs, t(round(sapply(res, function(x) cbind(x$coef, x$ci)[3,]), 3))) cutoff Coeff CI Lower CI Upper [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] / / 67
17 Sensitivity to Observations Close to Cutoff Donut-hole Estimation If there is potential for manipulation, it would be those observations closes to the cutoff who are most susceptible. Take them out and evaluate effect. rdrobust(y[abs(x) >= 0.25], X[abs(X) >= 0.25])[c("coef", "ci")] $coef Coeff Conventional Bias-Corrected Robust $ci CI Lower CI Upper Conventional Bias-Corrected Robust out <- t(sapply(seq(0, 1.25, by=.25), function(i) with(rdrobust(y[abs(x) >= i], X[abs(X) >= i]), c(coef=coef[3], ci[3,])))) out <- cbind(radius = seq(0, 1.25, by=.25), out) out radius coef CI Lower CI Upper [1,] [2,] [3,] [4,] [5,] [6,] / / 67 Conclusion The RDD approach can be valuable with the right data and question. Have to be careful that the causal effect is not a modeling artifact. Use data-driven tools to estimate appropriate bandwidth, window width, etc... Do sensitivity testing to make sure that your results are not sensitive to modeling choices 67 / 67
Regression III Regression Discontinuity Designs
Motivation Regression III Regression Discontinuity Designs Dave Armstrong University of Western Ontario Department of Political Science Department of Statistics and Actuarial Science (by courtesy) e: dave.armstrong@uwo.ca
More informationSection 7: Local linear regression (loess) and regression discontinuity designs
Section 7: Local linear regression (loess) and regression discontinuity designs Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A October 26, 2015 1 / 57 Motivation We will focus on local linear
More informationRegression Discontinuity Designs in Stata
Regression Discontinuity Designs in Stata Matias D. Cattaneo University of Michigan July 30, 2015 Overview Main goal: learn about treatment effect of policy or intervention. If treatment randomization
More informationRegression Discontinuity Designs
Regression Discontinuity Designs Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Regression Discontinuity Design Stat186/Gov2002 Fall 2018 1 / 1 Observational
More informationA Practical Introduction to Regression Discontinuity Designs: Part I
A Practical Introduction to Regression Discontinuity Designs: Part I Matias D. Cattaneo Nicolás Idrobo Rocío Titiunik December 23, 2017 Monograph prepared for Cambridge Elements: Quantitative and Computational
More informationESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.
More informationA Practical Introduction to Regression Discontinuity Designs: Volume I
A Practical Introduction to Regression Discontinuity Designs: Volume I Matias D. Cattaneo Nicolás Idrobo Rocío Titiunik April 11, 2018 Monograph prepared for Cambridge Elements: Quantitative and Computational
More informationNonparametric Methods
Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis
More informationSupplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"
Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs" Yingying Dong University of California Irvine February 2018 Abstract This document provides
More informationMultidimensional Regression Discontinuity and Regression Kink Designs with Difference-in-Differences
Multidimensional Regression Discontinuity and Regression Kink Designs with Difference-in-Differences Rafael P. Ribas University of Amsterdam Stata Conference Chicago, July 28, 2016 Motivation Regression
More informationRegression Discontinuity Design
Chapter 11 Regression Discontinuity Design 11.1 Introduction The idea in Regression Discontinuity Design (RDD) is to estimate a treatment effect where the treatment is determined by whether as observed
More informationRegression Discontinuity: Advanced Topics. NYU Wagner Rajeev Dehejia
Regression Discontinuity: Advanced Topics NYU Wagner Rajeev Dehejia Summary of RD assumptions The treatment is determined at least in part by the assignment variable There is a discontinuity in the level
More informationIntroduction to Econometrics. Review of Probability & Statistics
1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical
More informationExam ECON5106/9106 Fall 2018
Exam ECO506/906 Fall 208. Suppose you observe (y i,x i ) for i,2,, and you assume f (y i x i ;α,β) γ i exp( γ i y i ) where γ i exp(α + βx i ). ote that in this case, the conditional mean of E(y i X x
More informationStatistical Inference with Regression Analysis
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing
More informationMichael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp
page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationApplied Microeconometrics Chapter 8 Regression Discontinuity (RD)
1 / 26 Applied Microeconometrics Chapter 8 Regression Discontinuity (RD) Romuald Méango and Michele Battisti LMU, SoSe 2016 Overview What is it about? What are its assumptions? What are the main applications?
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationRegression Discontinuity Designs.
Regression Discontinuity Designs. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 31/10/2017 I. Brunetti Labour Economics in an European Perspective 31/10/2017 1 / 36 Introduction
More information1 Independent Practice: Hypothesis tests for one parameter:
1 Independent Practice: Hypothesis tests for one parameter: Data from the Indian DHS survey from 2006 includes a measure of autonomy of the women surveyed (a scale from 0-10, 10 being the most autonomous)
More informationRegression Discontinuity Design Econometric Issues
Regression Discontinuity Design Econometric Issues Brian P. McCall University of Michigan Texas Schools Project, University of Texas, Dallas November 20, 2009 1 Regression Discontinuity Design Introduction
More informationAn Alternative Assumption to Identify LATE in Regression Discontinuity Design
An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify
More informationRegression Discontinuity Design
Regression Discontinuity Design Marcelo Coca Perraillon University of Chicago May 13 & 18, 2015 1 / 51 Introduction Plan Overview of RDD Meaning and validity of RDD Several examples from the literature
More informationWhy high-order polynomials should not be used in regression discontinuity designs
Why high-order polynomials should not be used in regression discontinuity designs Andrew Gelman Guido Imbens 6 Jul 217 Abstract It is common in regression discontinuity analysis to control for third, fourth,
More informationFinding Instrumental Variables: Identification Strategies. Amine Ouazad Ass. Professor of Economics
Finding Instrumental Variables: Identification Strategies Amine Ouazad Ass. Professor of Economics Outline 1. Before/After 2. Difference-in-difference estimation 3. Regression Discontinuity Design BEFORE/AFTER
More informationECO Class 6 Nonparametric Econometrics
ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................
More informationRegression Discontinuity
Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 24, 2017 I will describe the basic ideas of RD, but ignore many of the details Good references
More informationChapter 11. Regression with a Binary Dependent Variable
Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score
More informationAddressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN
Addressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN Overview Assumptions of RD Causal estimand of interest Discuss common analysis issues In the afternoon, you will have the opportunity to
More informationRegression Discontinuity
Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 16, 2018 I will describe the basic ideas of RD, but ignore many of the details Good references
More informationGov 2002: 3. Randomization Inference
Gov 2002: 3. Randomization Inference Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last week: This week: What can we identify using randomization? Estimators were justified via
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationUse of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:
Use of Matching Methods for Causal Inference in Experimental and Observational Studies Kosuke Imai Department of Politics Princeton University April 27, 2007 Kosuke Imai (Princeton University) Matching
More informationLecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:
Biost 518 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture utline Choice of Model Alternative Models Effect of data driven selection of
More informationSelection on Observables
Selection on Observables Hasin Yousaf (UC3M) 9th November Hasin Yousaf (UC3M) Selection on Observables 9th November 1 / 22 Summary Altonji, Elder and Taber, JPE, 2005 Bellows and Miguel, JPubE, 2009 Oster,
More informationRegression with a Single Regressor: Hypothesis Tests and Confidence Intervals
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression
More informationAn Alternative Assumption to Identify LATE in Regression Discontinuity Designs
An Alternative Assumption to Identify LATE in Regression Discontinuity Designs Yingying Dong University of California Irvine September 2014 Abstract One key assumption Imbens and Angrist (1994) use to
More informationted: a Stata Command for Testing Stability of Regression Discontinuity Models
ted: a Stata Command for Testing Stability of Regression Discontinuity Models Giovanni Cerulli IRCrES, Research Institute on Sustainable Economic Growth National Research Council of Italy 2016 Stata Conference
More informationPassing-Bablok Regression for Method Comparison
Chapter 313 Passing-Bablok Regression for Method Comparison Introduction Passing-Bablok regression for method comparison is a robust, nonparametric method for fitting a straight line to two-dimensional
More informationAn Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01
An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationHypothesis testing. Data to decisions
Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationRegression #8: Loose Ends
Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch
More informationWhy High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs
Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs Andrew GELMAN Department of Statistics and Department of Political Science, Columbia University, New York, NY, 10027 (gelman@stat.columbia.edu)
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1
More informationCorrelation and regression
NST 1B Experimental Psychology Statistics practical 1 Correlation and regression Rudolf Cardinal & Mike Aitken 11 / 12 November 2003 Department of Experimental Psychology University of Cambridge Handouts:
More informationHypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal
Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric
More informationTwo Sample Problems. Two sample problems
Two Sample Problems Two sample problems The goal of inference is to compare the responses in two groups. Each group is a sample from a different population. The responses in each group are independent
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationThe Economics of European Regions: Theory, Empirics, and Policy
The Economics of European Regions: Theory, Empirics, and Policy Dipartimento di Economia e Management Davide Fiaschi Angela Parenti 1 1 davide.fiaschi@unipi.it, and aparenti@ec.unipi.it. Fiaschi-Parenti
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationreview session gov 2000 gov 2000 () review session 1 / 38
review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review
More informationStatistical Data Analysis
DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationMultiple Regression Analysis
Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators
More informationThe Simple Linear Regression Model
The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,
More informationChapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression
Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.
More information(1) Sort all time observations from least to greatest, so that the j th and (j + 1) st observations are ordered by t j t j+1 for all j = 1,..., J.
AFFIRMATIVE ACTION AND HUMAN CAPITAL INVESTMENT 8. ONLINE APPENDIX TO ACCOMPANY Affirmative Action and Human Capital Investment: Theory and Evidence from a Randomized Field Experiment, by CHRISTOPHER COTTON,
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationEMERGING MARKETS - Lecture 2: Methodology refresher
EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different
More informationReview of Statistics
Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationChapter 2: Resampling Maarten Jansen
Chapter 2: Resampling Maarten Jansen Randomization tests Randomized experiment random assignment of sample subjects to groups Example: medical experiment with control group n 1 subjects for true medicine,
More informationLecture 21: October 19
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use
More informationMy data doesn t look like that..
Testing assumptions My data doesn t look like that.. We have made a big deal about testing model assumptions each week. Bill Pine Testing assumptions Testing assumptions We have made a big deal about testing
More informationRegression Discontinuity
Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 9, 2016 I will describe the basic ideas of RD, but ignore many of the details Good references
More informationBusiness Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee
Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)
More informationExperiments and Quasi-Experiments
Experiments and Quasi-Experiments (SW Chapter 13) Outline 1. Potential Outcomes, Causal Effects, and Idealized Experiments 2. Threats to Validity of Experiments 3. Application: The Tennessee STAR Experiment
More informationGov 2002: 4. Observational Studies and Confounding
Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What
More informationUnit 14: Nonparametric Statistical Methods
Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based
More informationRegression Discontinuity Designs Using Covariates
Regression Discontinuity Designs Using Covariates Sebastian Calonico Matias D. Cattaneo Max H. Farrell Rocío Titiunik May 25, 2018 We thank the co-editor, Bryan Graham, and three reviewers for comments.
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationIntroduction to hypothesis testing
Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If
More informationHypothesis Tests and Confidence Intervals. in Multiple Regression
ECON4135, LN6 Hypothesis Tests and Confidence Intervals Outline 1. Why multipple regression? in Multiple Regression (SW Chapter 7) 2. Simpson s paradox (omitted variables bias) 3. Hypothesis tests and
More informationWhat s New in Econometrics. Lecture 1
What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and
More informationTerminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1
Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the
More informationDensity estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationDistribution-Free Procedures (Devore Chapter Fifteen)
Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal
More informationPotential Outcomes Model (POM)
Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics
More informationCh. 16: Correlation and Regression
Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to
More informationOptimal Data-Driven Regression Discontinuity Plots. Supplemental Appendix
Optimal Data-Driven Regression Discontinuity Plots Supplemental Appendix Sebastian Calonico Matias D. Cattaneo Rocio Titiunik November 25, 2015 Abstract This supplemental appendix contains the proofs of
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationIntroduction to Econometrics. Multiple Regression (2016/2017)
Introduction to Econometrics STAT-S-301 Multiple Regression (016/017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 OLS estimate of the TS/STR relation: OLS estimate of the Test Score/STR relation:
More informationQED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University
QED Queen s Economics Department Working Paper No. 1319 Hypothesis Testing for Arbitrary Bounds Jeffrey Penney Queen s University Department of Economics Queen s University 94 University Avenue Kingston,
More informationAGEC 621 Lecture 16 David Bessler
AGEC 621 Lecture 16 David Bessler This is a RATS output for the dummy variable problem given in GHJ page 422; the beer expenditure lecture (last time). I do not expect you to know RATS but this will give
More informationLecture 10 Regression Discontinuity (and Kink) Design
Lecture 10 Regression Discontinuity (and Kink) Design Economics 2123 George Washington University Instructor: Prof. Ben Williams Introduction Estimation in RDD Identification RDD implementation RDD example
More informationL6: Regression II. JJ Chen. July 2, 2015
L6: Regression II JJ Chen July 2, 2015 Today s Plan Review basic inference based on Sample average Difference in sample average Extrapolate the knowledge to sample regression coefficients Standard error,
More informationLab 07 Introduction to Econometrics
Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 13 Nonlinearities Saul Lach October 2018 Saul Lach () Applied Statistics and Econometrics October 2018 1 / 91 Outline of Lecture 13 1 Nonlinear regression functions
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More information