Regression III Regression Discontinuity Designs
|
|
- Brett Chandler
- 5 years ago
- Views:
Transcription
1 Motivation Regression III Regression Discontinuity Designs Dave Armstrong University of Western Ontario Department of Political Science Department of Statistics and Actuarial Science (by courtesy) e: w: Often times, we want to use regression analysis to make causal statements. We can only do this if: All of our modeling assumptions hold. Including - independence between X and ". Normally, with observational data, these assumptions are unlikely to hold. Some research designs can leverage near-random assignment to make mimic an experimental situation. 1 / 64 2 / 64 Example: State-building in Vietnam US Government Metrics What were the effects of different military strategies on security, development, governance, civil society, etc... in Vietnam? Why can t we just do: Modernization = b 0 + b 1 Bombing + Z where each observation is a hamlet in Vietnam. + e The US DoD used several metrics to guide military strategy. Abatteryof169questionsaboutsecurity,politicsandeconomicswas combined using Bayes rule to identify a security score: S =[0, 5]. The mainframe wouldn t print out the continuous score, so they rounded it and printed out the rounded numbers. Identification on causal effects can be obtained by considering hamlets that are close on the continuous score, but get rounded into different categories (e.g., , ) 3 / 64 4 / 64
2 Discontinuity Figure We know that in the assignment of a score, a discontinuity exists at the rounding threshold. How can we estimate the effect of bombings, which are assigned largely based on the discontinuity? How do we know that effect is real and not some modeling artifact? What assumptions are needed to motivate this type of analysis? 5 / 64 6 / 64 Reference Sharp vs. Fuzzy RDD This lecture is based primarily on the working manuscript: Matias D. Cattaneo, Nicolás Idrobo & Rocío Titiunik (2017) A Practical Introduction to Regression Discontinuity Designs 7 / 64 8 / 64
3 Potential Outcomes Framework Extrapolation and RDD Each observation has two potential outcomes: Y i (0): the outcome observed under the control and Y i (1): the outcome observed under the treatment Comparing these two effects should give us a sense of the causal effect of avariable. However, we only observe one or the other of those things for each observation. ( E[Y E[Y i X i ]= i (0) X i ] if x < x E[Y i (1) X i ] if x x In the sharp design, there is no joint support over Y i (0) and Y i (1) Extrapolation is required to identify the causal effect. 9 / / 64 The Main Idea The fundamental idea is that the discontinuity can provide a measure of the causal impact if: Both E[Y i (0) X i = x] and E[Y i (1) X i = x] are both continuous in x at the discontinuity X i = x. where E[Y i (1) Y i (0) X = x] =lim x# x E[Y i X = x] SRD = lim x# x E[Y i X = x] lim E[Y i X = x] x" x lim E[Y i X = x] x" x 11 / 64 In the fuzzy design... Fuzzy Design Pr(Treated) changes at x, but not from 0 to 1 like in the sharp design. This could happen if everyone above x was eligible for the treatment, but only some took part. FRD = E[(D i(1) D i (0))(Y i (1) Y i (0)) X i = x] E[(D i (1) D i (0)) X i = x] = lim x# x E[Y i X i = x] lim x" x E[Y i X i = x] lim x# x E[D i X i = x] lim x" x E[D i X i = x] where D i (0) is the treatment take-up indicator for those assigned to the control group, and D i (1) is the treatment take-up indicator for those assigned to the treatment 12 / 64
4 Let s get Kinky Kink Designs The Kink RD tries to estimate first derivatives of the regression function rather than the function itself. SKRD = d dx E[Y i(1) Y i (0) X i = x] x= x d = lim x# x dx E[Y d i X i = x] lim x" x dx E[Y i X i = x] d dx FKRD = E[(D i(1) D i (0))(Y i (1) Y i (0)) X i = x] x= x d dx E[(D i(1) D i (0)) X i = x] x= x = lim x# x d dx E[Y d i X i = x] lim x" x dx E[Y i X i = x] d lim x# x dx E[D d i X i = x] lim x" x dx E[D i X i = x] 13 / / 64 Other Designs RD Effects are Local The difference between E[Y i (1) X ] and E[Y i (0) X ] is calculated at a single point ( x) along the support of X. The effect will not necessarily generalize as we move away from the threshold without strong (usually unjustified) assumptions about the regression function. Multi-cutoff Designs Multiple Score/Geographic Designs 15 / / 64
5 Could Be (but isn t) Useful library(readstata13) data <- read.dta13("polecon.dta") Y <- data$y X <- data$x Z <- data$z Z_X <- Z*X plot(y ~ X, xlab = "Islamic Victory", ylab = "Female High School Share") abline(v=0) Islamic Victory Female High School Share 17 / 64 Binning Estimator We can partition the observations into bins and then take the average y within bins to get a sense of how the discontinuity looks. Ȳ,j = 1 #{X i 2 B,j } X i:x i 2B,j Y i and Ȳ +,j = 1 #{X i 2 B +,j } X i:x i 2B +,j Y i 18 / 64 RD Plot library(rdrobust) out <- rdplot(y, X, nbins = c(20, 20), binselect = "esmv") RD Plot Y axis 19 / 64 Notes on the Previous Slide 1. The binning and global parametric model certainly make it easier to see what is happening with respect to the discontinuity. 2. Global polynomials are not necessarily great because they are known to be unstable in the tails and the tail is, by definition the place we re looking. 20 / 64
6 Binning Estimators Bins Example out = rdplot(y, X, binselect = 'es') out = rdplot(y, X, binselect = 'qs') RD Plot RD Plot Bins can be: Evenly Spaced (with different numbers of observations in each category) Quantile Spaced (with different distances between bin boundaries) Y axis X axis X axis There are a number of methods to optimally pick the number of bins. Y axis / / 64 Optimally Choosing Bins: IMSE Optimally Choosing Bins: Mimicking Variability Some optimize on Integrated Mean Squared Error (IMSE), so as to make the optimal tradeoff between bias and variance. Not always best because it could produce an overly smooth plot. Omitting the nbins argument and specifying binselect = 'es' or binselect = 'qs' will generate these optimal bins for evenly and quantile space bins, respectively. Bins can be chosen such that the variability in the binned means mimics variability in the raw data. Not overly smooth like the IMSE binned estimator. Generally results in more bins than the IMSE method.d ES bins can sometimes encourage binselect='esmv' and binselect='qsmv' will generate the mimicking variance estimators. 23 / / 64
7 Bins Example RD Plots out = rdplot(y, X, binselect = 'esmv') out = rdplot(y, X, binselect = 'qsmv') RD Plot RD Plot Y axis Y axis Good for illustration and investigation, but not for treatment effect. polynomials are too variable at the boundary points Use MV bins (both QS and ES side-by-side) to illustrate the design, with a global 4th or 5th order polynomial X axis X axis 25 / / 64 Continuity-based Approach Fundamentals Better for point estimates and inference of the treatment effect. Use polynomial methods local to the cutoff to model E[Y i X i = x] from either side and treat SRD as a parameter to be estimated. Either global polynomials (when all obs are used) or local polynomials (when only obs near cutoff are used) model the treatment effect. The running (X )variableisassumedtobecontinuousandsothere are few, if any, observations at X = x. To estimate E[Y i (1) X i = x] and E[Y i (0) X i = x], points near (but not at) the cutoff need to be used. The main point of interest and attention here is how the regression function is specified. Has huge effects on the robustness and credibility of the design and inference. The primary tool for estimating the effect is a low-order local polynomial regression. 27 / / 64
8 LPR in RDD Example: First-order LPR 1. Choose order of the polynomial. 2. Choose bandwidth h, such that only observations between [ x h, x + h] are used to fit the LPR. 3. In the LPR, use weights w i = K x i = x h. The intercept from this LPR is an estimate ˆµ + of ˆµ = E[Y i (1) X i = x]. 4. Estimate ˆµ of µ = E[Y i (0) X i = x]. 5. ˆ SRD =ˆµ + ˆµ. 29 / / 64 Choices to make in LPR Bias and Bandwidth Kernel - triangular kernel (with MSE optimal bandwidth selection) leads to a point-estimate with optimal MSE properties. Here, weight declines linearly moving away from x. Other common options are Uniform and Epanechnikov kernels, but results tend to be robust with respect to this choice. Polynomial Order - in an effort to make the appropriate bias-variance tradeoff, polynomialorderofp = 1orp = 2isusuallyrecommended with optimal bandwidth selection to maximize accuracy of the estimate. Most research relies on local linear regression. Bandwidth - automatically selected given the two choices above (more below) to make the appropriate bias-variance tradeoff. 31 / / 64
9 Optimal Bandwidth Choice Optimal BW Selection in R Generally chosen to minimize MSE: Bias 2 + Variance. The bias is found by relating the local linear estimator to the curvature of the of the unknown regression function and depends primarily on the (p + 1) th derivative of the function. The variance term is a function of density of the running variable around the cutoff (which is negatively related to variance) and the conditional variability of the estimate. Different bandwidths can be chosen on either side of the cutoff since the treatment effect is the difference between two one-sided estimates. Aregularizationtermisoftenincludedtopreventstrangebehavior when bias is nearly zero (i.e., when a global linear model fits well). summary(rdbwselect(y, X, kernel = 'triangular', p = 1, bwselect = 'msetwo')) Call: rdbwselect Number of Obs BW type msetwo Kernel Triangular VCE method NN Number of Obs Order est. (p) 1 1 Order bias (p) 2 2 ======================================================= BW est. (h) BW bias (b) Left of c Right of c Left of c Right of c ======================================================= msetwo ======================================================= Use the argument bwselect = 'mserd' for a single bandwidth across both regions. 33 / / 64 Using rdrobust to Calculate Treatment Effect Using rdrobust to Calculate Treatment Effect (2) summary(rdrobust(y, X, kernel = "triangular", p = 1, bwselect = "mserd")) Call: rdrobust Number of Obs BW type mserd Kernel Triangular VCE method NN Number of Obs Eff. Number of Obs Order est. (p) 1 1 Order bias (p) 2 2 BW est. (h) BW bias (b) rho (h/b) ============================================================================= Method Coef. Std. Err. z P> z [ 95% C.I. ] ============================================================================= Conventional [0.223, 5.817] Robust [-0.309, 6.276] ============================================================================= summary(rdrobust(y, X, kernel = "triangular", p = 1, bwselect = "msetwo")) Call: rdrobust Number of Obs BW type msetwo Kernel Triangular VCE method NN Number of Obs Eff. Number of Obs Order est. (p) 1 1 Order bias (p) 2 2 BW est. (h) BW bias (b) rho (h/b) ============================================================================= Method Coef. Std. Err. z P> z [ 95% C.I. ] ============================================================================= Conventional [0.243, 5.695] Robust [-0.245, 6.152] ============================================================================= 35 / / 64
10 RD Plot, Optimal Bandwidth Inference bandwidth <- rdrobust(y, X, kernel = 'triangular', p = 1, bwselect = 'mserd')$h_l out <- rdplot(y[abs(x)<=bandwidth], X[abs(X)<=bandwidth], p = 1, kernel = 'triangular') Y axis RD Plot X axis Inference is less straightforward here, for reasons similar to those we ve seen before. Bandwidth has been selected to make the optimal bias-variance tradeoff. An implication of this is that the model is almost necessarily mis-specified because the algorithm didn t minimize bias, but a combination of bias and variance. Cattaneo et al propose a robust, bias-corrected confidence interval for hypothesis testing. Centered around a bias-corrected parameter estimate Variance takes into account the variability in the bias-correction phase as well as sampling variability. 37 / / 64 Inference in Practice Including Covariates out <- rdrobust(y, X, kernel = "triangular", p = 1, bwselect = "mserd", all = TRUE) cbind(out$coef, out$ci) Coeff CI Lower CI Upper Conventional Bias-Corrected Robust Covariates can be included in the RD design with the covs argument in rdrobust. The estimate is only really considered a treatment effect if the covariates are determined and fixed before the assignment of the treatment. Covariates can reduce sampling variability without increasing bias in the best case scenario. Z = data[,c("vshr_islam1994", "partycount", "lpop1994", "merkezi", "merkezp", "subbuyuk", "buyuk")] outcov <- rdrobust(y, X, covs = Z, kernel = 'triangular', scaleregul = 1, p = 1, bwselect = 'mserd') cbind(outcov$coef, outcov$ci) Coeff CI Lower CI Upper Conventional Bias-Corrected Robust / / 64
11 Randomization Inference Approach The previous approach leveraged the assumption of continuity and smoothness of E[Y i (0) X i = x] and E[Y i (1) X i = x] at the cutoff to make inferences. Randomization inference views the RD design as a randomized experiment around the cutoff x. The sharp differences in treatment status at the cutoff resemble a randomized controlled trial at the cutoff. Units whose score value (values on the running variable) are in a small window around the cutoff can be analyzed as being from a randomly assigned experiment. Local randomization inference is particularly useful when the running variable is discrete or has relatively few points. It can be used as a robustness check for continuity based designs, but local randomization requires stronger assumptions. We assume that: Local Randomization Overview For points in a small window around the cutoff, W 0 =[ x w 0, x + w 0 ], status into treatment or control can be considered to be randomly assigned (aka as if random assignment). Not only is the assignment random, but the running variable in the window must be unrelated to the outcome. Similarity of RD and Experiments: 41 / / 64 Formalization Estimation and Inference In the strongest version, we assume: For X i 2 W 0, Y i (X i, T i )=Y i (T i ), the running variable only influences Y through the treatment indicator. In a weaker version, we could relax above to: (Y i (X i, T i ), X i, T i )=Ỹi(T i ), there exists a transformation for which the first condition mentioned above is true. Estimation could take the form of large-sample statistical estimators if there are lots of X i 2 W 0, but this is often not the case. Randomization inference has exact, finite-sample properties which makes it quite attractive for this case. Fisherian inference: Potential outcomes are non-stochastic (i.e., fixed, no random sampling assumed). H0 F : Y i(0) =Y i (1)8i Under the null, all outcomes are observed because for each observation the two outcomes are the same. 43 / / 64
12 Hypothetical Example of Fisherian Inference Distribution of Test Statistic under Null Imagine we have 5 units in W 0 and we randomly assign n W0,+ = 3units to the treatment and n W0, = n W0 n W0,+ = 2unitstothecontrol. Under full randomization, we could assume that n W0,+, and by extension n W0, are fixed and find all possible vectors t of the treatment and control that preserve the marginal distribution of T. In our example, there are 5 3 = 10 possible assignments to treatment and control. Assume that Y =(5, 2, 2, 5, 5) and that T =(1, 0, 0, 1, 1), then the observed difference in means is S obs = Ȳ + Ȳ = = 3. If complete enumeration of all possible outcomes is not feasibe, simulation can be used. 45 / / 64 Test Statistics Randomization Inference for a Regression Coefficient Fisherian inference is general and should work for any test statistic. Some other common choices for RD designs are: Kolmogorov-Smirnov (KS) statistics: S KS = sum ˆF 1 (y) ˆF 0 (y), the biggest absolute difference in the two empirical CDFs. Better than difference of means when departures from null are in other moments or quantiles. Wilcoxon rank sum statistic: S WR = P i:t i =1 Ry i where R y i is the outcome rank. S WR is not effected by the cardinal values of the outcome, only their ordering. library(mass) set.seed(493) X <- mvrnorm(100, c(0,0,0), matrix(c(1,.25,.25,.25,1,.25,.25,.25,1), ncol=3)) b <- c(.3, -1, 2) y <- X %*% b + rnorm(100, 0, 1.5) printcoefmat(summary(mod <- lm(y ~ X))$coef) Estimate Std. Error t value Pr(> t ) (Intercept) X * X e-10 *** X < 2.2e-16 *** --- Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ranb <- NULL for(i in 1:2500){ X[,1] <- sample(x[,1], nrow(x), replace=f) ranb <- rbind(ranb, coef(update(mod))) } 2*(1-mean(coef(mod)[2] > ranb[,2])) [1] / / 64
13 Example Choosing the Window library(rdlocrand) rdrandinf(y, X, wl = -2.5, wr=2.5, seed = 50, reps=2500) Selected window = [-2.5;2.5] Running randomization-based test... Randomization-based test complete. Number of obs = 2629 Order of poly = 0 Kernel type = uniform Reps = 2500 Window = set by user H0: tau = 0 Randomization = fixed margins Cutoff c = 0 Left of c Right of c Number of obs Eff. number of obs Mean of outcome S.d. of outcome Window Finite sample Large sample Statistic T P> T P> T Power vs d = 4.27 Diff. in means $sumstats [,1] [,2] [1,] [2,] [3,] [4,] / 64 Some options: 1. Ad hoc or theoretically defined - both are different flavors of arbitrary. 2. Use pre-treatment covariates to select the window. Assumes that there exists a variable Z that is related to the running variable outside the window, but not inside the window. (without this assumption, the procedure breaks down) The effect of the treatment on Z, since it is pre-determined, is 0 by construction. 50 / 64 Data-driven Choice of Window Formalization of the Procedure 1. Identify H0 F : Z is unrelated to T or balanced on T. 2. Start with smallest possible window and test H0 F. 3. Continue to widen window until H0 F is rejected at a pre-specified significance level. 4. The chosen window is the largest one that continues to fail to reject H0 F. We need to choose the following things: Relevant Covariates Test Statistic Randomization mechanism Minimum n in smallest window Significance level 1. Start with a symmetric window of length 2w j, W j = X ± w j 2. Compute the test statistic either for each covariate individually or compute the omnibus test p-value. 3. Find the smallest p-vale p min and evaluate whether whether p min >. If yes, then fail to reject H 0 and increase the size of the window by a pre-specified step. If no, then use the window W j 1. The step procedure can be defined by a fixed length (wstep in R) or such that a certein number of observations is included (wobs in R). 51 / / 64
14 Window Selection in R Example Z <- data[, c("i89", "vshr_islam1994", "partycount", "lpop1994", "merkezi", "merkezp", "subbuyuk", "buyuk")] rdwinselect(x, Z, seed = 50, reps = 1000, wobs = 2) library(rdlocrand) rdrandinf(y, X, wl = -.944, wr=.944, seed = 50, reps=2500) Window selection for RD under local randomization Number of obs = 2629 Order of poly = 0 Kernel type = uniform Reps = 1000 Testing method = rdrandinf Balance test = diffmeans Cutoff c = 0 Left of c Right of c Number of obs st percentile th percentile th percentile th percentile Window length / 2 p-value Var. name Bin.test Obs<c Obs>=c i i i i i i i i merkezi i Recommended window is [-0.944;0.944] with 38 observations (17 below, 21 above). 53 / 64 Selected window = [-0.944;0.944] Running randomization-based test... Randomization-based test complete. Number of obs = 2629 Order of poly = 0 Kernel type = uniform Reps = 2500 Window = set by user H0: tau = 0 Randomization = fixed margins Cutoff c = 0 Left of c Right of c Number of obs Eff. number of obs Mean of outcome S.d. of outcome Window Finite sample Large sample Statistic T P> T P> T Power vs d = Diff. in means $sumstats [,1] [,2] [1,] [2,] [3,] [4,] / 64 Local Randomization or Continuity Approach? Validation Local randomization requires stronger assumptions than the continuity-based approach, thus one might use this approach to probe the conditions under which inference makes sense. The continuity-based approach requires reasonable data density around the cutoff. If this isn t the case, then the local randomization approach might be better. When the running variable is discrete (even potentially with lots of values, e.g., age in years), the local randomization approach could be better because there will be mass points with multiple observations. There are threats to validity with RD designs. If the cutoff is known to the observations ahead of time, this can threaten the validity of the RD design. Observations may try to actively manipulate their score if they are just below the cutoff. There are empirical tests aimed at evaluating the validity of the design. 1. continuity of the score density around the cutoff 2. null treatment effects on pre-treatment covariates and placebos 3. Look at regression function continuity at arbitrary alternative cutoffs. 55 / / 64
15 Density of the Running Variable Null Effects on Pre-treatment Covariates and Placebos If units don t have the ability to manipulate their score, then there should be similar data density on both sides of the cutoff. summary(rddensity(x)) Error in rddensity(x): could not find function "rddensity" If the effect is causal, then it should not be related to pre-treatment covariates or placebo conditions. Anything determined before the treatment counts as a pre-treatment covariate. Placebo outcomes are context-specific. 57 / / 64 Covariates and Placebos With Randomization Inference robs <- lapply(1:ncol(z), function(x)rdrobust(z[,x], X)) names(robs) <- colnames(z) t(round(sapply(robs, function(x)cbind(x$coef, x$ci)[3,]), 3)) Coeff CI Lower CI Upper i vshr_islam partycount lpop merkezi merkezp subbuyuk buyuk robs <- lapply(1:ncol(z), function(x)rdrandinf(z[,x], X, wl=-.944, wr=.944)) names(robs) <- colnames(z) t(round(sapply(robs, function(x)c(stat=x$obs.stat, pval=x$p.value)), 4)) stat pval i vshr_islam partycount lpop merkezi merkezp subbuyuk buyuk / / 64
16 Regression Function Continuity One of the assumptions we made before was that the regression functions are continuous at the cutoff for both treatment and control groups. treat <- which(x >= 0) contr <- which(x < 0) cutoffs <- seq(-5,5, by=1) cutoffs <- cutoffs[-which(cutoffs == 0)] res <- list() for(i in 1:length(cutoffs)){ if(cutoffs[i] < 0){ res[[i]] <- rdrobust(y[contr], X[contr], c=cutoffs[i]) } else { res[[i]] <- rdrobust(y[treat], X[treat], c=cutoffs[i]) } } cbind(cutoff = cutoffs, t(round(sapply(res, function(x) cbind(x$coef, x$ci)[3,]), 3))) cutoff Coeff CI Lower CI Upper [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] Sensitivity to Observations Close to Cutoff If there is potential for manipulation, it would be those observations closes to the cutoff who are most susceptible. Take them out and evaluate effect. rdrobust(y[abs(x) >= 0.25], X[abs(X) >= 0.25])[c("coef", "ci")] $coef Coeff Conventional Bias-Corrected Robust $ci CI Lower CI Upper Conventional Bias-Corrected Robust / / 64 Donut-hole Estimation Conclusion out <- t(sapply(seq(0, 1.25, by=.25), function(i) with(rdrobust(y[abs(x) >= i], X[abs(X) >= i]), c(coef=coef[3], ci[3,])))) out <- cbind(radius = seq(0, 1.25, by=.25), out) out radius coef CI Lower CI Upper [1,] [2,] [3,] [4,] [5,] [6,] The RDD approach can be valuable with the right data and question. Have to be careful that the causal effect is not a modeling artifact. Use data-driven tools to estimate appropriate bandwidth, window width, etc... Do sensitivity testing to make sure that your results are not sensitive to modeling choices 63 / / 64
Regression III Regression Discontinuity Designs
Motivation Regression III Regression Discontinuity Designs Dave Armstrong University of Western Ontario Department of Political Science Department of Statistics and Actuarial Science (by courtesy) e: dave.armstrong@uwo.ca
More informationSection 7: Local linear regression (loess) and regression discontinuity designs
Section 7: Local linear regression (loess) and regression discontinuity designs Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A October 26, 2015 1 / 57 Motivation We will focus on local linear
More informationRegression Discontinuity Designs in Stata
Regression Discontinuity Designs in Stata Matias D. Cattaneo University of Michigan July 30, 2015 Overview Main goal: learn about treatment effect of policy or intervention. If treatment randomization
More informationESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.
More informationNonparametric Methods
Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis
More informationRegression Discontinuity Designs
Regression Discontinuity Designs Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Regression Discontinuity Design Stat186/Gov2002 Fall 2018 1 / 1 Observational
More informationRegression Discontinuity Design
Chapter 11 Regression Discontinuity Design 11.1 Introduction The idea in Regression Discontinuity Design (RDD) is to estimate a treatment effect where the treatment is determined by whether as observed
More informationMultidimensional Regression Discontinuity and Regression Kink Designs with Difference-in-Differences
Multidimensional Regression Discontinuity and Regression Kink Designs with Difference-in-Differences Rafael P. Ribas University of Amsterdam Stata Conference Chicago, July 28, 2016 Motivation Regression
More informationA Practical Introduction to Regression Discontinuity Designs: Part I
A Practical Introduction to Regression Discontinuity Designs: Part I Matias D. Cattaneo Nicolás Idrobo Rocío Titiunik December 23, 2017 Monograph prepared for Cambridge Elements: Quantitative and Computational
More informationExam ECON5106/9106 Fall 2018
Exam ECO506/906 Fall 208. Suppose you observe (y i,x i ) for i,2,, and you assume f (y i x i ;α,β) γ i exp( γ i y i ) where γ i exp(α + βx i ). ote that in this case, the conditional mean of E(y i X x
More informationStatistical Inference with Regression Analysis
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing
More informationHarvard University. Rigorous Research in Engineering Education
Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationSupplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"
Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs" Yingying Dong University of California Irvine February 2018 Abstract This document provides
More informationChapter 11. Regression with a Binary Dependent Variable
Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score
More informationAddressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN
Addressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN Overview Assumptions of RD Causal estimand of interest Discuss common analysis issues In the afternoon, you will have the opportunity to
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationMichael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp
page 1 Lecture 7 The Regression Discontinuity Design fuzzy and sharp page 2 Regression Discontinuity Design () Introduction (1) The design is a quasi-experimental design with the defining characteristic
More informationRegression Discontinuity: Advanced Topics. NYU Wagner Rajeev Dehejia
Regression Discontinuity: Advanced Topics NYU Wagner Rajeev Dehejia Summary of RD assumptions The treatment is determined at least in part by the assignment variable There is a discontinuity in the level
More informationFinding Instrumental Variables: Identification Strategies. Amine Ouazad Ass. Professor of Economics
Finding Instrumental Variables: Identification Strategies Amine Ouazad Ass. Professor of Economics Outline 1. Before/After 2. Difference-in-difference estimation 3. Regression Discontinuity Design BEFORE/AFTER
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationECO Class 6 Nonparametric Econometrics
ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationLecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:
Biost 518 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture utline Choice of Model Alternative Models Effect of data driven selection of
More informationIntroduction to Econometrics. Review of Probability & Statistics
1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical
More informationreview session gov 2000 gov 2000 () review session 1 / 38
review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review
More informationDiagnostics and Transformations Part 2
Diagnostics and Transformations Part 2 Bivariate Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University Multilevel Regression Modeling, 2009 Diagnostics
More informationHypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal
Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric
More informationRegression with a Single Regressor: Hypothesis Tests and Confidence Intervals
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression
More informationRegression Discontinuity Design
Regression Discontinuity Design Marcelo Coca Perraillon University of Chicago May 13 & 18, 2015 1 / 51 Introduction Plan Overview of RDD Meaning and validity of RDD Several examples from the literature
More informationted: a Stata Command for Testing Stability of Regression Discontinuity Models
ted: a Stata Command for Testing Stability of Regression Discontinuity Models Giovanni Cerulli IRCrES, Research Institute on Sustainable Economic Growth National Research Council of Italy 2016 Stata Conference
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationObjectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters
Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence
More informationA Practical Introduction to Regression Discontinuity Designs: Volume I
A Practical Introduction to Regression Discontinuity Designs: Volume I Matias D. Cattaneo Nicolás Idrobo Rocío Titiunik April 11, 2018 Monograph prepared for Cambridge Elements: Quantitative and Computational
More informationRegression Discontinuity Design Econometric Issues
Regression Discontinuity Design Econometric Issues Brian P. McCall University of Michigan Texas Schools Project, University of Texas, Dallas November 20, 2009 1 Regression Discontinuity Design Introduction
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationRegression Discontinuity Designs.
Regression Discontinuity Designs. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 31/10/2017 I. Brunetti Labour Economics in an European Perspective 31/10/2017 1 / 36 Introduction
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More informationSection 3 : Permutation Inference
Section 3 : Permutation Inference Fall 2014 1/39 Introduction Throughout this slides we will focus only on randomized experiments, i.e the treatment is assigned at random We will follow the notation of
More informationGov 2002: 3. Randomization Inference
Gov 2002: 3. Randomization Inference Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last week: This week: What can we identify using randomization? Estimators were justified via
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationAGEC 621 Lecture 16 David Bessler
AGEC 621 Lecture 16 David Bessler This is a RATS output for the dummy variable problem given in GHJ page 422; the beer expenditure lecture (last time). I do not expect you to know RATS but this will give
More informationDistribution-Free Procedures (Devore Chapter Fifteen)
Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal
More informationL6: Regression II. JJ Chen. July 2, 2015
L6: Regression II JJ Chen July 2, 2015 Today s Plan Review basic inference based on Sample average Difference in sample average Extrapolate the knowledge to sample regression coefficients Standard error,
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationPassing-Bablok Regression for Method Comparison
Chapter 313 Passing-Bablok Regression for Method Comparison Introduction Passing-Bablok regression for method comparison is a robust, nonparametric method for fitting a straight line to two-dimensional
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationAn Alternative Assumption to Identify LATE in Regression Discontinuity Design
An Alternative Assumption to Identify LATE in Regression Discontinuity Design Yingying Dong University of California Irvine May 2014 Abstract One key assumption Imbens and Angrist (1994) use to identify
More informationApplied Microeconometrics Chapter 8 Regression Discontinuity (RD)
1 / 26 Applied Microeconometrics Chapter 8 Regression Discontinuity (RD) Romuald Méango and Michele Battisti LMU, SoSe 2016 Overview What is it about? What are its assumptions? What are the main applications?
More informationLinear Modelling in Stata Session 6: Further Topics in Linear Modelling
Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical
More informationEMERGING MARKETS - Lecture 2: Methodology refresher
EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different
More informationCorrelation and regression
NST 1B Experimental Psychology Statistics practical 1 Correlation and regression Rudolf Cardinal & Mike Aitken 11 / 12 November 2003 Department of Experimental Psychology University of Cambridge Handouts:
More informationWhy high-order polynomials should not be used in regression discontinuity designs
Why high-order polynomials should not be used in regression discontinuity designs Andrew Gelman Guido Imbens 6 Jul 217 Abstract It is common in regression discontinuity analysis to control for third, fourth,
More informationRegression and the 2-Sample t
Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression
More informationThe Economics of European Regions: Theory, Empirics, and Policy
The Economics of European Regions: Theory, Empirics, and Policy Dipartimento di Economia e Management Davide Fiaschi Angela Parenti 1 1 davide.fiaschi@unipi.it, and aparenti@ec.unipi.it. Fiaschi-Parenti
More informationThe Simple Linear Regression Model
The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate
More informationWhat s New in Econometrics. Lecture 1
What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint
More informationRank-Based Methods. Lukas Meier
Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data
More informationLecture 21: October 19
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use
More informationTwo Sample Problems. Two sample problems
Two Sample Problems Two sample problems The goal of inference is to compare the responses in two groups. Each group is a sample from a different population. The responses in each group are independent
More informationMy data doesn t look like that..
Testing assumptions My data doesn t look like that.. We have made a big deal about testing model assumptions each week. Bill Pine Testing assumptions Testing assumptions We have made a big deal about testing
More information1 Independent Practice: Hypothesis tests for one parameter:
1 Independent Practice: Hypothesis tests for one parameter: Data from the Indian DHS survey from 2006 includes a measure of autonomy of the women surveyed (a scale from 0-10, 10 being the most autonomous)
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationCh. 16: Correlation and Regression
Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationUse of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:
Use of Matching Methods for Causal Inference in Experimental and Observational Studies Kosuke Imai Department of Politics Princeton University April 27, 2007 Kosuke Imai (Princeton University) Matching
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationG-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation
G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation ( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationChapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression
Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.
More informationST505/S697R: Fall Homework 2 Solution.
ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationAn Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01
An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there
More information(1) Sort all time observations from least to greatest, so that the j th and (j + 1) st observations are ordered by t j t j+1 for all j = 1,..., J.
AFFIRMATIVE ACTION AND HUMAN CAPITAL INVESTMENT 8. ONLINE APPENDIX TO ACCOMPANY Affirmative Action and Human Capital Investment: Theory and Evidence from a Randomized Field Experiment, by CHRISTOPHER COTTON,
More informationSection 3: Permutation Inference
Section 3: Permutation Inference Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A September 26, 2015 1 / 47 Introduction Throughout this slides we will focus only on randomized experiments, i.e
More information1 Impact Evaluation: Randomized Controlled Trial (RCT)
Introductory Applied Econometrics EEP/IAS 118 Fall 2013 Daley Kutzman Section #12 11-20-13 Warm-Up Consider the two panel data regressions below, where i indexes individuals and t indexes time in months:
More informationReview of Statistics
Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and
More informationQuantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression
Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Correlation Linear correlation and linear regression are often confused, mostly
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationHypothesis Testing hypothesis testing approach
Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationRegression Discontinuity Designs Using Covariates
Regression Discontinuity Designs Using Covariates Sebastian Calonico Matias D. Cattaneo Max H. Farrell Rocío Titiunik May 25, 2018 We thank the co-editor, Bryan Graham, and three reviewers for comments.
More informationHoliday Assignment PS 531
Holiday Assignment PS 531 Prof: Jake Bowers TA: Paul Testa January 27, 2014 Overview Below is a brief assignment for you to complete over the break. It should serve as refresher, covering some of the basic
More informationGov 2000: 9. Regression with Two Independent Variables
Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Harvard University mblackwell@gov.harvard.edu Where are we? Where are we going? Last week: we learned about how to calculate a simple
More informationQED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University
QED Queen s Economics Department Working Paper No. 1319 Hypothesis Testing for Arbitrary Bounds Jeffrey Penney Queen s University Department of Economics Queen s University 94 University Avenue Kingston,
More informationIntroduction to hypothesis testing
Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If
More information36-463/663: Multilevel & Hierarchical Models
36-463/663: Multilevel & Hierarchical Models Causal Inference Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Causal Inference [G&H Ch 9] The Fundamental Problem Confounders, and how Controlled
More informationStatistical Methods III Statistics 212. Problem Set 2 - Answer Key
Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423
More informationIntroduction to Econometrics. Multiple Regression (2016/2017)
Introduction to Econometrics STAT-S-301 Multiple Regression (016/017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 OLS estimate of the TS/STR relation: OLS estimate of the Test Score/STR relation:
More informationANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS
ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing
More informationBusiness Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee
Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)
More informationCausal Inference Basics
Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,
More informationChapter 2: Resampling Maarten Jansen
Chapter 2: Resampling Maarten Jansen Randomization tests Randomized experiment random assignment of sample subjects to groups Example: medical experiment with control group n 1 subjects for true medicine,
More information