leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

Similar documents
ECON Introductory Econometrics. Lecture 17: Experiments

Econometrics II Censoring & Truncation. May 5, 2011

Interpreting coefficients for transformed variables

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Problem set - Selection and Diff-in-Diff

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Correlation and Simple Linear Regression

Exploring Marginal Treatment Effects

treatment effect models

Problem Set 10: Panel Data

ECO220Y Simple Regression: Testing the Slope

-redprob- A Stata program for the Heckman estimator of the random effects dynamic probit model

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

Regression #8: Loose Ends

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Constructive Proposals for Dealing with Attrition: An Empirical Example

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

THE MULTIVARIATE LINEAR REGRESSION MODEL

David M. Drukker Italian Stata Users Group meeting 15 November Executive Director of Econometrics Stata

Extended regression models using Stata 15

Treatment interactions with nonexperimental data in Stata

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

sociology 362 regression

Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers. (1 θ) if y i = 0. which can be written in an analytically more convenient way as

Binary Dependent Variables

Dealing With and Understanding Endogeneity

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Trimming for Bounds on Treatment Effects with Missing Outcomes *

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death

Sensitivity checks for the local average treatment effect

Flexible Estimation of Treatment Effect Parameters

Question 1 [17 points]: (ch 11)

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Fixed and Random Effects Models: Vartanian, SW 683

sociology 362 regression

Applied Statistics and Econometrics

Handout 12. Endogeneity & Simultaneous Equation Models

Handout 11: Measurement Error

Control Function and Related Methods: Nonlinear Models

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

Practice exam questions

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Lecture 3 Linear random intercept models

What s New in Econometrics. Lecture 1

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator

Lab 07 Introduction to Econometrics

Lecture 4: Multivariate Regression, Part 2

An explanation of Two Stage Least Squares

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Unemployment Rate Example

Econometrics Midterm Examination Answers

1 The basics of panel data

Lecture 8: Instrumental Variables Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation

ECON 594: Lecture #6

Generated Covariates in Nonparametric Estimation: A Short Review.

Sharp Bounds on Causal Effects under Sample Selection*

Microeconometrics. C. Hsiao (2014), Analysis of Panel Data, 3rd edition. Cambridge, University Press.

What s New in Econometrics? Lecture 14 Quantile Methods

Lecture 14. More on using dummy variables (deal with seasonality)

1 A Review of Correlation and Regression

Exercices for Applied Econometrics A

Practice 2SLS with Artificial Data Part 1

NBER WORKING PAPER SERIES A NOTE ON ADAPTING PROPENSITY SCORE MATCHING AND SELECTION MODELS TO CHOICE BASED SAMPLES. James J. Heckman Petra E.

Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Lab 6 - Simple Regression

Lecture 4: Multivariate Regression, Part 2

The relationship between treatment parameters within a latent variable framework

Lab 10 - Binary Variables

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects *

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Econometrics. 8) Instrumental variables

Confidence intervals for the variance component of random-effects linear models

2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u.

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Bounds on Average and Quantile Treatment Effects of Job Corps Training on Wages*

Treatment Effects with Normal Disturbances in sampleselection Package

Monday 7 th Febraury 2005

Econ 273B Advanced Econometrics Spring

Section Least Squares Regression

Statistical Inference with Regression Analysis

Transcription:

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata Harald Tauchmann (RWI & CINCH) Rheinisch-Westfälisches Institut für Wirtschaftsforschung (RWI) & CINCH Health Economics Research Centre 1. June 2012 2012 German Stata Users Group Meeting, WZB, Berlin

Introduction Selection Bias Introduction Random assignment of treatment: ideal setting for estimating treatment effects Randomized trials Non-random sample attrition (selection) still undermines validity of econometric estimates Selection bias Typical examples: Dropout from program Denied information on outcome Death during clinical trial Possibly severe attrition bias Direction of bias a priory unknown Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 2 / 20

Correction for Attrition Bias Classical Approaches Selection Correction Estimators Modeling the mechanism of sample selection/attrition Classical Heckman (1976, 1979) parametric selection correction estimator Stata command heckman Assumes joint normality Exclusion restrictions beneficial Identification through non-linearity in principle possible Parametric approach relying on strong assumptions Semi-parametric approaches (e.g. Ichimura and Lee, 1991; Ahn and Powell, 1993) Assumption of joint normality not required Exclusion restrictions essential Valid exclusion restrictions may not be available Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 3 / 20

Treatment Effect Bounds Non-Parametric Approaches Treatment Effect Bounds Rather than correcting point estimate of treatment effect Determining interval for effect size Correspond to extreme assumptions about the impact of selection on estimated effect 1. Horowitz and Manski (2000) bounds No assumptions about the the selection mechanism required Outcome variable needs to be bounded Missing information is imputed an basis of minimal and maximal possible values of the outcome variable Frequently yields very wide (i.e. hardly informative) bounds Useful benchmark for binary outcome variables Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 4 / 20

Treatment Effect Bounds Non-Parametric Approaches Treatment Effect Bounds II 2. Lee (2009) bounds Assumptions: (i) Besides random assignment of treatment (ii) Monotonicity assumption about selection mechanism Intuition: Assignment to treatment can only affect attrition in one direction I.e. (in terms of sign) no heterogeneous effect of treatment on selection Average treatment effect for never-attriters Sample trimmed such that the share of observed individuals is equal for both groups Trimming either from above or from below Corresponds to extreme assumptions about missing information that are consistent with (i) (ii) The observed data and A one-sided selection model Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 5 / 20

Treatment Effect Bounds Estimation Estimating Lee (2009) bounds Let denote Y the outcome, T a binary treatment indicator, W a binary selection indicator, and i individuals. Calculate: 1. q T i 1(T i=1,w i =1) i 1(T and q i=1) C i 1(T i=0,w i =1) i 1(T, i=0) i.e. the shares of individuals with observed Y 2. q (q T q C )/q T, if q T > q C (If q T < q C, exchange C for T) 3. y T q = G 1 Y (q T = 1, W = 1) and yt 1 q = G 1 (1 q T = 1, W = 1), i.e. qth and the (1 q)th quantile of observed outcome in the treatment group 4. Upper bound ˆθ upper and lower bound ˆθ lower as Y ˆθ upper = ˆθ lower = ( i 1 T i = 1, W i = 1, Y i yq) T Yi ( ) i 1 T i = 1, W i = 1, Y i yq T i 1( T i = 1, W i = 1, Y i y1 q) T Yi i 1( ) T i = 1, W i = 1, Y i y1 q T i 1 (T i = 0, W i = 1) Y i i 1 (T i = 0, W i = 1) i 1 (T i = 0, W i = 1) Y i i 1 (T i = 0, W i = 1) Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 6 / 20

Treatment Effect Bounds Tightened Bounds Tightening Bounds Lee (2009) bounds rest on comparing unconditional means of (trimmed) subsamples No covariates considered Using covariates yields tighter bounds: 1. Choose (discrete) variable(s) that have explanatory power for attrition 2. Split sample into cells defined by these variables 3. Compute bounds for each cell 4. Take weighted average Lee (2009) shows that such bounds are tighter than unconditional ones Researcher can generate such variables by deliberately varying the effort on preventing attrition (DiNardo et al., 2006) Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 7 / 20

Treatment Effect Bounds Standard Errors and Confidence Intervals Standard Errors and Confidence Intervals Lee (2009) derives analytic standard errors for bounds Allows for straightforward calculation of a naive confidence interval Covers the interval [θ lower, θ upper ] with probability 1 α Imbens and Manski (2004) derive confidence interval for the treatment effect itself Tighter than confidence interval for the interval Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 8 / 20

leebounds for Stata Syntax leebounds: Syntax Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 9 / 20

leebounds for Stata Syntax leebounds: Saved Results Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 10 / 20

Empirical Application The Experiment Experimental Design Research question: Do financial incentives aid obese in reducing bodyweight? Ongoing randomized trial (Augurzky et al., 2012) 698 obese (BMI 30) individuals recruited during rehab hospital stay Individual weight-loss target (typically 6 8% of body weight) Participants prompted to realize weight-loss target within four months Randomly assigned to on of three experimental groups: i. No financial incentive (control group) ii. 150 e reward for realizing weight-loss target iii. 300 e reward for realizing weight-loss target After four months: weight-in at assigned pharmacy Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 11 / 20

Empirical Application The Experiment Attrition Problem Experimental groups: group size compliers attrition control group 233 155 33.5% 150 e group 236 172 27.1% 300 e group 229 193 15.7% 698 520 25.5% Attrition rate negatively correlated with size of reward Plausible since (successful) members of incentive group have stronger incentive not to dropout Selection on success (in particular for incentive groups) likely Overestimation of incentive effect likely downward bias still possible Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 12 / 20

Empirical Application Eonometric Analysis Simple Bivariate OLS (comparison of means) Outcome variable: weightloss (percent of body weight) Focus on comparing group 300 e with control group. regress weightloss group300 Source SS df MS Number of obs = 348 F( 1, 346) = 23.17 Model 686.575435 1 686.575435 Prob > F = 0.0000 Residual 10253.2078 346 29.6335486 R-squared = 0.0628 Adj R-squared = 0.0601 Total 10939.7832 347 31.5267528 Root MSE = 5.4437 weightloss Coef. Std. Err. t P> t [95% Conf. Interval] group300 2.826111.5871336 4.81 0.000 1.671311 3.980911 _cons 2.34758.4372461 5.37 0.000 1.487585 3.207575 Highly significant inventive effect Roughly three percentage points Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 13 / 20

Empirical Application Eonometric Analysis Heckman (two-step) Selection Correction Estimator Exclusion restriction: nearby_pharmacy (assigned pharmacy within same ZIP-code area as place of residence) Captures cost of attending weight-in, no direct link to weight loss No further controlls Two-step estimation Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 14 / 20

Empirical Application Eonometric Analysis Heckman (two-step) Selection Correction Estimator II. heckman weightloss group300, select(group300 nearby_pharmacy) twostep Heckman selection model -- two-step estimates Number of obs = 462 (regression model with sample selection) Censored obs = 114 Uncensored obs = 348 Wald chi2(1) = 1.37 Prob > chi2 = 0.2415 weightloss Coef. Std. Err. z P> z [95% Conf. Interval] weightloss group300 3.126055 2.669154 1.17 0.242-2.105391 8.357501 _cons 1.716602 5.493513 0.31 0.755-9.050485 12.48369 select group300.5777289.1312605 4.40 0.000.3204631.8349947 nearby_phar ~ y.1358984.1344283 1.01 0.312 -.1275763.399373 _cons.3406349.1201113 2.84 0.005.1052211.5760487 mills lambda 1.158006 10.04912 0.12 0.908-18.5379 20.85392 rho 0.21123 sigma 5.4821209 Similar point estimate as for OLS Large S.E.s insignificant incentive effect Low explanatory power of nearby_pharmacy (if regional characteristics are not controlled for) Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 15 / 20

Empirical Application Eonometric Analysis Lee Bounds. leebounds weightloss group300 Lee (2009) treatment effect bounds Number of obs. = 462 Number of selected obs. = 348 Trimming porportion = 0.2107 weightloss Coef. Std. Err. z P> z [95% Conf. Interval] group300 lower.983459.6431066 1.53 0.126 -.2770069 2.243925 upper 4.783921.6677338 7.16 0.000 3.475187 6.092655 Bounds cover OLS and Heckman point estimate Fairly wide interval Lower bound does not significantly differ from zero Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 16 / 20

Empirical Application Eonometric Analysis Lee Bounds with Effect Confidence Interval. leebounds weightloss group300, cie Lee (2009) treatment effect bounds Number of obs. = 462 Number of selected obs. = 348 Trimming porportion = 0.2107 Effect 95% conf. interval : [-0.0744 5.8822] weightloss Coef. Std. Err. z P> z [95% Conf. Interval] group300 lower.983459.6431066 1.53 0.126 -.2770069 2.243925 upper 4.783921.6677338 7.16 0.000 3.475187 6.092655 Effect confidence interval covers zero Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 17 / 20

Empirical Application Eonometric Analysis Tightened Lee Bounds Variable nearby_pharmacy used for tightening bounds Following the suggestion of DiNardo et al. (2006). leebounds weightloss group300, cie tight(nearby_pharmacy) Tightened Lee (2009) treatment effect bounds Number of obs. = 462 Number of selected obs. = 348 Number of cells = 2 Overall trimming porportion = 0.2107 Effect 95% conf. interval : [-0.0595 5.8448] weightloss Coef. Std. Err. z P> z [95% Conf. Interval] group300 lower 1.000043.6441664 1.55 0.121 -.2625003 2.262585 upper 4.727485.6792707 6.96 0.000 3.396139 6.058831 Bounds just marginally tighter Effect confidence interval still covers zero Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 18 / 20

Empirical Application Eonometric Analysis Tightened Lee Bounds II Further covariates for tightening bounds: i. age50 (indicator for age 50) ii. woman (indicator for sex). leebounds weightloss group300, cie tight(nearby_pharmacy age50 woman) Tightened Lee (2009) treatment effect bounds Number of obs. = 462 Number of selected obs. = 348 Number of cells = 8 Overall trimming porportion = 0.2107 Effect 95% conf. interval : [ 0.0608 5.3804] weightloss Coef. Std. Err. z P> z [95% Conf. Interval] group300 lower 1.282951.7429877 1.73 0.084 -.1732782 2.73918 upper 4.065244.7995777 5.08 0.000 2.498101 5.632388 Bounds substantially tighter Effect confidence interval does not covers zero Confirms existence of incentive effect Size of (potential) attrition bias remains somewhat unclear Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 19 / 20

References References Ahn, H. and Powell, J. L. (1993). Semiparametric estimation of censored selection models with a nonparametric selection mechanism, Journal of Econometrics 58: 3 29. Augurzky, B., Bauer, T. K., Reichert, A. R., Schmidt, C. M. and Tauchmann, H. (2012). Does money burn fat? Evidence from a randomized experiment, mimeo. DiNardo, J., McCrary, J. and Sanbonmatsu, L. (2006). Constructive Proposals for Dealing with Attrition: An Empirical Example, University of Michigan Working Paper. Heckman, J. J. (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models, Annals of Economics and Social Measurement 5: 475 492. Heckman, J. J. (1979). Sample selection bias as a specification error, Econometrica 47: 153 161. Horowitz, J. L. and Manski, C. F. (2000). Nonparametric analysis of randomized experiments with missing covariate and outcome data, Journal of the American Statistical Association 95: 77 84. Ichimura, H. and Lee, L. (1991). Semiparametric least squares estimation of multiple index models: Single equation estimation, Vol. 5 of International Symposia in Economic Theory and Econometrics, Cambridge University Press, pp. 3 32. Imbens, G. and Manski, C. F. (2004). Confidence intervals for partially identified parameters, Econometrica 72: 1845 1857. Lee, D. S. (2009). Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects, Review of Economic Studies 76: 1071 1102. Harald Tauchmann (RWI & CINCH) leebounds 1. June 2012 20 / 20