Endogenous Treatment Effects for Count Data Models with Endogenous Participation or Sample Selection

Similar documents
Endogenous Treatment Effects for Count Data Models with Sample Selection or Endogenous Participation

Selection endogenous dummy ordered probit, and selection endogenous dummy dynamic ordered probit models

A Joint Tour-Based Model of Vehicle Type Choice and Tour Length

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005

ECON 594: Lecture #6

Applied Health Economics (for B.Sc.)

Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command

A simple bivariate count data regression model. Abstract

Limited Dependent Variables and Panel Data

Discrete Choice Modeling

A double-hurdle count model for completed fertility data from the developing world

Gibbs Sampling in Latent Variable Models #1

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Short Questions (Do two out of three) 15 points each

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Binary Models with Endogenous Explanatory Variables

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Panel Data. Final Examination: Spring 2018

Beyond the Target Customer: Social Effects of CRM Campaigns

Women. Sheng-Kai Chang. Abstract. In this paper a computationally practical simulation estimator is proposed for the twotiered

Model Assumptions; Predicting Heterogeneity of Variance

Applied Microeconometrics (L5): Panel Data-Basics

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Generalized Linear Models for Non-Normal Data

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Correlation and regression

Outline. Overview of Issues. Spatial Regression. Luc Anselin

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

Child Maturation, Time-Invariant, and Time-Varying Inputs: their Relative Importance in Production of Child Human Capital

Price Discrimination through Refund Contracts in Airlines

Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Switching Regime Estimation

Simplified Implementation of the Heckman Estimator of the Dynamic Probit Model and a Comparison with Alternative Estimators

Short T Panels - Review

David M. Drukker Italian Stata Users Group meeting 15 November Executive Director of Econometrics Stata

A Spatial Multiple Discrete-Continuous Model

Daisuke Nagakura University of Washington. Abstract

A Note on the Correlated Random Coefficient Model. Christophe Kolodziejczyk

Gravity Models, PPML Estimation and the Bias of the Robust Standard Errors

Ch 7: Dummy (binary, indicator) variables

Agricultural and Applied Economics 637 Applied Econometrics II. Assignment III Maximum Likelihood Estimation (Due: March 31, 2016)

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Testing exogeneity of multinomial regressors in count data models: does two stage residual inclusion work?

Non-linear panel data modeling

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago

Discrete Choice Modeling

Cross-sectional space-time modeling using ARNN(p, n) processes

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton.

Panel Data Exercises Manuel Arellano. Using panel data, a researcher considers the estimation of the following system:

Estimation of Dynamic Regression Models

FIML estimation of an endogenous switching model for count data

Studies in Nonlinear Dynamics & Econometrics

Bayes methods for categorical data. April 25, 2017

Hypothesis Registration: Structural Predictions for the 2013 World Schools Debating Championships

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Applied Econometrics Lecture 1

An Introduction to Mplus and Path Analysis

Analysing geoadditive regression data: a mixed model approach

Economics 671: Applied Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Regression Analysis Tutorial 77 LECTURE /DISCUSSION. Specification of the OLS Regression Model

Partial effects in fixed effects models

Maximum Likelihood and. Limited Dependent Variable Models

STA 216, GLM, Lecture 16. October 29, 2007

COUNTS WITH AN ENDOGENOUS BINARY REGRESSOR: A SERIES EXPANSION APPROACH* WP-AD

Exam D0M61A Advanced econometrics

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Econometrics for PhDs

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

PhD/MA Econometrics Examination. January, 2015 PART A. (Answer any TWO from Part A)

Stat 5101 Lecture Notes

Treatment Effects with Normal Disturbances in sampleselection Package

Functional Form. Econometrics. ADEi.

Linear Methods for Prediction

Practice exam questions

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Agricultural and Applied Economics 637 Applied Econometrics II. Assignment III Maximum Likelihood Estimation (Due: March 25, 2014)

WISE International Masters

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Categorical and Zero Inflated Growth Models

RIETI Discussion Paper Series 15-E-090

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011

Joint Modeling of Longitudinal Item Response Data and Survival

Florian Hoffmann. September - December Vancouver School of Economics University of British Columbia

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Introduction to Linear Regression Analysis

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Computational Systems Biology: Biology X

Determining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Transcription:

Endogenous Treatment Effects for Count Data Models with Endogenous Participation or Sample Selection Massimilano Bratti & Alfonso Miranda Institute of Education University of London c Bratti&Miranda (p. 1 of 27)

Motivation Previous work The model Application Results Conclusions References Sample Sample selection selection vs vs Participation Participation In many fields of applied work researchers need to model a count In many fields of applied work researchers need to model a count dependent dependent variable y that variable is only y that observed is only for a observed proportion for of a the proportion sample (i.e., of the a sample (i.e., selection a sample problem) selection or that problem) is defined or that only is fordefined a subset only of for the apopulation subset of(i.e., the population a participation (i.e., problem) a participation problem) Note. Observed Note. variables Observed arevariables written are inside written a square inside a square and and unobserved unobserved variables variables are written are written inside ainside circle. a circle. S S is a selection dummy and Pand is apparticipation is a participation dummy is a selection dummy dummy Institute Institute of Education of Education University University of London of London c Bratti c Bratti&Miranda & Miranda (p. 2 (p. of 2 of 28) 27)

Motivation Previous work The model Application Results Conclusions References Particiapation Particiapation + Treatment Treatment Often the count dependent variable is, itself, function of dichotomous variable that indicates a treatment condition T Often the count dependent variable is, itself, function of a dichotomous variable that indicates a treatment condition T Note. Solid arrows indicate functional relationships. x, z, Note. Solid arrows indicate functional relationships. x, z, and r are and r are vectors of explanatory variables. ζ and ξ are vectors of explanatory random errors. variables. β, γ, ξθand are ψvectors are random of coefficients. errors. And β, γ, δ abd θ are vector and of ϕ are coefficients. And δ and ϕ are coefficients Institute Institute of Education of Education University University of London of London c Bratti c Bratti&Miranda & (p. 3(p. of3 28) of 27)

Example Effect of physician advice on individual alcohol consumption (Kenkel et al. 2001) Institute of Education University of London c Bratti&Miranda (p. 4 of 27)

The challenge The participation (sample selection) and treatment dummies are both likely to be endogenous. In the drinking example: Health oriented individuals are more likely to to seek advice and less likely to participate (drink), and abuse drinking given participation Individuals who have been given advice against drinking are less likely to participate Physicians are less likely to give advise to individuals with low levels of consumption Institute of Education University of London c Bratti&Miranda (p. 5 of 27)

The aim We aim to develop methods to deal with an endogenous participation (or a sample selection) problem together with an endogenous treatment problem in a model for a count variable The method should be able to deal with cases in which unobserved factors affecting sample selection or participation are correlated with those affecting exposure to the treatment and those relevant for the intensive margin of the activity of interest (smoking, drinking, etc.) Institute of Education University of London c Bratti&Miranda (p. 6 of 27)

Previous work To the best of our knowledge, there are models that are able to address only one of these issues at the time, or that address a similar but not exactly the same issue Endogenous participation (selection) and endogenous treatment Terza et al. (2008) propose a two-step estimator for a dependent variable observed in interval form to model sample selection and endogenous treatment; Li and Trivedi (2009) use a Bayesian approach to estimate a model for a continuous and non negative dependent variable with endogenous participation and multivariate treatments Endogenous participation or endogenous treatment Greene 2009 (excellent review) Kenkel and Terza (2001), which will be described later in more details Institute of Education University of London c Bratti&Miranda (p. 7 of 27)

Motivation Previous work The model Application Results Conclusions References EPET-Poisson model Endogenous Participation + Endogenous Treatment Note. η represents an unobserved or latent variable. 1 and Note. η represents an unobserved or latent variable. λ λ 2 are parameters. 1 and λ 2 are parameters Institute of Education University of London c Bratti & Miranda (p. 8 of 28) Institute of Education University of London c Bratti&Miranda (p. 8 of 27)

Formal definition I We start with the Endogenous Participation Endogenous Treatment (EPET) Poisson model we assume that the treatment dummy T and the participation dummy P are generated as follows: T = z γ + v, (1) P = r θ + φt + q (2) with T = 1(T > 0), P = 1(P > 0). The count y is generated according to the following cumulative conditional distribution function, { not defined if P = 0 G (y η) Pr (y η) = [µ y exp ( µ)] /[1 exp( µ)]y! if P = 1 with, y = { 0 if P = 0 1, 2,... if P = 1 (3) ln (µ) = x β + δt + η, (4) Institute of Education University of London c Bratti&Miranda (p. 9 of 27)

Formal definition II An important feature of the EPET Poisson model is the y = 0 count is generated by a different data generating mechanism from y > 0 counts. This is equivalent to say that the decisions To drink vs. not to drink # of drinks, given current drinking status are different and should be modelled separately, although the two processes may be correlated Correlation between T, P and y is induced by imposing some structure on the residuals of equations (1) and (2), v = λ 1 η + ζ q = λ 2 η + ξ, (5) Institute of Education University of London c Bratti&Miranda (p. 10 of 27)

Main assumptions To close the model we require the covariates to be all exogenous and impose some distributional conditions D(η x, z, r, ζ, ξ) = D(η) D(ζ x, z, r, η) = D(ζ η) D(ξ x, z, r, η) = D(ξ η) ζ ξ η, (C1) (C2) (C3) (C4) In what follows, we assume that η N(0, σ 2 η) and that ζ η and ξ η are both distributed as independent standard normal variates, and that all covariates (except the treatment of interest) are exogenous Institute of Education University of London c Bratti&Miranda (p. 11 of 27)

Implied correlations Correlations between y, T, and S are functions of the factor loadings λ 1 and λ 2. In particular, the model implies the following correlations: ρ η,v = ρ η,q = λ 1 ση 2 ση(λ 2 2 1 σ2 η + 1) λ 2 ση 2 ση(λ 2 2 2 σ2 η + 1) (6) (7) ρ v,q = λ 1 λ 2 ση 2. (8) (λ 2 1 σ2 η + 1)(λ 2 2 σ2 η + 1) Institute of Education University of London c Bratti&Miranda (p. 12 of 27)

Estimation We estimate the model by Maximum Simulated Likelihood (MSL) and it is identified by restrictions on the covariance matrix and by functional form. So, x, z, and r can all have the same elements. However, specifying some exclusion restrictions for the selection and/or treatment equations is always advisable when it is possible Institute of Education University of London c Bratti&Miranda (p. 13 of 27)

Likelihood function log(l) = ω τ ln { Pr P (0 η)pr T (τ η)φ(η)dη } i,p i =0 τ + ω τ ln { Pr P (1 η)pr T (τ η)g(y η)φ(η)dη } i,p i =1 τ Note 1 Pr P (0 η) is the conditional probability of P = 0 given η and Pr P (1 η) is the conditional probability of P = 1 given η Note 2 Pr T (τ η) represents the probability of T = τ given η, with τ = 0, 1 Note 3 G(y η) is the cumulative distribution of y Note 4 ω 0 = 1(T = 0), and ω 1 = 1(T = 1) Note 5 To simplify notation, conditioning on observable variables is not explicitly writen Institute of Education University of London c Bratti&Miranda (p. 14 of 27)

SET-Poisson model Sample selection + endogenous treatment count model Institute of Education University of London c Bratti & Miranda (p. 16 of 28) Institute of Education University of London c Bratti&Miranda (p. 15 of 27)

SET-Poisson In the Sample Selection Endogenous Treatment (SET) Poisson model we assume that the treatment dummy T and the sample selection dummy S are generated as follows: T = z γ + λ 1 η + ζ, (9) S = r θ + ϕt + η + ξ (10) with T = 1(T > 0), S = 1(S > 0) and, F (y η) Pr (y η) = y = { not defined if S = 0 [µ y exp ( µ)] /y! if S = 1. { missing if S = 0 0, 1, 2,... if S = 1, (11) (12) ln (µ) = x β + δt + η, (13) Institute of Education University of London c Bratti&Miranda (p. 16 of 27)

Application: Kenkel and Terza 2001 We apply the EPET Poisson model to Kenkel and Terza s (2001) data (KT, hereafter). KT analyze the effect of phsycian advice on drinking using the 1990 National Health Interview Survey (U.S.) The authors drop from the analysis lifetime abstainers and former drinkers with no drinking in the past year. Because the physician advice to cut drinking was recommended as a way of reducing high blood pressure, the authors focus only on men who have drunk alcohol at least once in the last 12 months and report having been told at some time that they had high blood pressure In spite of this sample selection, KT observe in their sample that 21% of current drinkers (according to their definition) did not drink at all in the last two weeks (excess of zeroes) Institute of Education University of London c Bratti&Miranda (p. 17 of 27)

Application: Kenkel and Terza 2001 Excess zeros: recent quitters or people who were actively trying to stop drinking all together in the last 12 months individuals who drink only in very special occasions such as weddings, birthdays, or Christmas day (occasional drikers) frequent drinkers who, by chance, did not drink any alcohol in the past two weeks; although this last scenario is less likely Since KT compute drinking as the product of self-reported drinking fre- quency (the number of days in the past two weeks with any drinking) and drinking intensity (the average number of drinks on a day with any drink- ing), (p. 171-172), the measure refers to the last two weeks, a su?ciently long time-span. For this reason, we expect the first two explanations above to be more relevant in this specific case Institute of Education University of London c Bratti&Miranda (p. 18 of 27)

Application: Kenkel and Terza 2001 KT account for this excess zero by using a flexible functional form for the conditional mean of drinking based on the inverse Box-Cox transforma- tion. In contrast, here we account for it using a standard Poisson model, but treating the zeros and the positive drinking outcomes as if they were generated by two separate DGPs. Our idea is that the DGP for occasional drinkers or quitters should be a different one from that ruling day-by-day drinking by frequent drinkers Institute of Education University of London c Bratti&Miranda (p. 19 of 27)

Results We estimated several models: A Poisson model An Endogenous Treatment (ET) Poisson model (neglecting endogenous participation) An EPET model In what follows all estimations by MSL use 1600 Halton draws to perform the integration. Re-estimating the models with 2000 draws did not change significantly coe?cient or standard errors Institute of Education University of London c Bratti&Miranda (p. 20 of 27)

Sample selection + endogenous treatment count model Table 2 Marginal effects of physician advice on the number of drinks consumed in the last two weeks and on the probability of drinking 30 ET- EP- EPET- Poisson Probit Poisson Poisson Poisson y (a) Pr(y>0) (b) y (a) Pr(y>0) (b) y>0 (a) Pr(y>0) (b) y>0 (a) (1) (2) (3) (4) (5) (6) (7) Physician advice (T ) 3.679*** 0.079*** -5.395*** 0.079*** 3.377*** -0.045-4.072*** (.558) (.017) (.386) (.017) (.557) (.049) (.864) ρ η,v 0.832*** 0.689*** (.029) (.092) ρ η,q.082 0.378*** (.141) (.088) ρ v,q 0.261*** (.084) σ η 2 2.190*** 1.207*** 1.456*** (.069) (.023) (.099) N.obs. 2,467 2,467 2,467 2,467 2,467 Log-likelihood -32,263-1,247-10,184-8,660-10,062 Pr(y = 0) (c) 0.00 0.21 0.11 0.21 0.22 BIC (d) 70,360 23,418 20,675 20,671 *** significant at 1%. Eicker-Huber-White robust standard errors in parentheses. (a) The marginal effect is computed for a discrete change in T (from 0 to 1) at the sample median of the dependent variable, in analogy to KT; Institute of Education University of London c Bratti&Miranda (p. 21 of 27) (b) The marginal effect is computed for a discrete change in T (from 0 to 1) at the sample mean of the other independent variables;

Notes *** significant at 1%. Eicker-Huber-White robust standard errors in parentheses. (a) The marginal effect is computed for a discrete change in T (from 0 to 1) at the sample median of the dependent variable, in analogy to KT; (b) The marginal effect is computed for a discrete change in T (from 0 to 1) at the sample mean of the other independent variables; (c) Probability of the zero outcome predicted by the model. (d) Bayesian information criterion. For the sake of comparability all BICs refer to a three-equation model. In the ET-Poisson the BIC refers to the ET-Poisson and the probit equation for exogenous participation; in the EP-Poisson the BIC refers to the EP-Poisson and the probit equation for exogenous treatment. The BIC in columns (1)-(2) refers to the probit models for exogenous participation and exogenous treatment and the Poisson model with exogenous participation and exogenous treatment (including the zeros). Note. y is the number of alcoholic drinks consumed in the last two weeks, Pr(y>0) the probability of drinking in the last two weeks, y>0 the number of alcoholic drinks consumed in the last two weeks conditional on drinking and T a dichotomous indicator of individual treatment status. Estimation refers to the 1990 NHIS with the sample selection and covariates used in KT. ET, EP and EPET stand for Endogenous Treatment, Endogenous Participation and Endogenous Participation Endogenous Treatment, respectively. ET-Poisson, EP-Poisson and EPET-Poisson models were estimated using MSL and 1, 600 Halton draws. The joint Wald test statistic for ρ y,t = ρ y,p = ρ T,P = 0 in the EPET-Poisson model, distributed as a χ 2 (3), is 57.904 (p-value=0.00). Institute of Education University of London c Bratti&Miranda (p. 22 of 27)

Main findings Neglecting the potential endogeneity of the treatment produces wrongly signed estimates of the treatment effects Neglecting endogenous participation leads to an overestimate of the treatment effects The EPET Poisson model produces estimates of the treatment effects that are in between those produced by the ET Poisson model and the estimates by KT Institute of Education University of London c Bratti&Miranda (p. 23 of 27)

Remark The EPET Poisson model also allows the same covariates to have different effects on the the intensive and the extensive drinking margins Institute of Education University of London c Bratti&Miranda (p. 24 of 27)

Conclusions In this paper, we: Propose a model suitable for all cases in which an endogenous treatment affects a count outcome variables, and in which there are endogenous participation or sample selection issues Show that accounting for the endogeneity of the treatment status and endgenous participation issues is paramount to obtaining consistent estimates of treatment effects, using data from KT study of physician advice on drinking In the future the are plans to make the model s code available to the general public through a Stata s ado file Institute of Education University of London c Bratti&Miranda (p. 25 of 27)

Thanks! Institute of Education University of London c Bratti&Miranda (p. 26 of 27)

References Bratti M. and Miranda A (2011). Endogenous treatment effects for count data models with endogenous participation or sample selection. Health Economics 20 (9): 1090 1109. Greene, W., 2009. Models for count data with endogenous participation. Empirical Economics 36, 133 173. Kenkel, D., Terza, J., 2001. The e?ect of physician advice on alcohol consumption: Count regression with an endogenous treatment e?ect. Journal of Applied Econometrics 16, 165 184. Li, Q., Trivedi, P. K., 2009. Impact of prescription drug coverage on drug expenditure of the elderly evidence from a two-part model with endogeneity, mimeo. Riphahn, R., Wambach, A., Million, A., 2003. Incentive e?ects in the demand for health care: A bivariate panel count data estimation. Journal of Applied Econometrics 4, 387 405. Terza, J. V., Kenkel, D. S., Lin, T.-F., Sakata, S., 2008. Care-giver advice as a preventive measure for drinking during pregnancy: Zeros, categorical outcome responses, and endogeneity. Health Economics 17, 41 54. Windmaijer, F., Santos Silva, J., 1997. Endogeneity in count data models: An application to demand for health care. Journal of Applied Econometrics 12, 281 94. Institute of Education University of London c Bratti&Miranda (p. 27 of 27)