Estimating the Fractional Response Model with an Endogenous Count Variable

Similar documents
ECON 594: Lecture #6

Jeffrey M. Wooldridge Michigan State University

Econometric Analysis of Cross Section and Panel Data

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Control Function and Related Methods: Nonlinear Models

Non-linear panel data modeling

GMM Estimation in Stata

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

-redprob- A Stata program for the Heckman estimator of the random effects dynamic probit model

xtunbalmd: Dynamic Binary Random E ects Models Estimation with Unbalanced Panels

David M. Drukker Italian Stata Users Group meeting 15 November Executive Director of Econometrics Stata

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

CENSORED DATA AND CENSORED NORMAL REGRESSION

Homework Solutions Applied Logistic Regression

ERSA Training Workshop Lecture 5: Estimation of Binary Choice Models with Panel Data

The Case Against JIVE

A simple alternative to the linear probability model for binary choice models with endogenous regressors

Generalized linear models

Treatment interactions with nonexperimental data in Stata

Applied Econometrics Lecture 1

fhetprob: A fast QMLE Stata routine for fractional probit models with multiplicative heteroskedasticity

Economics 620, Lecture 18: Nonlinear Models

Limited Information Econometrics

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Generalized method of moments estimation in Stata 11

FIML estimation of an endogenous switching model for count data

THE BEHAVIOR OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF DYNAMIC PANEL DATA SAMPLE SELECTION MODELS

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

New Developments in Econometrics Lecture 16: Quantile Estimation

On the Power of Tests for Regime Switching

Dealing With and Understanding Endogeneity

Exploring Marginal Treatment Effects

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

Environmental Econometrics

Econometrics II Censoring & Truncation. May 5, 2011

i (x i x) 2 1 N i x i(y i y) Var(x) = P (x 1 x) Var(x)

Extended regression models using Stata 15

Introduction: structural econometrics. Jean-Marc Robin

Parametric and semiparametric estimation of ordered response models with sample selection and individual-specific thresholds

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Econometrics Midterm Examination Answers

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

Economics 671: Applied Econometrics Department of Economics, Finance and Legal Studies University of Alabama

A Two-Regime CRC Model with Nonnegative Dependent Variable

Instrumental Variables. Ethan Kaplan

Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers. (1 θ) if y i = 0. which can be written in an analytically more convenient way as

Thoughts on Heterogeneity in Econometric Models

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

Further simulation evidence on the performance of the Poisson pseudo-maximum likelihood estimator

Indirect estimation of a simultaneous limited dependent variable model for patient costs and outcome

use conditional maximum-likelihood estimator; the default constraints(constraints) apply specified linear constraints

Compare Predicted Counts between Groups of Zero Truncated Poisson Regression Model based on Recycled Predictions Method

Short T Panels - Review

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Problem set - Selection and Diff-in-Diff

The Stata Journal. Editor Nicholas J. Cox Geography Department Durham University South Road Durham City DH1 3LE UK

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

An Efficient Estimation of Average Treatment Effect by Nonlinear Endogenous Switching Regression with an Application in Botswana Fertility

Appendix A. Numeric example of Dimick Staiger Estimator and comparison between Dimick-Staiger Estimator and Hierarchical Poisson Estimator

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

A Course on Advanced Econometrics

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.

Essential of Simple regression

Multivariate probit regression using simulated maximum likelihood

Markov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

Simultaneous Equations Models: what are they and how are they estimated

Applied Health Economics (for B.Sc.)

Chapter 11. Regression with a Binary Dependent Variable

Exam ECON5106/9106 Fall 2018

Partial effects in fixed effects models

What s New in Econometrics? Lecture 14 Quantile Methods

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels

xtdpdqml: Quasi-maximum likelihood estimation of linear dynamic short-t panel data models

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Limited Dependent Variable Models II

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Generalized Linear Models 1

Empirical Methods in Applied Microeconomics

Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008

Day 4 Nonlinear methods

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Binary Dependent Variables

Simple tests for exogeneity of a binary explanatory variable in count data regression models

Testing for Regime Switching: A Comment

Day 3B Nonparametrics and Bootstrap

Practice exam questions

1 Estimation of Persistent Dynamic Panel Data. Motivation

Econometrics II. Lecture 4: Instrumental Variables Part I

Some Stata commands for endogeneity in nonlinear panel-data models

ivporbit:an R package to estimate the probit model with continuous endogenous regressors

Christopher Dougherty London School of Economics and Political Science

PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata

Marginal effects and extending the Blinder-Oaxaca. decomposition to nonlinear models. Tamás Bartus

Exercise Sheet 4 Instrumental Variables and Two Stage Least Squares Estimation

Short Questions (Do two out of three) 15 points each

Transcription:

Estimating the Fractional Response Model with an Endogenous Count Variable Estimating FRM with CEEV Hoa Bao Nguyen Minh Cong Nguyen Michigan State Universtiy American University July 2009 Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 1 / 14

Econometric Issues and Motivation For nonlinear model, accounting for endogeneity is essentially complicated because simple two-stage regression strategies analogous to the Heckman (1979) method are only approximate and inference based on such procedures may lead to wrong conclusions (Wooldridge 2002). Many count endogenous explanatory variable (CEEV) have been treated as continuous endogenous variable and therefore increasing the value in the CEEV by 1 (e.g., from c k to c k+1 ) at di erent interesting values of the CEEV has roughly same e ect on the dependent variable. There is no routine for estimation of the fractional response model (FRM) with CEEV. Full-Information Maximum Likelihood approach is possible but it requires major computations and time consumption that it is usually used in binary response model or count model (Romeu and Vera-Hernandez 2005, Weiss 1999, Terza 1998, Heckman 1978). Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 2 / 14

Contribution This presentation will describe the implementation of the Quasi-Maximum Likelihood (QMLE) techniques with control function approach (Blundell and Powell 2003, 2004) to estimate such FRM with CEEV (based on the theoretical paper by Hoa Nguyen and Je rey Wooldridge 2009). Properties of this estimator and other estimators will be compared using Monte Carlo simulations. Stata has several commands for continuous outcome with endogenous continuous or dummy explanatory variable (ivprobit, ivtobit, treatreg, ssm or gllamm). However, there are currently no commands for a CEEV in a nonlinear model. The frcount command will set a base for estimating nonlinear model with CEEV. Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 3 / 14

Model Speci cation The conditional mean is expressed as: E (y 1 jy 2, z, a 1 ) = Φ(α 1 y 2 + z 1 δ 1 + η 1 a 1 ) y 2 is the count endogenous variable with the conditional mean: E (y 2 jz, a 1 ) = exp(zδ 2 + a 1 ) y 2 jz, a 1 has a Poisson distribution while y 2 jz is NBII distributed. z, z 1 are 1 K and 1 L vector of exogenous explanatory variables (L > K) a 1 is independent of z and exp(a 1 ) = c 1 Gamma(δ 0, δ 0 ) Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 4 / 14

Model Speci cation We are interested in: E (y 1 jy 2, z) = R + Φ(α 1y 2 + z 1 δ 1 + η 1 a 1 )f (a 1 jy 2, z)da 1 = µ(θ 1 ; y 2, z) where θ 1 = (α 1, δ1, 0 η 1 ) 0 and g i = (y 2i, z 1i, a 1i ) The conditional mean E (y 1 jy 2, z) cannot simply have the closed-form solution so we need to use numerical approximation. The numerical routine for integration of unobserved heterogeneity in the conditional mean equation is based on the Adaptive Gauss Hermite Quadrature. Once the evaluation has been done, the numerical values can be passed on to a maximizer in order to nd the QMLE ^θ 1 The QMLE of θ 1 is obtained from the maximization problem: Max θ 1 2Θ n i=1 l i (θ 1 ) where the Bernoulli log-likelihood function is given by: l i (θ 1 ) = y i1 ln µ i + (1 y i1 ) ln(1 µ i ) with µ i is evaluated with numerical approximation. Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 5 / 14

Model Estimation Two-step Estimation Procedure: Estimate the reduced form of CEEV by using maximum likelihood of y i2 on z i in the NB model. Obtain the estimated parameters ˆδ 2 and ˆδ 0. Use the QMLE of y i1 on y i2, z i1 to estimate α 1, δ 1 and η 1 with the approximated conditional mean. Approximate the conditional mean using the estimated parameters in the rst step and by using the Adaptive Gauss-Hermite method. Obtain estimated parameters ˆα 1, ˆδ 1 and ˆη 1. Standard errors in the second stage is adjusted for the rst stage estimation p and computed by using delta method. p N(^δ2 δ 2 ) = N 1/2 N i=1 r i2 + o p (1) N(^θ 1 θ 1 ) = A1 1(N 1/2 N i=1 r i1 (θ 1 ; δ 2 )) + o p (1) ^r i1 (θ 1 ; δ 2 ) = ^s i (θ 1 ; δ 2 ) ^F 1^r i2 (δ 2 ) \ Avar(^θ 1 ) = 1 N ^A 1 1 N 1 N i=1^r i1^r 0 i1 ^A 1 1 Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 6 / 14

Average Partial E ects We are interested in partial e ects of the explanatory variables in non-linear models in order to get comparable magnitudes. - The conditional mean model is: E (y 1 jy 2, z, a 1 ) = Φ(α 1 y 2 + z 1 δ 1 + η 1 a 1 ) - The APE is obtained, for given z 1, y 2, by averaging the partial e ect across the distribution of a 1 in the population For a continuous z 1, APE = E a1 [δ 11 φ(α 1 y 2 + z 1 δ 1 + η 1 a 1 )] = δ 11 R + φ(α 1y 2 + z 1 δ 1 + η 1 a 1 )f (a 1 )da 1 [APE = ˆδ 11 N 1 N i=1 N Z + φ(ˆα 1 y 2i + z 1i^δ1 + ˆη 1 a 1 )f (a 1 )da 1 i=1 For a count y 2, for y 2 going from c k to c k+1 APE = Φ(α 1 c k+1 + z 1 δ 1 + η 1 a 1 ) Φ(α 1 c k + z 1 δ 1 + η 1 a 1 ) R +! [APE = N 1 Φ(ˆα 1c k+1 + z 1i^δ1 + ˆη 1 a 1 )f (a 1 )da R 1 + Φ(ˆα 1c k + z 1i^δ1 + ˆη 1 a 1 )f (a 1 )da 1 Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 7 / 14

Implementation of the Command Syntax The general syntax of the command is as follows: frcount depvar varlist [if] [in], endog(varname) iv(varlist) quad(#) maximize_options Details: - endo (varname) speci es that endogenous variable be included in varname - iv(varlist) speci es that instrument(s) be included in varlist - quad(#) sets the number of quadrature - maximize_options are passed to QMLE or NLS. Theses methods have to be indicated clearly. Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 8 / 14

Monte Carlo Simulation Data generating process: - The CEEV y 2 is generated from a Poisson distribution with conditional mean: λ = E(y 2 jz, x 1, x 2, a 1 ) = exp(0.01 x 1 +0.01 x 2 +2 iv + a 1 ) using independent draws of the normal variables iv N(0, 0.2 2 ), x 1 N(0, 0.005 2 ), x 2 N(0, 0.01 2 ) and exp(a 1 ) Gamma(1, 1/δ 0 ) where δ 0 = 2. - We generate the dependent variable y 1 by rst drawing a binomial random variable x with n trials and probability p and then y 1 = x/n. In this simulation, we use n = 100 and p = E(y 1 jy 2, x 1, x 2, a 1 ) =Φ(0.1 x 1 +0.1 x 2 0.1 y 2 +0.5 a 1 ) We run 500 replications for three sample sizes 500, 1000 and 5000 respectively. Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 9 / 14

Table 1: Simulation Results We report sample means and sample standard deviations of these 500 estimates. Table 1 show the results of simulations for parameter α 1 estimated by QMLE and three alternative estimators for di erent sample size. Table of Results Simulation Result Sample Size 500 1000 5000 Method Mean (S.d.) Mean (S.d.) Mean (S.d.) OLS.0124.0044.0122.0030.0122.0013 2SLS.0389.0162.0386.0107.0382.0050 QMLE.1025.0124.1010.0089.1000.0036 NLS.1025.0124.1011.0089.0999.0036 Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 10 / 14

Conclusion The QMLE method produces unbiased and more e cient estimates compared to three alternative estimators. The QMLE estimates are signi cant and APEs are available to get di erent magnitudes for increasing values of the CEEV. Other traditional estimates only produce the same partial e ect for increasing values of the CEEV. Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 11 / 14

Example. frcount y1 x1 x2, endog(y2) iv(iv) quad(50) qmle Getting Initial Values: Iteration 0: log likelihood = 845.94337 Iteration 1: log likelihood = 845.33452 Iteration 2: log likelihood = 845.3345 Resetting the parameters... Iteration 0: log likelihood = 848.3366 Iteration 1: log likelihood = 848.2526 Iteration 2: log likelihood = 848.2526 Resetting the parameters... Iteration 0: log likelihood = 848.44551 Iteration 1: log likelihood = 848.44551 Fitting Quasi MLE Model: Iteration 0: log likelihood = 679.63122 Iteration 1: log likelihood = 674.69828 Iteration 2: log likelihood = 674.69328 Iteration 3: log likelihood = 674.69327 Quasi MLE Fractional Response Model Number of obs = 1000 Corr(y1, y1hat) =.0665616 Log likelihood: 674.69327 Pseudo R2 =.0656221 y1 Coef. Std. Err. z P> z [95% Conf. Interval] y2.0940061.0053054 17.72 0.000.1044046.0836077 x1 1.315443 2.302827 0.57 0.568 3.198015 5.828902 x2 1.63701 1.187422 1.38 0.168.6902948 3.964314 eta1.4940986.0296782 16.65 0.000.4359305.5522667 Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 12 / 14

Future Work Run simulations for di erent values of η 1. Adopt Simulated Maximum Likehood method in order to get approximation and compare with the Adaptive Gauss-Hermite method. Simulate data with censored fractional response variable and compare the QML estimator of the FRM and the Tobit estimator. Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 13 / 14

Thank You THANK YOU! Nguyen and Nguyen (MSU & AU) Estimating FRM with CEEV 07/30 14 / 14