Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008

Similar documents
Non-linear panel data modeling

Econometric Analysis of Cross Section and Panel Data

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Econometrics of Panel Data

Munich Lecture Series 2 Non-linear panel data models: Binary response and ordered choice models and bias-corrected fixed effects models

What s New in Econometrics? Lecture 14 Quantile Methods

Limited Dependent Variables and Panel Data

A dynamic model for binary panel data with unobserved heterogeneity admitting a n-consistent conditional estimator

Advanced Econometrics

Missing dependent variables in panel data models

THE BEHAVIOR OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF DYNAMIC PANEL DATA SAMPLE SELECTION MODELS

Identification in Discrete Choice Models with Fixed Effects

Least Squares Estimation-Finite-Sample Properties

Econometrics of Panel Data

Applied Econometrics Lecture 1

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Simplified Implementation of the Heckman Estimator of the Dynamic Probit Model and a Comparison with Alternative Estimators

Semiparametric Identification in Panel Data Discrete Response Models

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Next, we discuss econometric methods that can be used to estimate panel data models.

Estimation of Dynamic Regression Models

Lecture 6: Discrete Choice: Qualitative Response

New Developments in Econometrics Lecture 16: Quantile Estimation

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Short T Panels - Review

Applied Health Economics (for B.Sc.)

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

SIMPLE SOLUTIONS TO THE INITIAL CONDITIONS PROBLEM IN DYNAMIC, NONLINEAR PANEL DATA MODELS WITH UNOBSERVED HETEROGENEITY

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1

Estimation of Structural Parameters and Marginal Effects in Binary Choice Panel Data Models with Fixed Effects

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Christopher Dougherty London School of Economics and Political Science

Estimation of Dynamic Panel Data Models with Sample Selection

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Estimating Panel Data Models in the Presence of Endogeneity and Selection

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates

Informational Content in Static and Dynamic Discrete Response Panel Data Models

Panel Data Exercises Manuel Arellano. Using panel data, a researcher considers the estimation of the following system:

Jeffrey M. Wooldridge Michigan State University

ECONOMETFUCS FIELD EXAM Michigan State University May 11, 2007

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 2. Dynamic panel data models

FLEXIBLE CORRELATED RANDOM EFFECTS ESTIMATION IN PANEL MODELS WITH UNOBSERVED HETEROGENEITY

An Exponential Class of Dynamic Binary Choice Panel Data Models with Fixed Effects

Estimating Semi-parametric Panel Multinomial Choice Models

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

3. Linear Regression With a Single Regressor

Econometric Analysis of Panel Data. Final Examination: Spring 2013

Semiparametric Estimation of a Sample Selection Model in the Presence of Endogeneity

Women. Sheng-Kai Chang. Abstract. In this paper a computationally practical simulation estimator is proposed for the twotiered

An overview of applied econometrics

Lecture 8 Panel Data

Identification of Regression Models with Misclassified and Endogenous Binary Regressor

Identification and Estimation of Nonlinear Dynamic Panel Data. Models with Unobserved Covariates

Panel data methods for policy analysis

Econometrics of Panel Data

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Partial Identification and Inference in Binary Choice and Duration Panel Data Models

Bias Corrections for Two-Step Fixed Effects Panel Data Estimators

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Specification Tests in Unbalanced Panels with Endogeneity.

Binary Models with Endogenous Explanatory Variables

Linear Panel Data Models

Lecture 14 More on structural estimation

WISE International Masters

Final Exam. Economics 835: Econometrics. Fall 2010

ECON 4160, Autumn term Lecture 1

On IV estimation of the dynamic binary panel data model with fixed effects

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Linear Regression with Time Series Data

Econometric Analysis of Games 1

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

On the Use of Linear Fixed Effects Regression Models for Causal Inference

EMERGING MARKETS - Lecture 2: Methodology refresher

-redprob- A Stata program for the Heckman estimator of the random effects dynamic probit model

Iterative Bias Correction Procedures Revisited: A Small Scale Monte Carlo Study

Outline. Overview of Issues. Spatial Regression. Luc Anselin

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

Partial effects in fixed effects models

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Discrete panel data. Michel Bierlaire

Econometric Analysis of Panel Data. Final Examination: Spring 2018

The regression model with one stochastic regressor (part II)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

ECO Class 6 Nonparametric Econometrics

Introductory Econometrics

Econ 510 B. Brown Spring 2014 Final Exam Answers

CORRELATED RANDOM EFFECTS MODELS WITH UNBALANCED PANELS

Econ 582 Fixed Effects Estimation of Panel Data

Econometrics of Panel Data

Economics 536 Lecture 21 Counts, Tobit, Sample Selection, and Truncation

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Transcription:

Panel Data Seminar Discrete Response Models Romain Aeberhardt Laurent Davezies Crest-Insee 11 April 2008 Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 1 / 29

Contents Overview and Strategies 1 Overview and Strategies 2 Simple Approaches and their Drawbacks Linear Probability Model Fixed effects : the Incidental Parameters Problem Random Effects : the assumptions are too strong 3 Classical Remedies Conditional Logit : removing the Fixed Effects Chamberlain s and Mundlak s Approaches : relaxing the Random Effects assumption 4 Extensions Dynamic framework Semi-Parametric approach Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 2 / 29

Introduction Overview and Strategies Panel data characterized by an outcome of the form : y it = F (x it β + α i + u it ) Main advantage of panel data : possibility to take into account the unobserved heterogeneity α i Main difficulty with panel data : dealing with unobserved heterogeneity, in particular : relationship between α i and x it Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 3 / 29

Overview and Strategies Important reminder The usual denomination of Fixed Effects and Random Effects is misleading Fixed Effects means no assumption concerning the dependence between α i and x it Random Effects means in general an independence assumption between α i and x it (although it can be relaxed) Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 4 / 29

Overview and Strategies Simple strategies Linear Probability Model Good for a quick start But bad properties (worse than in cross section) Probit / Logit with Fixed Effects as dummies Conceptually simple But ML estimators are consistent only when N and T (incidental parameters problem) Simple Random Effects Probit Computationaly quite easy (already implemented) But one strong assumption of no correlation between unobserved heterogeneity and covariates So one misses the point of using panel data Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 5 / 29

Overview and Strategies Classical Remedies Conditional Logit In the spirit of the Within or FD transformations No assumptions required on the correlation between unobserved heterogeneity and covariates But the identification hinges on the functional form (logit) Chamberlain s and Mundlak s Approaches Based on the RE framework, computationaly easy Relaxes the no correlation assumption Allows only for a restricted relation between unobserved heterogeneity and covariates Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 6 / 29

Extensions Overview and Strategies Dynamic framework Relaxes the strict exogeneity assumption In particular, allows for the presence of the lagged dependent variable among the covariates Question of state dependence vs. unobserved heterogeneity Raises a new issue : the initial conditions problem Semi-parametric models Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 7 / 29

Overview and Strategies Main Reference for this class Econometric Analysis of Cross Section and Panel Data, J.M. Wooldridge Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 8 / 29

Simple Approaches and their Drawbacks Contents 1 Overview and Strategies 2 Simple Approaches and their Drawbacks Linear Probability Model Fixed effects : the Incidental Parameters Problem Random Effects : the assumptions are too strong 3 Classical Remedies Conditional Logit : removing the Fixed Effects Chamberlain s and Mundlak s Approaches : relaxing the Random Effects assumption 4 Extensions Dynamic framework Semi-Parametric approach Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 9 / 29

Simple Approaches and their Drawbacks Linear Probability Model Linear Probability Model : good for a quick start Main advantage : allows to use all the simple and well known methods developped for linear models (FE, RE, Chamberlain s approach,...) Same problems as in the cross section case (predicted values outside the unit interval, heteroskedasticity) Even less appealing : it implies x i β α i 1 x i β Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 10 / 29

Simple Approaches and their Drawbacks Fixed effects : the Incidental Parameters Problem First idea : using dummies for fixed effects Interest : no assumption on the correlation structure between α i and x it A priori simple : just add dummies in the equation and use standard estimation procedures Danger : MLE estimators are asymptotically unbiased and consistent only if N and T Intuition : in the ML framework the number of regressors is fixed, and here it increases with N Fixed effects are biased and poorly estimated when T is small It contaminates the rest of the coefficients through the MLE procedure Difference with the linear case : the estimation of β did not depend on the α i (Frish-Waugh) This is called the incidental parameters problem Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 11 / 29

Simple Approaches and their Drawbacks Fixed effects : the Incidental Parameters Problem Chamberlain s illustration of the incidental parameters problem Very simple framework : ML estimation of a logit model with two independent time periods, fixed effects and one explanatory variable x it s.t. i, x i1 = 0 and x i2 = 1 P(y it = 1 x, α) = eα i +x it β 1 + e α i +x it β if y i1 = 0 and y i2 = 0 then ˆα i = if y i1 = 1 and y i2 = 1 then ˆα i = + if y i1 + y i2 = 1 then ˆα i = ˆβ/2 P and ˆβ = 2 log(ñ 2 /ñ 1 ) 2β with ñ 1 = #{i y i1 = 1, y i2 = 0} and ñ 2 = #{i y i1 = 0, y i1 = 1} Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 12 / 29

Simple Approaches and their Drawbacks Random Effects : the assumptions are too strong RE : simple procedure but strong assumptions Basic assumptions : P(y it = 1 x it, α i ) = Φ(x it β + α i ) y i1, y i2,..., y it independent conditional on (x i, α i ) Density of (y i1,..., y it ) conditional on (x i, α i ) : f (y i1,..., y it x i, α i, β) T = f (y it x it, α i, β) = t=1 T Φ(x it β + α i ) y it [1 Φ(x it β + α i )] 1 y it t=1 Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 13 / 29

Simple Approaches and their Drawbacks Random Effects : the assumptions are too strong RE : simple procedure but strong assumptions One needs to integrate out α i, which requires an additional assumption : α i x i N (0, σ 2 α) The conditional density becomes f (y i1,..., y it x i, β, σ α ) = + T [ t=1 f (y it x it, α, β)] 1 ( ) α ϕ dα σ α σ α This is already implemented or easy to implement in standard softwares The independance assumption of α i and x i is very strong One misses the point of using panel data But this procedure will be the basis for more complicated approaches Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 14 / 29

Contents Classical Remedies 1 Overview and Strategies 2 Simple Approaches and their Drawbacks Linear Probability Model Fixed effects : the Incidental Parameters Problem Random Effects : the assumptions are too strong 3 Classical Remedies Conditional Logit : removing the Fixed Effects Chamberlain s and Mundlak s Approaches : relaxing the Random Effects assumption 4 Extensions Dynamic framework Semi-Parametric approach Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 15 / 29

Classical Remedies Conditional Logit : make the α i vanish Conditional Logit : removing the Fixed Effects In the spirit of the linear FE model Requires no assumption on α i y i1,..., y it independent conditional on (x i, α i ) The distribution of (y i1,..., y it ) conditional on does not depend on α i x i, α i and n i = T t=1 y it Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 16 / 29

Classical Remedies Conditional Logit : make the α i vanish Conditional Logit : removing the Fixed Effects Example with T = 2, the result is based on and then P(y i1 = 1, y i2 = 0 α i, x i ) P(y i1 = 0, y i2 = 1 α i, x i ) = eβ(x i1 x i2 ) P(y i1 = 0, y i2 = 1 y i1 + y i2 = 1, α i, x i ) = independent of α i and hence, P(y i1 = 0, y i2 = 1 y i1 + y i2 = 1, x i ) = 1 1 + e β(x i1 x i2 ) 1 1 + e β(x i1 x i2 ) Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 17 / 29

Classical Remedies Conditional Logit : make the α i vanish Conditional Logit : removing the Fixed Effects Conditional log likelihood for observation i is cll i (β) = 1 {ni =1}(w i log Λ[(x i2 x i1 )β] Same properties as the usual likelihood + (1 w i ) log(1 Λ[(x i2 x i1 )β])) The identification uses only the individuals who change state Only drawback : the identification hinges on the functional form (logit) and there is no similar strategy with probit for example There is still a conditional independance assumption for the y it : i.e. no serial correlation in the u it, and no state dependence. Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 18 / 29

Back to the RE Classical Remedies Chamberlain s and Mundlak s Approaches : relaxing the Random Effects assumption Relaxing the crucial RE assumption : α i x i N (0, σ 2 α) by specifying a special form of dependence Mundlak (1978) : α i x i N (ψ + x i ξ, σ 2 a) Chamberlain (1980), more general form : instead of x i, he uses the vector of all explanatory variables across all time periods x i We can use standard RE probit software by just adding all the x i to all time periods (Chamberlain), or only the x i (Mundlak) Restrictive in the sense that it specifies a distribution of α i w.r.t. x i Still strong assumptions on the distribution tails for α i At least allows for some correlation Can be extended, for instance by specifying the distribution of the higher moments of α i x i Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 19 / 29

Strict exogeneity Classical Remedies Chamberlain s and Mundlak s Approaches : relaxing the Random Effects assumption All the previous procedures hinge on the strict exogeneity of x it conditional on α i : x it independent of u it at all time periods t Very difficult to correct for endogeneity in nonlinear models But an easy test can be implemented : Let w it be a subset of x it which potentially fail the strict exogeneity assumption Include w it+1 as an additional set of covariates Under the null hypothesis of strict exogeneity, the coefficients on w it+1 should be statistically insignificant Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 20 / 29

Contents Extensions 1 Overview and Strategies 2 Simple Approaches and their Drawbacks Linear Probability Model Fixed effects : the Incidental Parameters Problem Random Effects : the assumptions are too strong 3 Classical Remedies Conditional Logit : removing the Fixed Effects Chamberlain s and Mundlak s Approaches : relaxing the Random Effects assumption 4 Extensions Dynamic framework Semi-Parametric approach Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 21 / 29

Extensions Dynamic framework State dependence vs. unobserved heterogeneity Dynamic framework : P(y it = 1 y it 1,..., y i0, x i, α i ) = G(x it δ + ρy it 1 + α i ) x it are supposed to be strictly exogenous, but y it 1 appears on the RHS so we lose the strict exogeneity (y it 1 depends on u it 1 ) Extensions of the previous approaches Conditional logit cf Chamberlain (1985, 1993), Magnac (2000), Honoré Kyriazidou (1997) Extension of the RE framework but raises the initial conditions problem Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 22 / 29

Extensions Dynamic framework Conditional Logit in a dynamic framework You need at least 4 observations per individual Intuition : in order to make the α i vanish, you need to consider the two sets of events : A = {y i0 = d 0, y i1 = 0, y i2 = 1, y i3 = d 3 } and B = {y i0 = d 0, y i1 = 1, y i2 = 0, y i3 = d 3 } With no other covariates, see Chamberlain (1985), Magnac (2000) Extensions with strictly exogenous covariates, see Honoré and Kyriazidou (2000) Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 23 / 29

Extensions Dynamic framework Back to RE framework, the initial conditions problem Form of the joint density of the observations ranging from 0 to T for an individual i : f (y i0, y i1,..., y it α i, x i, β) = T f (y it y it 1, x it, α i, β)f (y i0 x i0, α i ) t=1 Goal : integrating out α i in order to obtain : f (y i0, y i1,..., y it x i, β) = T t=1 f (y it y it 1, x it, α i, β)f (y i0 x i, α i )g(α i x i )dα i Initial conditions problem : specifying f (y i0 x i, α i ) Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 24 / 29

Extensions Dynamic framework Initial conditions problem : Heckman s approach Specify f (y i0 x i, α i ) and then specify a density for α i given x i For instance, assume that y i0 follows a probit model with success probability Φ(η + x i π + γα i ) Then integrate out α i by specifying for instance α i x i N (m i, σ 2 i ) Problem : it is very difficult to specify the density of y i0 given (x i, α i ) Problem : because the true density of y i0 given (x i, α i ) is not known and is supposed to depend on y i 1, estimators are biased when T < + Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 25 / 29

Extensions Dynamic framework Initial conditions problem : Wooldridge s approach Instead of working on the full density f (y i0, y i1,..., y it α i, x i, β) Wooldridge prefers to work on the conditional density f (y i1,..., y it y i0, α i, x i, β) Advantage : remaining agnostic on the density of y i0 given (x i, α i ) Then specify a density for α i given (y i0, x i ) and keep conditioning on y i0 in addition to x i f (y i1,..., y it y i0, x i, θ) = + f (y i1,..., y it y i0, x it, α, β)h(α y i0, x i, γ)dα For example, with h(α y i0, x i, γ) N (ψ + ξ 0 y i0 + x i ξ, σ 2 a) y it = 1 {ψ+xit δ+ρy it 1 +ξ 0 y i0 +x i ξ+a i +e it >0} We can use standard RE probit software by just adding y i0 and x i to all time periods Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 26 / 29

Extensions Semi-Parametric approach Reminder on Manski s approach in cross section (1988) Model y i = 1 {xi β+ε i >0} Until now, the conditional density f (ε x i ) was specified Can we relax this assumption? E(ε X ) = 0 is not enough to identify β (Manski, 1988) med(ε X ) will allow to identify β/ β under one more technical assumption concerning X : there must be one continuous variable X k, s.t. the density of X k X k is positive everywhere a.s. ˆβ MS arg max β β 0 = arg max E((2Y 1)1 {X β β>0}) n Y i 1 {X β 0} + (1 Y i )1 {X β<0} i=1 Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 27 / 29

Extensions Semi-Parametric approach Reminder on Manski s approach in cross section (1988) ˆβ MS P β0 n 1/3 ( ˆβMS β 0 ) L D See Kim and Pollard (1990) for the exact definition of D Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 28 / 29

Extensions to panel data Extensions Semi-Parametric approach See Honoré and Kyriazidou (1997) : Extension to dynamic panel data with exogenous covariates P(y i0 = 1 x i, α i ) = p 0 (x i, α i ) P(y it = 1 x i, α i, y i0,..., y it 1 ) = F (x it β + γy it 1 + α i ) with T = 4, β and γ may be estimated by maximizing w.r.t. b an g n 1 {xi2 x i3 =0}(y i2 y i1 )sgn((x i2 x i1 )b + g(y i3 y i0 )) i=1 Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 29 / 29