Lecture 10: Panel Data

Similar documents
GLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22

Fixed Effects Models for Panel Data. December 1, 2014

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Dealing With Endogeneity

10 Panel Data. Andrius Buteikis,

Econometrics of Panel Data

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Advanced Econometrics

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Applied Microeconometrics (L5): Panel Data-Basics

08 Endogenous Right-Hand-Side Variables. Andrius Buteikis,

Final Exam. Economics 835: Econometrics. Fall 2010

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

MEI Exam Review. June 7, 2002

The BLP Method of Demand Curve Estimation in Industrial Organization

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Economics 582 Random Effects Estimation

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Non-linear panel data modeling

Introduction to Estimation Methods for Time Series models. Lecture 1

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Empirical Market Microstructure Analysis (EMMA)

EC327: Advanced Econometrics, Spring 2007

Econometrics of Panel Data

Panel Data Econometrics

Linear Models and Estimation by Least Squares

Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid

Linear models and their mathematical foundations: Simple linear regression

Nonstationary Panels

Multiple Equation GMM with Common Coefficients: Panel Data

Econ 582 Fixed Effects Estimation of Panel Data

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

ECON/FIN 250: Forecasting in Finance and Economics: Section 8: Forecast Examples: Part 1

Panel Data Model (January 9, 2018)

Applied Quantitative Methods II

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

Lecture 8 Panel Data

Test of hypotheses with panel data

Specification testing in panel data models estimated by fixed effects with instrumental variables

Lecture Notes from ADVANCED ECONOMETRICS

Formulary Applied Econometrics

Instrumental Variables, Simultaneous and Systems of Equations

Questions and Answers on Unit Roots, Cointegration, VARs and VECMs

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Moreover, the second term is derived from: 1 T ) 2 1

Gibbs Sampling in Endogenous Variables Models

Short T Panels - Review

1 Introduction to Generalized Least Squares

Estimation of Time-invariant Effects in Static Panel Data Models

Linear Panel Data Models

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

α version (only brief introduction so far)

Sensitivity of GLS estimators in random effects models

Linear Regression. Junhui Qian. October 27, 2014

Panel Data: Linear Models

3. Linear Regression With a Single Regressor

Lecture 4: Heteroskedasticity

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

The Two-way Error Component Regression Model

1 Estimation of Persistent Dynamic Panel Data. Motivation

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Econ 510 B. Brown Spring 2014 Final Exam Answers

The Simple Regression Model. Part II. The Simple Regression Model

Structural Equation Modeling An Econometrician s Introduction

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Statistics 910, #5 1. Regression Methods

Diagnostics of Linear Regression

Ec402 Econometrics. Suitable for all candidates. Summer 2012 (part of) examination. Instructions to candidates. Time allowed: 3 hours

Simple and Multiple Linear Regression

PS 271B: Quantitative Methods II Lecture Notes

Economics Department LSE. Econometrics: Timeseries EXERCISE 1: SERIAL CORRELATION (ANALYTICAL)

Factor Models for Asset Returns. Prof. Daniel P. Palomar

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

ECON The Simple Regression Model

Lecture 6: Dynamic panel models 1

ECONOMICS 8346, Fall 2013 Bent E. Sørensen

Structural VAR Models and Applications

Chapter 2: simple regression model

Econometrics of Panel Data

Dynamic Panel Data Workshop. Yongcheol Shin, University of York University of Melbourne

Topic 10: Panel Data Analysis

Series Estimation of Partially Linear Panel Data Models with Fixed Effects *

Estimation of a Panel Data Model with Parametric Temporal Variation in Individual Effects

The Simple Linear Regression Model

STA 2201/442 Assignment 2

Problem Set - Instrumental Variables

11. Simultaneous-Equation Models

Simple Linear Regression: The Model

Econometrics of Panel Data

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

7. Integrated Processes

Introduction to Estimation Methods for Time Series models Lecture 2

xtseqreg: Sequential (two-stage) estimation of linear panel data models

ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 7: Single equation models

Testing Error Correction in Panel data

Econometric Analysis of Cross Section and Panel Data

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Transcription:

Lecture 10: Instructor: Department of Economics Stanford University 2011

Random Effect Estimator: β R y it = x itβ + u it u it = α i + ɛ it i = 1,..., N, t = 1,..., T E (α i x i ) = E (ɛ it x i ) = 0. Eα 2 i = σ 2 α. Eɛ 2 it = σ2 ɛ. Eɛ it ɛ is = 0. Eα i ɛ it = 0, t. In vector notation, y }{{} NT 1 = (y 11,..., y 1T,..., y N1,..., y NT ), }{{} α = (α 1,..., α N ) N 1 ɛ = (ɛ 11,..., ɛ 1T,..., ɛ N1,..., ɛ NT ), y = X β + u, β R is essentially GLS. u = α l + ɛ }{{} l = (1,..., 1) T 1

Note that Euu = E (α l + ɛ) (α l + ɛ) = E [αα ll + ɛɛ ] = σ 2 αi N ll + σ 2 ɛi N I T = σ 2 ɛω where [ Ω = I N I T + T ] [ σ2 α 1 σɛ 2 T ll = I N I T + T ] σ2 α σɛ 2 P T where P T = 1 T ll. To get Ω 1 Use [C + xy ] 1 = C 1 1 1 + y C 1 x C 1 xy C 1, where C is m m, x and y are m 1. (This is useful when you update OLS when an additional observation comes in)

Then ( ) 1 I T + σ2 α σɛ 2 ll σ 2 = I T α/σɛ 2 1 + T σα/σ 2 ɛ 2 ll = I T = I T ( 1 θ 2) P T = Q T + θ 2 P T where θ 2 = σ2 ɛ σ 2 ɛ +T σ2 α So Ω 1 = I N ( Q T + θ 2 P T ). and Q T = I T P T. β R = ( X Ω 1 X ) 1 ( X Ω 1 y ) T σ2 α σ 2 ɛ + T σ 2 α = ( X I N ( Q T + θ 2 P T ) X ) 1 ( X I N ( Q T + θ 2 P T ) y ) Obviously ( Var (β R ) = σɛ 2 X Ω 1 X ) 1 ( = σ 2 ɛ X I N ( ) ) Q T + θ 2 1 P T X. ll T

Fixed Effect, Within Estimator: β W Use within group variation only, demean the data first, then run OLS. Q T y i = Q T X i β + Q T u i. Note that Q T u i = Q T ɛ i. β W = (X I N Q T X ) 1 (X I N Q T y) = Var (β W ) = σ 2 ɛ (X I N Q T X ) 1 = σ 2 ɛ ( ( i i X i Q T X i ) 1 ( i X i Q T X i ) 1 X i Q T y i ) β W cannot estimate coefficient on time-invarying regressors. The coefficients on time-varying regressors are always consistent no matter whether α i is correlated with X i or not.

Between Estimator: β B Use only the variation in group mean(between group variation). P T y i = P T X i β + P T u i. Note that P T u i = α i l + P T ɛ i. β B = (X I N P T X ) 1 (X I N P T y) = = ( N i=1 ( x i x i ) ) 1 ( N i=1 ( x i ȳ i ) Var (β B ) = T σ 2 ū (X I N P T X ) 1 = T ) ( i X i P T X i ) 1 ( i ( ) σα 2 + σ2 ɛ (X I N P T X ) 1 T X i P T y i )

The random effect estimator is a linear combination of β W and β B : β R = ( X ( I ( Q T + θ 2 P T )) X ) 1 [ X (I Q T ) y + θ 2 X (I P T ) y ] = ( X ( I ( Q T + θ 2 P T )) X ) 1 X (I Q T ) X (X (I Q T ) X ) 1 X I Q T y + ( X ( I ( Q T + θ 2 P T )) X ) 1 θ 2 X (I P T ) X (X I P T X ) 1 X I P T y = β W + (I ) β B for = ( X ( I ( Q T + θ 2 P T )) X ) 1 X (I Q T ) X I = ( X ( I ( Q T + θ 2 P T )) X ) 1 θ 2 X (I P T ) X

You need neither σ 2 α nor σ 2 ɛ to compute either β W or β B. The only place where σα 2 and σɛ 2 are needed for β R is to calculate θ. ˆσ ɛ 2 1 N = N(T 1) i=1 (Q T y i Q T X i β W ) consistently estimates σɛ. 2 ˆσ 2 ū = 1 N N i=1 (ȳ i x i β B) 2 consistently estimates σ 2 ū = σ 2 α + 1 T σ2 ɛ. Reminder: If you first difference the data to get rid of the fix effect α i, then run LS, you Don t get the same result as β W, except in the simple case where T = 2. But if run GLS, then it is the same as β W.

Time-invariant Regressors Consider y it = x itβ + z i γ + α i + ɛ it γ can only be identified from between group variation. Case 1: z i uncorrelated with α i : ȳ i = x i β + z i γ + α i + ɛ i = x i β W + z i γ + α i + ɛ i + x i β x i β W = x i β W + z i γ + α i + ˆɛ i for ˆɛ i = ɛ i + x i β x i β W Then just estimate γ by a second step LS regression: ( n ) 1 ( n ) ˆγ = z i z i z i (ȳ i x i β W ) i=1 i=1 i=1 ( n ) 1 ( n = γ + z i z i z i (α i + ɛ i + x i β x i β W ) since β W p β. i=1 ) P γ

Case 2: z i some endogenous Consider: x 1it k 1 of them, exogenous. z 1i g 1 of them, exogenous. x 2it k 2 of them, endogenous. z 2i g 2 of them, endogenous. Need at least k 1 g 2 in order to estimate γ. Use A = (X 1it : Z 1i ) to instrument the equation So that define ˆγ IV as ȳ i x i β W = z i γ + (α i + ɛ i + x i β x i β W ) ˆγ IV = (Z P A Z) 1 ( Z P A (ȳ X β 0 )) where P A is the projection matrix into the column space of A.

Recall the equation in the entire sample: y it = x itβ + z i γ + α i + ɛ it This will give consistent but inefficient estimate of γ, the inefficiency comes from: (1) not making use of all instruments; An efficient set of instruments is given by: A = (QX 1, QX 2, PX 1, Z 1 ), where Q = I N Q T, P = I N P T. The reason that QX 2 can be used as is that the within group variation is always uncorrelated with α i : EX 2 Qα = EX 2 0 = 0. Note also that X 1 has been used as two sets of IVs, QX 1 is used to instrument X 1 while PX 1 is used to instrument Z.

In contrast to standard simulatenuous equation model, where you need EXCLUDED exogenous variable to instrument INCLUDED endogenuous variable, here you use INCLUDED exogenuous variable to instrument INCLUDED endogenous variable. This is the special feature of the time invariance of the fix effect α i. (2) ignore heteroscedasticity. ( Recall that Var (α l + ɛ) = σɛ 2 I N I T + 1 θ2 P θ 2 T ), where you can estimate σɛ 2 and θ 2 using consistent ˆγ IV. So the efficient IV estimator described in Hausman and Taylor(1981) is given by applied IV using instruments A = (QX 1, QX 2, PX 1, Z 1 ) to the transformed equations: ˆΩ 1/2 y = Ω 1/2 X β + Ω 1/2 Zγ + Ω 1/2 (α l + ɛ)

Lagged Dependent Variable Recall y it = x itβ + α i + ɛ it where it is assumed that Eɛ i x i = 0, or Eɛ it x it = 0, t, t. For β W to be consistent, it is crucially important that none of the lagged y it s appeared in x it. Adding lagged INdependent variable is never a problem. y it = x itβ 1 + x it 1 β 2 + α i + ɛ it The problem comes once you have a single lagged y it appearing on the right hand side: y it = x itβ 1 + y it 1 β 2 + α i + ɛ it y it 1 = x it 1β 1 + y it 2 β 2 + α i + ɛ it 1

Now β W is no longer consistent, since even after you difference out α i : y it y it 1 = (x it x it 1 ) β 1 + (y it 1 y it 2 ) β 2 + ɛ it ɛ it 1 LS or GLS of this differenced equation can t be consistent: E ((y it 1 y it 2 ) (ɛ it ɛ it 1 )) 0 Ey it 1 ɛ it 1 0 But since Ey it 2 (ɛ it ɛ it 1 ) = 0, you can use y it 2 to instrument y it 1 y it 2. The more lagged y s you add into the regressors, the more lagged y s you will need as instruments. In other words, the panel need to be long enough relative to the number of lagged y s in the regression, no matter how big the cross section is.

This is assuming that is no serial correlation in the ɛ it s. If ɛ it is MA(1): ɛ it = ρu it 1 + u it, then Ey it 2 ɛ it 1 = ρey it 2 (ρu it 2 + u it 1 ) 0 so y it 3 instead of y it 2 can be used to instrument: y it y it 1 = (x it x it 1 ) β 1 + (y it 1 y it 2 ) β 2 + ɛ it ɛ it 1 If ɛ it is AR(p) then there is nothing you can do with it, unless you exclude some x i (e.g., lead and lagged value) from the regressors so that they can be used as instruments. So a single y it 1 as a regressor is sufficient to bring up all these problem created by serial correlation in the ɛ it s, while only lagged INdependent variables x it don t.

Incidental Parameter Consider estimating the coefficient on lagged dependent variable y i1 = βy i0 + α i + ɛ i1 y i2 = βy i1 + α i + ɛ i2 If T is fixed, MLE for β is not consistent. Also, we can t estimate the nuisance parameters α i,i = 1,..., n consistently. Assume ɛ it N (0, 1), the likelihood function is: Const 1 2 n [(y i1 βy i0 α i ) 2 + (y i2 βy i1 α i ) 2] i=1 First concentrate out α i, by just taking first order condition: ˆα i = y i1 + y i2 2 ˆβ y i1 + y i0 2

Put this back to get the concentrated likelihood function and simplies, it is up to a constant and constant proportion: n i=1 ( y i2 y i1 ˆβ ) 2 (y i1 y i0 ) This is just regressing y i2 y i1 on y i1 y i0, which is exactly β W and we know that it is inconsistent in the presence of lagged y i1 and y i0.