Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Similar documents
Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Econometrics of Panel Data

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Advanced Econometrics

Multiple Equation GMM with Common Coefficients: Panel Data

Linear dynamic panel data models

1 Estimation of Persistent Dynamic Panel Data. Motivation

Dealing With Endogeneity

Dynamic Panel Data Models

Panel Data Econometrics

Economics 582 Random Effects Estimation

Dynamic panel data methods

Panel Data Model (January 9, 2018)

Dynamic Panels. Chapter Introduction Autoregressive Model

Lecture 8 Panel Data

Topic 10: Panel Data Analysis

Chapter 2. Dynamic panel data models

Short T Panels - Review

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria (March 2017) TOPIC 1: LINEAR PANEL DATA MODELS

Econometrics I. Ricardo Mora

Non-linear panel data modeling

Lecture 6: Dynamic panel models 1

Applied Microeconometrics (L5): Panel Data-Basics

Linear Panel Data Models

Panel Data: Linear Models

Dynamic Panel Data Workshop. Yongcheol Shin, University of York University of Melbourne

Econometrics of Panel Data

Panel Data Exercises Manuel Arellano. Using panel data, a researcher considers the estimation of the following system:

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics of Panel Data

Econ 582 Fixed Effects Estimation of Panel Data

EC327: Advanced Econometrics, Spring 2007

Specification testing in panel data models estimated by fixed effects with instrumental variables

Instrumental Variables and GMM: Estimation and Testing. Steven Stillman, New Zealand Department of Labour

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

Consistent OLS Estimation of AR(1) Dynamic Panel Data Models with Short Time Series

Lecture 10: Panel Data

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Econometrics of Panel Data

Instrumental Variables, Simultaneous and Systems of Equations

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Econometric Analysis of Cross Section and Panel Data

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

10 Panel Data. Andrius Buteikis,

Robust Standard Errors to spatial and time dependence in. state-year panels. Lucciano Villacorta Gonzales. June 24, Abstract

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Asymptotic Properties and simulation in gretl

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Dynamic Panel Data Models

Econometric Methods and Applications II Chapter 2: Simultaneous equations. Econometric Methods and Applications II, Chapter 2, Slide 1

Applied Quantitative Methods II

Short Questions (Do two out of three) 15 points each

Econometrics of Panel Data

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Linear Regression with Time Series Data

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Quantile Regression for Dynamic Panel Data

Casuality and Programme Evaluation

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

1 Introduction The time series properties of economic series (orders of integration and cointegration) are often of considerable interest. In micro pa

Lecture 4: Linear panel models

Instrumental Variables

Strength and weakness of instruments in IV and GMM estimation of dynamic panel data models

Lecture 4: Heteroskedasticity

Cluster-Robust Inference

Economics 472. Lecture 10. where we will refer to y t as a m-vector of endogenous variables, x t as a q-vector of exogenous variables,

Econ 510 B. Brown Spring 2014 Final Exam Answers

Instrumental Variables and the Problem of Endogeneity

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Lecture 7: Dynamic panel models 2

Fixed Effects Models for Panel Data. December 1, 2014

Econometrics Summary Algebraic and Statistical Preliminaries

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Econometrics of Panel Data

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program

Linear Regression with Time Series Data

Regression with time series

Intermediate Econometrics

Regression and Statistical Inference

Notes on Panel Data and Fixed Effects models

We begin by thinking about population relationships.

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects

Econometric Methods for Panel Data

Weak Instruments and the First-Stage Robust F-statistic in an IV regression with heteroskedastic errors

8. Instrumental variables regression

4.8 Instrumental Variables

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

Linear Regression with Time Series Data

Linear Regression. Junhui Qian. October 27, 2014

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Transcription:

Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE

Introduction Chapter 6. Panel Data 2

Panel data The term panel data refers to data sets with repeated observations over time (or other dimensions) for a given cross-section of units. Units can be persons, households, rms, countries,... It is dierent from repeated cross-sections. Main advantages of panel data: Permanent unobserved heterogeneity. Dynamic responses and error components. Chapter 6. Panel Data 3

Cigarette Consumption We will use the same example throughout the chapter. Consider the estimation of the demand for cigarettes: ln C it = β 0 + β 1 ln P it + β 2 ln Y it + η i + v it, where: ln C it is log consumption of cigarettes, ln P it is log price of the cigarettes, ln Y it is log income of the individual, η i + v it is unobserved. Chapter 6. Panel Data 4

Static Models Chapter 6. Panel Data 5

General notation We consider the following model: y it = x itβ + (η i + v it ). where y it and x it are observed, and η i + v it is unobserved. Let {y it, x it } t=1,...,t i=1,...,n be our sample. We dene: y i y y i1. y it y 1. y N, X i =, X = x i1. x it X 1. X N, η i = η i ι T, and v i =, η = where ι T is a size T vector of ones. η 1. η N, and v = Hence, we can use the following compact notation: v i1. v it v 1. v N y i = X i β + (η i + v i ), and y = Xβ + (η + v). Chapter 6. Panel Data 6,,

General assumptions for static models For static models, we assume: Fixed eects: E[x it η i ] 0 or random eects: E[x it η i ] = 0. Strict exogeneity: E[x it v is ] = 0 s, t. This assumption rules out eects of past v is on current x it (e.g. x it cannot include lagged dependent variables). Error components: E[η i ] = E[v it ] = E[η i v it ] = 0. Serially uncorrelated shocks: E[v it v is ] = 0 s t. Homoskedasticity and i.i.d. errors: η i iid(0, σ 2 η) and v it iid(0, σ 2 v), which does not aect any crucial result, but simplies some derivations. Chapter 6. Panel Data 7

Pooled OLS A simple approach: dene: u η + v and estimate β by OLS: ˆβ OLS = (X X) 1 X y. The properties of ˆβ OLS depend on E[x it η i ], as E[x it v it ] = 0: If E[x itη i] = 0 E[x ju j] = 0 (random eects): E[x it v it ]=0 ˆβ OLS is consistent as N, or T, or both. it is ecient only if σ 2 η = 0. If E[x itη i] 0 E[x ju j] 0 (xed eects): ˆβ OLS is inconsistent as N, or T, or both. cross-section results are also inconsistent (but panel helps in constructing a consistent alternative). Chapter 6. Panel Data 8

Pooled OLS in Cigarette Demand In our previous example, we would redene our regression as: ln C j = β 0 + β 1 ln P j + β 2 ln Y j + u j, where I used subindex j to emphasize that each observation it is considered as one independent observation. Potential problems: preferences, education, OLS ln Prices (β 1 ) -0.083 (0.015) ln Income (β 2 ) -0.032 (0.006) parental smoking behavior,... Chapter 6. Panel Data 9

The xed eects model. Within groups estimation. Chapter 6. Panel Data 10

Within groups estimator Write the model in deviations from individual means, ỹ it y it ȳ i, where ȳ i T 1 T t=1 y it: ỹ it = (x it x i ) β + (η i η i ) + (v it v i ) = x itβ + ṽ it. Given the previous assumptions: E[ x it ṽ it ] = 0. Therefore, OLS on the transformed model: ˆβ W G = ( X X) 1 X ỹ, is a consistent estimator either if E[x it η i ] 0 or E[x it η i ] = 0. Strict exogeneity is a crucial assumption (see next slide). Chapter 6. Panel Data 11

The role of strict exogeneity In the case where N and T is xed, consistency depends on strict exogeneity. To see it, recall that: x it = x it 1 T (x i1 +... + x it ) and ṽ it = v it 1 T (v i1 +... + v it ). Therefore E[ x it ṽ it ] = 0 requires E[x it v is ] = 0 s, t unless T. This has motivated the development of dynamic panel data models, to relax this assumption. Chapter 6. Panel Data 12

Pros and cons of within groups estimator Advantage: consistent either if E[x it η i ] 0 or E[x it η i ] = 0. Limitations: Not ecient: When N but T is xed, less ecient that e.g. ˆβGLS if E[x it η i ] = 0. If E[x it η i ] 0, it is only ecient when all regressors are correlated with η i. It does not allow to identify coecients for time-invariant regressors, and identication is through switchers. Chapter 6. Panel Data 13

Within Groups in Cigarette Demand In our example: OLS WG ln Prices (β 1 ) -0.083 (0.015) -0.292 (0.023) ln Income (β 2 ) -0.032 (0.006) 0.107 (0.019) Chapter 6. Panel Data 14

Least Squares Dummy Variables The Within Groups estimator can also be obtained by including a set of N individual dummy variables: y it = x itβ + η 1 D 1i +... + η N D Ni + v it, where D hi = 1{h = i} (e.g. D 1i takes the value of 1 for the observations on individual 1 and 0 for all other observations). OLS estimation of this model gives numerically equivalent estimates to WG (that's why ˆβ W G is a.k.a. ˆβLSDV ). This gives intuition on why WG is not very ecient if there is only limited time-series variation (degrees of freedom are N T K N = N(T 1) K). Chapter 6. Panel Data 15

LSDV in Cigarette Demand We can generate individual (rm) dummies and estimate by OLS, to check that it delivers the same results: OLS WG LSDV ln Prices (β 1) -0.083-0.292-0.292 (0.015) (0.023) (0.023) ln Income (β 2) -0.032 0.107 0.107 (0.006) (0.019) (0.019) Individual 1 (β 0 + η 1) 2.804 (0.288) Individual 2 (β 0 + η 2) 3.455 (0.398) Individual 3 (β 0 + η 3) 2.891 (0.416) Individual 4 (β 0 + η 4) 2.908 (0.384) Individual 5 (β 0 + η 5) 3.490 (0.433) Individual 6 (β 0 + η 6) 2.092 (0.325) Individual 7 (β 0 + η 7) 1.769 (0.393)...... Chapter 6. Panel Data 16

First-Dierenced Least Squares Another transformation we can consider is rst dierences: y it = x itβ + v it, for i = 1,..., N; t = 2,..., T where y it = y it y it 1. Takes out time-invariant individual eects ( η i = η i η i = 0), so OLS on the dierenced model is consistent. Consistency requires E[ x it v it ] = 0 which is implied by but weaker than strict exogeneity. WG more ecient than FDLS under classical assumptions. FDLS more ecient if v it random walk ( v it = ε it iid(0, σ 2 ε)). Chapter 6. Panel Data 17

FDLS in Cigarette Demand To get FDLS we generate st dierences and estimate by OLS: OLS WG LSDV FDLS ln Prices (β 1 ) -0.083-0.292-0.292-0.413 (0.015) (0.023) (0.023) (0.035) ln Income (β 2 ) -0.032 0.107 0.107 0.178 (0.006) (0.019) (0.019) (0.055) Individual 1 (β 0 + η 1 ) 2.804 (0.288) Individual 2 (β 0 + η 2 ) 3.455 (0.398) Individual 3 (β 0 + η 3 ) 2.891 (0.416) Individual 4 (β 0 + η 4 ) 2.908 (0.384) Individual 5 (β 0 + η 5 ) 3.490 (0.433) Individual 6 (β 0 + η 6 ) 2.092 (0.325) Chapter 6. Panel Data Individual 7 (β 0 + η 7 ) 1.769 18

The random eects model. Error components. Chapter 6. Panel Data 19

Uncorrelated eects Now we assume uncorrelated or random eects: E[x it η i ] = 0. In this case, OLS is consistent, but not ecient. The ineciency is provided by the serial correlation induced by η i : E[u it u is ] = E[(η i + v it )(η i + v is )] = E[η 2 i ] = σ 2 η. The variance of the unobservables (under classical assumptions) is: E[u 2 it] = E[η 2 i ] + E[v 2 it] = σ 2 η + σ 2 v. Chapter 6. Panel Data 20

Error structure Therefore, the variance-covariance matrix of the unobservables is: ση 2 + σv 2 ση 2... σ 2 η E[u i u ση 2 ση 2 + σv 2... σ 2 η i] =...... = Ω i, ση 2 ση 2... ση 2 + σv 2 whose dimensions are T T, and E[u i u h ] = 0 i h, or: Ω 1 0... 0 E[uu 0 Ω 2... 0 ] =...... = Ω, 0 0... Ω N which is block-diagonal with dimension NT NT. Chapter 6. Panel Data 21

Generalized Least Squares Under the classical assumptions, GLS (Balestra-Nerlove) estimator is consistent and ecient if E[x it η i ] = 0: ˆβ GLS = ( X Ω 1 X ) 1 X Ω 1 y. If E[x it η i ] 0 GLS is inconsistent as N and T is xed. This estimator is unfeasible because we do not know σ 2 η and σ 2 v. Chapter 6. Panel Data 22

Theta-dierencing ˆβ GLS is equivalent to OLS on the theta-dierenced model: y it = x it β + u it, where: and: y it = y it (1 θ)ȳ i, σ 2 v θ 2 = σv 2 + T ση 2. Consistency relies on E[x it η i ] = 0 (as η i not eliminated). Two special cases: If σ 2 η = 0 (i.e. no individual eect), OLS is ecient. If T, then θ 0, and y it ỹ it = y it ȳ i: WG is ecient. Chapter 6. Panel Data 23

Feasible GLS ˆβ GLS is unfeasible because we do not know ση 2 and σv. 2 A consistent estimator of σv 2 is provided by the WG residuals: ˆṽ it ỹ it x ˆβ it W G ˆσ 2 v = ˆṽ ˆṽ N(T 1) K Then, a consistent estimator of σ 2 η is given by the BG residuals: ȳ i = x iβ + η i + v i, i = 1,..., N ˆβ BG ( ˆσ ū 2 = ση 2 + 1 ) T σ2 v ˆū i ȳ i x i ˆβ BG = ˆū ˆū N K ˆσ2 η = ˆσ 2 ū 1 T ˆσ2 v. Chapter 6. Panel Data 24

Feasible GLS in Cigarette Demand In our example, if we now estimate ˆβ F GLS, we get: OLS WG FGLS ln Prices (β 1 ) -0.083-0.292-0.122 (0.015) (0.023) (0.014) ln Income (β 2 ) -0.032 0.107-0.012 (0.006) (0.019) (0.004) Chapter 6. Panel Data 25

Testing for correlated individual eects. Chapter 6. Panel Data 26

Testing for correlated eects (Hausman test) ˆβ W G is consistent regardless of E[x it η i ] being equal to zero or not. ˆβ F GLS is consistent only if E[x it η i ] = 0. we can test whether both estimates are similar! The Hausman test does exactly this comparison: h = ˆq [avar(ˆq)] 1 ˆq a χ 2 (K), under the null hypothesis E[x it η i ] = 0, where: ˆq = ˆβ W G ˆβ F GLS, and: ( ) ( ) avar(ˆq) = avar ˆβ W G avar ˆβ F GLS. Requires classical assumptions (FGLS to be more ecient than WG). Chapter 6. Panel Data 27

Hausman test in Cigarette Demand Output from software packages often includes the Hausman test. In our example: Statistic P-value Hausman test 24.661 0.000 Chapter 6. Panel Data 28

Dynamic Models Chapter 6. Panel Data 29

Autoregressive models with individual eects Chapter 6. Panel Data 30

Autorregressive panel data model We consider the following model: y it = αy it 1 + η i + v it α < 1. Other regressors can be included, but main results unaected. We assume: Error components: E[η i ] = E[v it ] = E[η i v it ] = 0. Serially uncorrelated shocks: E[v it v is ] = 0 s t. Predetermined initial cond.: E[y i0 v it ] = 0 t = 1,..., T. Chapter 6. Panel Data 31

Properties of pooled OLS and WG estimators Even assuming E[y it 1v it] = 0, still OLS delivers: because E[y it 1η i] = σ 2 η ( plim ˆα OLS > α, N 1 α t 1 1 α ) + α t 1 E[y i0η i] > 0. Doing the within groups transformation we see that: because E[ỹ it 1ṽ it] = Aσ 2 v < 0. plim ˆα W G < α, N ( ( ( A = (1 α) 1+T 1 α t 1 α T 1 t)) +αt (1 α T 1) ) T 2 (1 α) 2 WG bias vanishes as T (bias not small even with T = 15). Supposedly consistent estimators that give ˆα >> α OLS or ˆα << ˆα W G should be seen with suspicion. Chapter 6. Panel Data 32

Anderson and Hsiao Consider the model in rst dierences: y it = α y it 1 + v it. OLS in rst dierences is inconsistent: E[ y it 1 v it ] = σv 2 < 0. However, y it 2 or y it 2 are valid instruments for y it 1 : Relevance: E[ y it 2 y it 1] 0, E[y it 2 y it 1] 0. Orthogonality: E[ y it 2 v it] = E[y it 2 v it] = 0. Anderson and Hsiao (1981) proposed this 2SLS estimators: ) 1 ˆα AH = ( y 1 y 1 y 1 y, where: y 1 = Z (Z Z) 1 Z y 1, where Z can be y 2 or y 2. Requires min. three periods (T = 2 and y i0 ). Only ecient if T = 2. Chapter 6. Panel Data 33

AR(1) Cigarette Demand (No Covariates) In our example, we redene the model to be an AR(1) process (for now without regressors): ln C it = α ln C it 1 + η i + v it. The Anderson-Hsiao results (together with OLS and WG) are: OLS WG Anderson- Hsiao Lagged cons. (ln C it 1 ) 0.982 0.884 1.395 (0.003) (0.061) (0.090) Chapter 6. Panel Data 34

Dierenced GMM estimation. Chapter 6. Panel Data 35

GMM in 3 slides (I): the setup GMM nds parameter estimates that come as close as possible to satisfy orthogonality conditions (or moment conditions) in the sample. E.g., consider K regressors x i and L instruments z i : u i = y i f(x i, β) z i = g(x i ). The model species L moment conditions: E[z i u i ] = 0. Sample analogue: b N (β) = 1 N N z i u i (β). i=1 Chapter 6. Panel Data 36

GMM in 3 slides (II): the estimation Two possible cases cases: L = K (just identied): b N ( ˆβ GMM ) = 0. L > K (overidentied): ˆβGMM = arg min β b N (β) W N b N (β). W N is a positive denite weighting matrix. Optimal GMM (ecient) uses ( 1 N N i=1 E[z iu i u i z i ] ) 1, the inverse of the variance-covariance matrix, as weighting matrix. Chapter 6. Panel Data 37

GMM in 3 slides (III): asymptotic properties ˆβ GMM is a consistent estimator of β. It is asymptotically normal, with the following variance: where: avar( ˆβ GMM ) = (D W D) 1 D W S 0 W D(D W D) 1, b N (β) D plim N β, W plim W N, N S 0 1 N N E[z i u i u iz i]. i=1 Chapter 6. Panel Data 38

Moment conditions Given previous assumptions, several moment conditions: Equation Instruments Orthogonality cond. y i2 = α y i1 + v i2 y i0 E[ v i2 y i0 ] = 0 [ ( )] yi0 y i3 = α y i2 + v i3 y i0, y i1 E v i3 = 0 y i1 y i0 y i4 = α y i3 + v i4 y i0, y i1, y i2 E v i4 y i1 = 0 y i2. y it = α y it 1 + v it y i0, y i1, y i2,..., y it 2 E.. v it y i0 y i1 y i2. y it 2 = 0 We end up with (T 1)T/2 moment conditions. Chapter 6. Panel Data 39

Moment conditions in matrix form We can write these moment conditions as E[Z i v i] = 0, where: y i0 0 0 0... 0 0... 0 0 y i0 y i1 0... 0 0... 0 Z i =................... 0 0 0 0... y i0 y i1... y it 2 v i2 v i3 and vi =., v it and the sample analogue is: b N (α) = 1 N N Z i v i(α). Hence, the GMM estimator (proposed by Arellano and Bond, 1991) is: ( ) ( ) 1 N ˆα GMM = arg min v 1 N α i(α)z i W N Z i v i(α) = N N i=1 i=1 i=1 = ( y 1 ZWN Z y 1 ) 1 y 1 ZWN Z y. Chapter 6. Panel Data 40

Optimal weighting matrix The optimal weighting matrix (ecient GMM) is: ( 1 1 N W N = E[Z i v i v iz i]). N The sample analogue is obtained in a two-step procedure: i=1 ( ) 1 1 N W N = [Z i v N i(ˆα) v i (ˆα)Zi]. i=1 Windeijer (2005) proposes a nite sample correction of the variance that accounts for α being estimated. The most common one-step (and rst-step) matrix uses the structure of E[ v i v i]: 2 1 0 0... 0 E[ v i v i] = σv 2 1 2 1 0... 0.......... 0 0 0 0... 2 Chapter 6. Panel Data 41

GMM in Cigarette Demand GMM results in our example are: Coecient Standard Error Least Squares (OLS) 0.982 (0.003) Within Groups (WG) 0.884 (0.061) Anderson-Hsiao 1.395 (0.090) One-step GMM 1.023 (0.104) Two-step GMM 0.994 (0.040) Two-step GMM small sample 0.994 (0.121) Chapter 6. Panel Data 42

Potential limitations of Arellano-Bond Weak instruments: When α 1, relevance of the instrument decreases. Instruments are still valid, but have poor small sample properties. Monte Carlo evidence shows that with α > 0.8, estimator behaves poorly unless huge samples available. There are alternatives in the literature. Overtting: Too many instruments if T relative to N is relatively large. We might want to restrict the number of instruments used. It is always good practice to check robustness to dierent combinations of instruments. Chapter 6. Panel Data 43

Dynamic linear model Once we include regressors, the model is: y it = αy it 1 + x itβ + η i + v it α < 1. We maintain the previous assumptions: error components, serially uncorrelated shocks, and predetermined initial conditions. Therefore, moment conditions of the kind: are still valid. E[y it s v it ] = 0 t = 2,..., T, s 2, Chapter 6. Panel Data 44

Assumptions on regressors Dierent assumptions regarding x it will enable dierent additional orthogonality conditions: x it can be correlated or uncorrelated with η i. x it can endogenous, predetermined, or strictly exogenous with respect to v it. For instance, if assumptions are analogous to those for y it 1, we may use x it 1 (and longer lags) as instruments: y i0 x i0 x i1... 0... 0 0... 0 Z i =......................... 0 0 0... y i0... y it 2 x i0... x it 1 Chapter 6. Panel Data 45

Dynamic Cigarette Demand with Regressors In our example, we rewrite the employment equations as follows: ln C it = α ln C it 1 + β 1 ln P it + β 2 ln Y it + η i + v it. Results are: OLS WG GMM Lagged dep (α) 0.947 0.528 0.495 (0.011) (0.064) (0.127) ln Prices (β 1 ) 0.010-0.501-0.607 (0.006) (0.098) (0.143) ln Income (β 2 ) 0.049 0.369 0.338 (0.011) (0.044) (0.051) Chapter 6. Panel Data 46

System GMM estimation Chapter 6. Panel Data 47

Additional orthogonality conditions Recall our (T 1)T/2 moment conditions: E[y it s v it ] = 0 t = 2,..., T ; s 2. System GMM (Arellano and Bover, 1995) uses the assumption E[y i0 η i ] = η i 1 α : E[ y it η i ] = 0, t or, alternatively: E[ y it s u it ] = 0, u it = η i + v it, s = 1,..., T 1. Chapter 6. Panel Data 48

The System GMM estimator Analogously to the rst-dierenced GMM, the estimator is given by E[(Z ) u i ] = 0: where: ˆα Sys GMM = ( (X ) Z W N (Z ) X ) 1 X Z W N (Z ) y, ( ) ( ) ( ) Zi Zi 0... 0 = 0, u vi i =, X y 1i i = y i1... y it 1 η i + v it y it 1 ( ) and y yi i = y it This estimator is more ecient, as it uses additional moment conditions. It reduces small sample bias, especially when α 1. Chapter 6. Panel Data 49

System-GMM in Cigarette Demand GMM results in our example are: Coecient Standard Error Least Squares (OLS) 0.982 (0.003) Within Groups (WG) 0.884 (0.061) Anderson-Hsiao 1.395 (0.090) One-step GMM 1.023 (0.104) Two-step GMM 0.994 (0.040) Two-step GMM small sample 0.994 (0.121) One-step System-GMM 0.926 (0.023) Two-step System-GMM small 0.911 (0.032) Chapter 6. Panel Data 50

Specication tests Chapter 6. Panel Data 51

Specication tests There are several relevant aspects for the validity of the estimation that can be tested formally. Orthogonality conditions: are they small enough to not reject that they are zero (overidentifying restrictions). Validity of a subset of more restrictive assumptions (dierence Sargan test, Hausman test). Serial correlation in the data: vital for the validity of the instruments (Arellano-Bond test). Chapter 6. Panel Data 52

Sargan/Hansen overidentifying restrictions test The null hypothesis is whether the orthogonality conditions are satised (i.e. moments are equal to zero). The test can only be implemented if the problem is overidentied (otherwise the sample moments are exactly zero by construction). The test is: S = NJ N (β) = N 1 N ( N û 1 iz i N i=1 ) 1 N Z iû iû iz i i=1 1 N N Z iû i, i=1 where û are predicted residuals from the rst step and û are those predicted from the second stage, and: S a χ 2 (L K). Chapter 6. Panel Data 53

Testing stronger assumptions Some of the assumptions that we make are stronger than others. If the problem is overidentied, we can test whether results change if we include or exclude the orthogonality conditions generated by them. If they are true, increase eciency, but if not, inconsistent! Two ways of testing it: Overidentifying restrictions (dierence in Sargan should be close to zero: S = S S A a χ 2 (L L A )). Dierences in coecients (Hausman test: ˆq = ˆδ A GMM ˆδ GMM ). Chapter 6. Panel Data 54

Direct test for serial correlation The test was proposed by Arellano-Bond (1991). Tests for the presence of second order autocorrelation in the rst-dierenced residuals. If dierences in residuals are second-order correlated, some instruments would not be valid! The test is: m 2 = v 2 v se a N (0, 1), where v 2 is the second lagged residual in dierences, and v is the part of the vector of contemporaneous rst dierences for the periods that overlap with the second lagged vector. Values close to zero do not reject the hypothesis of no serial correlation. Chapter 6. Panel Data 55