Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Similar documents
Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Econ 582 Fixed Effects Estimation of Panel Data

Zellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley

y it = α i + β 0 ix it + ε it (0.1) The panel data estimators for the linear model are all standard, either the application of OLS or GLS.

Advanced Econometrics

Multiple Equation GMM with Common Coefficients: Panel Data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Economics 582 Random Effects Estimation

Panel Data Model (January 9, 2018)

Econometrics of Panel Data

1. The OLS Estimator. 1.1 Population model and notation

Non-linear panel data modeling

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics of Panel Data

Topic 10: Panel Data Analysis

Non-Spherical Errors

GLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22

Econometrics of Panel Data

Econometrics Summary Algebraic and Statistical Preliminaries

Outline. Overview of Issues. Spatial Regression. Luc Anselin

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

the error term could vary over the observations, in ways that are related

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

10 Panel Data. Andrius Buteikis,

Applied Econometrics (QEM)

Lecture 10: Panel Data

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

11. Further Issues in Using OLS with TS Data

Introductory Econometrics

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

1 Introduction to Generalized Least Squares

Review of Econometrics

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Final Exam. Economics 835: Econometrics. Fall 2010

Econometrics of Panel Data

Econometrics of Panel Data

Introductory Econometrics

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Section 6: Heteroskedasticity and Serial Correlation

EC327: Advanced Econometrics, Spring 2007

Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Instrumental Variables, Simultaneous and Systems of Equations

Lesson 17: Vector AutoRegressive Models

Notes on Panel Data and Fixed Effects models

Econometric Analysis of Cross Section and Panel Data

Multiple Regression Analysis

Introductory Econometrics

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

Applied Microeconometrics (L5): Panel Data-Basics

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

Econometrics Master in Business and Quantitative Methods

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Dealing With Endogeneity

Estimation of Time-invariant Effects in Static Panel Data Models

Panel Data Econometrics

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Panel Threshold Regression Models with Endogenous Threshold Variables

Intermediate Econometrics

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Nonstationary Panels

CLUSTER EFFECTS AND SIMULTANEITY IN MULTILEVEL MODELS

α version (only brief introduction so far)

Lecture 4: Linear panel models

Sensitivity of GLS estimators in random effects models

Munich Lecture Series 2 Non-linear panel data models: Binary response and ordered choice models and bias-corrected fixed effects models

What s New in Econometrics? Lecture 14 Quantile Methods

Short T Panels - Review

The Simple Regression Model. Part II. The Simple Regression Model

Introduction to Econometrics Midterm Examination Fall 2005 Answer Key

HOW IS GENERALIZED LEAST SQUARES RELATED TO WITHIN AND BETWEEN ESTIMATORS IN UNBALANCED PANEL DATA?

1 Estimation of Persistent Dynamic Panel Data. Motivation

Econometrics of Panel Data

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Heteroskedasticity in Panel Data

Panel Data: Linear Models

Seemingly Unrelated Regressions in Panel Models. Presented by Catherine Keppel, Michael Anreiter and Michael Greinecker

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

Simple Linear Regression: The Model

Homoskedasticity. Var (u X) = σ 2. (23)

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Econ 510 B. Brown Spring 2014 Final Exam Answers

Linear Regression. Junhui Qian. October 27, 2014

Lecture 8 Panel Data

Weighted Least Squares

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Introduction to Estimation Methods for Time Series models. Lecture 1

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

An overview of applied econometrics

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Econometric Methods for Panel Data Part I

STAT5044: Regression and Anova. Inyoung Kim

Regression Models - Introduction

Transcription:

Panel Data Models James L. Powell Department of Economics University of California, Berkeley Overview Like Zellner s seemingly unrelated regression models, the dependent and explanatory variables for panel data models (or pooled cross section and time series models) are typically denoted using two (or more) subscripts, where the different subscripts indicate different characteristics of the variable usually indicating both individual and time, but possibly denoting location, group, etc. Some variants of the model ( fixed effects models) can be viewed as special cases of the classical linear regression model, while others ( random effects models) are special cases of the generalized regression model. The simplest model assumes that the dependent variable y it satisfies a linear model with an intercept that is specific to individual i, y it = x 0 itβ + α i + ε it, i =,...,N; t =,...,T. This is known as a balanced panel, since all individual observations are assumed observable for every time period (and vice versa); here, Kronecker product notation will useful when using matrix notation for the data set. In contrast, clustered or grouped data models typically assume the number of observations per group i can vary across groups, which in this notation would replace the common number of time periods T with a groups-specific numbert i of individuals. As for seemingly unrelated equations, the number of time periods T in most panel data applications is usually small relative to the number of individuals N. However, unlike SUR models, where it is more natural to stack observations by the second subscript first that is, across individuals for each equation, and then stack by equation for panel data models the usual convention is to stack observations in the opposite order of subscripts, that is, first collecting the observations across time for each individual as y i = X i β + α i ι T + ε i (T ) (T K) for i =,..., N, where y j and ε j are T -vectors and X i is a T K matrix, y i ε i y i = y i2 (T )..., ε i = ε i2 (T )..., X i = (T K) y it ε it x 0 i x 0 i2... x 0 it,

and ι T is a T -dimensional column vector of ones. Then, stacking the entire data set by individuals, y ε X y = y 2 (NT )..., ε = ε 2 (NT )..., X = X 2 (NT K)..., y N ε N X N and defining α (N ) = α α 2... α N, the data can be represented by the single (relatively simple) equation y = Xβ + Dα + ε, where D (NT N) I N ι T. In panel data (or clustered data) models, the individual intercept α i is meant to control for the effect of unobservable regressors that are specific to individual i, so that α i = w 0 iγ where the unobservable, individual-specific regressors w i might include ability, intelligence, family background, ambition, etc. Different assumptions on the relationship of the observable regressors x it to the intercept term α i, and thus to the unobservable regressors w i, yield different variations on the classical and generalized regression models. With this notation, it is understood that the matrix X does not include a column vector of ones, since otherwise a linear combination of the matrix D of individual-specific dummy variables would yield this vector of ones, D ι N = (I N ι T )(ι N ) = (I N ι N ι T ) = ι NT, so that the combined matrix [X, D] of regressors would not have full column rank. 2

Fixed Effect Model When the individual intercepts α i are treated as fixed constants, which can be arbitrarily related to the regression vectors x it, the resulting model, known as the fixed effect model, can be viewed as a special case of the classical linear model under the usual assumptions that X is nonrandom, [X, D] is of full column rank, E(ε) =0, and V(ε) =σ 2 I NT. By the usual residual regression formulae, the classical LS estimator of the subvector β of regression coefficients is ³ ˆβ FE = X 0 X X 0 ỹ, where ỹ ³I 0 NK D(D 0 D) D y are the residuals of a regression of y on D, with an analogous definition of X. (The subscript FE stands for fixed effects.) By the special structure of the matrix D (which has a lot of ones in it), it is straightforward to show that the subvector of ỹ corresponding to individual i canbeexpressedas ỹ i = (I T ι T (ι 0 T ι T ) ι 0 T )y i = y i y i ι T, where y i is the average value of y it over t, y i ι0 T y i ι 0 T ι T = T y it. This result can be interpreted as follows: from the original structural equation t= y it = x 0 itβ + α i + ε it it follows that y i = x 0 i β + α i + ε i, and subtracting the second equation from the first yields y it y i = (x it x i ) 0 β+(α i α i )+(ε it ε i ) = (x it x i ) 0 β +(ε it ε i ), so deviating y it and x it from their time-averages eliminates the fixed effect α i from the structural equation, much as taking deviations from means eliminates the intercept term in a classical regression model (with intercept). 3

An alternative interpretation of the fixed effect estimator ˆβ FE uses first-differences over time, rather than deviations from time-averages, to eliminate the fixed effect α i : y it y it y i,t = x 0 itβ + α i + ε it = x 0 itβ + ε it, for i =,...,N and t =2,..., T. For balanced panels, LS regression of y it on x it gives identical results to LS regression of y it y i on x it x i. Considering either interpretation, it is clear that the regression coefficients on any component of x it that is constant over time (x it x is ) will be unidentified, since that component of either x it or x it x i will be identically zero. It is only variation in the regressors across time for a given individual that allows the corresponding coefficient to be identified relative to the individual-specific, time-invariant intercept α i. If the error terms ε it are i.i.d. and normally distributed with zero mean and variance σ 2, then the fixed effect estimator ˆβ FE is also the maximum likelihood (ML) estimator of β, and the ML estimator of σ 2 is ˆσ 2 ML = NT i= t= ³ y it y i (x it x i ) 0 ˆβ FE 2. This estimator is biased, and, if N for fixed T (a reasonable approximation if N is large relative to T ), the ML estimator of σ 2 is also inconsistent, with ˆσ 2 ML p (T ) σ 2. T This inconsistency of the ML estimator of σ 2 is a classic example of the nefarious incidental parameters problem described by Neyman and Scott. Because the number N of intercept terms α i increases to infinity as N increases, and the corresponding ML estimator ˆα i y i x 0 i ˆβ FE p α i + ε i is inconsistent for α i, this inconsistency translates into inconsistency of the ML estimator ˆσ 2 ML. Fortunately, this doesn t cause inconsistency of ˆβ FE, and an unbiased and consistent estimator s 2 ML = N(T ) K i= t= ³y it y i (x it x i ) 0 ˆβ FE 2 4

of σ 2 is readily available. It is worth mentioning that the fixed effect label does not mean that the regressors x it or intercept terms α i must be viewed as nonrandom, or fixed in a statistical sense; they may be indeed be viewed as random and jointly distributed, provided the conditions on ε it are assumed to hold conditional on the realizations of X and ε. What characterizes a fixed effect model is that no structure on the relationship between α i and x it is imposed. Random Effect Model Adrawbackofthefixed-effect model is its failure to identify any components of β corresponding to regressors that are constant over time for a given individual; for such coefficients to be identified, stronger conditions on the relation of the individual-specific intercept α i to the regressors x it must be imposed. The random effects model uses a simple but very strong assumption to restrict this relationship: namely, that the intercept α i is a random variable which is not related to x it and ε it, in the sense that E(α i )=α, V ar(α i )=σ 2 α, and Cov(α i,ε it )=0, all assumed independent of x it. (These moments should be interpreted as conditional on X if the regressors are viewed as random.) Relabeling the variance of ε it as Var(ε it ) σ 2 ε, the original panel data model can be rewritten as y it = x 0 itβ + α + u it, where u it (α i α)+ε it has E(u it ) = 0, Var(u it ) = σ 2 α + σ 2 ε, Cov(u it,u is ) = σ 2 α, and Cov(u it,u js ) = 0 if i 6= j. This implies that the model can be written in matrix form as y = Xβ+αι NT + u, 5

where E(u) = 0, V(u) = σ 2 αdd 0 + σ 2 εi NT σ 2 εω, with Ω I NT + σ2 α σ 2 DD 0 ε = I NT + σ2 α IN σ 2 ι T ι 0 T. ε In the unlikely event that the ratio θ σ 2 α/σ 2 ε wereknown andthusω did not contain unknown parameters Aitken s GLS estimator of β and α would have the usual form µ ˆβGLS =(Z 0 Ω Z) Z 0 Ω y, ˆα GLS where Z [X, ι NT ]. Like the fixed effects estimator, there are a couple algebraically-equivalent representations of the GLS estimator ˆβ GLS of the slope coefficients. One interpretation is the coefficients of a LS regression of y it on x it, where y it y it y i + ω (y i y ), x it x it x i + ω (x i x ), for and y x ω NT NT s i= t= i= t= σ 2 ε Tσ 2 ε + σ 2 α r Tθ+, y it, x it, where as above θ σ 2 α/σ 2 ε. As the variability of the random effect σ 2 α declines to zero for σ 2 ε fixed, ω, and the GLS estimator reduces to the usual LS regression of the deviations y it y in the dependent 6

variable from its mean value on the corresponding deviations x it x in regressors. Using this result, we can obtain another interpretation of the GLS estimator as as a matrix-weighted average ˆβ GLS = Aˆβ FE +(I A)ˆβ B of the fixed effect estimator ˆβ FE defined above and the between estimator " N # X X N ˆβ B (x i x )(x i x ) 0 (x i x )(y i y ) i=t of β coming from the LS regression of the time-averages y i on x i and a constant. (The form of the matrix A = A(θ) is given in Ruud s textbook.) When the variances σ 2 α and σ 2 ε are unknown (as is always the case), an estimator of θ σ 2 α/σ 2 ε or ω ( + Tθ) /2 is needed to construct a Feasible GLS estimator. As noted above, the unbiased estimator s 2 FE of σ2 ε baseduponthefixed-effect estimator ˆβ FE will be consistent; for fixed T, the corresponding variance estimator for the between estimator s 2 B N K i=t ³(y i y ) (x i x ) 0 ˆβ B 2 will be unbiased and consistent for Var(ε i ) =σ 2 α + σ 2 ε/t. Hence i= ˆω s2 FE Ts 2 B p ω, and can be used in place of ω to construct a Feasible GLS estimator. Note that the corresponding estimator of σ 2 α, is not guaranteed to be positive. s 2 α s 2 B s 2 FE/T, Time Effects and Differences in Differences In addition to the assumption that the intercept term varies across individuals i, itmightalso be reasonable to assume that it varies across time t; defining η t to be the time-specific intercept term, a generalization of the basic panel data model would be y it = x 0 itβ + α i + η t + ε it, i =,...,N; t =,...,T, which, in matrix form, might be written as y = Xβ + Dα + Rη + ε, 7

for and η = (T ) η η 2... η T R (NT T ) ι N I T. Since Dι N = ι NT = Rι T, the columns of the matrix of regressors [X, D, R] would not be linearly independent, and a normalization on α or η would need to be imposed, e.g., η 0, which would imply that the first column of R could be dropped, along with the first component of η. Treating the α and η parameters as fixed effects, the corresponding fixed effect estimator of β is obtained by a regression of ỹ it on x it, where ỹ it y it y i y t + y, with and corresponding definitions for x it and x t. y t N i= y it If T is small (relative to N), the time-specific intercepts η are typically treated as fixed, which implies that the coefficients of any regressors that are time-specific (i.e., do not vary across individuals) would be unidentified. If such coefficients are of interest, the η coefficients can be treated as random effects, with corresponding variance component σ 2 η, along with the individual effects α. The corresponding GLS estimator of β would combine the fixed effects estimator with two different between estimators, one involving the regression of y i on x i and a constant and the other regressing y t on x t and a constant. The details are a bit messy, and can be found in many graduate texts. A special case of the fixed effects model with individual- and time-specific effects is the so-called differences in differences (or diffs indiffs ) framework, a name more descriptive of the estimation method than the model itself. The simplest version of this model has T =2and a single time-varying regressor x it which is binary. Specifically the N individual observations are classified into two groups, the controls (for i =,...,N c ), for which x it 0, and the treated (i = N c +,...,N), for which x i =0 8

and x i2 =. The scalar coefficient β is the treatment effect, i.e., the change in the average value of the dependent variable between the pre-treatment (t =)and post-treatment (t =2)periods. To repeat, 0 if i =,...,N c, x it = 0 if i = N c +,...,N, and t =; if i = N c +,...,N, and t =. Writing the structural equation y it = x it β + α i + η t + ε it, i =,...,N; t =, 2, and taking first differences yields y i2 = y i2 y i = x i2 β + η 2 + ε i2 = ½ η2 + ε i2 if i =,...,N c β + η 2 + ε i2 if i = N c +,...,N. A classical LS regression of y i2 on a constant and x i2 yields ˆβ FE = ȳ 2 ȳ, where ȳ XN c (y i2 y i ) and N c i= ȳ 2 N N c (y i2 y i ) i=n c + are the average changes in y it for the control and treatment groups, respectively. So the estimate of the treatment effect is the difference in the average change in the dependent variable across the two groups. As N c and N N c tend to infinity, it is easy to see that ȳ p η 2, ȳ 2 p β + η 2, so the diffs-in-diffs estimator of the treatment effect is consistent. Note that, with this fixed-effects approach, there is no need to assume the treatment assignment (or choice ) is independent of the individual effect α i or time effect η t. 9

Robust Covariance Estimation Like other GLS applications, statistical inference with panel data can be sensitive to heteroskedasticity or serial correlation of the error terms. Writing the panel data model as y it = z 0 itθ+u it, where z 0 it includes the row vector of regressors x0 it and the relevant row of the matrices D and R of dummy variables for the fixed effect, we may want to assume that the errors u it are uncorrelated across individuals i but have arbitrary variance-covariance patterns over time t for each individual i. That is, suppose E(u it ) = 0, Cov(u it,u is ) = σ i,ts, Cov(u it,u js ) = 0 if i 6= j. Stacking the observations in the usual matrix form y = Zθ + u, it follows that for V(u) Σ = Σ 0... 0 0 Σ 2............... 0 0... 0 Σ N Σ j [σ j,ts ]. (T T ) In the absence of a parametric form for Σ j, itwouldbereasonabletousetheclassicalleastsquares estimator ˆθ LS =(Z 0 Z) Z 0 y of the θ coefficients (i.e., the slope coefficients β and any individual or time fixed effects), which should be consistent and asymptotically normally distributed under fairly general conditions on u. As in other GLS applications, the trick is to find a consistent estimator for plim N V(ˆθ LS )=plim µ µ µ N Z0 Z N Z0 ΣZ N Z0 Z, 0

the asymptotic covariance matrix of the LS estimator. The middle matrix is the tricky one; since N Z0 ΣZ N i= t= s= σ i,ts z it z 0 is, the same reasoning that led to the Huber-Eicker-White robust covariance matrix estimator yields N Z0 ˆΣZ N i= t= s= û it û is z it z 0 is as a consistent estimator of the middle matrix of plim N V(ˆθ LS ), where û it are the LS residuals û it y it z 0 itˆθ LS. This estimator will be consistent as N under similar conditions as for consistency of the Huber- Eicker-White heteroskedasticity-robust covariance estimator. This robust covariance matrix estimator immediately extends to clustered data problems, where the number of time periods or group members T i depends upon the group index i; the formulae above are easily extended by changing T to T i throughout. Lagged Dependent Variables in Panel Data A more serious inference problem arises when the regressors include a lagged dependent variable in models with fixed effects. Consider the special case with T =2and y it = x 0 itβ+γy i,t + α i + ε it, where the {α i } are considered fixed effects and ε it satisfies the usual Gauss-Markov assumptions. Assuming y i0 is observable for all i and considered a nonrandom starting value, the fixed-effect estimator of β and γ is obtained by a LS regression on the differenced model y i2 = y i2 y i = x 0 i2β+γ y i + ε i2. But now Cov( y i, ε i2 ) = Cov ((y i y i0 ), (ε i2 ε i )) = Cov(y i,ε i ) = Var(ε i ) 6= 0,

so the LS regression of y i2 on x i2 and y i will yield biased and inconsistent estimators of β and γ in general. This problem is the same as the problem of inconsistency of LS with lagged dependent variables and serially-correlated errors; elimination of the fixed effect by differencing (or deviating from individual means) yields a differenced error term which is serially correlated, and thus related to the lagged dependent variable. A solution to the inconsistency of the LS estimator for this problem is the instrumental variables method to be discussed next. 2