Econometrics of Panel Data

Similar documents
Econometrics of Panel Data

Econometrics of Panel Data

Econometrics of Panel Data

Econometrics of Panel Data

Econometrics of Panel Data

Applied Microeconometrics (L5): Panel Data-Basics

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Advanced Econometrics

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Applied Econometrics Lecture 1

Applied Quantitative Methods II

Panel Data: Fixed and Random Effects

ECON 4551 Econometrics II Memorial University of Newfoundland. Panel Data Models. Adapted from Vera Tabakova s notes

An overview of applied econometrics

1 Estimation of Persistent Dynamic Panel Data. Motivation

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Non-linear panel data modeling

Topic 10: Panel Data Analysis

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Introductory Econometrics

10 Panel Data. Andrius Buteikis,

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Ordinary Least Squares Regression

Fixed Effects Models for Panel Data. December 1, 2014

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Christopher Dougherty London School of Economics and Political Science

Econometric Analysis of Cross Section and Panel Data

Panel Data: Linear Models

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid

Econ 582 Fixed Effects Estimation of Panel Data

12E016. Econometric Methods II 6 ECTS. Overview and Objectives

Economics 582 Random Effects Estimation

Linear Panel Data Models

Intermediate Econometrics

Longitudinal Data Analysis Using SAS Paul D. Allison, Ph.D. Upcoming Seminar: October 13-14, 2017, Boston, Massachusetts

Introductory Econometrics

EC327: Advanced Econometrics, Spring 2007

Econometrics - 30C00200

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes

α version (only brief introduction so far)

Econometrics Master in Business and Quantitative Methods

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Advanced Quantitative Methods: panel data

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Panel Data Model (January 9, 2018)

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Heteroskedasticity. Part VII. Heteroskedasticity

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Econometric Methods for Panel Data Part I

Notes on Panel Data and Fixed Effects models

1. The OLS Estimator. 1.1 Population model and notation

AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M.

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Applied Econometrics (QEM)

Econometrics. 9) Heteroscedasticity and autocorrelation

Economics 671: Applied Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Week 2: Pooling Cross Section across Time (Wooldridge Chapter 13)

Dynamic Panel Data Workshop. Yongcheol Shin, University of York University of Melbourne

Short T Panels - Review

Making sense of Econometrics: Basics

ECON The Simple Regression Model

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Dynamic Panel Data Ch 1. Reminder on Linear Non Dynamic Models

On The Comparison of Two Methods of Analyzing Panel Data Using Simulated Data

Week 11 Heteroskedasticity and Autocorrelation

Homoskedasticity. Var (u X) = σ 2. (23)

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

Kausalanalyse. Analysemöglichkeiten von Paneldaten

ADVANCED ECONOMETRICS I. Course Description. Contents - Theory 18/10/2017. Theory (1/3)

ECONOMETRICS HONOR S EXAM REVIEW SESSION

the error term could vary over the observations, in ways that are related

Dynamic Panel Data Models

Two-Variable Regression Model: The Problem of Estimation

WISE International Masters

Econometric Analysis of Panel Data. Final Examination: Spring 2013

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Nonstationary Panels

Test of hypotheses with panel data

The Multiple Linear Regression Model

New York University Department of Economics. Applied Statistics and Econometrics G Spring 2013

We can relax the assumption that observations are independent over i = firms or plants which operate in the same industries/sectors

Motivation for multiple regression

Econ 510 B. Brown Spring 2014 Final Exam Answers

Dealing With Endogeneity

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

Panel data methods for policy analysis

Applied Econometrics (QEM)

Simple Linear Regression: The Model

xtseqreg: Sequential (two-stage) estimation of linear panel data models

Linear dynamic panel data models

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Transcription:

Econometrics of Panel Data Jakub Mućk Meeting # 1 Jakub Mućk Econometrics of Panel Data Meeting # 1 1 / 31

Outline 1 Course outline 2 Panel data Advantages of Panel Data Limitations of Panel Data 3 Pooled OLS estimator 4 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within Group) Estimator Jakub Mućk Econometrics of Panel Data Course outline Meeting # 1 2 / 31

Literature Basic: 1 Baltagi B. H., (2014), Econometric Analysis of Panel Data, 5th edition, Wiley. 2 Wooldridge J. M., (2010), Econometric Analysis of Cross Section and Panel Data, 2nd edition, The MIT Press. Additional: 1 Angrist J. D. and Pischke J.-S., (2009), Mostly Harmless Econometrics: An Empiricist s Companion, Princeton University Press. 2 Arellano M., (2004), Panel Data Econometrics. Advanced Texts in Econometrics, Oxford University Press. 3 Baltagi B. H. (ed.), (2015), The Oxford Handbook of Panel Data, Oxford University Press. 4 Cameron C. A. and Trivedi P. K., (2006),Microeconometrics: Methods and Applications, Cambridge University Press. 5 Hsiao Ch., (2014), Analysis of Panel Data, 3rd edition, Cambridge University Press. 6 Pesaran H. (ed.), (2015), Time Series and Panel Data Econometrics, Oxford University Press. Jakub Mućk Econometrics of Panel Data Course outline Meeting # 1 3 / 31

Topics 1 Panel data - basic definitions, characteristics, etc. 2 Linear static model - common types. Fixed and random effects approaches. 3 Fixed effects estimation. 4 Random effects estimation. 5 Verification and forecasting with the use of linear static models. 6 Basic Hausman-Taylor estimator, Swamy estimator. 7 Problem of missing observations, applications of Hausman s test. 8 IV and GIVE estimators in the estimation of static panel data models. 9 The use of IV and GIVE - developped HT method, endogeneity of variables. 10 Estimation of dynamic models - estimator of Arellano and Bond. 11 Limited dependent variables and panel data. 12 Estimating average treatment effects. 13 RE probit and FE logit models: estimation, verification, inference. 14 Spatial panel data. 15 Nonstationary panels. Jakub Mućk Econometrics of Panel Data Course outline Meeting # 1 4 / 31

Office hours: TBA. E-mail: jakub.muck@gmail.com. WWW: http://akson.sgh.waw.pl/ jm39985/ = english = teaching = Econometrics of Panel Data. Software: Stata (+ R). Exam rules: Exam: 60 %. Classes: 40% houseworks and activity. Jakub Mućk Econometrics of Panel Data Course outline Meeting # 1 5 / 31

Outline 1 Course outline 2 Panel data Advantages of Panel Data Limitations of Panel Data 3 Pooled OLS estimator 4 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within Group) Estimator Jakub Mućk Econometrics of Panel Data Panel data Meeting # 1 6 / 31

Panel data Panel of data consists of a group of cross-section units (people, firms, states, countries) that are observed over the time: Cross-section: y i where i {1,..., N }. Time series: y t where t {1,..., T}. Panel data: y it where i {1,..., N } t {1,..., T}. In general, N - the cross-sectional dimension. T - the time dimension. Jakub Mućk Econometrics of Panel Data Panel data Meeting # 1 7 / 31

Panel data We might describe panel data using T and N: long/short describes the time dimension (T); wide/narrow describes the cross-section dimension (N ); For example: panel with relatively large N and T: long and wide panel. Jakub Mućk Econometrics of Panel Data Panel data Meeting # 1 8 / 31

Panel data We might describe panel data using T and N: long/short describes the time dimension (T); wide/narrow describes the cross-section dimension (N ); For example: panel with relatively large N and T: long and wide panel. In a balanced panel, each individuals(unit) has the same number of observation. Unbalanced panel is a panel in which the number of time series observations is different across units. Jakub Mućk Econometrics of Panel Data Panel data Meeting # 1 8 / 31

Examples of panel data Micro data PSID the Panel Study of Income Dynamics collected by the Institute for Social Research at the University of Michigan. NLS the National Longitudinal Surveys collected by the Bureau of Labor Statistics. CPS Current Population Survey conducted by the Bureau of Census (monthly frequency). Macro data Penn World Table, WDI Indicators. Sectoral data KLEMS database. Jakub Mućk Econometrics of Panel Data Panel data Meeting # 1 9 / 31

Advantages of Panel Data (Baltagi, 2014) Controlling for individual heterogeneity. Panel data offer more informative data, more variability, less collinearity among the dependent variables, more degrees of freedom and more efficiency in estimation. Identification and measurement of effects that are simply not detectable in pure cross-section or pure time-series data. Testing more complicated behavioral models than purely cross-section or time-series data. Reduction in biases resulting from aggregation over firms or individuals. Overcome the problem of nonstandard distributions typical of unit roots tests = macro panels. Jakub Mućk Econometrics of Panel Data Panel data Meeting # 1 10 / 31

Limitations of Panel Data Design and data collection problems: coverage; nonresponse; frequency of interviewing; Distortions of measurement errors Selectivity problems: self-selectivity; nonresponse; attrition; Short T. Cross-sectional dependence. Jakub Mućk Econometrics of Panel Data Panel data Meeting # 1 11 / 31

Two examples (Arellano, 2003) Classical example Agricultural Cobb-Douglas production function. Consider the following model: y it = βx it + u it + η i (1) yit the log output. xit the log of a variable output; ηi an farm-specific input that is constant over time, e.g., soil quality. uit a stochastic input that is outside framer s control, e.g., rainfalls. β - the technological parameter. Jakub Mućk Econometrics of Panel Data Panel data Meeting # 1 12 / 31

Two examples (Arellano, 2003) Classical example Agricultural Cobb-Douglas production function. Consider the following model: y it = βx it + u it + η i (1) yit the log output. xit the log of a variable output; ηi an farm-specific input that is constant over time, e.g., soil quality. uit a stochastic input that is outside framer s control, e.g., rainfalls. β - the technological parameter. An example in which panel data does not work Returns to education. Consider the following model: y it = α + βx it + u it (2) yit the log wage; xit years of the full-time education; β returns to education. In addition: u it = η i + ε it (3) where η i stands for the unobserved individual ability. Problem: x it lacks of time variation. Jakub Mućk Econometrics of Panel Data Panel data Meeting # 1 12 / 31

Outline 1 Course outline 2 Panel data Advantages of Panel Data Limitations of Panel Data 3 Pooled OLS estimator 4 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within Group) Estimator Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 13 / 31

Pooled OLS estimator Pooled model is one where the data on different units are pooled together with no assumption on individual differences: where y it = α + β 1 x 1it +... + β k x kit + u it (4) yit the dependent variable; xkit the k th explanatory variable; uit the error/disturbance term; α the intercept; β1,..., β k the structural parameters; Note that the coefficients α, β 1,..., β k are the same for all unit (don t have i or t subscript). Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 14 / 31

Pooled OLS estimator Pooled model is one where the data on different units are pooled together with no assumption on individual differences: where y it = α + β 1 x 1it +... + β k x kit + u it (4) yit the dependent variable; xkit the k th explanatory variable; uit the error/disturbance term; α the intercept; β1,..., β k the structural parameters; Note that the coefficients α, β 1,..., β k are the same for all unit (don t have i or t subscript). Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 14 / 31

Pooled OLS estimator In vector form: y = Xβ + u (5) where y is NT 1, β is (K + 1) 1, X is NT K and u is NT 1 and: 1 x 11... x k1 y 1 α 1 x 12... x k2 X =...... and y = y 2. and β = β 1. 1 x 1N... x kn y N β K where x ij is the vector of observations of the jth independent variable for unit i over the time and y i is the vector of observation of the explained variable for unit i. Then: ˆβ POOLED = (X X) 1 X y = ˆα POOLED ˆβ POOLED 1. ˆβ POOLED K Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 15 / 31 (6)

Pooled OLS estimator Assumptions (for linear pooled model): E(u) = 0 (7) E(uu ) = σui 2 (8) rank(x) = K + 1 < NT (9) E(u X) = 0 (10) (10): X is nonstochastic and is not correlated with u. (8): the error term (u) is not autocorrelated and homoscedastic. (10) = strictly exogeneity of independent variables. Gauss-Markov Theorem If (7)-(10)and are satisfied then ˆβ POOLED is BLUE (the best linear unbiased estimator). Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 16 / 31

Empirical example Grunfeld s (1956) investment model: where I it = α + β 1 K it + β 2 F it + u it (11) I it the gross investment for firm i in year t; K it the real value of the capital stocked owned by firm i; F it is the real value of the firm (shares outstanding). The dataset consists of 10 large US manufacturing firms over 20 years (1935 54). Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 17 / 31

Empirical example Pooled OLS α 42.714 (9.512) [0.000] β 1 0.231 (0.025) [0.000] β 2 0.116 (0.006) [0.000] Note: The expressions in round and squared brackets stand for standard errors and probability values corresponding to the null about parameter s insignificance, respectively. I it = α + β 1 K it + β 2 F it + u it Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 18 / 31

Robust/ Clustered Standard Errors The general assumption in pooled regression on the error terms are very strong or even unrealistic. The lack of correlation between errors corresponding to the same individuals. Let s relax the above assumption: cov(u i,t, u i,s ) 0 (12) Then we have problem of both autocorrelation and heteroskedasticity. The OLS estimator is still consistent but the standard errors are incorrect. We might use the clustered/robust standard errors. Here, the time series for each individual are clusters. Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 19 / 31

Standard errors If the assumptions (7)-(10) are satisfied then the variance-covariance matrix of β is equal to: Var(β) = σ 2 (X X) 1 = D (13) where σ 2 is the variance of the error term. The standard errors stand for the diagonal elements: One step back in (13): s.e.(β i ) = D ii (14) Var(β) = (X X) 1 X ΣX (X X) 1 (15) (15) simplifies to (13) when Var(u) = σ 2 I. But when we don t know Σ? White s (1980) heteroscedasticity-consistent estimator of the variance-covariance matrix: Σ = diag ( û 2 1, û 2 2,..., û 2 N), (16) where û is the vector of residuals. Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 20 / 31

Empirical example Pooled OLS Pooled OLS robust se α 42.714 42.714 (9.512) (11.575) [0.000] (0.000) β 1 0.231 0.231 (0.025) (0.049) [0.000] [0.000] β 2 0.116 0.231 (0.006) (0.007) [0.000] [0.000] Note: The expressions in round and squared brackets stand for standard errors and probability values corresponding to the null about parameter s insignificance, respectively. I it = α + β 1 K it + β 2 F it + u it Jakub Mućk Econometrics of Panel Data Pooled OLS estimator Meeting # 1 21 / 31

Outline 1 Course outline 2 Panel data Advantages of Panel Data Limitations of Panel Data 3 Pooled OLS estimator 4 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within Group) Estimator Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 22 / 31

One-way Error Component Model In the model: y it = α + X itβ + u it i {1,..., N }, t {1,..., T} (17) it is assumed that all units are homogeneous. Why? One-way error component model: u i,t = µ i + ε i,t (18) where: µi the unobservable individual-specific effect; εi,t the remainder disturbance. Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 23 / 31

Fixed effects model We can relax assumption that all individuals have the same coefficients y it = α i + β 1 x 1it +... + β k x kit + u it (19) Note that an i subscript is added to only intercept α i but the slope coefficients, β 1,..., β k are constant for all individuals. An individual intercept (α i ) are include to control for individual-specific and time-invariant characteristics. That intercepts are called fixed effects. Fixed effects capture the individual heterogeneity. The estimation: i) The least squares dummy variable estimator ii) The fixed effects estimator Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 24 / 31

The Least Squares Dummy Variable Estimator The natural way to estimate fixed effect for all individuals is to include an indicator variable. For example, for the first unit: { 1 i = 1 D 1i = (20) 0 otherwise The number of dummy variables equals N. It s not feasible to use the least square dummy variable estimator when N is large We might rewrite the fixed regression as follows: Why α is missing? y it = N α i D ji + β 1 x 1it +... + β k x kit + u i,t (21) j=1 Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 25 / 31

The Least Squares Dummy Variable Estimator In the matrix form: y = Xβ + u (22) where y is NT 1, β is (K + N ) 1, X is NT (K + N ) and u is NT 1. 1 0... 0 x 11... x k1 0 1... 0 x 12... x k2 X =................. 0 0... 1 x 1N... x kn and y = y 1 y 2. y N and β = where x ij is the vector of observations of the jth independent variable for unit i over the time and y i is the vector of observation of the explained variable for unit i. Then: α 1. α N β 1. β K ˆβ LSDV = ( X X ) 1 X y = [ˆα 1 LSDV... ˆα N LSDV ˆβ 1 LSDV LSDV... ˆβ K ] (23) Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 26 / 31

The Least Squares Dummy Variable Estimator We can test estimates of intercept to verify whether the fixed effects are different among units: H 0 : α 1 = α 2 =... = α N (24) To test (24) we estimate: i) unrestricted model (the least squares dummy variable estimator) and ii) restricted model (pooled regression). Then we calculate sum of squared errors for both models: SSE U and SSE R. F = (SSE R SSE U )/(N 1) SSE U /(NT K) (25) if null is true then F F (N 1,NT K). Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 27 / 31

The Fixed Effect (Within Group) Estimator Let s start with simple fixed effects specification for individual i: y it = α i + β 1x 1it +... + β k x kit + u it t = 1,..., T (26) Average the observation across time and using the assumption on time-invariant parameters we get: ȳ i = α i + β 1 x 1i +... + β k x ki + ū i (27) where ȳ i = 1 T T t=1 yit, x1i = 1 T T t=1 x1it, x ki = 1 T T t=1 x kit and ū i = 1 T T t=1 ui Now we substract (27) from (26) (y it ȳ i) = (α i α i) + β 1(x 1it x 1i) +... + β k (x itk x ki ) + (u it ū i) (28) }{{} =0 Using notation: ỹ it = (y it ȳ i), x 1it = (x 1it x 1i), x kit = (x kit x ki ) ũ it = (u it ū i), we get ỹ it = β 1 x 1t +... + β k x kt + ũ i,t (29) Note that we don t estimate directly fixed effects. Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 28 / 31

The Fixed Effect (Within Group) Estimator The Fixed Effect Estimator or the within (group) estimator in vector form: ỹ = Xβ + ũ (30) where ỹ is NT 1, β is K 1, X is NT K and ũ is NT 1 and: x 11... x k1 ỹ 1 x 12... x k2 ỹ 2 X =....... and ỹ = and β =. x 1N... x kn ỹ N Then: ˆβ FE = ( 1 X X) X ỹ FE FE = [ ˆβ 1... ˆβ K ] (31) The standard errors should be adjusted by using correction factor: NT K (32) NT N K where K is number of estimated parameters without constant. The fixed effects estimates can be recovered by using the fact that OLS fitted regression passes the point of the means. ˆα FE i = ȳ i β 1. β K FE FE FE ˆβ 1 x 1i ˆβ 2 x 2i... ˆβ k x ki (33) Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 29 / 31

Empirical example Pooled OLS FE α 42.714 58.744 (9.512) (12.454) [0.000] [0.000] β 1 0.231 0.310 (0.025) (0.017) [0.000] [0.000] β 2 0.116 0.110 (0.006) (0.012) [0.000] [0.000] Note: The expressions in round and squared brackets stand for standard errors and probability values corresponding to the null about parameter s insignificance, respectively. I it = α + β 1 K it + β 2 F it + u it Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 30 / 31

Empirical example Pooled OLS Pooled OLS FE FE robust se robust se α 42.714 42.714 58.744 58.744 (9.512) (11.575) (12.454) (27.603) [0.000] [0.000] [0.000] [0.062] β 1 0.231 0.231 0.310 0.310 (0.025) (0.049) (0.017) (0.053) [0.000] [0.000] [0.000] [0.000] β 2 0.116 0.231 0.110 0.110 (0.006) (0.007) (0.012) (0.015) [0.000] [0.000] [0.000] [0.000] Note: The expressions in round and squared brackets stand for standard errors and probability values corresponding to the null about parameter s insignificance, respectively. I it = α + β 1 K it + β 2 F it + u it Jakub Mućk Econometrics of Panel Data Fixed effects model Meeting # 1 31 / 31