Dynamic Panels. Chapter Introduction Autoregressive Model

Similar documents
Econometrics of Panel Data

Dynamic Panel Data Models

xtdpdqml: Quasi-maximum likelihood estimation of linear dynamic short-t panel data models

Dynamic Panel Data estimators

1 Estimation of Persistent Dynamic Panel Data. Motivation

Dynamic Panel Data Models

Linear dynamic panel data models

Lecture 8 Panel Data

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Lecture 7: Dynamic panel models 2

Short T Panels - Review

Linear Panel Data Models

EC327: Advanced Econometrics, Spring 2007

Dynamic panel data methods

Applied Econometrics. Lecture 3: Introduction to Linear Panel Data Models

ADVANCED ECONOMETRICS I. Course Description. Contents - Theory 18/10/2017. Theory (1/3)

Microeconometrics (PhD) Problem set 2: Dynamic Panel Data Solutions

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

xtseqreg: Sequential (two-stage) estimation of linear panel data models

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Econometrics. 9) Heteroscedasticity and autocorrelation

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 48

LECTURE 10: MORE ON RANDOM PROCESSES

Econometrics. 8) Instrumental variables

Advanced Econometrics

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 54

Econometric Analysis of Cross Section and Panel Data

Jeffrey M. Wooldridge Michigan State University

Instrumental Variables, Simultaneous and Systems of Equations

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Fortin Econ Econometric Review 1. 1 Panel Data Methods Fixed Effects Dummy Variables Regression... 7

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

GMM Estimation in Stata

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Lab 11 - Heteroskedasticity

IV and IV-GMM. Christopher F Baum. EC 823: Applied Econometrics. Boston College, Spring 2014

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Lecture 6: Dynamic panel models 1

Econ 510 B. Brown Spring 2014 Final Exam Answers

Working Paper Number 103 December 2006

Day 2A Instrumental Variables, Two-stage Least Squares and Generalized Method of Moments

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects

Lecture 12 Panel Data

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Autoregressive models with distributed lags (ADL)

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Instrumental variables estimation using heteroskedasticity-based instruments

1 Introduction The time series properties of economic series (orders of integration and cointegration) are often of considerable interest. In micro pa

Handout 12. Endogeneity & Simultaneous Equation Models

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

System GMM estimation of Empirical Growth Models

7 Introduction to Time Series Time Series vs. Cross-Sectional Data Detrending Time Series... 15

1. Overview of the Basic Model

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Regression with time series

GMM Estimation of Empirical Growth Models

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

Lecture#12. Instrumental variables regression Causal parameters III

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Instrumental Variables and GMM: Estimation and Testing. Steven Stillman, New Zealand Department of Labour

7 Introduction to Time Series

ECON3327: Financial Econometrics, Spring 2016

A Transformed System GMM Estimator for Dynamic Panel Data Models. February 26, 2014

GMM based inference for panel data models

1 Motivation for Instrumental Variable (IV) Regression

Graduate Econometrics Lecture 4: Heteroskedasticity

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N

Panel Data Exercises Manuel Arellano. Using panel data, a researcher considers the estimation of the following system:

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Ordinary Least Squares Regression

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

Эконометрика, , 4 модуль Семинар Для Группы Э_Б2015_Э_3 Семинарист О.А.Демидова

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Essential of Simple regression

Lecture 8: Instrumental Variables Estimation

Instrumental Variables Estimation in Stata

Linear Regression with Time Series Data

Econometrics for PhDs

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

TOURISM DEMAND IN AUSTRIAN SKI DESTINATIONS

Instrumental variables estimation using heteroskedasticity-based instruments

Econometrics Homework 4 Solutions

Vector Autogregression and Impulse Response Functions

Estimation of Panel Data Models with Binary Indicators when Treatment Effects are not Constant over Time. Audrey Laporte a,*, Frank Windmeijer b

Binary Dependent Variables

Consistent OLS Estimation of AR(1) Dynamic Panel Data Models with Short Time Series

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

LECTURE 11. Introduction to Econometrics. Autocorrelation

Consistent estimation of dynamic panel data models with time-varying individual effects

10 Panel Data. Andrius Buteikis,

10) Time series econometrics

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Transcription:

Chapter 11 Dynamic Panels This chapter covers the econometrics methods to estimate dynamic panel data models, and presents examples in Stata to illustrate the use of these procedures. The topics in this chapter are discussed in detail in Arellano and Bond (1991), Arellano and Bover (1995), and Blundell and Bond (1998). 1 11.1 Introduction Panel data is widely used to estimate dynamic models. The advantage over crosssectional data is obvious: we cannot estimate dynamic models with a single observation in time. The advantage over time-series data include the possibility that underlying dynamics may be obscured by aggregation biases, and the scope that panel data offers to investigate heterogeneity in adjustment dynamics between different types of cross-sectional units. The focus of this chapter is in the estimation of single equation, autoregressivedistributed lag models from panels with a large number of cross-section units, each observed for a small number of periods. The importance of these methods arise because they can be used in the absence of any strictly exogenous explanatory variables or instruments, and can be extended to models with predetermined or endogenous variables. Strictly exogenous rules out any feedback from current or past shocks to to current values of the variable, which is often not a reasonable restriction. 11.2 Autoregressive Model Let s first estimate the simple panel AR(1) model: 1 This chapter mostly follows Bond (2002). 119

120 11 Dynamic Panels y it = αy i,t 1 +(ν i + v it ), α<1 (11.1) for i=1,2,..., and t = 2,3,...,T. where y it is an observation on some series for individual i in period t, y i, j 1 is the observation for the same individual lagged once ν i is the unobserved individualspecific time-invariant effect which allows for heterogeneity in the means of the y it series across individuals, and v it are independent across individuals. There are individuals across T periods, and the asymptotic properties are considered as becomes large with T fixed. Simply applying OLS to Equation 11.1 yields inconsistent estimates because y i,t 1 is positively correlated with the error term (ν i + v it ) due to the presence of individual effects. Moreover, ickell (1981) shows that the Fixed Effects estimator eliminates ν i ; however, this estimator is still inconsistent for a fixed T. Maximum likelihood estimators are also inconsistent when the initial condition process is misspecified (see Hsiao, 1986). Arellano and Bond (1991) propose first-differencing Equation 11.1 to eliminate the individual effects: y it = α y i,t 1 + v it (11.2) where y i,t 1 = y it y i,t 1. Then the idea is to estimate Equation 11.2 lags of y it as instruments. When t = 3, only y i1 is available as an instrument, but when t = 4 both y i1 and y i2 can be used as instruments. As we have more time periods there will be more instruments available. In general, the vector(y i1,y i2,...,y i,t 2 ) can be used as instrument when t = T. Because the model is over-identified (more instruments than instrumented variables), Arellano and Bond propose estimating Equation 11.2 via Generalized Method of Moments (GMM). The matrix of instruments has the form: y i1 0 0 0 0 0 y i1 y i2 0 0 Z i =..........., 0 0 0 y i1 y i,t 2 where the rows correspond to the first-differenced equations for periods t= 3,4,...,T for individual i, and exploit the moment conditions E[Z i v i ]=0 for i=1,2,, (11.3) where v i = ( v i3, v i4,..., v it ). The asymptotically efficient GMM estimator based on this set of moment conditions minimizes the criterion ( ) ( ) 1 J = v 1 iz i W Z i v i (11.4) using the weight matrix i=1 i=1

11.3 Multivariate Dynamic Models 121 W = [ 1 i=1 (Z i v i vi Zi ) ] 1 (11.5) where the v i are consistent estimates of the first-differenced residuals obtained from a preliminary consistent estimator. Hence, this is known as the two-step GMM estimator. otice that when T > 3 the model is overidentified and the validity of the assumptions used to obtain the moments in Equation 11.3 can be tested using the standard Sargan test of overidentifying restrictions. In particular, J in Equation 11.4 has an asymptotic χ 2 distribution under the null hypothesis that these moment conditions are valid. In this context the key identifying assumption that there is no serial correlation in the v it disturbances can be tested by testing for no second-order serial correlation in the first-difference residuals. 11.3 Multivariate Dynamic Models 11.3.1 Difference GMM Estimator The GMM estimators for the autoregressive models just described can be extended to estimate autoregressive-distributes lag models of the form y it = αy i,t 1 + βx it +(ν i + v it ), α<1 (11.6) for i=1,2,..., and t = 2,3,...,T. where x it can be a vector of current and lagged values of additional explanatory variables. An attractive feature of this approach is that it does not require models for the x it series to be specified in order to estimate the parameters(α,β). Taking first differences of Equation 11.6 we obtain: y it = α y i,t 1 + β x it + v it (11.7) Different moment conditions following Equation 11.3 will be available depending on what is assumed about the correlation between x it and the error term. Maintaining that the v it disturbances are serially uncorrelated, the x it series may be: 1. Endogenous. Then x it is correlated with v it and earlier shocks, but x it is uncorrelated with v i,t+1 and subsequent shocks. In this case x it is treated symmetrically with the variable y it. This means that lagged values of x i,t 2, x i,t 3 and longer lags will be instruments (in Z i ) for the first-differenced equations (Equation 11.7). 2. Predetermined. Then x it and v it are uncorrelated, but x it may still be correlated with v i,t 1 and earlier shocks. In this case x i,t 1 becomes available as additional instrument.

122 11 Dynamic Panels 3. Strictly exogenous. Then x it is uncorrelated with all past, present and future realizations of v is. In this case the complete time series x i =(x i1,x i2,,x it ) will be valid instrumental variables for the first-differenced equations. GMM based on the moment conditions in Equation 11.3 is known as the Arellano and Bond difference GMM dynamic panel estimator. 11.3.2 System GMM Estimator When the series are persistent (high value of α), Blundell and Bond (1998) showed that the instruments in Z i are weak and may fail to identify the parameters of interest. Blundell and Bond proposed combining the equation in first differences with the equation in levels to form a system estimator. The additional assumption in this system estimator is that the first-differences x it are uncorrelated with the unobserved individual effects ν i. In this case lagged values of y i,t 1 and x it can be used as instrumental variables in the levels equation. Hence, the following additional moment conditions are also available and E[ y i,t 1 (ν i + v it )] for i=1,2,..., t = 3,4,...,T, (11.8) E[ x it (ν i + v it )] for i=1,2,..., t = 2,4,...,T. (11.9) That is, the system estimator combines the first difference equation (Equation 11.7) and the levels equation (Equation11.6) using the moment conditions in Equations 11.3, 11.8, and 11.9. The choice between the system and the difference estimators depends on the validity of the assumptions and the time-series properties of the series. The is better if the series are persistent, but requires more assuptions. 11.4 Estimation in Stata The simplest models can be estimated using the commands xtabond and xtdpdsys. However, more interesting models are available with xtdpd and the options inst, lags, maxldep and maxlags and when one is willing to explore the different assumptions about the correlation between x it and v it.

11.4 Estimation in Stata 123 11.4.1 The Difference GMM Estimator Consider the following example from the Stata manual. The following estimation controls for the endogeneity of lagged dependend variable and assumes strictly exogenous regressors. use http://www.stata-press.com/data/r11/abdata xtabond n l(0/1).w l(0/2).(k ys) yr1980-yr1984 year, lags(2) noconstant vce(robust) Arellano-Bond dynamic panel-data estimation umber of obs = 611 Group variable: id umber of groups = 140 Time variable: year Obs per group: min = 4 avg = 4.364286 max = 6 umber of instruments = 41 Wald chi2(16) = 1727.45 Prob > chi2 = 0.0000 One-step results (Std. Err. adjusted for clustering on id) Robust n Coef. Std. Err. z P>z [95% Conf. Interval] -------------+---------------------------------------------------------------- n L1..6862261.1445943 4.75 0.000.4028266.9696257 L2. -.0853582.0560155-1.52 0.128 -.1951467.0244302 w --. -.6078208.1782055-3.41 0.001 -.9570972 -.2585445 L1..3926237.1679931 2.34 0.019.0633632.7218842 k --..3568456.0590203 6.05 0.000.241168.4725233 L1. -.0580012.0731797-0.79 0.428 -.2014308.0854284 L2. -.0199475.0327126-0.61 0.542 -.0840631.0441681 ys --..6085073.1725313 3.53 0.000.2703522.9466624 L1. -.7111651.2317163-3.07 0.002-1.165321 -.2570095 L2..1057969.1412021 0.75 0.454 -.1709542.382548 yr1980.0029062.0158028 0.18 0.854 -.0280667.0338791 yr1981 -.0404378.0280582-1.44 0.150 -.0954307.0145552 yr1982 -.0652767.0365451-1.79 0.074 -.1369038.0063503 yr1983 -.0690928.047413-1.46 0.145 -.1620205.0238348 yr1984 -.0650302.0576305-1.13 0.259 -.1779839.0479235 year.0095545.0102896 0.93 0.353 -.0106127.0297217 Instruments for differenced equation GMM-type: L(2/.).n Standard: D.w LD.w D.k LD.k L2D.k D.ys LD.ys L2D.ys D.yr1980 D.yr1981 D.yr1982 D.yr1983 D.yr1984 D.year otice how Stata details the list of instruments. Basically the estimation is being done using the first difference equation and the strictly exogenous variables are instruments for themselves. The GMM-type instruments follow from the Z i matrix detailed after Equation 11.2. The vce(robust) options estimates the Windmeijer (2005) robust finite-sample correction of the variance-covariance matrix. To test for the validity of the moment conditions we use the Sargan test for overidentifying restrictions xtabond n l(0/1).w l(0/2).(k ys) yr1980-yr1984 year, lags(2) noconstant

124 11 Dynamic Panels (output omitted) estat sargan Sargan test of overidentifying restrictions H0: overidentifying restrictions are valid chi2(25) = 65.81806 Prob > chi2 = 0.0000 Only for a homoskedastic error term does the Sargan test have an asymptotic chisquared distribution. Because its asymptotic distribution is not known under the assumptions of the vce(robust) model, xtabond does not compute it when vce(robust) is specified. Based on the results we reject the null that the overidentifying restrictions are valid. Hence, the moment conditions are not valid. To test for second-order serial correlation in the first-difference residuals we use xtabond n l(0/1).w l(0/2).(k ys) yr1980-yr1984 year, lags(2) noconstant vce(robust) (output omitted) estat abond Arellano-Bond test for zero autocorrelation in first-differenced errors Order z Prob > z ------+---------------- 1-3.5996 0.0003 2 -.51603 0.6058 H0: no autocorrelation The large p-value for the second order serial correlation indicates that there is no first-order serial correlation in the residuals of equation in levels. Hence, there is no evidence of misspecification. 11.4.2 The System GMM Estimator For the system estimator we have the command xtdpdsys n l(0/1).w l(0/2).(k ys) yr1980-yr1984 year, lags(2) noconstant vce(robust) System dynamic panel-data estimation umber of obs = 751 Group variable: id umber of groups = 140 Time variable: year Obs per group: min = 5 avg = 5.364286 max = 7 umber of instruments = 48 Wald chi2(16) = 10231.68 Prob > chi2 = 0.0000 One-step results Robust n Coef. Std. Err. z P>z [95% Conf. Interval] -------------+---------------------------------------------------------------- n L1..8960982.1292138 6.94 0.000.6428438 1.149353 L2. -.0765991.0603137-1.27 0.204 -.1948118.0416136 w --. -.6477924.1812963-3.57 0.000-1.003127 -.2924582 L1..5048449.1669834 3.02 0.003.1775633.8321264 k

11.4 Estimation in Stata 125 --..3447254.0625943 5.51 0.000.2220427.467408 L1. -.1395779.073419-1.90 0.057 -.2834765.0043207 L2. -.0509138.0410563-1.24 0.215 -.1313826.029555 ys --..6393548.173655 3.68 0.000.2989972.9797124 L1. -.8204468.2330248-3.52 0.000-1.277167 -.3637266 L2..1112503.1566238 0.71 0.478 -.1957267.4182274 yr1980.0197832.0144734 1.37 0.172 -.0085841.0481505 yr1981 -.0163387.0261617-0.62 0.532 -.0676147.0349373 yr1982 -.0215201.0273468-0.79 0.431 -.0751188.0320785 yr1983 -.016892.0280498-0.60 0.547 -.0718686.0380846 yr1984 -.0067004.0281021-0.24 0.812 -.0617795.0483788 year.0005229.0002955 1.77 0.077 -.0000564.0011021 Instruments for differenced equation GMM-type: L(2/.).n Standard: D.w LD.w D.k LD.k L2D.k D.ys LD.ys L2D.ys D.yr1980 D.yr1981 D.yr1982 D.yr1983 D.yr1984 D.year Instruments for level equation GMM-type: LD.n otice how Stata specifies the instruments for the levels equation. In the Sargan test for the over-identifying restrictions we have xtdpdsys n l(0/1).w l(0/2).(k ys) yr1980-yr1984 year, lags(2) noconstant (output omitted) estat sargan Sargan test of overidentifying restrictions H0: overidentifying restrictions are valid chi2(32) = 64.89071 Prob > chi2 = 0.0005 Again, the moment conditions are not valid. The second-order serial correlation test yields xtdpdsys n l(0/1).w l(0/2).(k ys) yr1980-yr1984 year, lags(2) noconstant vce(robust) estat abond artests not computed for one-step system estimator with vce(gmm) Arellano-Bond test for zero autocorrelation in first-differenced errors Order z Prob > z ------+---------------- 1-4.6584 0.0000 2 -.90919 0.3632 H0: no autocorrelation There is no evidence of first-order serial correlation in the levels equation, which is the same as second-order serial correlation in the first-difference equation. For more on these estimations consult the original papers and the Stata manual for additional options.