Bias in Dynamic Panel Models under Time Series Misspeci cation

Similar documents
Estimating Time-Series Models

Chapter 3. GMM: Selected Topics

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

Heteroskedasticity, Autocorrelation, and Spatial Correlation Robust Inference in Linear Panel Models with Fixed-E ects

Estimation of spatial autoregressive panel data models with xed e ects

The power performance of fixed-t panel unit root tests allowing for structural breaks in their deterministic components

Johan Lyhagen Department of Information Science, Uppsala University. Abstract

Notes on Instrumental Variables Methods

Chapter 2. Dynamic panel data models

Debt, In ation and Growth

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Adaptive Estimation of the Regression Discontinuity Model

On the asymptotic sizes of subset Anderson-Rubin and Lagrange multiplier tests in linear instrumental variables regression

Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models

Testing Weak Cross-Sectional Dependence in Large Panels

Estimation of the large covariance matrix with two-step monotone missing data

Exercises Econometric Models

Benoît MULKAY Université de Montpellier. January Preliminary, Do not quote!

A Simple Panel Stationarity Test in the Presence of Cross-Sectional Dependence

Asymptotic F Test in a GMM Framework with Cross Sectional Dependence

MAKING WALD TESTS WORK FOR. Juan J. Dolado CEMFI. Casado del Alisal, Madrid. and. Helmut Lutkepohl. Humboldt Universitat zu Berlin

On GMM Estimation and Inference with Bootstrap Bias-Correction in Linear Panel Data Models

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

QML estimation of spatial dynamic panel data models with time varying spatial weights matrices

Elements of Asymptotic Theory. James L. Powell Department of Economics University of California, Berkeley

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

arxiv: v2 [stat.me] 3 Nov 2014

Using the Divergence Information Criterion for the Determination of the Order of an Autoregressive Process

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

Lower Confidence Bound for Process-Yield Index S pk with Autocorrelated Process Data

Performance of lag length selection criteria in three different situations

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

Partial Identification in Triangular Systems of Equations with Binary Dependent Variables

General Linear Model Introduction, Classes of Linear models and Estimation

GMM estimation of spatial panels

E cient Estimation and Inference for Di erence-in-di erence Regressions with Persistent Errors

Statics and dynamics: some elementary concepts

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Common Correlated Effects Estimation of Heterogeneous Dynamic Panel Data Models with Weakly Exogenous Regressors *

Iterative Bias Correction Procedures Revisited: A Small Scale Monte Carlo Study

Scaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling

The following document is intended for online publication only (authors webpage).

Testing Weak Convergence Based on HAR Covariance Matrix Estimators

Towards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK

E cient Semiparametric Estimation of Quantile Treatment E ects

A New Asymmetric Interaction Ridge (AIR) Regression Method

ute measures of uncertainty called standard errors for these b j estimates and the resulting forecasts if certain conditions are satis- ed. Note the e

BOOTSTRAP FOR PANEL DATA MODELS


Parametric Inference on Strong Dependence

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.

Notes on Time Series Modeling

Lecture Notes on Measurement Error

A multiple testing approach to the regularisation of large sample correlation matrices

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

State Estimation with ARMarkov Models

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS

GMM based inference for panel data models

Testing for Regime Switching: A Comment

An Improved Calibration Method for a Chopped Pyrgeometer

Semiparametric Estimation of Markov Decision Processes with Continuous State Space

Consistent Estimation of the Number of Dynamic Factors in a Large N and T Panel

Asymptotically Optimal Simulation Allocation under Dependent Sampling

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Time Series Nonparametric Regression Using Asymmetric Kernels with an Application to Estimation of Scalar Diffusion Processes

MATH 829: Introduction to Data Mining and Analysis Consistency of Linear Regression

4. Score normalization technical details We now discuss the technical details of the score normalization method.

Introduction Model secication tests are a central theme in the econometric literature. The majority of the aroaches fall into two categories. In the r

Numerical Linear Algebra

Estimation of Separable Representations in Psychophysical Experiments

A New Approach to Robust Inference in Cointegration

Some Recent Developments in Spatial Panel Data Models

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

Discussion Paper No.247. Heterogeneous Agents Model of Asset Price with Time Delays. Akio Matsumoto Chuo University

arxiv: v1 [physics.data-an] 26 Oct 2012

SIGNALING IN CONTESTS. Tomer Ifergane and Aner Sela. Discussion Paper No November 2017

The Properties of Pure Diagonal Bilinear Models

ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE

Positive decomposition of transfer functions with multiple poles

On a Markov Game with Incomplete Information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Supplementary Materials for Robust Estimation of the False Discovery Rate

Cambridge-INET Institute

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects

Spectral Analysis by Stationary Time Series Modeling

On the optimal weighting matrix for the GMM system estimator in dynamic panel data models

Department of Economics Seminar Series. Yoonseok Lee University of Michigan. Model Selection in the Presence of Incidental Parameters

AN EXTENDED YULE-WALKER METHOD FOR ESTIMATING A VECTOR AUTOREGRESSIVE MODEL WITH MIXED-FREQUENCY DATA *

How to Estimate Expected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty

An Improved Generalized Estimation Procedure of Current Population Mean in Two-Occasion Successive Sampling

arxiv:cond-mat/ v2 25 Sep 2002

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

Colin Cameron: Brief Asymptotic Theory for 240A

Research Note REGRESSION ANALYSIS IN MARKOV CHAIN * A. Y. ALAMUTI AND M. R. MESHKANI **

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

ECON 4130 Supplementary Exercises 1-4

Let s Fix It: Fixed-b Asymptotics versus Small-b Asymptotics in Heteroskedasticity and Autocorrelation Robust Inference

COMMUNICATION BETWEEN SHAREHOLDERS 1

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT

16.2. Infinite Series. Introduction. Prerequisites. Learning Outcomes

Transcription:

Bias in Dynamic Panel Models under Time Series Misseci cation Yoonseok Lee August 2 Abstract We consider within-grou estimation of higher-order autoregressive anel models with exogenous regressors and xed e ects, where the lag order is ossibly misseci ed. Even when disregarding the misseci cation bias, the xed-e ect bias formula is quite different from the correctly seci ed case though its asymtotic order remain the same under the stationarity. We suggest some bias reduction methods under the ossible misseci cation. Keywords: Bias, dynamic anel, xed e ects, misseci cation, bias reduction. JEL Classi cations: C23, C33 An earlier version of this aer is the art of my dissertation at Yale University and it was circulated as A General Aroach to Bias Correction in Dynamic Panels under Time Series Misseci cation. I thank Peter Phillis, Donald Andrews, Yuichi Kitamura and seminar articiants at Cornell, MSU, Texas A&M, SUNY-Binghamton, Yale, 25 Inter University Conference at Princeton, 27 Midwest Econometrics Grou Meeting, and Conference in Honor of Peter C. B. Phillis. I also would like to thank an anonymous referee and coeditors for their detailed and constructive comments. All errors are solely mine. Address: Deartment of Economics, University of Michigan, 6 Taan Street, Ann Arbor, MI 489-22. E-mail: yoolee@umich.edu.

Introduction Since the in uential aers by Nerlove (967) and Nickell (98), nite samle autoregressive bias in xed-e ect dynamic anel models has been well understood and many bias reduction methods were roosed in the context of within-grou estimators (e.g., Kiviet, 995; Hahn and Kuersteiner, 22; Alvarez and Arellano, 23; Bun and Carree, 25; and Phillis and Sul, 27 to name a few). Dynamic anel studies have long relied on the rst-order autoregressive structure, which is indeed unavoidable esecially when the length of the time series (T ) is small. As longer anel data become available, however, it is more natural to consider higher order dynamics when rst-order models are susected to be misseci ed. This aer evaluates the e ects of this tye of misseci cation articularly on the xed-e ect bias in the dynamic anel regressions. Seci cally, we extend the bias formula of Nickell (98) to the case that the dynamics follow general autoregressive forms with exogenous regressors, ARX(), but the lag order could be misseci ed. We also develo NT -normalized limit distribution of the withingrou estimator that allows for lag order misseci cation, when both N (the cross section samle size) and T get large at the same rate. Besides the order misseci cation bias and the xed-e ect bias, the analytical results reveal an additional bias, which is generated from combining order misseci cation and incidental arameters roblem (Neyman and Scott, 948). The additional bias is, however, still of the same order of magnitude as the standard xed-e ect bias (i.e., O(=T )) and thus it is ossible to develo a bias reduction method. Though attemts to adjust for the bias using formulae that correct for rst-order dynamics could be wrong and even exacerbate the bias under such misseci cation, it is found that we can reduce the xed-e ect bias robustly to the lag order misseci cation using the enalized likelihood function aroach (e.g., Hahn and Kuersteiner, 24; Arellano and Hahn, 26; Bester and Hansen, 27) or the model selection based aroach (e.g., Lee, 26). 2 The Model We consider a anel rocess fy i;t g generated from the homogeneous th-order univariate autoregressive model with exogenous regressors X i;t 2 R r (i.e., (ARX ())) given by y i;t = i + X jy i;t j + X i;t + u i;t for i = ; ; N and t = ; ; T, () j=

where the lag order is assumed to be nite. We let the initial values (y i; ; y i; ; ; y i; + ) be observed for all i. We rst assume the following conditions. Assumtion E (i) fu i;t g is i.i.d. across i and t; (ii) E (u i;t i ) = E (u i;t X i;s ) =, Eu 2 i;t = 2 with < 2 < and Eu 8 i;t < for all i, s and t; (iii) fx i;tg is i.i.d. across i and strictly stationary in t for each i; (iv) jje(x i;t Xi;t )jj <. Assumtion S P j= j jj < and all roots of the characteristic equation P j= jz j = lie outside the unit circle. Assumtion E imlies weak (or sequential) exogeneity in y i;t but strict exogeneity in X i;t. It still allows for non-zero correlation between the unobserved individual e ect i and X i;t. Note that X i;t could be serially correlated; but we assume that X i;t and the higher order lags of y i;t cature all the ersistence and thus the error term does not have any serial correlation. Assuming indeendence of u i;t over t makes the exressions simle. We also exclude cross sectional deendence in u i;t. Finally note that the eighth moment condition is required to derive joint asymtotic CLT in the later section. We eliminate xed e ects i by subtracting the individual samle average over time (i.e., within-grou transformation) from equation () to obtain y i;t = X j= jy i;t j + X i;t + u i;t, (2) where for any variable w i;t we de ne w i;t j = w i;t j w i; j and w i; j = (=T ) P T s= w i;s j for j = ; ; ;. The formula (2) readily transforms into the rst-order -dimensional vector autoregressive rocess with exogenous regressors given by Y i;t = AY i;t + BX i;t + U i;t, (3) where Y i;t j = (y i;t j ; y i;t j ; ; y i;t j + ) for j = ;, and A = " () I ; with () = ( ; 2 ; ; ) and I k being the identity matrix of rank k. We also let As noted in Bhansali (98) and Kunitomo and Yamamoto (985), we can consider the in nite case given suitable de nitions of the in nite dimensional autoregressive coe cient matrix A and the choice matrices e and J q. This is the case of estimating an aroximate ARX ( T ) model with T!, where 3 T =T! and T =2 P j=+ jjj! as T! (e.g., Bhansali, 978). See Lee (26) for further discussions. 2 #

U i;t = e u i;t and B = e, where e is the column vector with one in the rst element and zeros elsewhere. Note that, from (3), Assumtion S is equivalent to det [I Az] 6= for all jzj, or that each eigenvalue of A has modulus less than one. It thus guarantees that the sequence fa j g is absolutely summable and P j= Aj = (I A) exists. Hence, if we de ne vector linear rocesses V i;t = P j= Aj U i;t j and Z i;t = P j= Aj BX i;t j with U i;t = e u i;t, then both V i;t and Z i;t exist in the mean square sense and we can rewrite (3) as Yi;t = Z i;t + V i;t. Moreover, if we let j = E(V i;t Vi;t+j ) for j = ; ;, we have j = A j, where = 2 P j= Aj e e A j. For later convenience we also introduce the long-run covariance matrix of V i;t as = P j= j = + +, where = P j= j and = P j= j = P j= j. Assumtions E and S guarantee that exists. In most cases, the true lag order of the underlying autoregressive rocess fy i;t g in () is unknown. Hence we consider the situation that fy i;t g is tted to an ARX (q) rocess instead of ARX (), where q. 2 By stacking cross section observations rst and then time series observations, (3) can be rewritten as Y = Y A + X B + U, from which the within-grou estimators in this case are de ned as b (; q) = J qy Q X Y J q J q Y Q X Y e (4) b = X X X Y e Y J q b (; q), (5) where Q X = I NT X (X X ) X, b (; q) = (b ;q ; b ;q2 ; ; b ;qq ) and b ;qr for r = ; 2; ; q is the within-grou estimator for the coe cient of y i;t r when ARX () rocess is tted to ARX (q). The q choice matrix J q is de ned as J q = [I q ; ] if q < ; J q = I if q =. We further assume the invertibility of JY Q X Y J and X X, which is imlied by the following condition. Assumtion R (i) For given, Y Q X Y is a full column rank matrix; (ii) X is a full column rank matrix and it does not include time-invariant variables; (iii) q <. 3 Bias Formulae When the number of lags is not correctly chosen, and esecially when the data is tted to a lower order ARX model, the within-grou estimators in (4) and (5) are exected to 2 The case of q > is less interesting since there is no variable omission bias. In this case, we simly let (; q) = (() ; q ). An interesting examle is when (; q) = (; ) without exogenous regressors X i;t (i.e., y i;t = i + u i;t is the data generating model). In this case, for the within-grou estimator b(; ) = P N P T i= t= u i;t u i;t= P N P T 2 i= t= u i;t, we can derive that lim N! b(; ) = =T. 3

have variable omission bias on to of the standard xed-e ect bias. Seci cally, we let b = Y Q X Y =NT, b = Y Q X Y =NT, = + lim N! Z Q X Z =NT and = + lim N! Z Q X Z A =NT. Then, using the identity C = (I C (C C ))C for symmetric invertible matrices C and C, we can rewrite (4) as b (; q) Jq () = (; q) Jq () + fb (; q) (; q)g (6) = n (; q) Jq () + (Jq J q ) Jq( b )e (Jq b J q ) [Jq( b b (b (; q)) + b 2 (b (; q)), )J q ](J q o J q ) Jq b e where (; q) = (Jq J q ) Jq e = ( ;q ; ;q2 ; ; ;qq ) is the theoretical arameter value from the ARX(q) tting. For examle, (; ) = ; is the rst-order autocorrelation coe cient of y i;t for the ure autoregressive case. In addition, since we can rewrite (3) as J qy i;t = J qaj q J qy i;t b = + J qbx i;t + J qu i;t, (5) can be decomosed as n X X X Y J q Jq () (; q) o (7) n + X X o X Y J q ( (; q) b (; q)) + o () b ( b ) + b 2 ( b ) + o (). by letting (X X ) X U e = o () from the strict exogeneity. Note that in these two bias exressions (6) and (7), b (b (; q)) and b ( b ) are the variable omission biases, which cannot be eliminated unless the model is correctly seci ed. This art of bias, therefore, should be took care of by roer lag order selection methods (e.g., Lee, 26, 2b). On the other hand, b 2 (b(; q)) and b 2 ( b ) are the biases from the within-grou transformation. In this section, we will show that the second art of the biases have di erent exressions from the standard xed-e ect bias formula and they indeed have additional terms olluted by the misseci cation. b 2 (b(; q)) and b 2 ( b ) are, however, shown to be still O (=T ), which will be disaear as T!, whereas the misseci cation biases b (b(; q)) and b ( b ) do not vanish even when N; T!. The main interest here is, therefore, in the biases b 2 (b(; q)) and b 2 ( b ), which are manageable, instead of the entire biases b(; q) J q() and b. 3 This is the case when 3 The theoretical arameters (; q) indeed corresond to the autocorrelation coe cients of y i;t u to the qth-order, which are not model seci c. Thus, the results in this aer could be alternatively seen as the bias in estimating autocorrelations in xed-e ects models. The bias from the theoretical arameter value is also studied in the standard time series literature under misseci cation (e.g., Bhansali, 98; Kunitomo and Yamamoto, 985). 4

the researcher is only able to run the ARX(q) regression (with q < ) due to the lack of enough time eriods or simly believes ARX(q) is true; but she tries to nd the bias formula from the arameters in ARX(q) in order to correct the xed-e ect bias. A leading examle is tting AR() rocess using nite number of lags. The main lesson should be that the standard bias formula for the within-grou estimator is no longer valid even around the theoretical arameter values (i.e., even we disregard the misseci cation bias) and thus the standard bias correction methods would not work roerly in this case. 3. Nickell bias We rst examine the asymtotic bias of the within-grou estimator when N tends to in nity but T is xed. Even in the case of correct seci cation, the standard within-grou estimator in AR () xed-e ects models is not consistent for large N (e.g., Nerlove, 967; Nickell, 98). As well described in Phillis and Sul (27), such autoregressive bias arises from the correlation of the error and the lagged deendent variables after the unknown mean is estimated to be removed. Not surrisingly, such bias becomes more comlicated when the lag order is not correctly seci ed. The rst theorem generalizes the Nickell bias to the case of ossible time series misseci cation. Theorem Let fy i;t g be generated from () and S X = lim N! Z Q X Z =NT exists. Under Assumtions E, R and S, lim (b (; q) (; q)) = (I q R q; ) R q;2 (8) N! (I q R q; ) R q; R q; D X q J q(s X A + )e, where D X q = (J q( + S X )J q ), R q; = D X q J qg J q, R q; = D X q J qg e and R q;2 = D X q J qg 2 e. Exressions of G, G 2 and G 2 are as (A.3), (A.4) and (A.5) in Aendix. The rst term of the bias (8) is mainly from the nonzero correlation between Yi;t and, and the second term is mainly from order misseci cation (more recisely, from the U i;t combination of endogeneity and order misseci cation). It can be easily veri ed that the asymtotic bias of b (; q) around (; q) becomes negligible as T grows. The bias exression of (5) follows easily from (7) and (8). If there is no exogenous regressors, the exression for lim N! (b (; q) as (; q)) remains the same without S X term, which can be aroximated 2 T D qjq (I A) e T D qjq I J q D q Jq A e + O T 2 5 (9)

for large T, where D q = (J q J q ). Remark When q =, (9) could be further reduced to 2` X (T ) (T ) k! k! + O k= T 2, () P where ` = = k= k and!k = P j= j+k with j being the autocovariances of y i;t : For a articular examle with (; q) = (2; ), the asymtotic bias can be obtained as T 2 ( + 22) ( + ) T 2 22 using the Yule-Walker equation, where = 2 =( + 22 22 ( + ) + O T 2 () 22 ) is the rst-order autocorrelation coe cient of y i;t in AR(2), which indeed corresonds to the theoretical coe cient (2; ) = 2;. 4 If = q =, then 22 = and () is equivalent to the bias exression of Nickell (98), which could be also seen in () by letting k = for k 2 since = =. Note that since the true arameters satisfy j 22 j <, 22 + 2 < and 22 2 < under the Assumtion S (e.g., Marmol, 995), it can be veri ed that the rst art of the bias exression () is always negative, whereas the direction of the second art deends on the sign of 22. On the other hand, the sum of these two terms, (=(T 2))((+ 22 )=( 22 )) ( + ), is always negative under the stationarity so that both the direction and the asymtotic order of the xed-e ect bias of the AR() coe cient b(2; ) about the rst-order autocorrelation coe cient = (2; ) still remains the same as the standard (correctly seci ed) case. 5 Another nding is that the absolute value of the bias () increases with ositive 22. In articular, the second art of the bias, which is mainly from the misseci cation, exlodes as 22 gets close to unity even when 2 =. Similar henomenon was also discussed by Kunitomo and Yamamoto (985) in the time series context. 3.2 Noncentrality in the asymtotic distribution We now consider the asymtotic distribution of the within-grou estimator when both N and T are large. Hahn and Kuersteiner (22), Alvarez and Arellano (23), and Lee (2a) 4 In an indeendent work by Okui (28), it is also shown that the within-grou estimator of the anel AR() coe cient converges to the rst-order autocorrelation coe cient as N! and T! even under dynamic misseci cation. This nding is a articular examle of Theorem since b (; ) (; )! as N! and T!, where (; ) should be the rst-order autocorrelation coe cient for any by construction. 5 Based on a number of simulation studies, we exect that the direction of the bias () still remains negative even for 3. 6

consider a similar case, where the asymtotic ratio of N to T is a nite constant. More recisely, we assume the following condition. Assumtion NT lim N;T! N=T =, where < <. Under Assumtion NT, the standard NT -normalized within-grou estimator has nondegenerating asymtotic bias, which is roortional to the limiting samle size ratio,. As in Theorem, however, the asymtotic bias could increase when the dynamic seci cation is incorrect. We also assume the following initial condition. Assumtion I (i) Y i; iid( i (I A) e + E(Z i;t ); + var(z i;t )) for each i, where var(z i;t ) < ; (ii) (=N) P N i= 2 i = O (). The initial condition in Assumtion I is standard in time series models when studying stationary autoregressive rocess. Since we assume large T, we use this initial condition without loss of generality as the initial values become less imortant for longer stationary anels. Theorem 2 We assume that lim T! S X = lim N;T! Z Q X Z =NT > and it exists. Under Assumtions E, R, S, I and NT, NT (b (; q) (; q))!d N ( X q ; 2 D X q ) (2) as N; T! jointly, where X q = 2 D X q Jq(I A) e +D X q JqA e D X q (JqJ q )D X q Jq( + lim T! S X A )e and D X q = (Jq( + lim T! S X )J q ). The asymtotic distribution of (5) follows easily from (7) and (2). Theorem 2 shows that the asymtotic bias deends on the samle size ratio,, and thus large T does not attenuate the bias unless is zero. If there is no exogenous regressors, the exression reduces to NT (b (; q) (; q))!d N q ; 2 D q (3) with q = 2 D q Jq (I A) e Dq Jq(I J q D q Jq )A e, where the rst comonent mainly contributes to the well-known negative bias from the within-grou transformation, which always underestimates (; q) even when the number of lags is correctly chosen. As discussed in Alvarez and Arellano (23), we can further derive that NT (b (; q) [ (; q) (=T ) q ])! d N ; 2 D q using a higher order exansion of the 7

bias term (e.g., Lee, 26) rovided that N=T 3!. The asymtotic distribution under the correct time series seci cation (e.g., Hahn and Kuersteiner, 22) is a secial case of (3) by letting = q, which is NT (b () ())! d N ( 2 ( ) e ; 2 ). In articular, when =, we have NT (b () ())! d N ( ( + ()); () 2 ). 4 Bias Reductions Theorems and 2 show that the within-grou estimators have additional bias even around the theoretical arameter values when the lag order is not correctly chosen. Therefore, most existing bias corrections would not work roerly because the correction formulae assume correct model seci cation. In fact, attemts to adjust for the bias using formulae that correct for AR() dynamics would be wrong and may even exacerbate the bias when the true lag order is larger than one. For examle, we consider estimating AR() anel regression with Hahn and Kuersteiner (22) s tye bias correction when the true data generating rocess is AR(2): e(2; ) = ((T + )=T )b(2; ) + (=T ), where b (2; ) is the within-grou estimator. From () and using (2; ) =, however, e(2; ) (2; ) = (b(2; ) (2; )) + (=T )(b(2; ) + )! (=T ) ( + ) (2 22 =( 22 )) + O(=T 2 ) as N!, where the leading bias term can be zero only when 22 = (i.e., AR() is indeed the true data generating rocess) under the stationarity. Unfortunately, the direction and the size of the bias after the correction would vary deending on the unknown arameter value 22. Note that this result is even without considering the ure misseci cation bias (i.e., (2; ) 2 ), which will add additional bias toward the true arameter value e(2; ) 2. This examle manifests that a recise dynamic seci cation is imortant articularly when we attemt to correct the xed-e ect bias. A reasonable aroach is to conduct model selection before any bias corrections as suggested in Lee (26). 6 By doing so, the additional bias terms from the misseci cation will disaear asymtotically and we can focus on eliminating the ure xed-e ect bias using the bias formulae derived. Additional bene t of this aroach is that the bias correction works not only toward the theoretical arameter but also toward the true arameter since the ure misseci cation bias (e.g., b (b (; q)) in (6)) will disaear from the model selection stage. Alternatively, articularly when we are interested in bias correction toward the theo- 6 Also see Lee (2b) for a modi ed lag order selection criterion develoed for the AR() dynamic anel models with xed e ects. It is imortant to note that we cannot use the standard information-based lag order selection criteria (e.g., AIC, BIC) in this case. Intuitively, this is because the ML estimator and the KL information criterion could be inconsistent for small T in the resence of incidental arameters (e.g., the xed e ects). 8

retical arameter, we can address it, even under ossible lag order misseci cation, using the enalized likelihood function aroach (e.g., Hahn and Kuersteiner, 24; Arellano and Hahn, 26; Bester and Hansen, 27). Though the original formulae were develoed under the correct model seci cation, we can use this bias correction aroach robustly to the dynamic order misseci cation since they are formulated using the HAC estimator for the variance of the scores. Note that any lag order misseci cation yields erroneous serial correlation in the error term (or in the scores in general). More recisely, we can derive the enalty term in this context as 2 2 T X n i= X m `= m X minft;t +`g t=maxf;`+g u i;tu i;t ` (4) for some truncation arameter m satisfying m=t =2!, which is based on the HAC estimator for the long-run variance of u i;t and thus indeed allows for general forms of serial correlations in u i;t. Note that Bester and Hansen (27) suggest to use small m in ractice (e.g., m = for AR() model) but we need to use large m as in the standard HAC estimation to cover general forms of serial correlations from ossible order misseci cations. Tables I and II summarize some Monte Carlo simulation results showing how the bias reductions work even under the lag order misseci cation. The true model is AR(2) without exogenous regressors but it is tted to AR(). The true coe cients are 2 = 22 = :4 or :4. The theoretical arameter value 2; = (2; ) is the rst-order autocorrelation coe cient in the given AR(2) rocess (about :67 and :29, resectively), which is calculated using the true arameter values. i are randomly drawn from U ( :5; :5) and u i;t from N (; ). b 2; is the within-grou estimator before bias correction; e AR() 2; is the Hahn- Kuersteiner s bias-corrected estimator assuming AR() is correct; e P 2; L is the bias-corrected ML estimator from the enalized likelihood using (4) with m being the smallest integer larger than T =4, which is suosed to be robust to order misseci cation; e = e (b( ) lim N! [b ( ; ) ( ; )]) and e ; = b 2; lim N! (b ( ; ) ( ; )) are the bias-corrected estimators using the bias formula (8) and assuming the selected lag order is true, where is chosen as Lee (26 or 2b). 7 Table I tabulates the biases from the theoretical arameter values, which are b 2 (b (; q)) in (6), whereas Table II tabulates the biases from the true arameter values, which thus include the misseci cation biases (i.e., b (b (; q)) + b 2 (b (; q)) in (6)). The values are obtained from averaging over relications. 7 In other words, e is the rst element of the standard bias-corrected within-grou estimator of correctly seci ed AR( ); e ; is the bias-corrected estimator of AR() but the correction formula (8) assumes is true. Both cases believe is correctly chosen but the target arameters are di erent. 9

Table I: Bias from the theoretical arameter and RMSE of bias corrected estimates (N=25 ; ercentage of biases to the target arameter values are in the arentheses) 2; T b 2; 2; e AR() 2; 2; e P 2; L 2; e ; 2; bias (%) rmse bias (%) rmse bias (%) rmse bias (%) rmse.67 2 -.357(53.6).358 -.248(37.2).25 -.27(4.5).27 -.27(4.5).27 25 -.7(25.5).7 -.(6.5).2 -.(6.6).2 -.6(5.9).7 5 -.8(2.).8 -.49 (7.3).5 -.5 (7.4).5 -.42 (6.3).44 -.39 (5.8).4 -.23 (3.4).24 -.2 (3.).22 -.8 (2.7).2 25 -.4 (2.).5 -.8 (.2).9 -.7 (.).9 -.6 (.9).8 -.29 2 -.25 (8.9).28.32(.2).35 -.8 (2.8).5 -.6 (2.).3 25 -.2 (4.3).5.6 (5.5).8 -.8 (2.7). -.2 (.5).8 5 -.6 (2.).8.8 (2.9). -.3 (.9).6. (.).6 -.3 (.).5.4 (.5).6. (.).4. (.).4 25 -. (.4).3.2 (.7).3. (.).3. (.).3 Table II: Bias from the true arameter and RMSE of bias corrected estimates (N=25 ; ercentage of biases to the target arameter values are in the arentheses) 2 T b 2; 2 e AR() 2; 2 e P 2; L 2 e 2 bias (%) rmse bias (%) rmse bias (%) rmse bias (%) rmse.4 2 -.9(22.7).94.8 (4.6).32 -.3 (.8).24 -.62(4.4).67 25.97(24.2).98.57(39.).57.56(39.).57 -.55(3.8).57 5.86(46.6).87.28(54.5).28.27(54.3).27 -.2 (5.3).23.228(57.).228.244(6.).244.246(6.5).246 -.9 (2.2). 25.252(63.).252.259(64.8).259.26(64.9).26 -.3 (.6).5 -.4 2.89(22.3).9.46(36.6).47.6(26.6).7 -.24 (5.9).5 25.2(25.5).2.3(32.5).3.7(26.7).7 -.5 (.2).3 5.9(27.).9.23(3.7).23.2(27.9).2 -.2 (.4).9.(27.9).2.9(29.6).9.5(28.7).5 -. (.2).6 25.3(28.3).3.6(29.).6.5(28.7).5. (.).4 The result tells several imortant oints. First, for the original atterns of the xed-e ect bias, Table I shows that the direction of the bias is always negative (under the stationarity) and it decreases as T gets large, which corresonds to the analytical ndings in the main theorems, whereas Table II shows it is not the case when the misseci cation bias is also considered. Second, the standard bias correction (e AR() 2; ) could exacerbate the overall bias deending on the arameter values though it reduces the bias from 2; to some degree in most cases (Table I). It is mainly because the Hahn-Kuersteiner s bias correction term is ositive unless b 2; < (which is a rare case under the stationarity) and it would o set the negative xed-e ect bias. The correction, however, could increase the bias when the

correction size is too large comaring to the xed-e ect bias. Third, the enalized likelihood based aroach (e P L 2;) well reduces the bias from the theoretical arameter value robustly to the order misseci cation (Table I) and it is comarable to the model-selection-embedded rocedure (e ;). In addition to the robustness, the enalized likelihood based aroach never increases the mean square error, which tells the bias correction barely increases the samle variance. However, e P 2; L as well as e AR() 2; does not work roerly toward the true arameter 2 (Table II), which is well exected since these two rocedures are not designed to correct the ure misseci cation bias (e.g., b (b (; q)) in (6)). In comarison, the correction method involving model selection (e and e ;) outstands for most of the cases (both for 2; and 2 ) and the erformance imroves as T increases, which would be mainly because that the correct selection robability of the lag order selection rocedure imroves with T. 5 Concluding Remarks This aer calls into question the simle ARX () structure in dynamic anels with xed e ects. When the lag orders are unknown, the rst-order models are most likely misseci ed. In such cases, attemts to adjust for the bias using formulae that correct for ARX () models would be wrong and may even exacerbate the bias. To address these concerns, we undertake an in-deth investigation of the asymtotic bias of the within-grou estimator, where the bias formulae are derived under ossible lag order misseci cation. It should be noted that, therefore, the main focus of this aer is di erent from develoing bias correction methods that is robust to the serial correlations in the error (e.g., Hahn and Kuersteiner, 24). It is closely related with Solon (984), who considers autocorrelation estimators of the serially correlated error term, but deriving exlicit bias formulae of the autoregressive arameters is new. For dynamic anel regression, instrumental variables estimation after rst di erencing (e.g., Arellano and Bond, 99) is an alternative aroach, which does not require any bias correction. However, the instruments are found from the lagged values of the deendent variable with resuming that the error term does not have serial correlations. With lag order misseci cation, however, the regression error could imose serial correlation and thus the instruments could be no longer valid, which incurs inconsistency of the estimator. This aer emhasizes the lag order misseci cation in the linear model. When the linearity is in doubt, however, we could consider the nonarametric aroach as Lee (2a), in which a roer bias correction is also develoed for nonlinear models.

Aendix: Mathematical roofs For V i;t = P j= Aj U i;t j, we rst derive the following lemmas. See Lee (29) for the roof. Lemma A Under Assumtions E and S, lim NT N! lim NT N! NX TX i= t= NX TX i= t= V i;t U i;t = 2 T (I A) (I H T ) M, V i;t V i;t = T (I A) (I H T A) T f(i A) (I H T ) A g, where M = e e and H T = (I A) I A T =T. Lemma A2 Under Assumtions E, S, I and NT, NT N X TX i= t= vec Vi;t Ui;t!d N 2 vec((i A) M); 2 (M ) as N; T! jointly, where = EV i;t V i;t = 2 P j= Aj MA j. Proof of Theorem imlies that Recall that Y = Y A + X B + U = Z + V. Lemma A lim b = lim N! N!NT Z Q X Z + lim N!NT V V = S X + G, (A.) lim b = lim N! N!NT Y Q X Y A + lim N!NT V U = S X A + G G 2, (A.2) from the strict exogeneity of X i;t, where G = h (I A) (I H T A) + f(i A) (I H T ) A g i, (A.3) T G = h (I A) (I H T A) + fa (I A) (I H T )g i, (A.4) T G 2 = 2 T (I A) (I H T ) M. (A.5) 2

Note that = A. Then, from (6), the bias exression for b (; q) follows as lim (b (; q) N! (; q)) = Jq(S X + )J q J q (G + G 2 ) e + I q Jq(S X + )J q J q G J q J q (S X + )J q J q G J q Jq(S X + )J q Jq S X A + G G 2 e = (R q; + R q;2 ) + (I q R q; ) R q; Jq(S X + )J q J q (S X + )A e R q; R q;2 n = (I q R q; ) R q;2 (I q R q; ) R q; R q; Jq(S o X + )J q J q (S X + )A e since J q(s X + )J q J qg J q = (Iq J q(s X + )J q J q G J q ) J q(s X + )J q and by letting R q; = J q(s X + )J q J q G J q, R q; = J q(s X + )J q J q G e and R q;2 = J q(s X + )J q J q G 2 e. Proof of Theorem 2 NT ( b First note that r N ) = T T (by jx )A + Y Q X U, NT in which the rst term satis es N=T T ( b )A! A as N; T! from (A.) and Lemma B2 in Lee (29). For the second term, we have Y Q X U = Z Q X U + V Q X U CN;T + CN;T 2. NT NT NT It can be veri ed that vec(cn;t )! d N ; 2 M lim N;T! (Z Q X Z =NT ) as N; T! rovided lim N;T! (Z Q X Z =NT ) is nonzero and exists. However, since X X =NT = O (), X U = NT = O () and V X =NT = o () from the strict exogeneity, we have CN;T 2 = V V U X X X X U = V NT NT NT U + o (). NT NT Lemma A2 imlies that, therefore, vec(c 2 N;T )! d N ( 2 vec((i A) M); 2 (M )) as N; T!. In sum, we have NT vec( b )! d N ; 2 M ( + lim N;T! (Z Q X Z =NT )), where = vec( 2 (I A) M + A ) since CN;T and C2 N;T are uncorrelated. Further note that as shown in Lee (29), lim T! S X = lim N;T! Z Q X Z =NT (i.e., equivalence between the sequential and the joint robability limits; e.g., Phillis and Moon, 999). 3

Therefore, from (6), for a random vector W N ( ( 2 (I A) M + A )e ; 2 ( + lim T! S X )), we can conclude that (using (A.), (A.2) and Lemma B2 in Lee (29)) NT (b (; q) = (J q (; q)) J q ) J qf NT ( b (Jq b J q ) [Jq N=T T ( b! d (J q( + lim T! S X)J q ) J qw + (J q( + lim T! S X)J q ) [J q = d N ( X q ; 2 D X q ) as N; T!, )e g )J q ](J q J q ) Jq b e Jq ](J q( + lim T! S X)J q ) J q( + lim T! S XA )e where D X q = lim T! D X q = (J q( + lim T! S X )J q ) and X q = D X q J q( 2 (I A) M + A )e References D X q (J qj q )D X q J q( + lim T! S X A )e. [] Alvarez, J. and M. Arellano (23). The time series and cross-section asymtotics of dynamic anel data estimators, Econometrica, 7, 2-59. [2] Arellano, M. and S. Bond (99). Some tests of seci cation for anel data: Monte Carlo evidence and an alication of emloyment equations, Review of Economics Studies, 58, 277-297. [3] Arellano, M. and J. Hahn (26). A likelihood-based aroximate solution to the incidental arameter roblem in dynamic nonlinear models with multile e ects, CEMFI Working Paer: No. 63. [4] Bester, C.A. and C. Hansen (27). A Penalty Function Aroach to Bias Reduction in Nonlinear Panel Models with Fixed E ects, Journal of Business and Economic Statistics, forthcoming. [5] Bhansali, R.J. (978). Linear rediction by autoregressive model tting in the time domain, Annals of Statistics, 6, 224-23. [6] Bhansali, R.J. (98). E ects of not knowing the order of an autoregressive rocess on the mean squared error of rediction I, Journal of the American Statistical Association, 76, 588-597. [7] Bun, M.J.G. and M.A. Carree (25). Bias-corrected Estimation in Dynamic Panel Data Models, Journal of Business and Economic Statistics, 23, 2-2. [8] Hahn, J. and G. Kuersteiner (22). Asymtotically unbiased inference for a dynamic anel model with xed e ects, Econometrica, 7, 639-657. 4

[9] Hahn, J. and G. Kuersteiner (24). Bias reduction for dynamic nonlinear anel models with xed e ects, unublished manuscrit, UCLA. [] Kiviet, J.F. (995). On bias, inconsistency, and e ciency of various estimators in dynamic anel models, Journal of Econometrics, 68, 53-78. [] Kunitomo, N. and T. Yamamoto (985). Proerties of redictors in misseci ed autoregressive time series models, Journal of the American Statistical Association, 8, 94-95. [2] Lee, Y. (26). Nonarametric Aroaches to Dynamic Panel Modelling and Bias Correction, Ph.D. dissertation, Yale University. [3] Lee, Y. (29). Sulementary Aendix to Bias in Dynamic Panel Models under Time Series Misseci cation, available on www-ersonal.umich.edu=~yoolee =research.html. [4] Lee, Y. (2a). Nonarametric Estimation of dynamic anel models with xed e ects, unublished manuscrit, University of Michigan. [5] Lee, Y. (2b). Model selection in the resence of incidental arameters, unublished manuscrit, University of Michigan. [6] Marmol, F. (995). The stationarity conditions for an AR(2) rocess and Schur s theorem, Econometric Theory,, 8-82. [7] Nerlove, M. (967). Exerimental evidence on the estimation of dynamic economic relations from a time series of cross-sections, Economic Studies Quarterly, 8, 42-74. [8] Neyman, J. and E. Scott (948). Consistent estimates based on artially consistent observations, Econometrica, 6, -32. [9] Nickell, S. (98). Biases in dynamic models with xed e ects, Econometrica, 49, 47-425. [2] Okui, R. (28). Panel AR() estimators under misseci cation, Economics Letters,, 2-23. [2] Phillis, P.C.B. and H.R. Moon (999). Linear Regression Limit Theory for Nonstationary Panel Data, Econometrica, 67, 57-. [22] Phillis, P.C.B. and D. Sul (27). Bias in Dynamic Panel Estimation with Fixed E ects, Incidental Trends and Cross Section Deendence, Journal of Econometrics, 37, 62-88. [23] Solon, G. (984). Estimating autocorrelations in xed-e ects models, NBER Technical Working Paer No.32. 5