GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

Similar documents
Chapter 1. GMM: Basic Concepts

Chapter 2. Dynamic panel data models

GMM based inference for panel data models

On GMM Estimation and Inference with Bootstrap Bias-Correction in Linear Panel Data Models

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

13 Endogeneity and Nonparametric IV

On the optimal weighting matrix for the GMM system estimator in dynamic panel data models

Chapter 2. GMM: Estimating Rational Expectations Models

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Monotone Index Models

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

On Standard Inference for GMM with Seeming Local Identi cation Failure

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Strength and weakness of instruments in IV and GMM estimation of dynamic panel data models

Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models

Parametric Inference on Strong Dependence

A Course on Advanced Econometrics

A New Approach to Robust Inference in Cointegration

Econometrics of Panel Data

Notes on Time Series Modeling

Testing for Regime Switching: A Comment

Judging contending estimators by simulation: tournaments in dynamic panel data models

Robust Standard Errors in Transformed Likelihood Estimation of Dynamic Panel Data Models with Cross-Sectional Heteroskedasticity

Short T Dynamic Panel Data Models with Individual and Interactive Time E ects

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

Notes on Generalized Method of Moments Estimation

Subsets Tests in GMM without assuming identi cation

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Robust Standard Errors in Transformed Likelihood Estimation of Dynamic Panel Data Models with Cross-Sectional Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Estimation and Inference with Weak Identi cation

Asymptotic distributions of the quadratic GMM estimator in linear dynamic panel data models

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria

GMM estimation of spatial panels

1 Estimation of Persistent Dynamic Panel Data. Motivation

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT

Comment on HAC Corrections for Strongly Autocorrelated Time Series by Ulrich K. Müller

Speci cation of Conditional Expectation Functions

Comparing the asymptotic and empirical (un)conditional distributions of OLS and IV in a linear static simultaneous equation

Pseudo panels and repeated cross-sections

Improving GMM efficiency in dynamic models for panel data with mean stationarity

Lecture Notes on Measurement Error

Nonparametric Identi cation of a Binary Random Factor in Cross Section Data

11. Bootstrap Methods

Introduction: structural econometrics. Jean-Marc Robin

The Impact of a Hausman Pretest on the Size of a Hypothesis Test: the Panel Data Case

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Economics 241B Estimation with Instruments

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

1 Procedures robust to weak instruments

ECON2285: Mathematical Economics

We begin by thinking about population relationships.

Estimation and Inference with Weak, Semi-strong, and Strong Identi cation

Modeling Expectations with Noncausal Autoregressions

The Hausman Test and Weak Instruments y

GMM Estimation with Noncausal Instruments

MC3: Econometric Theory and Methods. Course Notes 4

On the Power of Tests for Regime Switching

Testing Weak Convergence Based on HAR Covariance Matrix Estimators

A Conditional-Heteroskedasticity-Robust Con dence Interval for the Autoregressive Parameter

A CONDITIONAL-HETEROSKEDASTICITY-ROBUST CONFIDENCE INTERVAL FOR THE AUTOREGRESSIVE PARAMETER. Donald W.K. Andrews and Patrik Guggenberger

1. GENERAL DESCRIPTION

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation

UNIVERSITY OF CALIFORNIA, SAN DIEGO DEPARTMENT OF ECONOMICS

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

Tests for Cointegration, Cobreaking and Cotrending in a System of Trending Variables

Bias Correction Methods for Dynamic Panel Data Models with Fixed Effects

Specification Test for Instrumental Variables Regression with Many Instruments

Dynamic Panels. Chapter Introduction Autoregressive Model

Lecture 4: Linear panel models

Economics 241B Review of Limit Theorems for Sequences of Random Variables

Panel VAR Models with Spatial Dependence

Weak - Convergence: Theory and Applications

Equivalence of several methods for decomposing time series into permananent and transitory components

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

1 Introduction The time series properties of economic series (orders of integration and cointegration) are often of considerable interest. In micro pa

CAE Working Paper # Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators

GROWTH AND CONVERGENCE IN A MULTI-COUNTRY EMPIRICAL STOCHASTIC SOLOW MODEL

Modeling Expectations with Noncausal Autoregressions

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

Bootstrapping Long Memory Tests: Some Monte Carlo Results

Some Recent Developments in Spatial Panel Data Models

GMM Based Tests for Locally Misspeci ed Models

Bootstrapping Long Memory Tests: Some Monte Carlo Results

The Predictive Space or If x predicts y; what does y tell us about x?

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

Combining Macroeconomic Models for Prediction

Testing for a Trend with Persistent Errors

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Maximum Likelihood Analysis of Dynamic Stochastic General Equilibrium (DSGE) Models

Spatial panels: random components vs. xed e ects

Efficient Estimation of Dynamic Panel Data Models: Alternative Assumptions and Simplified Estimation

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria (March 2017) TOPIC 1: LINEAR PANEL DATA MODELS

Rank Estimation of Partially Linear Index Models

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J.

Economics 620, Lecture 14: Lags and Dynamics

A New Computational Algorithm for Random Coe cients Model with Aggregate-level Data

Nearly Unbiased Estimation in Dynamic Panel Data Models

The Time Series and Cross-Section Asymptotics of Empirical Likelihood Estimator in Dynamic Panel Data Models

Transcription:

GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen, Denmark April 9 PRELIMIARY Abstract We are interested in making inference on the AR parameter in an AR() panel data model when the time-series dimension is xed and the cross-section dimension is large. We consider a GMM estimator based on the second order moments of the observed variables. It turns out that when the AR parameter equals unity and certain restrictions apply to the other parameters in the model then local identi cation fails. We show that the parameters and in particular the AR parameter can still be estimated consistently but at a slower rate than usual. Also we derive the asymptotic distribution of the parameter estimators which turns out to be non-standard. The properties of this estimator of the AR parameter are very di erent from those of the widely used Arellano-Bond GMM estimator which is obtained by taking rst-di erences of the variables and then use lagged levels as instruments. It is well-known that this estimator performs poorly when the AR parameter is close to unity. The reason for this di erence between the two estimators is that with the Arellano-Bond moment conditions global identi cation fails for the speci c values of the parameters considered here whereas with the non-linear GMM estimator that uses all information from the second order moments only local identi cation fails. We illustrate the results in a simulation study. Keywords: AR() panel data model, GMM estimation, local identi cation failure, rate of convergence, non-standard limiting distribution

Introduction We are interested in making inference on the parameters in the simple AR() panel data model. This type of model is used to distinguish between two types of persistency often found in economic variables at the individual level such as incomes and wages. Within the framework of this model one source of persistency is attributed to the AR parameter which is common across all crosssection units and it describes how persistent unanticipated shocks to the variables of interest are. The other source of persistency is attributed to unobserved individual-speci c e ects and re ects di erences that remains constant over time for speci c cross-section units. Within this type of model the focus has mainly been on estimating the AR parameter whereas the individual-speci c e ects or the parameters describing speci c features of their distribution are treated as nuisance parameters. One of the most widely used estimators of the AR parameter is the Arellano-Bond GMM estimator which is obtained by taking rst-di erences of the variables in the equation in order to eliminate the individual-speci c e ects and then use lagged levels of the variable as instruments. This estimator has several advantages. First of all, it is based on moment conditions that are linear in the AR parameter and consequently it is straight forward to calculate. In addition, it only requires very basic assumptions about the di erent terms in the AR() panel data model. In particular, it does not require that the exact relationship between the initial values and the individual-speci c e ects is speci ed. However, it is also well-known that this estimator sometimes su ers from a weak instrument problem when the AR parameter is close to unity. The contribution of this paper is to get a better understanding of when there is a problem with identi cation (weak instruments in the linear case) and to show that the very basic assumptions implies moment conditions that do contain information about the AR parameter also when this is not the case for the linear Arellano-Bond moment conditions. Using all of the moment conditions gives a simple non-linear GMM estimator and we derive the asymptotic properties of this estimator. We will consider micro-panels where the cross-section dimension is large and the time-series dimension T is relatively small. This means that the asymptotic approximations to the statistics of interest will be based on letting! while T is xed. Under the same basic assumptions Ahn & Schmidt (995, 997) obtain the Arellano-Bond moments conditions and additional moment conditions that are quadratic in the AR parameter. These moment conditions do not depend on the remaining parameters in the model and they are obtained by transforming the original set of moment conditions. The approach in this paper is a somewhat di erent as we will use the full set of moment conditions and obtain estimators of all parameters used in the speci cation of the model. The ndings in this paper are based on a result from the statistical literature. Rotnitzky, ox, Bottai & Robins () consider a likelihood-based inference procedure when the standard assumptions about identi cation are not satis ed. The application to the GMM-based framework

is somewhat similar but there are also some di erences. We are able to derive the limiting distribution of the GMM estimator for the particular parameter values where local identi cation fails. It turns out that it is necessary to consider a Taylor expansion of the GMM objective function of higher order than usual in order to be able to explain its behavior. In particular, we nd that the GMM estimator of the AR parameter is =4 -consistent and has a limiting distribution which is a non-standard distribution. In a simulation study we show that this limiting distribution provides a good approximation to the actual distribution of the parameter estimators. Finally, the ndings in Stock & Wright () about the properties of GMM estimators under weak instruments are interesting in relation to the results of this paper. Stock & Wright () consider a general framework with moment conditions that are non-linear in the parameters but linear in the instruments under the assumption that the moment conditions provide only weak identi cation of one of the parameters. In this case they nd that the GMM estimator of the parameter that is only weakly identi ed is inconsistent whereas the GMM estimator of the remaining parameters is consistent at the usual rate but with a non-standard limiting distribution. In their paper the limiting distributions of the GMM estimators are expressed as being the distribution of the values at which a non-central chi-squared process is minimized. Here in this paper, the moment conditions provide identi cation of all parameters even though the assumption about rst order local identi cation is not satis ed which implies that the convergence rate is slower than usual. In addition we provide the exact limiting distribution. The model We consider the AR() panel data model with individual-speci c e ects y it = y it + i + " it for i = ; :::; and t = ; :::; T () For notational convenience it is assumed that y i is observed such that the number of observations over time equals T +. By repeated substitution in equation () we nd that y it = t y i +! Xt s i + s= The following assumptions are imposed: Assumption (y i ; i ; " i ; :::; " it ) is iid across i and has nite fourth order moments E (" it ) =, E " it = " and E (" is " it ) = for s 6= t and s; t = ; :::; T E ( i ) = and E i = E (y i ) = and E y i = E (" it i ) = and E (" it y i ) = for t = ; :::; T E (y i i ) = tx t s " is for t = ; ; ::: () s= 3

These assumptions are very general and in particular they do not require that the exact relationship between the initial values y i and the individual-speci c e ects i is speci ed. The assumptions imply that the observed variables y i = (y i ; :::; y it ) are iid across i with rst order moments equal to zero and second order moments that can be obtained by using the expression in (). In the next section this is used as the basis for the estimation of the parameters in the model. ote that we have assumed that the error terms " it have homoskedastic variances over time. This assumption could be relaxed in order to allow for heteroskedastic variances over time. Before we continue let us examine the assumption about mean stationarity. This assumption is often applied when deriving the asymptotic properties of parameter estimators in the simple AR() panel data model. It imposes a restriction on the relationship between the initial values and the individual-speci c e ects such that the covariance between the observed variable y it and the individual-speci c e ect i is the same for all t = ; ; :::; T. Mean stationarity can be formulated in the following two ways: ) y i = i + " i and i = ( ) i ) y i = i + " i for 6= where " i is iid across i with E (" i ) = and E " i = " for. Solving equation () recursively and using the assumption about mean stationarity we have ) y it = i + t " i + t " i + ::: + " it for all t = ; ; ; ::: (3) ) y it = i + t " i + t " i + ::: + " it for all t = ; ; ; ::: (4) This implies that (and therefore the term mean stationary ): ) E (y it j i ) = i for all t = ; ; ; ::: ) E (y it j i ) = i for all t = ; ; ; ::: As long as the AR parameter is considered as being xed the two ways of formulating mean stationarity are equivalent. However, when the value of approaches unity (usually formulated as being local-to-unity, i.e. = c= p ) then the two data generating processes are very di erent. In general under the formulation in ) we have = ( ) and = ( ) such that ( ) =. This means that = implies that = and =. It turns out that under the formulation in ) the correlation between y i and i is xed as! and zero when = whereas under the formulation in ) the correlation between y i and i tends to as! and in addition the variables y i ; :::; y it are dominated by the term i = ( to being linear dependent as!. ) and close From the expressions in (3)-(4) it is clear that the Arellano-Bond GMM estimator where lagged levels are used as instruments for the variables in rst-di erences su ers from a weak instrument 4

problem under mean stationarity and when is close to unity. Under the formulation in ) the rst-di erences y it can be written as y it = t t " i + ::: + ( ) " it + " it whereas the instruments y i ; :::; y it depend on i ; " i ; :::; " it. This means that there is a weak instrument problem when is close to unity. Under the formulation in ) this is even more serious as the instruments are dominated by the individual-speci c term i = ( ) which does not appear in the rst-di erences. In the existing literature it is common to consider the formulation in ). For example Blundell & Bond (998) and Blundell, Bond & Windmeijer () explain the weak instrument problem in the AR() panel data model within this framework from simulation studies and recently Hahn, Hausman & Kursteiner (7) have derived the asymptotic behavior of the Arellano-Bond GMM estimator under the assumption that is local-to-unity. In the present paper we derive the asymptotic properties of a speci c GMM estimator based on the restrictions in Assumption when the observed variables are generated by mean stationary AR() processes as formulated in ) with an AR parameter equal to unity. More speci cally, we consider the behavior of a GMM estimator when the true parameter values are = ; = and =. It is important to emphasize that only the restrictions in Assumption and not the additional restrictions implied by mean stationarity are used in the estimation of the parameters. 3 Estimation For the moment, we assume that we observe y i ; y i ; y i for all i (3 observations over time). The model contains the following 5 parameters: ; ; ; ; " The 5 vector of parameters is de ned as = ; ; ; ; ". We let denote the true value of the parameter. In the following we are interested in estimation of the parameter when its true value equals with = ; ; ; ; " where and " are the true (unknown) values of the parameters and ". By using the expression in () we have that y i = y i + i + " i y i = y i + ( + ) i + " i + " i 5

Under Assumption the model yields the following 6 moment conditions: E yi = E (y i y i ) = + E (y i y i ) = + ( + ) E yi = + + + " E (y i y i ) = 3 + ( + ) + ( + ) + " E yi = 4 + ( + ) + ( + ) + + " Using vector notation this can be written as E [g (y i ; )] = (5) where the 6 vector g (y i ; ) corresponding to the expressions above is de ned in Appendix A. The moment conditions are based on second order moments of the observed variable y i = (y i ; y i ; y i ). We see that for a xed value of then the expressions on the right hand side in the equations above are linear in the remaining parameters ; ; ; ". ote that we do not use the additional moment conditions implied by mean stationarity here. The variance of moment function g (y i ; ) evaluated at the true parameter value = is given by Var [g (y i ; )] = E g (y i ; ) g (y i ; ) = (6) where is positive de nite a 6 6 matrix that depends on the fourth order moments of the observed variables (y i ; y i ; y i ). Ahn & Schmidt (995, 997) show that under Assumption and when there are 3 observations over time then the following two moment conditions hold E (y i (y i y i )) = (7) E (y i y i ) (y i y i ) = (8) The rst is the linear Arellano-Bond moment condition and the other is quadratic in the AR parameter. ote that they do not depend on the variance parameters ; ; ; ". These two moment conditions consist of a subset of the full set of moment conditions in (5) and therefore there might be a loss of information by only using these two moment conditions. repon, Kramarz & Trognon (997) show that when the usual rst order asymptotic approximations apply then the GMM estimator of the AR parameter based on the moment conditions in (5) and the GMM estimator based on the two moments above are asymptotically equivalent. This might not be the case here in this paper since the rst order asymptotic approximations are not valid. Therefore we will consider the full set of moment conditions in (5). 6

3. Identi cation The condition for global identi cation is The condition for rst order local identi cation is @g rank E @ (y i; ) E [g (y i ; )] = if and only if = (9) = 5 () It turns out that rst order local identi cation fails at = since @g @g rank E @ (y i; ) = rank @ (y i; ) = 4 () In Appendix A it is shown that the derivatives with respect to ; ; ; " are linear dependent. This does not imply that these parameters are not identi ed when = but it does mean that the behavior of the GMM estimator can not be explained by the usual rst-order asymptotic approximations. When () holds then there is a reparametrization = () such that = and ~ = ( ; ::; 5 ) and where it holds that the expected value of the rst order derivative of g (y i ; ) with respect to equals zero at = ( ) and the 6 4 matrix of the expected value of the rst order derivatives of g (y i ; ) with respect to ~ has full rank equal to 4 at = ( ). This means that in a neighborhood around the moment conditions can be written as @g ~ E [g (y i ; )] = E @ ~ (y i; ) ~ @ g + E (y i ; ) =! ( ) () The right hand side in the expression above equals zero if and only if = when the following condition holds: @ g E @ @ @g (y i ; ) 6= and it is not a linear combination of E @ ~ (y i; ) In particular, this means that rst order local identi cation is satis ed for the parameters ( ) = ( ) and ( ~ ~ ). It turns out that (3) is satis ed in the case we consider in this paper. An implication of this is that when the true value of equals it is possible to estimate the parameters ( ) and ( ~ ~ ) consistently at the usual p -rate. This is shown in the next section where we also show that it is possible to estimate the parameter of interest consistently at a rate of =4 when its true value equals. Let us brie y have a look at the moment conditions underlying the Arellano-Bond GMM estimator. They are given by E [(y i y i ) y i ] = E yi (3) ( ) E (y i y i ) + E (y i y i ) (4) They only depend on and are linear in this parameter. With moment conditions that are linear in the parameters it holds that the concepts of rst order local and global identi cation are equivalent. 7

It turns out that identi cation fails when = and = and in particular mean-stationarity and an AR parameter equal to unity lead to lack of identi cation. In general identi cation fails when E y i = E (yi y i ) = E (y i y i ). ote that the Arellano-Bond moment conditions in (4) can be written as linear combinations of the moment conditions in (5). In particular, they are a subset of the moment conditions in (5) meaning that they contain less information about the AR parameter. 3. The GMM estimator We consider the GMM estimator ^ that minimizes a quadratic form in the sample moments P i= g (y i; ), that is where The weight matrix W i= ^ = arg min Q () (5) " # " # X X Q () = g (y i ; ) W g (y i ; ) i= (6) is positive semi-de nite and may depend on the data but converges in probability to a positive de nite matrix W as!. In the following we will use the identity matrix as a weight matrix. This is done for simplicity and a di erent weight matrix might be used in the future. Also it is not clear if the weight matrix that is found to be optimal under the standard assumption about local identi cation is optimal when this assumption fails. Since we are rst of all interested in the AR parameter and since the moment conditions in (5) for xed are linear in the other parameters we will consider the concentrated objective function. For a xed value of we let ^ () denote the GMM estimator of = ; ; ; ", that is ^ () = arg min Q (; ) (7) In Appendix A we show that for any value of the minimization problem above always have a unique solution and a closed-form expression for ^ () is easily obtained. In addition ^ ( ) is a p -consistent estimator of if the true value of equals. This results from the matrix in () having reduced rank equal to 4 (the number of parameters minus ) and it is crucial for the results in the paper to hold. The concentrated objective function is de ned as Q () = Q ; ^ () = g () g () (8) where g () = W = P i= g y i ; ^ () is the sample moments evaluated at ; ^ (). ote here that ^ () depends on the weight matrix W but for simplicity this is suppressed in the notation. The GMM estimator of can then be expressed as ^ = arg min Q () (9) 8

When we use the usual optimal weight matrix, i.e W =, then asymptotically times the concentrated objective function as a function of is a non-central -distribution with degrees of freedom (this results from starting with 6 moment conditions and concentrating out 4 parameters). The non-centrality parameter depends on and is given by F () I 6 F () F () F () F () F () where F () = = S (). In particular, since the non-centrality parameter equals zero at = we have that asymptotically Q () is a -distribution with degrees of freedom. It is also clear that the non-centrality parameter is increasing in the number of cross-section observations. Altogether we might consider Q () as a non-central chi-squared process indexed by. It is then possible to simulate this process for di erent values of and obtain the value of where the minimum is obtained. The distribution of these values of would then re ect the asymptotic distribution of the GMM estimator ^. This approach is used in Stock & Wright () within a weak instrument framework such that the non-centrality parameter is considered as being constant as tends to in nity. However, here we will use an approximation of the concentrated objective function in a neighborhood around unity in order to explain the behavior of the GMM estimator ^. 3.3 Asymptotic approximations The derivative of order k of the concentrated objective function is de ned as Q (k) () = dk Q () () dk Expressions can be found in Appendix A. A Taylor expansion of order k of the concentrated objective function Q () around the true value gives Q () = Q ( ) + kx j= Q (j) ( ) ( ) j =j! + L k () () where L k () = Q (k+) () ( ) k+ = (k + )! for some with j j < j j. When the "standard regularity conditions" (including rst order local identi cation) are satis ed then the following results concerning derivatives of the concentrated objective function hold p () Q ( ) = g () ( ) p g ( )! w ; 4G ~ G as! () Q () ( ) as = g () ( ) g () ( )! P G G > as! (3) Q (3) ( ) = O P () (4) where we have used that p g ( )! w ; ~ ; g () ( )! P G 6= and g (k) ( )! P G k as!. For p ( ) bounded then the expansion in () together with the results above imply that the centered (equal to zero at = ) and appropriate scaled concentrated objective 9

function can be approximated by a second order polynomial in p ( ) in a neighborhood around. We have the following (Q () Q ( )) as = p np o np Q () ( ) ( ) + Q () ( ) =! ( )o (5) When considered as a function of p ( unique minimum which is asymptotically equivalent to p (^ ) the right hand side in the expression above has a of which minimizes the left hand side in the expression above. This gives p (^ ) as = p Q () ( ) =Q () ( ) w! ) since ^ is de ned as the value ; (G G ) G ~ G (G G ) as! (6) It means that under the "standard regularity conditions" (including rst order local identi cation) the GMM estimator ^ is a p -consistent estimator of and it is asymptotically normally distributed. In the case considered here in this paper when local identi cation fails for = we need a Taylor expansion of higher order in order to derive asymptotic properties of the GMM estimator. We have the following as! (for details see Appendix A) 3=4 Q () () = =4 g () p () g ()! P as! (7) p () as Q () = g () p () g ()! w ; 4G ~ G as! (8) =4 Q (3) () P! as! (9) Q (4) as () = 6g () () g () ()! P 6G G > as! (3) Q (5) as () = g () () g (3) ()! P G G 3 as! (3) where we have used that p g ()! w ; ~ ; =4 g () ()! P, g () ()! P G 6= and g (k) () P! G k as!. For =4 ( ) bounded then the expansion in () together with the results above imply that the centered and scaled concentrated objective function can be approximated by a fourth order polynomial in =4 ( have the following ) in a neighborhood around unity. We = (Q () Q ()) n o pq o =4 () ( ) () =! + Q(4) n () =4! =4 ( ) + R () (3) where R () =! () 3=4 Q () () + =4 Q (3) () =3!! () + =4 Q (5) () =5!! ()4 (33) n o! () = =4 ( ) (34) It holds that R () P! as!. otice that the limiting distributions of the coe cients in the polynomial on the right hand side in equation (3) above are of the same form as in the

second order polynomial obtained when the assumption about local identi cation is satis ed. We have the following (Q () n o Q ()) as n o = =4 ( ) Z ~ + =4G G =4 ( ) where the right and left hand side in the expression above are asymptotically equivalent and ~Z ; G ~ G. In the case where we have a positive value of Z ~ (this will happen with probability /) then the right hand side in the expression above has a global minimum at =. In the case where we have a negative value of ~ Z (this will happen with probability /) then the expression on the right hand side will have a local maximum at = and two global minima attained at =4 ( ) = q ~Z= (=G G ). This means that ^ = with probability / and ^ = =4 q ~Z= (=G G ) with probability /. In order to determine which of these values gives the minimum of the objective function we will need to consider the deviation of the scaled objective function and the expression on the right hand side in the expression above evaluated at the two values. Using equation (33) we write R () =! () T () where T () = 3=4 Q () () + =4 Q (3) () =3!! () + =4 Q (5) () =5!! ()4 (36) q q We see that T (^ ) = T (^ ) for ^ = ~Z= (=G G )= =4 and ^ = + ~Z= (=G G )= =4. This means that the sign of T (^) will determine at which of the two values the minimum is attained. If the sign of T () is negative (positive) the minimum is attained at ^ (^ ). This has to be conditional on getting a negative draw of p Q () as () =! = Z. ~ So in order to determine where the minimum is attained we have to consider the conditional distribution of =4 T (^) given p () Q () =! and determine the probability that T (^) < conditional on p Q () () <. The di erence between the framework considered here in this paper and the one in Rotnitzky, ox, Bottai & Robins () is that the expected value of the rst h order i derivative of the concentrated moment function at equal to unity equals zero, i.e. E g () () =, while Rotnitzky, ox, Bottai & Robins () assume that the rst order derivative of their objective function (the log-likelihood function) at this point is equal to zero with probability. This means that in our framework p g () () = O P () such that expressions in =4 T (^) involving this term will appear and that complicate things. In fact it turns out that if the usual optimal weight matrix W = is used and if it was the case that g () () = then we would have that =4 T (^) is asymptotically normal with mean zero and independent of p Q () () =! such that the minimum is attained at ^ and ^ with equal probability. (35) The limiting distribution of the GMM estimator of the AR parameter ^ is given in the proposition below. Proposition Under Assumption and when the true value of equals then the following holds: =4 (^ ) w! (Z>) + (Z<) ( ) B p Z as! (37)

where B Ber (:5) and Z (; ) with = 4 (G G ) G ~ G (G G ). In addition B and Z are independent of each other (as explained above this is not shown yet). The result shows that ^ is a =4 -consistent estimator of the AR parameter and that its asymptotic distribution is a mixture with 5% of the probability mass at the true value of unity and 5% each at unity plus/minus the square root of a half-normal distribution. This gives a distribution which is symmetric around the true value of unity with 3 modes of which one is at unity and where the two other modes will become closer to unity as the sample size increases. Also note that the result implies that the parameter ( ) can be estimated consistently at the usual p -rate and that the limiting distribution of p (^ ) is a mixture with 5% of the probability mass at the true value which equals and 5% distributed according to the positive half-normal of Z. This result is similar to what is found for the behavior of parameter estimators when the true value of the parameter is on the boundary of the parameter space, see Geyer (994) and Andrews (999). The limiting distribution of the GMM estimator of the remaining parameters follows by the results in the proposition below. Proposition Under Assumption and when the true value of equals then the following holds as! : p ^ (^) H (^ ) H (^ ) w! Z (38) p ^ (^) H (^ ) H (^ ) =4 (^ ) w! (Z j Z ) (39) where Z Z ; (4) with = 4 (G G ) G ~ G (G G ) H = 6 4 = M () W = W = M () = (G G ) G ~ W = M () M () = F () F () F () " " 3 7 5 H = M () F () () H F () () The result in (38) shows which possibly non-linear transformations of the parameter vector that can be estimated consistently at the usual p -rate when the true value of equals

. The result also gives the marginal asymptotic distributions of the GMM estimator of the parameter since it implies that ^ =4 (^) as = H (^ ). In particular, this means as = whereas the asymptotic distributions of the remaining parameters are that =4 ^ similar to the one obtained for the GMM estimator of the AR parameter ^, see Proposition. In particular, they can be estimated consistently at the rate =4. The result also shows that when using the usual optimal weight matrix W = then the estimator of the transformation of the parameters that can be estimated p -consistently is asymptotically independent of the estimator ^. The result in (39) shows the conditional distribution of p ^ (^) given the value of =4 (^ ). In general we nd that conditional on Z (see Proposition ) then ^ (^) is asymptotically normal with mean + H (^ ) + H (^ ) Z and variance =. Since =4 (^ ) = p Z for Z < and =4 (^ ) = for Z > this means that conditional on (^ ) then with 5% probability ^ (^) is asymptotically normal with mean + E (Z j Z < ) and with 5% probability ^ (^) is asymptotically normal with mean + H (^ ) + H + (^ ). When the usual optimal weight matrix is used, that is W =, then conditional on (^ ) then with 5% probability ^ (^) is asymptotically normal with mean and with 5% probability ^ (^) is asymptotically normal with mean + H (^ ) + H (^ ). 4 A simulation study The results derived in the previous section are now illustrated in a simulation study. We consider the AR() panel data model with parameter values where local identi cation fails, i.e. = ; = ; = ; = ; " =. We consider 5 replications of the model: y it = y it + " it for i = ; :::; and t = ; y i iid ; across i " it iid ; " across i; t We have that = and = 5. In practice the minimum of the concentrated objective function Q () is found by a grid search in the interval [:; :8]. The reason for this is that the function Q () in approximately 5% of the replications have two local minima of which one is the global minimum. We have used the weight matrix W = where = Var [g (y i ; )]. Overall the outcome of this simulation experiment shows that the asymptotic results derived in the previous section explains the actual behavior of the GMM estimator of the AR parameter well (Figure -4). In addition the outcome shows that the joint distribution of the estimators of is also very di erent from what we usually nd when they are jointly normally distributed. (Figure 5-8). This follows by the result in Proposition which also explains the strong quadratic relationship (the green plots) between the estimators of for instance and. 3

Figure : Empirical distribution of ^ for = Figure : Empirical distribution of ^ when p Q () () < for = 4

Figure 3: Empirical distribution of ^ for = 5 Figure 4: Empirical distribution of ^ when p Q () () < for = 5 5

Figure 5: Plot of (^; ^ ) for = 5. The red lines are the true parameter values Figure 6: Plot of (^; ^ ) for = 5. The red lines are the true parameter values 6

Figure 7: Plot of (^; ^ ) for = 5. The red lines are the true parameter values Figure 8: Plot of (^; ^ " ) for = 5. The red lines are the true parameter values 7

Appendix A In this appendix the notation "X as = Y " means that X and Y are asymptotically equivalent as! such that X Y P! as!. The moment conditions: The 6 vector g (y i ; ) consists of the following elements g (y i ; ) = yi g (y i ; ) = y i y i + g 3 (y i ; ) = y i y i + ( + ) g 4 (y i ; ) = yi + + + " g 5 (y i ; ) = y i y i 3 + ( + ) + ( + ) + " g 6 (y i ; ) = yi 4 + ( + ) + ( + ) + + " ote that they can be written as g (y i ; ) = vech (y i y i) S () (4) where = ; ; ; " and the 6 4 matrix S () is de ned as We have that S () = 6 4 ( + ) 3 ( + ) ( + ) 4 ( + ) ( + ) + @g @ (y i; ) = h S () ().. S () where S () () denotes the rst order derivative of S () and is given by 3 S () () = 6 7 4 3 + 4 5 4 3 ( + ) 4 + 6 i 3 7 5 (4) (43) (44) We see that S () () = S () q with q = ; " ; ; " such that the matrix in (43) has reduced rank equal to 4 when = = ; ; ; ; ". 8

We have E [g (y i ; )] = (45) E g (y i ; ) g (y i ; ) = (positive de nite) (46) such that p X i= X i= g (y i ; ) w! (; ) as! (47) g (y i ; )! X g (y i ; ) i= g (y i ; ) X P g (y i ; )!! as! (48) i= The estimator ^ (): Using the expression in (43) we have that " @Q () = @ X g (y i ; )# W S () (49) i= The FO for the optimization problem in (7) give " X vech (y i y i) S () ^ ()# W S () = (5) i= The FO are necessary and su cient since S () W S () as = S () W S () > for all values of since W is positive de nite and S () has full rank equal to 4 for all values of. This implies that ^ () = S () W S () S () W X vech (y i yi) (5) i= The concentrated objective function Q () and its derivatives: We de ne the concentrated sample moment function g () = W = X i= g y i ; ; ^ () = W = The concentrated objective function can be written as X vech (y i yi) i= S () ^ ()! (5) Q () = g () g () (53) 9

Using this we obtain the following expressions for derivatives of Q Q () () = g () g () () (54) Q () () = g () g () () + g() () g () () (55) Q (3) () = g () g (3) () + 6g() () g () () (56) Q (4) () = g () g (4) () + 8g() () g (3) () + 6g() () g () () (57) Q (5) () = g () g (5) () + g() () g (4) () + g() () g (3) () (58) Inserting the expression for ^ () in (5) in equation (5) we have g () = I 6 F () F () F () F () W = X vech (yiy i ) (59) where F () = W = S (). ote that I 6 F () F () F () F () W = S () = such that when the true value of equals then the concentrated moment function evaluated at the true value can be written as g ( ) = I 6 F ( ) F ( ) F ( ) F ( ) W = i= X g (y i ; ) (6) The Lemma below gives the limiting distributions of p g () and p ^ () when the true value of equal. Lemma 3 When = the following holds as! where M () W = p (I 6 P ()) W = For W = we have that X i= p g (y i ; ) w! X i= g (y i ; ) w! F () = W = F () = W = S () i= ; M () W = W = M () (6) ; (I 6 P ()) W = W = (I 6 P ()) (6) = @g (y i ; ) S () = W @ M () = F () F () F () M () = F () F () F () P () = F () M () P () = F () M () (I 6 P ()) W = W = M () =

such that the two linear transformations of P i= g (y i; ) are orthogonal. The probability limit of ^ () when the true parameter value equals is given by where plim ^ () = M () S () (63)! M () = S () W S () S () W (64) In addition we have that p ^ () M () S () w! ; M () M () as! (65) This means that ^ () is a p -consistent estimator of since M () S () = I 4 or more generally that when evaluated at the true value of the estimator ^ is a p -consistent estimator of the true value of. Asymptotic results: For the rst order derivative of Q () we have that 3=4 Q () () = = g () W =4 g () ()! P as! (66) where we have used that = g ()! w ; ~ and that =4 g () ()! P as!. For the second order derivative of Q () we have that = Q () () =! = = g () g () () + =4 g () () =4 g () () as = = g () g () ()! w ; G ~ G as! (67) where we have used that = g ()! w ; ~, =4 g () ()! P and that g () ()! P G as!. For the third order derivative of Q () we have that =4 Q (3) ()!=3! = =3 =4 g () g (3) () + =4 g () () g () () P! as! (68) where we have used that =4 g ()! P and =4 g () ()! P as! and also that g () () and g (3) () both converge in probability as!. For the fourth order derivative of Q () we have that Q (4) () =4! = =g () g (4) () + =3g() () g (3) () + =4g() () g () () as = =4g () () g () ()! P =4G G as! (69)

where we have used that g ()! P and g () ()! P as! and also that g (3) () and g(4) () both converge in probability as!. For the fth order derivative of Q () we have that =4 Q (5) () =5! = =5! =4 g () g (5) () + =5! =4 g () () g (4) () + =5! =4 g () () g (3) () P! as! (7) where we use arguments similar to the ones above. This gives the following result (Q (^) Q ()) as = 3=4 Q () () =4 (^ ) + = Q () n () =4 (^ )o =! + =4 Q (3) () n =4 (^ ) o 3 4 (4) =3! + Q n () =4 (^ )o =4! 5 + =4 Q (5) n () =4 (^ )o =5! o as 4 = = g () g () n () =4 () (^ ) + =4g () g () n () =4 (^ )o (7) Derivatives of the concentrated moment function: g k () = dk g () (7) dk The rst order derivative g () = W = @g @ + @g ^() @ = W = S () () ^ () () S () ^ () (73) The second order derivative g () = W = S () () ^ () S () () () () ^ () S () ^ () (74) The third order derivative g (3) = W = S (3) () ^ () 3S () () () ^ 3S () () (3) () ^ () S () ^ () The fourth order derivative g (4) = W = S (4) () ^ () 4S (3) () () ^ 6S () () () ^ 3S () (3) () ^ S () (4) ^ (75) (76)

Limiting distribution of ^ (^): We use the following de nitions The rst order derivative of ^ (): ^() () = F () F () () F () W = F () = W = S () (77) F (k) = () = W S(k) () for k = ; ; 3; 4 (78) X vech (y i yi) i= F () F () h i F () () F () + F () S () F () () F () F () W = = F () F () () W = F () Using this we have that when = X vech (y i yi) i= h F () () F () + F () F () X vech (y i yi) i= () i ^ ()! =4^() () as = =4 F () F () F () () F () h i F () () F () + F () F () () = =4 F () F () F () F () () = =4 q (79) where the second line follows by using that p P i= (vech (y iy i ) S () ) w! (; ) as! and the last line follows from using that F () () = W = S () () = W = S () q = F () q. with q = ; " ; ; " The second order derivative of ^ (): ^() () = F () F () () (F () W = X vech (y i y i) i= F () F () h i F () () F () + F () () F () () + F () F () () ^ () F () F () h F () () F () + F () F () () i ^() () 3

Using this we have that when = as =4^() () = =4 F () F () (F () () F () h i + F () () F () + F () F () () q) = F () F () h F () () F () () + F () F ()i () =4 + F () F () h i F () () F () + F () F () () =4 q = =4 F () F () F () F () () q F () () h i F () () F () + F () () F () () + F () F () () P where the second line follows by using that p i= (vech (y iyi ) S () )! w (; ) as! and that ^ =4 () P! and =4 ^() P! () + q as! and the third line follows from using that S () q = S () (). (8) The limiting distribution of ^ (^) when the true value equals unity and the true value of equals = ; ; ; " p ^ (^) ^() () (^ ) ^() () (^ ) =! = p ^ (^) ^ () ^() () (^ ) ^() () (^ ) =! + ^ () as = w! p ^ () ; F () F () F () W = W = F () F () F () as! where we have used the result in (65) and that p ^ (^) ^ () ^() () (^ ) ^() () (^ ) =! = =4^(3) n 3 () =3! =4 (^ )o = op () This gives the following asymptotic distribution of ^ (^) =4 ^ (^) as () = ^ () =4 (^ ) + =4^() n o () =! =4 (^ ) as = q =4 (^ ) 4

Derivatives of S (): S () () = S () () = S (3) () = S (4) () = 6 4 6 4 6 4 6 4 3 + 4 4 3 ( + ) 4 + 6 6 4 4 + 6 4 4 3 7 5 3 7 5 3 7 5 3 7 5 (8) (8) (83) (84) 5

References Ahn, S.. and P. Smith, 995, E cient estimation of models for dynamic panel data, Journal of Econometrics 68, 5-7. Ahn, S.. and P. Smith, 997, E cient estimation of models for dynamic panel data: Alternative assumptions and simpli ed estimation, Journal of Econometrics 76, 39-3. Andrews, D.W.K., 999, Estimation when a parameter is on a boundary, Econometrica 67, 34-383. Blundell, R. and S. Bond, 998, Initial conditions and moment restrictions in a dynamic panel data models, Journal of Econometrics 87, 5-43. Blundell, R., S. Bond and F. Windmeijer,, Estimation in dynamic panel data models: Improving on the performance of the standard GMM estimator, in: Baltagi, B.H. (Ed), Advances in Econometrics vol. 5. repon, B., F. Kramarz and A. Trognon, 997, Parameters of interest, nuisance parameters and orthogonality conditions: An application to autoregressive error component models, Journal of Econometrics 8, 35-56. Geyer,.J., 994, On the asymptotics of constrained M-estimation, The Annals of Statistics, 993-. Hahn, J., J. Hausman and G. Kursteiner, 7, Long di erence instrumental variables estimation for dynamic panel models with xed e ects, Journal of Econometrics 4, 574-64. Madsen, E., 9, GMM estimators and unit root test in the AR() panel data model (submitted for publication). Rotnitzky, A., D.R. ox, M. Bottai and J. Robins,, Likelihood-based inference with singular information matrix, Bernoulli 6(), 43-84. Stock, J.H. and J.H. Wright,, GMM with weak identi cation, Econometrica 68, 55-96. 6