Should We Go One Step Further? An Accurate Comparison of One-step and Two-step Procedures in a Generalized Method of Moments Framework

Size: px
Start display at page:

Download "Should We Go One Step Further? An Accurate Comparison of One-step and Two-step Procedures in a Generalized Method of Moments Framework"

Transcription

1 Should We Go One Step Further? An Accurate Comparison of One-step and wo-step Procedures in a Generalized Method of Moments Framework Jungbin Hwang and Yixiao Sun Department of Economics, University of California, San Diego y July 24, 25 Abstract According to the conventional asymptotic theory, the two-step Generalized Method of Moments (GMM) estimator and test perform as least as well as the one-step estimator and test in large samples. he conventional asymptotic theory, as elegant and convenient as it is, completely ignores the estimation uncertainty in the weighting matrix, and as a result it may not re ect nite sample situations well. In this paper, we employ the xed-smoothing asymptotic theory that accounts for the estimation uncertainty, and compare the performance of the one-step and two-step procedures in this more accurate asymptotic framework. We show the two-step procedure outperforms the one-step procedure only when the bene t of using the optimal weighting matrix outweighs the cost of estimating it. his qualitative message applies to both the asymptotic variance comparison and power comparison of the associated tests. A Monte Carlo study lends support to our asymptotic results. JEL Classi cation: C2, C32 Keywords: Asymptotic E ciency, Asymptotic Mixed Normality, Fixed-smoothing Asymptotics, Heteroskedasticity and Autocorrelation Robust, Increasing-smoothing Asymptotics, Nonstandard Asymptotics, wo-step GMM Estimation Introduction E ciency is one of the most important problems in statistics and econometrics. In the widelyused GMM framework, it is standard practice to employ a two-step procedure to improve the For helpful comments and suggestions, we would like to thank Brendan Beare, Graham Elliott, Bruce Hansen, Jonathan Hill, Min Seong Kim, Oliver Linton, Seunghwa Rho, Peter Robinson, Peter Schmidt, Andres Santos, Xiaoxia Shi, Valentin Verdier, im Vogelsang, Je rey Wooldridge and seminar participants at LSU, Madison, Michigan State, UNC/Duke/NCSU, 24 Shanghai Jiao ong University and Singapore Management University Bi-party Conference, and 24 Shandong Econometrics Conference, China. y j6hwang@ucsd.edu, yisun@ucsd.edu. Correspondence to: Department of Economics, University of California, San Diego, 95 Gilman Drive, La Jolla, CA

2 e ciency of the GMM estimator and the power of the associated tests. he two-step procedure requires the estimation of a weighting matrix. According to the Hansen (982), the optimal weighting matrix is the asymptotic variance of the (scaled) sample moment conditions. For time series data, which is our focus here, the optimal weighting matrix is usually referred to as the long run variance (LRV) of the moment conditions. o be completely general, we often estimate the LRV using the nonparametric kernel or series method. Under the conventional asymptotics, both the one-step and two-step GMM estimators are asymptotically normal. In general, the two-step GMM estimator has a smaller asymptotic variance. Statistical tests based on the two-step estimator are also asymptotically more powerful than those based on the one-step estimator. A driving force behind these results is that the two-step estimator and the associated tests have the same asymptotic properties as the corresponding ones when the optimal weighting matrix is known. However, given that the optimal weighting matrix is estimated nonparametrically in the time series setting, there is large estimation uncertainty. A good approximation to the distributions of the two-step estimator and the associated tests should re ect this relatively high estimation uncertainty. One of the goals of this paper is to compare the asymptotic properties of the one-step and twostep procedures when the estimation uncertainty in the weighing matrix is accounted for. here are two ways to capture the estimation uncertainty. One is to use the high order conventional asymptotic theory under which the amount of nonparametric smoothing in the LRV estimator increases with the sample size but at a slower rate. While the estimation uncertainty vanishes in the rst order asymptotics, we expect it to remain in high order asymptotics. he second way is to use an alternative asymptotic approximation that can capture the estimation uncertainty even with just a rst-order asymptotics. o this end, we consider a limiting thought experiment in which the amount of nonparametric smoothing is held xed as the sample size increases. his leads to the so-called xed-smoothing asymptotics in the recent literature. In this paper, we employ the xed-smoothing asymptotics to compare the one-step and twostep procedures. For the one-step procedure, the LRV estimator is used in computing the standard errors, leading to the popular heteroskedasticity and autocorrelation robust (HAR) standard errors. See, for example, Newey and West (987) and Andrews (99). For the two-step procedure, the LRV estimator not only appears in the standard error estimation but also plays the role of the optimal weighting matrix in the second-step GMM criterion function. Under the xed-smoothing asymptotics, the weighting matrix converges to a random matrix. As a result, the second-step GMM estimator is not asymptotically normal but rather asymptotically mixed normal. he asymptotic mixed normality re ects the estimation uncertainty of the GMM weighting matrix and is expected to be closer to the nite sample distribution of the second-step GMM estimator. In a recent paper, Sun (24b) shows that both the one-step and two-step test statistics are asymptotically pivotal under this new asymptotic theory. So a nuisance-parameter-free comparison of the one-step and two-step tests is possible. Comparing the one-step and two-step procedures under the new asymptotics is fundamentally di erent from that under the conventional asymptotics. Under the new asymptotics, the twostep procedure outperforms the one-step procedure only when the bene t of using the optimal weighting matrix outweighs the cost of estimating it. his qualitative message applies to both the asymptotic variance comparison and the local asymptotic power comparison of the associated In this paper, the one-step estimator refers to the rst-step estimator in a typical two-step GMM framework. his is not to be confused with the continuous updating GMM estimator that involves only one step. We use the terms one-step and rst-step interchangingly. Our use of one-step and two-step is the same as what are used in the Stata gmm command. 2

3 tests. his is in sharp contrast with the conventional asymptotics where the cost of estimating the optimal weighting matrix is completely ignored. Since the new asymptotic approximation is more accurate than the conventional asymptotic approximation, comparing the two procedures under this new asymptotics will give an honest assessment of their relative merits. his is con rmed by a Monte Carlo study. here is a large and growing literature on the xed-smoothing asymptotics. For kernel LRV estimators, the xed-smoothing asymptotics is the so-called the xed-b asymptotics rst studied by Kiefer, Vogelsang and Bunzel (22) and Kiefer and Vogelsang (22a, 22b, 25) in the econometrics literature. For other studies, see, for example, Jansson (24), Sun, Phillips and Jin (28), Sun and Phillips (29), Gonçlaves and Vogelsang (2), and Zhang and Shao (23) in the time series setting; Bester, Conley, Hansen and Vogelsang (24) in the spatial setting; and Gonçalves (2), Kim and Sun (23), and Vogelsang (22) in the panel data setting. For orthonormal series LRV estimators, the xed-smoothing asymptotics is the so-called xed-k asymptotics. For its theoretical development and related simulation evidence, see, for example, Phillips (25), Müller (27), Sun (2, 23) and Sun and Kim (25). he approximation approaches in some other papers can also be regarded as special cases of the xed-smoothing asymptotics. his includes, among others, Ibragimov and Müller (2), Shao (2) and Bester, Conley, and Hansen (2). he xed-smoothing asymptotics can be regarded as a convenient device to obtain some high order terms under the conventional increasing-smoothing asymptotics. he rest of the paper is organized as follows. he next section presents a simple overidenti ed GMM framework. Section 3 compares the two procedures from the perspective of point estimation. Section 4 compares them from the testing perspective. Section 5 extends the ideas to a general GMM framework. Section 6 reports simulation evidence and provides some practical guidance. he last section concludes. Proofs are provided in the Appendix. A word on notation: for a symmetric matrix A; A =2 (or A =2 ) is a matrix square root of A such that A =2 A =2 = A: Note that A =2 does not have to be symmetric. We will specify A =2 explicitly when it is not symmetric. If not speci ed, A =2 is a symmetric matrix square root of A based on its eigen-decomposition. For matrices A and B; we use A B to signify that A B is positive (semi)de nite. We use and O interchangeably to denote a matrix of zeros whose dimension may be di erent at di erent occurrences. For two random variables X and Y; we use X? Y to indicate that X and Y are independent. For a matrix A; we use (A) ; min (A) and max (A) to denote the set of all singular values, the smallest singular value, and the largest singular value of A, respectively. For an estimator ^; we use avar(^) to denote the asymptotic variance of the limiting distribution of p (^ plim! ^) where is the sample size. 2 A Simple Overidenti ed GMM Framework o illustrate the basic ideas of this paper, we consider a simple overidenti ed time series model of the form: y t = + u t ; y t 2 R d ; y 2t = u 2t ; y 2t 2 R q () for t = ; :::; where 2 R d is the parameter of interest and the vector process u t := (u t ; u 2t ) is stationary with mean zero. We allow u t to have autocorrelation of unknown forms so that the 3

4 long run variance of u t : = lrvar(u t ) = X j= Eu t u t takes a general form. However, for simplicity, we assume that var(u t ) = 2 I d+q for the moment 2. Our model is just a location model. We initially consider a general GMM framework but later nd out that our points can be made more clearly in the simple location model. From the asymptotic point of view, we show later that a general GMM framework can be reduced to the above simple location model. Embedding the location model in a GMM framework, the moment conditions are E(y t ) q = ; j where y t = (y t ; y 2t ). Let g () = p P t= (y t ) p P t= y 2t! : hen a GMM estimator of can be de ned as ^GMM = arg min g () W g () for some positive de nite weighting matrix W : Writing W W W = 2 W 2 W 22 where W is a d d matrix and W 22 is a q q matrix, then it is easy to show that ^GMM = X t= ; (y t W y 2t ) for W = W 2 W 22 : here are at least two di erent choices of W. First, we can take W to be the identity matrix W = I m for m = d + q: In this case, W = and the GMM estimator ^ is simply ^ = X y t : Second, we can take W to be the optimal weighting matrix W =. With this choice, we obtain the GMM estimator: ~ 2 = X (y t y 2t ) ; 2 If for any 2 > ; we can let t= t= V V 2 var (u t) = 6= 2 I V 2 V d+q 22 V =2 = (V 2) =2 V 2 (V 22) =2 (V 22) =2 where V 2 = V V 2V 22 V2: hen V=2 (y t; y2t) can be written as a location model whose error variance is the identity matrix I d+q : he estimation uncertainty in estimating V will not a ect our asymptotic results.! 4

5 where = 2 22 is the long run regression coe cient matrix. While ^ completely ignores the information in fy 2t g ; ~ 2 takes advantage of this source of information. Under some moment and mixing conditions, we have where p ^ d =) N(; ) and p 2 = : ~2 d =) N (; 2 ) ; So avar( ~ 2 ) < avar(^ ) unless 2 =. his is a well known result in the literature. Since we do not know in practice, ~ 2 is infeasible. However, given the feasible estimator ^ ; we can estimate and construct a feasible version of ~ 2 : he common two-step estimation strategy is as follows. i) Estimate the long run covariance matrix by ^ := ^ (^u) = X X Q h ( s ; t ) s= t= ^u t! X ^u = ^u s! X ^u = where ^u t = (y t ^ ; y 2t ). ii) Obtain the feasible two-step estimator ^ 2 = P t= (y t ^y 2t ) where ^ = ^ 2 ^ 22. In the above de nition of ^, Q h (r; s) is a symmetric weighting function that depends on the smoothing parameter h: For conventional kernel LRV estimators, Q h (r; s) = k ((r s) =b) and we take h = =b: For the orthonormal series (OS) LRV estimators, Q h (r; s) = K P K j= j (r) j (s) and we take h = K; where j (r) are orthonormal basis functions on L 2 [; ] satisfying R j (r) dr = : We parametrize h in such a way so that h indicates the level or amount of smoothing for both types of LRV estimators. Note that we use the demeaned process f^u t P = ^u g in constructing ^ (^u) : For the location model, ^ (^u) is numerically identical to ^ (u) where the unknown error process fu t g is used. he moment estimation uncertainty is re ected in the demeaning operation. Had we known the true value of and hence the true moment process fu t g ; we would not need to demean fu t g. While ~ 2 is asymptotically more e cient than ^ ; is ^ 2 necessarily more e cient than ^ and in what sense? Is the Wald test based on ^ 2 necessary more powerful than that based on ^? One of the objectives of this paper is to address these questions. 3 A ale of wo Asymptotics: Point Estimation We rst consider the conventional asymptotics where h! as! but at a slower rate, i.e., h=! : Sun (24a, 24b) calls this type of asymptotics the Increasing-smoothing Asymptotics, as h increases with the sample size. Under this type of asymptotics and some regularity conditions, we have ^! p : It can then be shown that ^ 2 is asymptotically equivalent to ~ 2, i.e., p ( ~ 2 ^2 ) = o p (). As a direct consequence, we have p d ^ =) N(; ); p d ^2 =) N ; : 5

6 So ^ 2 is still asymptotically more e cient than ^ : he conventional asymptotics, as elegant and convenient as it is, does not re ect the nite sample situations well. Under this type of asymptotics, we essentially approximate the distribution of ^ by the degenerate distribution concentrating on. hat is, we completely ignore the estimation uncertainty in ^: he degenerate approximation is too optimistic, as ^ is a nonparametric estimator, which by de nition can have high variation in nite samples. o obtain a more accurate distributional approximation of p (^ 2 ); we could develop a high order increasing-smoothing asymptotics that re ects the estimation uncertainty in ^. his is possible but requires strong assumptions that cannot be easily veri ed. In addition, it is also technically challenging and tedious to rigorously justify the high order asymptotic theory. Instead of high order asymptotic theory under the conventional asymptotics, we adopt the type of asymptotics that holds h xed (at a positive value) as! : Given that h is xed, we follow Sun (24a, 24b) and call this type of asymptotics the Fixed-smoothing Asymptotics. his type of asymptotics takes the sampling variability of ^ into consideration. Sun (23, 24a) has shown that critical values from the xed-smoothing asymptotic distribution are higher order correct under the conventional increasing-smoothing asymptotics. So the xed-smoothing asymptotics can be regarded as a convenient device to obtain some higher order terms under the conventional increasing-smoothing asymptotics. o establish the xed-smoothing asymptotics, we maintain Assumption on the kernel function and basis functions. Assumption (i) For kernel LRV estimators, the kernel function k () satis es the following conditions: for any b 2 (; ], k b (x) = k (x=b) is symmetric, continuous, piecewise monotonic, and piecewise continuously di erentiable on [ ; ]. (ii) For the OS LRV variance estimator, the basis functions j () are piecewise monotonic, continuously di erentiable and orthonormal in L 2 [; ] and R j (x) dx = : Assumption on the kernel function is very mild. It includes many commonly used kernel functions such as the Bartlett kernel, Parzen kernel, and Quadratic Spectral (QS) kernel. De ne Q h (r; s) = Q h(r; s) Z Q h (; s)d which is a centered version of Q h (r; s); and Z Q h (r; )d + Z Z Q h ( ; 2 )d d 2 ; ~ = X X Q h ( s ; t )^u t^u s: s= t= Assumption ensures that ~ and ^ are asymptotically equivalent. Furthermore, under this assumption, Sun (24a) shows that, for both kernel LRV and OS LRV estimation, the centered weighting function Q h (r; s) satis es : Q h (r; s) = X j j (r) j (s) j= where f j (r)g is a sequence of continuously di erentiable functions satisfying R j(r)dr = and the series on the right hand side converges to Q h (r; s) absolutely and uniformly over (r; s) 2 6

7 [; ] [; ]. he representation can be regarded as a spectral decomposition of the compact Fredholm operator with kernel Q h (r; s) : See Sun (24a) for more discussion. Now, letting () := and using the basis functions f j ()g j= in the series representation of the weighting function, we make the following assumptions. Assumption 2 he vector process fu t g t= satis es: (i) =2 P t= j(t= )u t converges weakly to a continuous distribution, jointly over j = ; ; : : : ; J for every xed J; (ii) For every xed J and x 2 R m,! X P p j ( t )u t x for j = ; ; : : : ; J t=! X = P =2 p j ( t )e t x for j = ; ; : : : ; J + o() as! t= where! =2 = =2 2 2 =2 22 =2 > 22 is a matrix square root of the nonsingular LRV matrix = P j= Eu tu t j and e t s iid N(; I m ): Assumption 3 P j= k Eu tu t j k< : Proposition Let Assumptions 3 hold. As! for a xed h >, we have: (a) ^ =) d where = =2 ~ =2 := ; ;2 ;2 ;22 Z Z ~; ~ = Q h (r; s)db ~ m(r)db m (s) ;2 := ~ ;2 ;22 ~ and B m () is a standard Brownian motion of dimension m = d + q; (b) p d ^2 =) I d ; =2 B m () where = (h; d; q) := ;2 ;22 is independent of B m () : Conditional on, the asymptotic distribution of p (^ 2 ) is a normal distribution with variance V 2 = I d ; 2 Id 2 22 = : Given that V 2 is random, p (^ 2 ) is asymptotically mixed-normal rather than normal. Since avar(^ 2 ) avar( ~ 2 ) = EV = E = E ; 7

8 the feasible estimator ^ 2 has a large variation than the infeasible estimator ~ 2 : his is consistent with our intuition. he di erence avar(^ 2 ) avar( ~ 2 ) can be regarded as the cost of implementing the two-step estimator, i.e., the cost of having to estimate the weighting matrix. Under the xed-smoothing asymptotics, we still have p d (^ ) =) N(; ), as ^ does not depend on the smoothing parameter h: So avar(^ ) avar( ~ 2 ) := = ; which can be regarded as the bene t of going to the second step. o compare the asymptotic variances of p (^ ) and p (^ 2 ); we need to evaluate the relative magnitudes of the cost and the bene t. De ne ~ := ~ (h; d; q) := ~ ;2 ~ ;22 ; (2) which does not depend on any nuisance parameter but depends on h; d; q: For notational economy, we sometimes suppress this dependence. Direct calculations show that Using this, we have: = =2 2 ~ = : (3) avar(^ 2 ) avar(^ ) = avar(^ 2 ) avar( ~ 2 ) {z } cost [avar(^ ) avar( ~ 2 )] {z } bene t = =2 2 E~ ~ ( =2 2 ) : (4) If the cost is larger than the bene t, i.e., =2 2 E~ ~ ( =2 2 ) > ; then the asymptotic variance of ^ 2 is larger than that of ^. he following lemma gives a characterization of E ~ (h; d; q) ~ (h; d; q) : Lemma 2 For any d ; we have E ~ (h; d; q) ~ (h; d; q) = where and Using the lemma, we can prove that Ejj ~ (h; ; q) jj 2 I d : avar(^ 2 ) avar(^ ) = ( + Ejj ~ (h; ; q) jj 2 ) =2 g(h; q)id ( =2 ) ; g(h; q) := Ejj~ (h; ; q)jj 2 + Ejj ~ 2 (; ); (h; ; q)jj2 = =2 2 = R dq ; which is the long run correlation matrix between u t and u 2t. he proposition below then follows immediately. Proposition 3 Let Assumptions 3 hold. Consider the xed-smoothing asymptotics. (a) If max ( ) < g(h; q); then ^ 2 has a larger asymptotic variance than ^. (b) If min ( ) > g(h; q); then ^ 2 has a smaller asymptotic variance than ^. 8

9 o compute the eigenvalues of ; we can use the fact that ( ) = : he eigenvalues of are the squared long run correlation coe cients between c u t and c 2 u 2t for some c and c 2 ; i.e., the squared long run canonical correlation coe cients between u t and u 2t : So the conditions in the proposition can be presented in terms of the smallest and largest square long run canonical correlation coe cients. If = ; then max ( ) < g(h; q) holds trivially. In this case, the asymptotic variance of ^2 is larger than the asymptotic variance of ^ : Intuitively, when the long run correlation is zero, there is no information that can be explored to improve e ciency. If we insist on using the long run correlation matrix in attempt to improve the e ciency, we may end up with a less e cient estimator, due to the noise in estimating the zero long run correlation matrix. On the other hand, if = I d after some possible rotation, which holds when the long run variation of u t is perfectly predicted by u 2t ; then min ( ) = and we have min ( ) > g(h; q). In this case, it is worthwhile estimating the long run variance and using it to improve the e ciency of the two-step GMM estimator. he two conditions min ( ) > g(h; q) and max ( ) < g(h; q) in the proposition may appear to be strong. However, the conclusions are also very strong. For example, ^ 2 has a smaller asymptotic variance than ^ means that avar(r^ 2 ) avar(r^ ) for any matrix R 2 R pd and for all p d: In fact, in the proof of the proposition, we show that the conditions are both necessary and su cient. he two conditions min ( ) > g(h; q) and max ( ) g(h; q) are not mutually exclusive unless d = : When d > ; it is possible that neither of two conditions is satis ed, in which case avar(^ 2 ) avar(^ ) is inde nite. So, as a whole vector, the relative asymptotic e ciency of ^ 2 to ^ cannot be compared. However, there exist two matrices R + 2 R d +d and R 2 R d d with d + + d = d, d + < d; and d < d such that avar(r +^2 ) avar(r +^ ) and avar(r ^ 2 ) avar(r ^ ): An example of the inde nite case is when q < d and max ( ) > g(h; q): In this case, min ( ) = and min ( ) > g(h; q) does not hold. A direct implication is that avar(r ^ 2 ) > avar(r ^ ) for some R : So when the degree of overidenti cation is not large enough, there are some directions characterized by R along which the two-step estimator is less e cient than the one-step estimator. When d = ; is a scalar, and the two conditions min ( ) > g(h; q) and max ( ) g(h; q) become mutually exclusive. So if > g(h; q); then ^ 2 is asymptotically more e cient than ^ : Otherwise, it is asymptotically less e cient. In the case of kernel LRV estimation, it is hard to obtain an analytical expression for Ejj ~ (h; ; q)jj 2 and hence g(h; q), although we can always simulate g(h; q) numerically. he threshold g(h; q) depends on the smoothing parameter h = =b and the degree of overidenti cation q: ables 3 report the simulated values of g(h; q) for b = : : : : :2 and q = 5. hese values are nontrivial in that they are close to neither zero nor one. It is clear that g(h; q) increases with q and decreases with the smoothing parameter h = =b. When the OS LRV estimation is used, we do not need to simulate g(h; q), as we can obtain a closed-form expression. Corollary 4 Let Assumptions 3 hold. In the case of OS LRV estimation, we have g(h; q) = q K : So if max ( ) < q K (or min ( ) > q K ), then ^ 2 has a larger (or smaller) asymptotic variance than ^ under the xed-smoothing asymptotics. 9

10 Since ^ 2 is not asymptotically normal, asymptotic variance comparison does not paint the whole picture. o compare the asymptotic distributions of ^ and ^ 2, we consider the case of OS LRV estimation with d = q = and K = 4 as an example. We use the sine and cosine basis functions as given in (??) later in Section 6. Figure reports the shapes of probability density functions when ( ; 2 2 ; 22) = (; :; ). In this case, 2 = = :9. he rst graph shows p (^ ) a N(; ) and p (^ 2 ) a N(; :9) under the conventional asymptotics. he conventional limiting distributions for p (^ ) and p (^ 2 ) are both normal but the latter has a smaller variance, so the asymptotic e ciency of ^ 2 is always guaranteed. However, this is not true in the second graph of Figure, which represents the limiting distributions under the xed-smoothing asymptotics. While we still have p (^ ) a N(; ), p (^ 2 ) a MN[; :9( + ~ 2 )]. he mixed normality can be obtained by using a conditional version of (4): More speci cally, the conditional asymptotic variance of ^ 2 is avar(^ 2 j ~ ) = V 2 = =2 2 ~ ~ ( =2 2 ) + 2 = :9( + ~ 2 ): (5) Comparing these two di erent families of distributions, we nd that the asymptotic distribution of ^ 2 has fatter tail than that of ^ : he asymptotic variance of ^ 2 is avar(^ 2 ) = EV 2 = 2 f + E[jj ~ (h; ; q)jj 2 ]g = 2 K K q = :9 3 2 = :35; which is larger than the asymptotic variance of ^. 4 A ale of wo Asymptotics: Hypothesis esting We are interested in testing the null hypothesis H : R = r against the local alternative H : R = r + = p for some p d full rank matrix R and p vectors r and : Nonlinear restrictions can be converted into linear ones using the Delta method. We construct the following two Wald statistics: W := (R^ r) R^ R (R^ r) W 2 := (R^ 2 r) R^ 2 R (R^2 r) where ^ 2 = ^ ^2 ^ 22 2 : When p = and the alternative is one sided, we can construct the following two t statistics: : = p R^ r pr^ R (6) p R^ 2 r 2 : = pr^ 2 R : (7) No matter whether the test is based on ^ or ^ 2 ; we have to employ the long run covariance estimator ^. De ne the p p matrices and 2 according to = R R and 2 2 = R 2 R :

11 Conventional Asymptoptic Distributions One step estimator: N(,) wo step estimator: N(,.9) Fixed smoothing Asymptotic Distributions One step estimator: N(,) wo step estimator: MN(,.9(+ 2 )) Figure : Limiting distributions of ^ and ^ 2 based on the OS LRV estimator with K = 4: In other words, and 2 are matrix square roots of R R and R 2 R respectively. Under the conventional increasing-smoothing asymptotics, it is straightforward to show that under H : R = r + = p : W d =) 2 p( 2 ), W 2 d =) 2 p( 2 d =) N( ; ), 2 d =) N( 2 ; ); where 2 p 2 is the noncentral chi-square distribution with noncentrality parameter 2 : When = ; we obtain the null distributions: W ; W 2 d =) 2 d p and ; 2 =) N(; ): So under the conventional increasing-smoothing asymptotics, the null limiting distributions of W and W 2 are identical. Since 2 2 2, under the conventional asymptotics, the local asymptotic power function of the test based on W 2 is higher than that based on W : 2 ),

12 he key driving force behind the conventional asymptotics is that we approximate the distribution of ^ by the degenerate distribution concentrating on : he degenerate approximation does not re ect the nite sample distribution well. As in the previous section, we employ the xed-smoothing asymptotics to derive more accurate distributional approximations. Let and C pp = C qq = Z Z Z Z Q h (r; s)db p(r)db p (s) ; C pq = Q h (r; s)db q(r)db q (s) ; C qp = C pq Z Z D pp = C pp C pq C qq C pq Q h (r; s)db p(r)db q (s) where B p () 2 R p and B q () 2 R q are independent standard Brownian motion processes. Proposition 5 Let Assumptions 3 hold. As! for a xed h, we have, under H : R = r + = p : d (a) W =) W ( 2 ) where (b) W 2 (c) (d) 2 d =) W 2 ( W (kk 2 ) = [B p () + ] C pp [B p () + ] for 2 R p : (8) 2 2 ) where W 2 (kk 2 ) = B p () C pq Cqq B q () + D pp d =) d =) 2 2 Bp () C pq C qq B q () + : (9) := Bp ()+ p = Cpp for p = : := Bp () C pq Cqq B q () + 2 p = Dpp for p = : In Proposition 5, we use the notation W (kk 2 ), which implies that the right hand side of (8) depends on only through kk 2 : his is true, because for any orthogonal matrix H : [B p () + ] C pp [B p () + ] = [HB p () + H] HC pp H [HB p () + H] d = [B p () + H] C pp [B p () + H] : If we choose H = (= kk ; ~ H) for some ~ H such that H is orthogonal, then [B p () + ] C pp [B p () + ] d = [B p () + kk e p ] C pp [B p () + kk e p ] ; where e p = (; ; :::; ) 2 R p : So the distribution of [B p () + ] C pp [B p () + ] depends on only through kk : Similarly, the distribution of the right hand side of (9) depends only on kk 2 : When = ; we obtain the limiting distributions of W ; W 2 ; and 2 under the null hypothesis: W W 2 2 d =) W := W () = B p () C pp B p () ; d =) W 2 := W 2 () = B p () C pq C d =) := () = B p ()= p C pp ; qq B q () D pp d =) 2 := 2 () = B p () C pq C qq B q () = p D pp : Bp () C pq C qq B q () ; 2

13 hese distributions are di erent from those under the conventional asymptotics. For W and ; the di erence lies in the random scaling factor C pp or p C pp : he random scaling factor captures the estimation uncertainty of the LRV estimator. For W 2 and 2 ; there is an additional di erence embodied by the random location shift C pq Cqq B q () with a consequent change in the random scaling factor. he proposition below provides some characterization of the two limiting distributions W and W 2 : Proposition 6 For any x > ; the following hold: (a) W 2 () rst-order stochastically dominates W () in that P [W 2 () x] > P [W () x] : h i h i (b) P W (kk 2 ) x strictly increases with kk 2 and lim kk! P W (kk 2 ) x = : h i h i (c) P W 2 (kk 2 ) x strictly increases with kk 2 and lim kk! P W 2 (kk 2 ) x = : Proposition 6(a) is intuitive. W 2 rst-order stochastically dominates W because W 2 rst-order stochastically dominates B p () Dpp B p (), which in turn rst-order stochastically B p () Cpp B p () ; which is just W. According to a property of the rst-order stochastic dominance, we have W 2 d = W + W e for some W e > : Intuitively, W 2 shifts some of the probability mass of W to the right. A direct implication is that the asymptotic critical values for W 2 are larger than the corresponding ones for W : he di erence in critical values has implications on the power properties of the two tests. For x > ; we have P ( > x) = 2 P W x 2 and P ( 2 > x) = 2 P W 2 x 2. It then follows from Proposition 6(a) that P ( 2 > x) P ( > x) for x > : So for a onesided test with the alternative H : R > r; critical values from 2 are larger than those from : Similarly, we have P ( 2 < x) P ( < x) for x < : his implies that for a one-sided test with the alternative H : R < r; critical values from 2 are smaller than those from : Let W and W 2 be the ( ) quantile from the distributions W and W 2 ; respectively. he local asymptotic power functions of the two tests are 2 := h 2 ; h; p; q; = P W ( i 2 ) > W ; 2 2 := 2 h 2 ; h; p; q; = P W 2 ( i 2 ) > W 2 : 2 While ; we also have W 2 > W : he e ects of the critical values and the noncentrality parameters move in opposite directions. It is not straightforward to compare the two power functions. However, Proposition 6 suggests that if the di erence in the noncentrality parameters is large enough to o set the increase in critical values, then the two-step test based on W 2 will be more powerful. 2 3

14 o evaluate ; we de ne R = R R =2 (R2 ) =2 22 ; () which is the long run correlation matrix R between Ru t and u 2t : In terms of R 2 R pq we have = R R R R R R = hi p R 222 2R i = n Ip R R I p o : So the di erence in the noncentrality parameters depends on the matrix R R : Let R R = P p i= i;ra i;r a i;r be the eigen decomposition of R R ; where f i;rg are the eigenvalues of R R and fa i;rg are the corresponding eigenvectors. Sorted in the descending order, f i;r g are the (squared) long run canonical correlation coe cients between Ru t and u 2t : hen 2 2 px 2 i;r = a i;r 2 : i;r i= Consider a special case that p;r := min p i= f i;rg approaches. If a p;r 6= ; then jj2 jj 2 2 and hence jj2 jj 2 approaches as p;r approaches from below. his case happens when the second block of moment conditions has very high long run prediction power for the rst block. In this case, we expect the W 2 test to be more powerful, as lim p;r! 2 ( 2 2 ) = : Consider another special case that max p i= f i;rg = ; i.e., R is a matrix of zeros. In this case, the second block of moment conditions contains no additional information, and we have 2 2 = 2 : In this case, we expect the W 2 test to be less powerful. It follows from Proposition 6(b) and (c) that for any, there exists a unique () := (; h; p; q; ) such that 2 () = : As a function of ; () is de ned implicitly via the above equation. hen 2 ( 2 2 ) < ( 2 ) if and only if 2 < ( 2 ) 2 : Using = = = px i= px i= px i= ( 2 i;r ( ( i;r i;r 2 2 ) 2 i;r f( i;r 2 2 ) a 2 2 ) ( 2 ) 2 2 i;r 2 ) 2 a i;r! a i;r 2 ( 2 ( ) 2 ) () 4

15 where f () is de ned according to f () := f(; h; p; q; ) = (; h; p; q; ) ; (; h; p; q; ) we can prove the proposition below. Proposition 7 Let Assumptions 3 hold. De ne A ( ) = f : R 2 R = g: Consider the local alternative H ( ) : R = r + = p for 2 A ( ) and the xed-smoothing asymptotics. (a) If max ( R R ) < f( ; h; p; q; ); then the two-step test based on W 2 has a lower local asymptotic power than the one-step test based on W for any 2 A ( ) : (b) If min ( R R ) > f( ; h; p; q; ); then the two-step test based on W 2 has a higher local asymptotic power than the one-step test based on W for any 2 A ( ) : o compute max ( R R ) and min ( R R ), we can use the relationship that R n R = R R R R o : here is no need to compute the matrix square roots (R R ) =2 and =2 22 : As in the case of variance comparison, the conditions on the canonical correlation coe - cients in Proposition 7(a) and (b) are both su cient and necessary. See the proof of the proposition for details. he conditions may appear to be strong but the conclusions are equally strong the power comparison results hold regardless the directions of the local departure. If we have a particular direction in mind so that is xed and given, then we can evaluate directly for the given : If is positive (negative), then the two-step test has a higher (lower) local asymptotic power along the given direction. When p = ; which is of ultimate importance in empirical studies, R R is equal to the sum of the squared long run canonical correlation coe cients. In this case, f( ; h; p; q; ) is the threshold value of R R for assessing the relative e ciency of the two tests. More speci cally, when R R > f( ; h; p; q; ), the two-step test is more powerful than the one-step test. Otherwise, the two-step test is less powerful. Proposition 7 is in parallel with Proposition 3. he qualitative messages of these two propositions are the same when the long run correlation is high enough, we should estimate and exploit it to reduce the variation of our point estimator and improve the power of the associated tests. However, the thresholds are di erent quantitatively. he two propositions fully characterize the threshold for each criterion under consideration. Proposition 8 Consider the case of OS LRV estimation. For any 2 R +, we have () > 2 () and hence (; h; p; q; ) > and f(; h; p; q; ) > : Proposition 8 is intuitive. When there is no long run correlation between Ru t and u 2t ; we have 2 2 = 2 : In this case, the two-step W 2 test is necessarily less powerful. he proof uses the theory of uniformly most powerful invariant tests and the theory of complete and 5

16 su cient statistics. It is an open question whether the same strategy can be adopted to prove Proposition 8 in the case of kernel LRV estimation. Our extensive numerical work supports that (; h; p; q; ) > and f(; h; p; q; ) > continue to hold in the kernel case. It is not easy to give an analytical expression for f(; h; p; q; ) but we can compute it numerically without any di culty. In able 4, we consider the case of OS LRV estimation and compute the values of f(; K; p; q; ) for = 25; K = 8; ; 2; 4, p = 3 and q = 3: he values are nontrivial in that they are not close to the bounary value of zero or one. Similar to the asymptotic variance comparison, we nd that these threshold values increase as the degree of overidenti cation increases and decrease as the smoothing parameter K increases. For the case of kernel LRV estimation, results not reported here show that f(; h; p; q; ) increases with q and decreases with h. his is entirely analogous to the case of OS LRV estimation. 5 General Overidenti ed GMM Framework In this section, we consider the general GMM framework. he parameter of interest is a d vector 2 R d. Let v t 2 R dv denote the vector of observations at time t. We assume that is the true value, an interior point of the parameter space. he moment conditions E f(v t ; ) = ; t = ; 2; :::; : hold if and only if = where f (v t ; ) is an m vector of continuously di erentiable functions. he process f (v t ; ) may exhibit autocorrelation of unknown forms. We assume that m d and that the rank of E[@ f (v t ; ) =@ ] is equal to d: hat is, we consider a model that is possibly overidenti ed with the degree of overidenti cation q = m d: 5. One-step and wo-step Estimation and Inference De ne the m m contemporaneous covariance matrix and the LRV matrix as: Let = E f(v t ; ) f(v t ; ) and = X j= g t () = p j where j = E f(v t ; ) f(v t j ; ) : tx f(v j ; ): Given a simple positive-de nite weighting matrix W that does not depend on any unknown parameter, we can obtain an initial GMM estimator of as j= ^ = arg min 2 g () W g (): For example, we may set W equal to I m. In the case of IV regression, we may set W equal to Z Z = where Z is the matrix of the instruments. Using or as the weighting matrix, we obtain the following two (infeasible) GMM estimators: ~ : = arg min () g (); 2 (2) ~ 2 : = arg min () g (): 2 (3) 6

17 For the estimator ~ ; we use the contemporaneous covariance matrix as the weighting matrix and ignore all the serial dependency in the moment vector process ff(v t ; )g t=. In contrast to this procedure, the second estimator ~ 2 accounts for the long run dependency. he feasible versions of these two estimators ^ and ^ 2 can be naturally de ned by replacing and with their estimates est (^ ) and est (^ ) where est () : = est () : = X f(v t ; ) f(v t ; ) ; (4) t= X s= t= X Q h ( s ; t ) f(v t ; ) f(v s ; ) : (5) o test the null hypothesis H : R = r against H : R = r + = p ; we construct two di erent Wald statistics as follows: W : = (R^ r) n R ^V R o (R^ r); (6) W 2 : = (R^ 2 r) n R ^V 2 R o (R^2 r); where ^V = ^V 2 = h G est (^ ) G i h G est (^ ) est (^ ) est (^ ) G i h G est (^ ) G i (7) h G2 est (^ 2 ) G i 2 and G = X f(v t ; ; G2 = =^ X f(v t ; : =^2 hese are the standard Wald test statistics in the GMM framework. o compare the two estimators ^ and ^ 2 and associated tests, we maintain the standard assumptions below. Assumption 4 As! for a xed h; ^ = + o p () ; ^ = + o p () ; ^ 2 = + o p () for an interior point 2 : Assumption 5 De ne G t () = = tx f(v t ; for t and G () = : For any = + o p (); the following hold: (i) plim! G[r ] ( ) = r G uniformly in r where G = G( ) and G() = E@ f(v t ; )=@ ; (ii) est ( ) p! > ; (iii) ; ; G G, and G G are all nonsingular. With these assumptions and some mild conditions, the standard GMM theory gives us p (^ ) = p X t= h G G i G f(vt ; ) + o p (): 7

18 Under the xed-smoothing asymptotics, Sun (24b) establishes the representation: p (^2 ) = p X t= h G i G G f(v t ; ) + o p (); where is de ned in the similar way as in Proposition : = =2 ~ =2 : Due to the complicated structure of two transformed moment vector processes, it is not straightforward to compare the asymptotic distributions of ^ and ^ 2 as in Sections 3 and 4. o confront this challenge, we let G = U V mm md dd be a singular value decomposition (SVD) of G, where = A dd ; O dq ; A is a d d diagonal matrix and O is a matrix of zeros. Also, we de ne f (v t ; ) = (f (v t ; ); f 2 (v t ; )) := U f(vt ; ) 2 R m ; where f (v t; ) 2 R d and f2 (v t; ) 2 R q are the rotated moment conditions. he variance and long run variance matrices of ff (v t ; )g are := U U = 2 2 ; 22 and := U U, respectively. o convert the variance matrix into an identity matrix, we de ne the normalized moment conditions below: where f(v t ; ) = f (v t ; ) ; f 2 (v t ; ) := ( =2 ) f (v t ; ) =2 = ( 2 )=2 2 ( 22 ) =2 ( 22 )=2! : (8) More speci cally, h i f (v t ; ) : = ( 2) =2 f (v t ; ) 2 ( 22) f2 (v t ; ) 2 R d ; f 2 (v t ; ) : = ( 22) =2 f 2 (v t ; ) 2 R q : hen the contemporaneous variance of the time series ff(v t ; )g is I m and the long run variance is := ( =2 ) ( =2 ) : Lemma 9 Let Assumptions 5 hold with u t replaced by f(v t ; ) in Assumptions 2 and 3. hen as! for a xed h > ; ( 2) =2 AV p (^ ) = p ( 2) =2 AV p (^ 2 ) = p X t= X t= f (v t ; ) + o p () where := ;2 ;22 is the same as in Proposition. d =) N(; ) (9) [f (v t ; ) f 2 (v t ; )] + o p () (2) d =) MN ;

19 Lemma 9 casts the stochastic expansions of two estimators in the same form. o the best of our knowledge, these representations are new in the econometric literature and may be of independent interest. Lemma 9 enables us to directly compare the asymptotic properties of one-step and two-step estimators and the associated tests. It follows from the proof of the lemma that ( 2) =2 AV p ( ~ 2 ) = p X t= [f (v t ; ) f 2 (v t ; )] + o p (); where = 2 22 as de ned before. So the di erence between the feasible and infeasible twostep GMM estimators lies in the uncertainty in estimating : While the true value of appears in the asymptotic distribution of the infeasible estimator ~ 2 ; the xed-smoothing limit of the implied estimator ^ := ^ 2 ^ 22 appears in that of the feasible estimator ^ 2 : It is important to point out that the estimation uncertainty in the whole weighting matrix est matters only through that in ^: If we let (u t ; u 2t ) = (f (v t ; ); f 2 (v t ; )); then the right hand sides of (9) and (2) are exactly the same as what we would obtain in the location model. he location model, as simple as it is, has implications for general settings from an asymptotic point of view. More speci cally, de ne y t = ( 2) =2 AV + u t ; y 2t = u 2t ; where u t = f (v t ; ) and u 2t = f 2 (v t ; ). he estimation and inference problems in the GMM setting are asymptotically equivalent to those in the above simple location model with fy t ; y 2t g as the observations. o present our next theorem, we transform R into ~ R using which has the same dimension as R: We let Z Z Z ~ (h; p; q) = Q h (r; s)db p(r)db q (s) which is compatible with the de nition in (2). We de ne ~R = RV A ( 2) =2 ; (2) Z Q h (r; s)db q(r)db q (s) ; = =2 2 = R dq and R = ( ~ R ~ R ) =2 ( ~ R 2 ) = R pq : While is the long run correlation matrix between f (v t ; ) and f 2 (v t ; ), R is the long run correlation matrix between Rf ~ (v t ; ) and f 2 (v t ; ). he corresponding long run canonical correlation coe cients are = and R R = n( R ~ R ~ )( R ~ R ~ ) o : For the location model considered before, G = (I d ; O dq ) and so U = I m, A = I d and V = I d : Given the assumption that = = I m ; which implies that 2 = I d; we have ~ R = R: So the above de nition of R is identical to that in (). 9

20 heorem Let the assumptions in Lemma 9 hold. De ne A ( ) = f : [R( G G) R ] = g: Consider the local alternative H ( ) : R = r + = p for 2 A ( ) and the xed-smoothing asymptotics. (a) If max ( R R ) < g(h; q); then R^ 2 has a larger asymptotic variance than R^ : (b) If min ( R R ) > g(h; q); then R^ 2 has a smaller asymptotic variance than R^. (c) If max ( R R ) < f ( ; h; p; q; ) ; then the two-step test is asymptotically less powerful than the rst-step test for any 2 A ( ). (d) If min ( R R ) > f ( ; h; p; q; ) ; then the two-step test is asymptotically more powerful than the rst-step test for any 2 A ( ). If R = I d ; then ~ R is a square matrix with a full rank. Since the long canonical correlation coe cient is invariant to a full-rank linear transformation, we have ( R R ) = ( ) : It then follows from heorem (a) (b) that (i) if max ( ) < g(h; q); then avar(^ 2 ) > avar(^ ): (ii) if min ( ) > g(h; q); then avar(^ 2 ) < avar(^ ): hese results are identical to what we obtain for the location model. he only di erence is that in the general GMM case we need to rotate and standardize the original moment conditions before computing the long run correlation matrix. heorem can also be applied to a general location model with a nonscalar error variance, in which case ~ R = R ( 2 )= GMM Estimation and Inference with a Working Weighting Matrix In the previous subsection, we employ two speci c weighting matrices the variance and long run variance estimators. In this subsection, we consider a general weighting matrix W (^ ); which may depend on the initial estimator ^ and the sample size ; leading to yet another GMM estimator: h i ^a = arg min g () W (^ ) g () 2 where the subscript a signi es another or alternative. An example of W (^ ) is the implied LRV matrix when we employ a simple approximating parametric model to capture the dynamics in the moment process. We could also use the general LRV estimator but we choose a large h so that the variation in W (^ ) is small. In the kernel LRV estimation, this amounts to including only autocovariances of low orders in constructing W (^ ): We assume that W (^ ) p! W, a positive de nite nonrandom matrix under the xedsmoothing asymptotics. W may not be equal to the variance or long run variance of the moment process. We call W (^ ) a working weighting matrix. his is in the same spirit of using a working correlation matrix rather than a true correlation matrix in the generalized estimating equations (GEE) setting. See, for example, Liang and Zeger (986). In parallel to (6), we construct the test statistic W a := (R^ a r) n R ^V a R o (R^a r); 2

21 where, for G a = P f(v t ; )=@ ; =^a ^V a is de ned according to ^V a = h G a W (^ a ) G a i h G a W (^ a ) est (^ a ) W (^ a ) G a i h G a W (^ a ) G a i ; which is a standard variance estimator for ^ a : De ne W = U W U and W = =2 W ( =2 ) W W = 2 W 2 W 22 and a = W 2 W 22 : Using the same argument for proving Lemma 9, we can show that ( 2) =2 AV p (^ a ) = p X t= [f (v t ; ) a f 2 (v t ; )] + o p (): (22) he above representation is the same as that in (2) except that is now replaced by a : Let V a and V a;r be the long run variances of [f (v t ; ) a f 2 (v t ; )] and ~ R [f (v t ; ) a f 2 (v t ; )] ; respectively. he long run correlation matrices are a = V =2 a ( 2 a 22 ) =2 22 and a;r = V =2 a;r he corresponding long run canonical correlation coe cients are a a = (2 a 22 ) a;r a;r h ~R (2 a 22 )i = ( 2 a 22 ) Va and n = ~R (2 a 22 ) 22 ( 2 a 22 ) R ~ V a;r heorem Let the assumptions in Lemma 9 hold. Assume further that W (^ ) p! W, a positive de nite nonrandom matrix. Consider the local alternative H ( ) and the xed-smoothing asymptotics. (a) If max ( a;r a;r ) < g(h; q), then R^ 2 has a larger asymptotic variance than R^ a : (b) If min ( a;r a;r ) > g(h; q), then R^ 2 has a smaller asymptotic variance than R^ a : (c) If max ( a;r a;r ) < f ( ; h; p; q; ) ; then the two-step test based on W 2 is asymptotically less powerful than the test based on W a for any 2 A ( ). (d) If min ( a;r a;r ) > f ( ; h; p; q; ) ; then the two-step test based on W 2 is asymptotically more powerful than the test based on W a for any 2 A ( ). heorem is entirely analogous to heorem. he only di erence is that the second block of moment conditions is removed from the rst block using the implied matrix coe cient a before computing the long run correlation coe cient. When R = I d ; ~ R becomes a square matrix, and we have (a;r a;r ) = ( a a). heorem (a) and (b) gives the conditions under which ^ 2 is asymptotically more (or less) e cient than ^a : o : 2

22 o understand the theorem, we can see that the e ective moment conditions behind R^ a are: Ef a (v t ; ) = for f a (v t ; ) = ~ R [f (v t ; ) a f 2 (v t ; )] : R^ a uses the information in Ef 2 (v t ; ) = to some extent, but it ignores the residual information that is still potentially available from Ef 2 (v t ; ) = : In contrast, R^ 2 attempts to explore the residual information. If there is no long run correlation between f a (v t ; ) and f 2 (v t ; ) ; i.e., a;r = ; then all the information in Ef 2 (v t ; ) = has been fully captured by the e ective moment conditions underlying R^ a : As a result, the test based on R^ a necessarily outperforms that based on R^ 2 : If the long run correlation a;r is large enough in the sense given in heorem (d), the test based on R^ 2 could be more powerful than that based on R^ a in large samples. 6 Simulation Evidence and Practical Guidance his section compares the nite sample performances of one-step and two-step estimators and tests using the xed-smoothing approximations. We consider the location model given in () with the true parameter value = (; :::; ) 2 R d but we allow for a nonscalar error variance. he error fu t g follows a VAR() process: u i t = u i t u i 2t = u i 2t u j 2t j= + p q qx + e i 2t for i = ; :::; q + ei t for i = ; :::; d (23) where e i t iid N(; ) across i and t, ei 2t iid N(; ) across i and t; and fe t ; t = ; 2; :::; g are independent of fe 2t ; t = ; 2; :::; g : Let u t := ((u t ) ; (u 2t ) ) 2 R m 2 R m, then u t = ~ u t + e t where mm = I d p q J d;q I q! ; e t = e t e 2t iid N (; I m ) ; and J d;q is the d q matrix of ones. Direct calculations give us the expressions for the long run and contemporaneous variances of fu t g as = = X Eu t (u t j) = (I m ) (I m ) I ( ) 2 d + 2 ( ) 3 p q J q;d I ( ) 2 q J ( ) 4 d;d ( ) 3 p q J d;q A and = var(u t ) = I 2 d + 2 (+ 2 ) J ( 2 ) 3 d;d pq J ( 2 ) 2 d;q p q ( 2 ) 2 J q;d 2 I q C A : Let u t = ( 2 ) =2 [u t 2 ( 22 ) u 2t ] and u 2t = ( 22 ) =2 u 2t and be the long run correlation matrix between u t and u 2t : With some algebraic manipulations, we have = d + ( 2 ) 2 J d;d: (24) 2 22

23 So the maximum eigenvalue of is given by max ( ) = + ( 2 ) 2 =(d 2 ), which is also the only nonzero eigenvalue. In addition to the VAR() error process, we also consider the following VARMA(,) process for u t : u i t = u i t u i 2t = u i 2t + e i t + p qx q e j 2;t j= + e i 2t for i = ; :::; q for i = ; :::; d (25) where e i:i:d t N (; I m ) : he corresponding long run covariance matrix and contemporaneous covariance matrix are I ( ) 2 d + 2 J ( ) 2 d;d ( ) 2 p q J d;q A ( ) 2 p q J q;d I ( ) 2 q and =! I 2 d + 2 J 2 d;d pq J 2 d;q p : q J 2 q;d I 2 q With some additional algebras, we have = d + ( ) 2 J d;d; (26) 2 and max ( ) = ( + =[d ( ) 2 2 ]) : Under the VARMA(,) design, the approximating AR() model is misspeci ed. It is not hard to obtain the probability limit of W (^a ) as W = I m ~ ~ ( ) I ~ ( ) ~ + ~ ~ I ~ m ( ) ~ ; which is di erent from the true long run variance matrix : Based on W, ; and, we can compute a a and a;r a;r : For the basis functions in OS LRV estimation, we choose the following orthonormal basis functions f j g j= in the L2 [; ] space: 2j (x) = p 2 cos(2jx) and 2j (x) = p 2 sin(2jx) for j = ; :::; K=2; where K is an even integer. We also consider kernel based LRV estimators with the three commonly-used kernels: Bartlett, Parzen, QS kernels. For the choice of K in OS LRV estimation, we employ the following AMSE-optimal formula in Phillips (25): & tr [(Im 2 + K mm )( ' )] =5 K MSE = 2 :5 4vec(B ) vec(b 4=5 ) where de is the ceiling function, K mm is m 2 m 2 commutation matrix and B = 2 6 X j= j 2 Eu t u t j: 23

Should We Go One Step Further? An Accurate Comparison of One-Step and Two-Step Procedures in a Generalized Method of Moments Framework

Should We Go One Step Further? An Accurate Comparison of One-Step and Two-Step Procedures in a Generalized Method of Moments Framework Sould We Go One Step Furter? An Accurate Comparison of One-Step and wo-step Procedures in a Generalized Metod of Moments Framework Jungbin Hwang and Yixiao Sun Department of Economics, University of California,

More information

A New Approach to Robust Inference in Cointegration

A New Approach to Robust Inference in Cointegration A New Approach to Robust Inference in Cointegration Sainan Jin Guanghua School of Management, Peking University Peter C. B. Phillips Cowles Foundation, Yale University, University of Auckland & University

More information

Accurate Asymptotic Approximation in the Optimal GMM Frame. Stochastic Volatility Models

Accurate Asymptotic Approximation in the Optimal GMM Frame. Stochastic Volatility Models Accurate Asymptotic Approximation in the Optimal GMM Framework with Application to Stochastic Volatility Models Yixiao Sun UC San Diego January 21, 2014 GMM Estimation Optimal Weighting Matrix Two-step

More information

Comment on HAC Corrections for Strongly Autocorrelated Time Series by Ulrich K. Müller

Comment on HAC Corrections for Strongly Autocorrelated Time Series by Ulrich K. Müller Comment on HAC Corrections for Strongly Autocorrelated ime Series by Ulrich K. Müller Yixiao Sun Department of Economics, UC San Diego May 2, 24 On the Nearly-optimal est Müller applies the theory of optimal

More information

Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing

Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing Yixiao Sun Department of Economics University of California, San Diego Peter C. B. Phillips Cowles Foundation, Yale University,

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

CAE Working Paper # Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators

CAE Working Paper # Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators CAE Working Paper #06-04 Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators by Nigar Hashimzade and Timothy Vogelsang January 2006. Fixed-b Asymptotic

More information

Testing Weak Convergence Based on HAR Covariance Matrix Estimators

Testing Weak Convergence Based on HAR Covariance Matrix Estimators Testing Weak Convergence Based on HAR Covariance atrix Estimators Jianning Kong y, Peter C. B. Phillips z, Donggyu Sul x August 4, 207 Abstract The weak convergence tests based on heteroskedasticity autocorrelation

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

POWER MAXIMIZATION AND SIZE CONTROL IN HETEROSKEDASTICITY AND AUTOCORRELATION ROBUST TESTS WITH EXPONENTIATED KERNELS

POWER MAXIMIZATION AND SIZE CONTROL IN HETEROSKEDASTICITY AND AUTOCORRELATION ROBUST TESTS WITH EXPONENTIATED KERNELS POWER MAXIMIZAION AND SIZE CONROL IN HEEROSKEDASICIY AND AUOCORRELAION ROBUS ESS WIH EXPONENIAED KERNELS By Yixiao Sun, Peter C. B. Phillips and Sainan Jin January COWLES FOUNDAION DISCUSSION PAPER NO.

More information

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J.

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J. CAE Working Paper #05-08 A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests by Nicholas M. Kiefer and Timothy J. Vogelsang January 2005. A New Asymptotic Theory for Heteroskedasticity-Autocorrelation

More information

Notes on Time Series Modeling

Notes on Time Series Modeling Notes on Time Series Modeling Garey Ramey University of California, San Diego January 17 1 Stationary processes De nition A stochastic process is any set of random variables y t indexed by t T : fy t g

More information

Simple Estimators for Semiparametric Multinomial Choice Models

Simple Estimators for Semiparametric Multinomial Choice Models Simple Estimators for Semiparametric Multinomial Choice Models James L. Powell and Paul A. Ruud University of California, Berkeley March 2008 Preliminary and Incomplete Comments Welcome Abstract This paper

More information

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation Inference about Clustering and Parametric Assumptions in Covariance Matrix Estimation Mikko Packalen y Tony Wirjanto z 26 November 2010 Abstract Selecting an estimator for the variance covariance matrix

More information

Serial Correlation Robust LM Type Tests for a Shift in Trend

Serial Correlation Robust LM Type Tests for a Shift in Trend Serial Correlation Robust LM Type Tests for a Shift in Trend Jingjing Yang Department of Economics, The College of Wooster Timothy J. Vogelsang Department of Economics, Michigan State University March

More information

Fixed-b Inference for Testing Structural Change in a Time Series Regression

Fixed-b Inference for Testing Structural Change in a Time Series Regression Fixed- Inference for esting Structural Change in a ime Series Regression Cheol-Keun Cho Michigan State University imothy J. Vogelsang Michigan State University August 29, 24 Astract his paper addresses

More information

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic Chapter 6 ESTIMATION OF THE LONG-RUN COVARIANCE MATRIX An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic standard errors for the OLS and linear IV estimators presented

More information

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Xu Cheng y Department of Economics Yale University First Draft: August, 27 This Version: December 28 Abstract In this paper,

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables Likelihood Ratio Based est for the Exogeneity and the Relevance of Instrumental Variables Dukpa Kim y Yoonseok Lee z September [under revision] Abstract his paper develops a test for the exogeneity and

More information

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH LECURE ON HAC COVARIANCE MARIX ESIMAION AND HE KVB APPROACH CHUNG-MING KUAN Institute of Economics Academia Sinica October 20, 2006 ckuan@econ.sinica.edu.tw www.sinica.edu.tw/ ckuan Outline C.-M. Kuan,

More information

Testing for a Trend with Persistent Errors

Testing for a Trend with Persistent Errors Testing for a Trend with Persistent Errors Graham Elliott UCSD August 2017 Abstract We develop new tests for the coe cient on a time trend in a regression of a variable on a constant and time trend where

More information

Estimation and Inference of Linear Trend Slope Ratios

Estimation and Inference of Linear Trend Slope Ratios Estimation and Inference of Linear rend Slope Ratios imothy J. Vogelsang and Nasreen Nawaz Department of Economics, Michigan State University October 4 Abstract We focus on the estimation of the ratio

More information

Simple and Powerful GMM Over-identi cation Tests with Accurate Size

Simple and Powerful GMM Over-identi cation Tests with Accurate Size Simple and Powerful GMM Over-identi cation ests wit Accurate Size Yixiao Sun and Min Seong Kim Department of Economics, University of California, San Diego is version: August, 2 Abstract e paper provides

More information

Heteroskedasticity and Autocorrelation Consistent Standard Errors

Heteroskedasticity and Autocorrelation Consistent Standard Errors NBER Summer Institute Minicourse What s New in Econometrics: ime Series Lecture 9 July 6, 008 Heteroskedasticity and Autocorrelation Consistent Standard Errors Lecture 9, July, 008 Outline. What are HAC

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Testing for Regime Switching: A Comment

Testing for Regime Switching: A Comment Testing for Regime Switching: A Comment Andrew V. Carter Department of Statistics University of California, Santa Barbara Douglas G. Steigerwald Department of Economics University of California Santa Barbara

More information

Chapter 2. GMM: Estimating Rational Expectations Models

Chapter 2. GMM: Estimating Rational Expectations Models Chapter 2. GMM: Estimating Rational Expectations Models Contents 1 Introduction 1 2 Step 1: Solve the model and obtain Euler equations 2 3 Step 2: Formulate moment restrictions 3 4 Step 3: Estimation and

More information

Tests for Cointegration, Cobreaking and Cotrending in a System of Trending Variables

Tests for Cointegration, Cobreaking and Cotrending in a System of Trending Variables Tests for Cointegration, Cobreaking and Cotrending in a System of Trending Variables Josep Lluís Carrion-i-Silvestre University of Barcelona Dukpa Kim y Korea University May 4, 28 Abstract We consider

More information

Estimation and Inference with Weak Identi cation

Estimation and Inference with Weak Identi cation Estimation and Inference with Weak Identi cation Donald W. K. Andrews Cowles Foundation Yale University Xu Cheng Department of Economics University of Pennsylvania First Draft: August, 2007 Revised: March

More information

Problem set 1 - Solutions

Problem set 1 - Solutions EMPIRICAL FINANCE AND FINANCIAL ECONOMETRICS - MODULE (8448) Problem set 1 - Solutions Exercise 1 -Solutions 1. The correct answer is (a). In fact, the process generating daily prices is usually assumed

More information

Robust testing of time trend and mean with unknown integration order errors

Robust testing of time trend and mean with unknown integration order errors Robust testing of time trend and mean with unknown integration order errors Jiawen Xu y Shanghai University of Finance and Economics and Key Laboratory of Mathematical Economics Pierre Perron z Boston

More information

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Simple Estimators for Monotone Index Models

Simple Estimators for Monotone Index Models Simple Estimators for Monotone Index Models Hyungtaik Ahn Dongguk University, Hidehiko Ichimura University College London, James L. Powell University of California, Berkeley (powell@econ.berkeley.edu)

More information

GMM estimation of spatial panels

GMM estimation of spatial panels MRA Munich ersonal ReEc Archive GMM estimation of spatial panels Francesco Moscone and Elisa Tosetti Brunel University 7. April 009 Online at http://mpra.ub.uni-muenchen.de/637/ MRA aper No. 637, posted

More information

Comparing Nested Predictive Regression Models with Persistent Predictors

Comparing Nested Predictive Regression Models with Persistent Predictors Comparing Nested Predictive Regression Models with Persistent Predictors Yan Ge y and ae-hwy Lee z November 29, 24 Abstract his paper is an extension of Clark and McCracken (CM 2, 25, 29) and Clark and

More information

Introduction: structural econometrics. Jean-Marc Robin

Introduction: structural econometrics. Jean-Marc Robin Introduction: structural econometrics Jean-Marc Robin Abstract 1. Descriptive vs structural models 2. Correlation is not causality a. Simultaneity b. Heterogeneity c. Selectivity Descriptive models Consider

More information

Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models

Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models Ryan Greenaway-McGrevy y Bureau of Economic Analysis Chirok Han Korea University February 7, 202 Donggyu Sul University

More information

A Conditional-Heteroskedasticity-Robust Con dence Interval for the Autoregressive Parameter

A Conditional-Heteroskedasticity-Robust Con dence Interval for the Autoregressive Parameter A Conditional-Heteroskedasticity-Robust Con dence Interval for the Autoregressive Parameter Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Patrik Guggenberger Department

More information

The Size-Power Tradeoff in HAR Inference

The Size-Power Tradeoff in HAR Inference he Size-Power radeoff in HAR Inference June 6, 07 Eben Lazarus Department of Economics, Harvard University Daniel J. Lewis Department of Economics, Harvard University and James H. Stock* Department of

More information

13 Endogeneity and Nonparametric IV

13 Endogeneity and Nonparametric IV 13 Endogeneity and Nonparametric IV 13.1 Nonparametric Endogeneity A nonparametric IV equation is Y i = g (X i ) + e i (1) E (e i j i ) = 0 In this model, some elements of X i are potentially endogenous,

More information

Averaging and the Optimal Combination of Forecasts

Averaging and the Optimal Combination of Forecasts Averaging and the Optimal Combination of Forecasts Graham Elliott University of California, San Diego 9500 Gilman Drive LA JOLLA, CA, 92093-0508 September 27, 2011 Abstract The optimal combination of forecasts,

More information

Notes on Generalized Method of Moments Estimation

Notes on Generalized Method of Moments Estimation Notes on Generalized Method of Moments Estimation c Bronwyn H. Hall March 1996 (revised February 1999) 1. Introduction These notes are a non-technical introduction to the method of estimation popularized

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO DEPARTMENT OF ECONOMICS

UNIVERSITY OF CALIFORNIA, SAN DIEGO DEPARTMENT OF ECONOMICS 2-7 UNIVERSITY OF LIFORNI, SN DIEGO DEPRTMENT OF EONOMIS THE JOHNSEN-GRNGER REPRESENTTION THEOREM: N EXPLIIT EXPRESSION FOR I() PROESSES Y PETER REINHRD HNSEN DISUSSION PPER 2-7 JULY 2 The Johansen-Granger

More information

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT MARCH 29, 26 LECTURE 2 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT (Davidson (2), Chapter 4; Phillips Lectures on Unit Roots, Cointegration and Nonstationarity; White (999), Chapter 7) Unit root processes

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot October 28, 2009 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

On Size and Power of Heteroskedasticity and Autocorrelation Robust Tests

On Size and Power of Heteroskedasticity and Autocorrelation Robust Tests On Size and Power of Heteroskedasticity and Autocorrelation Robust Tests David Preinerstorfer and Benedikt M. Pötscher y Department of Statistics, University of Vienna Preliminary version: April 2012 First

More information

Supplemental Material 1 for On Optimal Inference in the Linear IV Model

Supplemental Material 1 for On Optimal Inference in the Linear IV Model Supplemental Material 1 for On Optimal Inference in the Linear IV Model Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Vadim Marmer Vancouver School of Economics University

More information

Estimation and Inference with Weak, Semi-strong, and Strong Identi cation

Estimation and Inference with Weak, Semi-strong, and Strong Identi cation Estimation and Inference with Weak, Semi-strong, and Strong Identi cation Donald W. K. Andrews Cowles Foundation Yale University Xu Cheng Department of Economics University of Pennsylvania This Version:

More information

(Y jz) t (XjZ) 0 t = S yx S yz S 1. S yx:z = T 1. etc. 2. Next solve the eigenvalue problem. js xx:z S xy:z S 1

(Y jz) t (XjZ) 0 t = S yx S yz S 1. S yx:z = T 1. etc. 2. Next solve the eigenvalue problem. js xx:z S xy:z S 1 Abstract Reduced Rank Regression The reduced rank regression model is a multivariate regression model with a coe cient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure,

More information

7 Semiparametric Methods and Partially Linear Regression

7 Semiparametric Methods and Partially Linear Regression 7 Semiparametric Metods and Partially Linear Regression 7. Overview A model is called semiparametric if it is described by and were is nite-dimensional (e.g. parametric) and is in nite-dimensional (nonparametric).

More information

Economics 620, Lecture 18: Nonlinear Models

Economics 620, Lecture 18: Nonlinear Models Economics 620, Lecture 18: Nonlinear Models Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 18: Nonlinear Models 1 / 18 The basic point is that smooth nonlinear

More information

Inference on a Structural Break in Trend with Fractionally Integrated Errors

Inference on a Structural Break in Trend with Fractionally Integrated Errors Inference on a Structural Break in rend with Fractionally Integrated Errors Seongyeon Chang Boston University Pierre Perron y Boston University November, Abstract Perron and Zhu (5) established the consistency,

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot November 2, 2011 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Powerful Trend Function Tests That are Robust to Strong Serial Correlation with an Application to the Prebish Singer Hypothesis

Powerful Trend Function Tests That are Robust to Strong Serial Correlation with an Application to the Prebish Singer Hypothesis Economics Working Papers (22 26) Economics 4-23 Powerful Trend Function Tests That are Robust to Strong Serial Correlation with an Application to the Prebish Singer Hypothesis Helle Bunzel Iowa State University,

More information

On Standard Inference for GMM with Seeming Local Identi cation Failure

On Standard Inference for GMM with Seeming Local Identi cation Failure On Standard Inference for GMM with Seeming Local Identi cation Failure Ji Hyung Lee y Zhipeng Liao z First Version: April 4; This Version: December, 4 Abstract This paper studies the GMM estimation and

More information

Testing in GMM Models Without Truncation

Testing in GMM Models Without Truncation Testing in GMM Models Without Truncation TimothyJ.Vogelsang Departments of Economics and Statistical Science, Cornell University First Version August, 000; This Version June, 001 Abstract This paper proposes

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

Lecture Notes on Measurement Error

Lecture Notes on Measurement Error Steve Pischke Spring 2000 Lecture Notes on Measurement Error These notes summarize a variety of simple results on measurement error which I nd useful. They also provide some references where more complete

More information

Inference with Dependent Data Using Cluster Covariance Estimators

Inference with Dependent Data Using Cluster Covariance Estimators Inference with Dependent Data Using Cluster Covariance Estimators C. Alan Bester, Timothy G. Conley, and Christian B. Hansen February 2008 Abstract. This paper presents a novel way to conduct inference

More information

When is it really justifiable to ignore explanatory variable endogeneity in a regression model?

When is it really justifiable to ignore explanatory variable endogeneity in a regression model? Discussion Paper: 2015/05 When is it really justifiable to ignore explanatory variable endogeneity in a regression model? Jan F. Kiviet www.ase.uva.nl/uva-econometrics Amsterdam School of Economics Roetersstraat

More information

ECON0702: Mathematical Methods in Economics

ECON0702: Mathematical Methods in Economics ECON0702: Mathematical Methods in Economics Yulei Luo SEF of HKU January 14, 2009 Luo, Y. (SEF of HKU) MME January 14, 2009 1 / 44 Comparative Statics and The Concept of Derivative Comparative Statics

More information

A Course on Advanced Econometrics

A Course on Advanced Econometrics A Course on Advanced Econometrics Yongmiao Hong The Ernest S. Liu Professor of Economics & International Studies Cornell University Course Introduction: Modern economies are full of uncertainties and risk.

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

GLS-based unit root tests with multiple structural breaks both under the null and the alternative hypotheses

GLS-based unit root tests with multiple structural breaks both under the null and the alternative hypotheses GLS-based unit root tests with multiple structural breaks both under the null and the alternative hypotheses Josep Lluís Carrion-i-Silvestre University of Barcelona Dukpa Kim Boston University Pierre Perron

More information

SIEVE INFERENCE ON SEMI-NONPARAMETRIC TIME SERIES MODELS. Xiaohong Chen, Zhipeng Liao and Yixiao Sun. February 2012

SIEVE INFERENCE ON SEMI-NONPARAMETRIC TIME SERIES MODELS. Xiaohong Chen, Zhipeng Liao and Yixiao Sun. February 2012 SIEVE INFERENCE ON SEMI-NONPARAMERIC IME SERIES MODELS By Xiaohong Chen, Zhipeng Liao and Yixiao Sun February COWLES FOUNDAION DISCUSSION PAPER NO. 849 COWLES FOUNDAION FOR RESEARCH IN ECONOMICS YALE UNIVERSIY

More information

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1 PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,

More information

Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems *

Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems * February, 2005 Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems * Peter Pedroni Williams College Tim Vogelsang Cornell University -------------------------------------------------------------------------------------------------------------------

More information

Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing

Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Roust esting Yixiao Sun Department of Economics University of California, San Diego Peter C. B. Phillips Cowles Foundation, Yale University,

More information

A CONDITIONAL-HETEROSKEDASTICITY-ROBUST CONFIDENCE INTERVAL FOR THE AUTOREGRESSIVE PARAMETER. Donald W.K. Andrews and Patrik Guggenberger

A CONDITIONAL-HETEROSKEDASTICITY-ROBUST CONFIDENCE INTERVAL FOR THE AUTOREGRESSIVE PARAMETER. Donald W.K. Andrews and Patrik Guggenberger A CONDITIONAL-HETEROSKEDASTICITY-ROBUST CONFIDENCE INTERVAL FOR THE AUTOREGRESSIVE PARAMETER By Donald W.K. Andrews and Patrik Guggenberger August 2011 Revised December 2012 COWLES FOUNDATION DISCUSSION

More information

Rank Estimation of Partially Linear Index Models

Rank Estimation of Partially Linear Index Models Rank Estimation of Partially Linear Index Models Jason Abrevaya University of Texas at Austin Youngki Shin University of Western Ontario October 2008 Preliminary Do not distribute Abstract We consider

More information

Department of Economics, UCSD UC San Diego

Department of Economics, UCSD UC San Diego Department of Economics, UCSD UC San Diego itle: Spurious Regressions with Stationary Series Author: Granger, Clive W.J., University of California, San Diego Hyung, Namwon, University of Seoul Jeon, Yongil,

More information

Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis

Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis Simultaneous Choice Models: The Sandwich Approach to Nonparametric Analysis Natalia Lazzati y November 09, 2013 Abstract We study collective choice models from a revealed preference approach given limited

More information

New Developments in Econometrics Lecture 16: Quantile Estimation

New Developments in Econometrics Lecture 16: Quantile Estimation New Developments in Econometrics Lecture 16: Quantile Estimation Jeff Wooldridge Cemmap Lectures, UCL, June 2009 1. Review of Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile

More information

What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

More information

Choice of Spectral Density Estimator in Ng-Perron Test: Comparative Analysis

Choice of Spectral Density Estimator in Ng-Perron Test: Comparative Analysis MPRA Munich Personal RePEc Archive Choice of Spectral Density Estimator in Ng-Perron Test: Comparative Analysis Muhammad Irfan Malik and Atiq-ur- Rehman International Institute of Islamic Economics, International

More information

Weak - Convergence: Theory and Applications

Weak - Convergence: Theory and Applications Weak - Convergence: Theory and Applications Jianning Kong y, Peter C. B. Phillips z, Donggyu Sul x October 26, 2018 Abstract The concept of relative convergence, which requires the ratio of two time series

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

4.3 - Linear Combinations and Independence of Vectors

4.3 - Linear Combinations and Independence of Vectors - Linear Combinations and Independence of Vectors De nitions, Theorems, and Examples De nition 1 A vector v in a vector space V is called a linear combination of the vectors u 1, u,,u k in V if v can be

More information

Economic modelling and forecasting

Economic modelling and forecasting Economic modelling and forecasting 2-6 February 2015 Bank of England he generalised method of moments Ole Rummel Adviser, CCBS at the Bank of England ole.rummel@bankofengland.co.uk Outline Classical estimation

More information

Estimator Averaging for Two Stage Least Squares

Estimator Averaging for Two Stage Least Squares Estimator Averaging for Two Stage Least Squares Guido Kuersteiner y and Ryo Okui z This version: October 7 Abstract This paper considers model averaging as a way to select instruments for the two stage

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2016 Instructor: Victor Aguirregabiria ECOOMETRICS II (ECO 24S) University of Toronto. Department of Economics. Winter 26 Instructor: Victor Aguirregabiria FIAL EAM. Thursday, April 4, 26. From 9:am-2:pm (3 hours) ISTRUCTIOS: - This is a closed-book

More information

The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors

The Asymptotic Variance of Semi-parametric Estimators with Generated Regressors The Asymptotic Variance of Semi-parametric stimators with Generated Regressors Jinyong Hahn Department of conomics, UCLA Geert Ridder Department of conomics, USC October 7, 00 Abstract We study the asymptotic

More information

A Fixed-b Perspective on the Phillips-Perron Unit Root Tests

A Fixed-b Perspective on the Phillips-Perron Unit Root Tests A Fixed-b Perspective on the Phillips-Perron Unit Root Tests Timothy J. Vogelsang Department of Economics Michigan State University Martin Wagner Department of Economics and Finance Institute for Advanced

More information

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill November

More information

ESTIMATION AND INFERENCE WITH WEAK, SEMI-STRONG, AND STRONG IDENTIFICATION. Donald W. K. Andrews and Xu Cheng. October 2010 Revised July 2011

ESTIMATION AND INFERENCE WITH WEAK, SEMI-STRONG, AND STRONG IDENTIFICATION. Donald W. K. Andrews and Xu Cheng. October 2010 Revised July 2011 ESTIMATION AND INFERENCE WITH WEAK, SEMI-STRONG, AND STRONG IDENTIFICATION By Donald W. K. Andrews and Xu Cheng October 1 Revised July 11 COWLES FOUNDATION DISCUSSION PAPER NO. 1773R COWLES FOUNDATION

More information

Heteroskedasticity- and Autocorrelation-Robust Inference or Three Decades of HAC and HAR: What Have We Learned?

Heteroskedasticity- and Autocorrelation-Robust Inference or Three Decades of HAC and HAR: What Have We Learned? AEA Continuing Education Course ime Series Econometrics Lecture 4 Heteroskedasticity- and Autocorrelation-Robust Inference or hree Decades of HAC and HAR: What Have We Learned? James H. Stock Harvard University

More information

Simple and Trustworthy Cluster-Robust GMM Inference

Simple and Trustworthy Cluster-Robust GMM Inference Simple and Trustworthy Cluster-Robust MM Inference Jungbin Hwang Department of Economics, University of Connecticut August 30, 207 Abstract This paper develops a new asymptotic theory for two-step MM estimation

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Markov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1

Markov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1 Markov-Switching Models with Endogenous Explanatory Variables by Chang-Jin Kim 1 Dept. of Economics, Korea University and Dept. of Economics, University of Washington First draft: August, 2002 This version:

More information

OPTIMAL BANDWIDTH CHOICE FOR INTERVAL ESTIMATION IN GMM REGRESSION. Yixiao Sun and Peter C.B. Phillips. May 2008

OPTIMAL BANDWIDTH CHOICE FOR INTERVAL ESTIMATION IN GMM REGRESSION. Yixiao Sun and Peter C.B. Phillips. May 2008 OPIAL BANDWIDH CHOICE FOR INERVAL ESIAION IN G REGRESSION By Yixiao Sun and Peter C.B. Phillips ay 8 COWLES FOUNDAION DISCUSSION PAPER NO. 66 COWLES FOUNDAION FOR RESEARCH IN ECONOICS YALE UNIVERSIY Box

More information

MAXIMUM LIKELIHOOD ESTIMATION AND UNIFORM INFERENCE WITH SPORADIC IDENTIFICATION FAILURE. Donald W. K. Andrews and Xu Cheng.

MAXIMUM LIKELIHOOD ESTIMATION AND UNIFORM INFERENCE WITH SPORADIC IDENTIFICATION FAILURE. Donald W. K. Andrews and Xu Cheng. MAXIMUM LIKELIHOOD ESTIMATION AND UNIFORM INFERENCE WITH SPORADIC IDENTIFICATION FAILURE By Donald W. K. Andrews and Xu Cheng October COWLES FOUNDATION DISCUSSION PAPER NO. 8 COWLES FOUNDATION FOR RESEARCH

More information

Strength and weakness of instruments in IV and GMM estimation of dynamic panel data models

Strength and weakness of instruments in IV and GMM estimation of dynamic panel data models Strength and weakness of instruments in IV and GMM estimation of dynamic panel data models Jan F. Kiviet (University of Amsterdam & Tinbergen Institute) preliminary version: January 2009 JEL-code: C3;

More information

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics GMM, HAC estimators, & Standard Errors for Business Cycle Statistics Wouter J. Den Haan London School of Economics c Wouter J. Den Haan Overview Generic GMM problem Estimation Heteroskedastic and Autocorrelation

More information

Economics 241B Review of Limit Theorems for Sequences of Random Variables

Economics 241B Review of Limit Theorems for Sequences of Random Variables Economics 241B Review of Limit Theorems for Sequences of Random Variables Convergence in Distribution The previous de nitions of convergence focus on the outcome sequences of a random variable. Convergence

More information

Cointegration Tests Using Instrumental Variables Estimation and the Demand for Money in England

Cointegration Tests Using Instrumental Variables Estimation and the Demand for Money in England Cointegration Tests Using Instrumental Variables Estimation and the Demand for Money in England Kyung So Im Junsoo Lee Walter Enders June 12, 2005 Abstract In this paper, we propose new cointegration tests

More information

Long-Run Covariability

Long-Run Covariability Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips

More information