A formal statistical test for the number of factors in. the approximate factor models

Size: px
Start display at page:

Download "A formal statistical test for the number of factors in. the approximate factor models"

Transcription

1 A formal statistical test for the number of factors in the approximate factor models Alexei Onatski Economics Department, Columbia University September 26, 2006 Abstract In this paper we study i.i.d. sequences of n 1 vectors X t ; t = 1; :::; T such that, for each t; X t has an approximate factor structure with correlated Gaussian idiosyncratic terms. We develop a test of the null of r factors vs. the alternative that the number of factors is larger than k but smaller than k max +1; where k max is an a priori maximum number of factors. Our test statistic is equal to the ratio of k+1 kmax+1 to kmax+1 kmax+2; where i is the i-th largest eigenvalue of the sample covariance matrix 1 P T T t=1 X txt: 0 We describe the asymptotic distribution of the test statistic as n and T go to in nity proportionally as a function of the Tracy-Widom distribution. We tabulate the critical values of the test corresponding to k and k max relevant for nancial and macroeconomic applications. As an application, we test di erent hypotheses about the number of factors in arbitrage pricing theory. We reject the nulls of no or a single factor against alternatives of more than one factors by the test of 5%-level. We cannot reject the null of 5 factors by the 5%-level test. 1

2 1 Introduction Approximate factor analysis of high-dimensional data has attracted a lot of recent attention from researchers in macroeconomics and nance. Factors non-trivially in uencing thousands of stock returns observed over hundreds of time periods have been used to study the pricing of nancial assets, to evaluate the performance of nancial portfolios, and to test arbitrage pricing theory. In macroeconomics, pervasive factors extracted from half a century of data on hundreds of macroeconomic indicators have been used to monitor business cycles, to forecast individual macroeconomic time series, and to augment vector autoregressions used for monetary policy analysis. An important question to be addressed by any study which uses factor analysis is how many factors there are. Although there have been many recent studies which develop consistent estimators of the number of factors in the empirically relevant context of large and comparable time and cross-sectional dimensions of the data (see, for example, Forni et al (2000), Bai and Ng (2002, 2005), Stock and Watson (2005), Hallin and Liska (2005), Onatski (2005), and Watson and Amengual (2006)), the corresponding estimates of the number of factors driving stock returns and macroeconomic time series often considerably disagree. Such a disagreement indicates that there exists a large amount of statistical uncertainty about the point estimates of the number of factors. Unfortunately, none of the above studies proposes a formal statistical test of di erent hypotheses about the number of factors that would quantify the amount of the uncertainty. The purpose of this paper is to develop such a test. In this paper, we study i.i.d. observations of ndimensional vectors X t ; t = 1; :::; T; of data which admit the approximate factor structure (see Chamberlain and Rothschild, 1983) with correlated complex Gaussian idiosyncratic terms. If, as in the majority of the empirical applications, the original data is real, we construct X by adding the rst half of the original sample and the product of the imaginary unit and the second half of the original sample. We develop a test of the null of k factors vs. the alternative that the number of factors is larger 2

3 than k but smaller than k max +1; where k max is an a priori maximum number of factors. Our test statistics is equal to the ratio of k+1 kmax+1 to kmax+1 kmax+2; where i is the i-th P largest eigenvalue of the sample covariance matrix 1 T T t=1 X txt. 0 We describe the asymptotic distribution of the test statistics as n and T go to in nity proportionally as a function of the Tracy-Widom distribution (see Tracy and Widom, 1994), and tabulate the critical values of the test corresponding to k and k max relevant for nancial and macroeconomic applications. The logic of our test is based on the standard identi cation assumption (see Chamberlain and Rothschild, 1983) that the portion of the data s variation per cross-sectional unit explained by the factors remains non-trivial as n tends to in nity, whereas the portion of the variation per cross-sectional unit due to any idiosyncratic in uence tends to zero. This assumption is equivalent to the requirement that the rst r eigenvalues of the data s covariance matrix (where r is the true number of factors) rise proportionally to n, whereas the rest of the eigenvalues stay bounded. If k < r; we would, therefore, expect that k+1 kmax+1 is much larger than kmax+1 kmax+2: In contrast, when k = r; the eigenvalues k+1 ; kmax+1; and kmax+2 remain bounded as n tends to in nity, and we do not have a reason to expect the ratio of k+1 kmax+1 to kmax+1 kmax+2 to be large. In this paper, we establish the fact that, as both n and T tend to in nity, the joint distribution of the centered and scaled eigenvalues 1 r+1 ; :::; 1 kmax+1 ; where the centering and scaling constants and depend on n; T; and the covariance structure of the idiosyncratic terms, converges to the same limit as the joint distribution of the similarly centered and scaled k max r + 1 largest eigenvalues of the sample covariance matrix of the idiosyncratic terms. Further, we extend recent results of El Karoui (2006) to show that the latter joint distribution converges to the distribution described by Tracy and Widom (1994), which does not depend on the parameters of the data generating process. We conclude that, under the null hypothesis that k = r; the asymptotic distribution of the ratio of 1 k+1 1 kmax+1 to 1 kmax+1 1 kmax+2 is a function of the Tracy-Widom distribution. Finally, we note that the latter ratio is equal to the ratio of 3

4 k+1 kmax+1 to kmax+1 kmax+2: Therefore, its computation does not require the knowledge of the centering and the scaling constants, and hence, the knowledge of the covariance structure of the idiosyncratic terms. We take the ratio of k+1 kmax+1 to kmax+1 kmax+2 as our test statistics and tabulate the corresponding critical values. We study the nite sample performance of our test by running several Monte Carlo experiments. We nd that the test has correct size and good power for our Monte Carlo design and samples as small as n = 50 and T = 50: Moreover, the test seems to be reasonably robust with respect to di erent distributional assumptions about the idiosyncratic terms. Using Monte Carlo experiments we also study the performance of an analog of our test which does not require transforming the original real data into the complex form. Although we are unable to formally establish the asymptotic distribution of the corresponding test statistics, we conjecture the form of the asymptotic distribution and nd by simulation that the test works as well as our test designed for the complex data. As an application, we test di erent hypotheses about the number of factors in arbitrage pricing theory. Our test rejects the nulls of zero or only one factors against alternatives of more factors at 5% levels. We also can reject the nulls of 2, 3 and 4 factors at 5% level at least against some alternatives. We cannot reject the null of 5 factors against alternatives of more factors at 5% level. Although this paper is the rst to develop a formal statistical test of hypotheses about the number of factors in a situation when n and T tend to in nity simultaneously, Connor and Korajczyk (1993) were the rst to develop another test of similar hypotheses under a sequential asymptotics when rst n goes to in nity and then T goes to in nity. The Connor-Korajczyk test is not directly based on the eigenvalue-separation idea which forms the basis for our test, although their test s logic can be traced to this idea. We compare the performance of the Connor-Korajczyk test and our test using simulated data. We nd that in samples of various sizes our test has a much less distorted size than the Connor-Korajczyk test. At the same time, the power of our test is similar to that of the Connor-Korajczyk 4

5 test. The logic of our test (and that of Connor and Korajzyk, 1993) di ers from the logic of the standard likelihood ratio test (see Anderson (1984), chapter 14), which is based on the fact that the idiosyncratic covariance matrix is diagonal under the classical k-factor structure if the true number of factors is less than or equal to k: The approximate factor structure that we are concerned with in this paper does not require that all the cross-sectional correlation be due to the factors as is the case for the classical factor structure. Since the identi cation of factors is fundamentally di erent in the classical and the approximate factor model cases, it is not natural, in general, to compare our test and the classical test. The rest of the paper is organized as follows. In the next section we state our assumptions and develop the test. Section 3 contains Monte Carlo experiments. Section 4 tests di erent hypotheses about the number of factors in arbitrage pricing theory. Section 5 concludes. Technical proofs are contained in the Appendix. 2 The number of factors test We consider a sequence of approximate factor models indexed by n : X (n) = L (n) F (n)0 + e (n) (1) where X (n) is an n T (n) matrix of data; F (n) is a T (n) r matrix of T (n) observations of r factors; where r does not depend on n; L (n) is an n r deterministic matrix of factor loadings; and e (n) is an nt (n) noise matrix with i.i.d. N C (0; (n) ) columns independent from the elements of F (n). Here N C (0; (n) ) denotes a complex normal distribution, which is the distribution of a complex random variable whose real and imaginary parts are independent and identically distributed normals N(0; (n) =2): We assume that (1) satis es Assumptions 1 and 2 formulated below. In what follows we will omit superscript (n) over the variables that change with the dimensionality of data to simplify notations. 5

6 Assumption 1. There exist a positive de nite matrix B and a positive number b such that (L 0 L=n) (F 0 F=T ) p! B as n and T tend to in nity so that n=t! b: Assumption 1 can be thought of as a part of the identi cation restriction which allows us to identify the systematic component LF 0 of the data. The rest of the identi cation restriction is given by the rst inequality of Assumption 2 formulated below. If we further normalize factors so that F 0 F=T = I r ; then the assumption can be used to separately identify factors and factor loadings and may be interpreted as a requirement that the e ects of the factors per cross-sectional unit measured by L 0 L=n remain non-trivial as n tends to in nity. In this paper we do not impose any separate restrictions on the convergence or divergence of L 0 L=n and F 0 F=T because we are not going to separately identify factors and factor loadings. One consequence of putting no separate restrictions on factors and factor loadings is that much room is left for modeling factors. For example, they are allowed to be deterministic, or random and stationary, or random and non-stationary. The proportional n and T asymptotics di ers from the asymptotic assumptions made in the previous literature. Connor and Korajczyk (1993) develop their test using sequential limit asymptotics when, rst, n tends to in nity and, then, T tends to in nity. Bai and Ng (2002) allow n and T to go to in nity without any restrictions on the relative growth rates. The proportional asymptotics allows us to use the machinery of random matrix theory to establish the asymptotic distribution of our test statistics. Note that the limit b may be any positive number so that the asymptotics is consistent with a variety of empirically relevant nite sample situations. Our second assumption restricts the asymptotic behavior of the matrix of the crosssectional covariance of the idiosyncratic terms : Let l 1 ::: l n be the eigenvalues of : Denote by H the spectral distribution of ; that is H() = 1 1 n # fi n : l i > g ; where # fg denotes the number of elements in the indicated set. Further, let c be the unique root in 0; l 1 1 of the equation R (c= (1 c)) 2 dh() = T=n: Assumption 2. As n and T tend to in nity so that n=t! b, lim sup l 1 < 1; 6

7 lim inf l n > 0; and lim sup l 1 c < 1: Note that l 1 is the variance of the most variable or, loosly speaking, the most in uential weighted average of the idiosyncratic terms. Hence, the rst of the three inequalities of Assumption 2 requires that the strongest cumulative idiosyncratic in uence on the crosssectional units remains bounded as the number of the units tends to in nity. This requirement taken together with Assumption 1 constitute a slight modi cation of the standard (see Chamberlain and Rothschild, 1983) identi cation restriction imposed on model (1). The second inequality of Assumption 2 requires that, whatever the dimensionality of the data is, there is no multicollinearity among the idiosyncratic terms. The third inequality is crucial for the analytic apparatus developed in El Karoui (2006) for the analysis (which we rely on in this paper) of the spectral distribution of large complex Wishart matrices. In a nutshell, the inequality requires that the right tail of the spectral distribution of is not too thin. More precisely, it is not di cult to see that for the inequality lim sup l 1 c < 1 to hold, it is su cient that H weakly converges to a distribution H 1 with the upper boundary of support equal to lim sup l 1 and the density bounded away from zero in the vicinity of lim sup l 1 : As is pointed out by El Karoui (2006), such a su cient condition would hold, for example, for that have a symmetric Toeplitz structure with parameters a 0 ; a 1 ; ::: such that P k ja k j < 1 and for that have uniformly spaced eigenvalues on a given bounded segment. In this version of the paper we assume, in addition to Assumptions 1 and 2, that b < 1: We will get rid of this additional assumption in the later versions of the paper. Let 1 ::: n be the eigenvalues of ee 0 =T: De ne a centering constant and a scaling constant so that c = 1 + (n=t ) R (c) = (1 c) dh() and (c) 3 = 1 + (n=t ) R (c) 3 = (1 c) 3 dh () ; and let ~ i = T 2=3 1 ( i ) : Further, let W be a T T Hermitian matrix with i.i.d. N C (0; 1=T ) lower triangular entries and (independent from them) i.i.d N (0; 1=T ) diagonal entries. The collection of such matrices is called Gaussian Unitary Ensemble. It plays an important role in random matrix theory (see Mehta, 2004). Let d 1 ::: d T be the eigenvalues of W; and de ne ~ d i = T 2=3 (d i 2). We, rst, establish 7

8 the following Theorem 1. Let Assumptions 1 and 2 hold and let b < 1: Then, as n and T go to in nity so that n=t! b; for any nite positive integer j; the limiting joint distribution of ~ 1 ; :::; ~ j is equal to the limiting joint distribution of d1 ~ ; :::; d ~ j described by Tracy and Widom (1994). Theorem 1 is, essentially, due to El Karoui (2006). Although he proves the theorem only for the case of j = 1; his derivations imply that it holds for any xed positive integer j: An argument which establishes this fact is the same as that in Soshnikov (2002) which shows that Johnstone s (2001) result stated for the largest eigenvalue of the sample covariance matrix of uncorrelated data easily generalizes to the several largest eigenvalues. We will sketch the argument in the future version of the paper which will deal with the case b 1: Tracy and Widom (1994) studied the asymptotic distribution of a few largest eigenvalues of matrices from the Gaussian Unitary Ensemble when the dimensionality of the matrices tend to in nity. They showed that under appropriate centering and scaling the asymptotic distribution exists and can be expressed in terms of a solution of a system of partial di erential equations. The system simpli es to a single ordinary di erential equation q 00 (s) = sq(s) + 2q 3 (s) when we are interested in the asymptotic distribution of the largest eigenvalue only. In such a case, the asymptotic distribution is equal to F (x) exp R 1 x (x s)q2 (s)ds ; where q(s) is the solution of the above ordinary differential equation which is asymptotically equivalent to the Airy function Ai(s) as s! 1: 1 We describe the Tracy-Widom distribution in more detail in the Monte Carlo section of our paper. Our next theorem shows the equivalence of the joint asymptotic distribution of unobservable eigenvalues of ee 0 =T and that of observable eigenvalues of XX 0 =T: Let 1 ::: n be the eigenvalues of the sample covariance matrix XX 0 =T of the data, and let ~ i = T 2=3 1 ( i ) ; where and are as de ned above. Then, we have: 1 For the de nition and properties of the Airy function see Olver (1974). 8

9 Theorem 2. Let Assumptions 1 and 2 hold. Then, as n and T go to in nity so that n=t! b; for any nite positive integer j; the limiting joint distribution of ~ r+1 ; :::; ~ r+j is equal to the limiting joint distribution of ~ 1 ; :::; ~ j : A proof of the theorem is given in the Appendix. Theorem 2 is the basis of our test of the number of factors. To test a hypothesis that the number of factors is k vs. an alternative that the number of factors is larger than k but smaller than k max + 1; we form a test statistics k+1 kmax+1 = kmax+1 kmax+2 : Note that the test statistics is equal to the ratio of the normalized and centered eigenvalues ~ k+1 ~ kmax+1 = ~kmax+1 ~ kmax+2 : Therefore, by Theorem 2, the asymptotic distribution of our test statistics under the null of k factors is the same as the asymptotic distribution of the ratio ~1 kmax ~ k+1 = ~kmax k+1 ~ kmax k+2 : By Theorem 1 and the continuous mapping theorem, the asymptotic distribution of the latter ratio is equal to the distribution of (x 1 x kmax k+1) = (x kmax k+1 x kmax k+2), where x 1 ; :::; x j are random variables with the Tracy-Widom joint distribution. The critical values of the test statistics can therefore be tabulated based on the knowledge of the Tracy-Widom law. We do such a tabulation in the Monte Carlo section of the paper. In contrast, when the alternative hypothesis is true, k+1 rises proportionally to n while kmax+1 and kmax+2 remain bounded. Therefore, under the alternative hypothesis our test statistics explodes. Theorem 3 summarizes the properties of our test. Theorem 3. Let Assumptions 1 and 2 hold and let b < 1: Then, if the true number of factors is equal to k; the distribution of the ratio k+1 kmax+1 = kmax+1 kmax+2 converges to the distribution of (x 1 x kmax k+1) = (x kmax k+1 x kmax k+2), where x 1 ; :::; x j are random variables with the Tracy-Widom joint distribution. The convergence takes place as n and T go to in nity so that n=t! b: In contrast, when the true number of factors is larger than k but smaller than k max + 1; the ratio k+1 kmax+1 = kmax+1 kmax+2 diverges in probability to in nity. Theorem 3 is a simple consequence of Theorem 2. 9

10 3 Monte Carlo Study In this section we use Monte Carlo simulations to describe in some detail the Tracy-Widom distribution, to tabulate the critical values of our test, and to study its nite sample properties. We approximate the Tracy-Widom distribution by the distribution of a few largest eigenvalues of a matrix from the Gaussian Unitary Ensemble. We obtain an approximation for the latter distribution by simulating 30,000 independent matrices from the ensemble and numerically computing their 10 rst eigenvalues. The left panel of Figure 1 shows the empirical distribution function of the largest eigenvalue centered by 2 and scaled by T 2=3 = =3. The right panel of Figure 1 shows a scatterplot of 30,000 observations of the di erence between the rst and the second eigenvalues vs. the di erence between the second and the third eigenvalues (both di erences scaled by T 2=3 ): Tracy and Widom (2002) report that the mean of their univariate distribution 2 is about -1.77, the standard deviation is close to 0.90, the skewness is slightly larger than 0.22, and the kurtosis is around These characteristics are consistent with the left panel of Figure 1. The right panel of Figure 1 shows that it is not unlikely that x 1 x 2 is substantially larger than x 2 x 3 ; where x 1 ; x 2 ; and x 3 have joint Tracy-Widom distribution. This observation suggests that ad hoc methods of the determination of the number of factors based on visual inspection of the eigenvalues of the sample covariance matrix, and their separation into a group of large and a group of small eigenvalues may be misleading. It may happen, for example, that even though data have no factors, the rst eigenvalue of the sample covariance matrix is substantially larger than the second one, while the second eigenvalue is not much di erent from the third one. Table 1 contains approximate percentiles of the distributions of random variables (x 1 x j 1 ) = (x j 1 x j ) for j = 3; 4; :::; 10; where x 1 ; :::; x j have the joint Tracy-Widom distribution. The approximate percentiles were obtained as the Jackknifed sectioning es- 2 They denote the distribution as F 2 to distinguish it from the distributions F 1 and F 4 that correspond to the limiting distributions of the largest eigenvalues of matrices from the so called Gaussian Orthogonal and Simplectic Ensenbles, respectively. 10

11 Figure 1: Left panel: a univariate Tracy-Widom distribution. Right panel: 30,000 draws from the joint distribution of x 1 x 2 and x 2 x 3 ; where x 1 ; x 2 ; x 3 have the joint Tracy-Widom distribution 11

12 99% 8:81 (7:77;9:84) 95% 4:52 (4:43;4:61) 90% 3:33 (3:28;3:38) j = k max k :10 (13:93;16:28) 8:08 (7:89;8:28) 6:18 (6:09;6:27) 21:30 (19:38;23:22) 11:90 (11:67;12:14) 9:02 (8:91;9:12) 28:10 (23:73;32:47) 15:66 (15:42;15:90) 12:04 (11:81;12:27) 34:61 (30:26;38:96) 19:09 (18:69;19:49) 14:72 (14:62;14:83) 42:57 (37:87;47:28) 22:90 (22:18;23:61) 17:81 (17:45;18:18) 45:84 (43:54;48:13) 26:72 (26:22;27:21) 20:75 (20:48;21:02) 54:58 (51:97;57:17) 30:54 (30:03;31:04) 23:68 (23:41;23:96) Table 1: Approximate percentiles of the test statistics for the tests of k factors vs. alternative of more than k but less than kmax factors an timator (reference here) of the corresponding percentiles of the distribution of the ratio (y 1 y j 1 ) = (y j 1 y j ) ; where y 1 ; :::; y j are the rst j eigenvalues of matrices from the Gaussian Unitary Ensemble. The sectioning estimator uses 5 equal-length sections of 10,000 i.i.d. draws of (y 1 y j 1 ) = (y j 1 y j ) : The 90% con dence intervals for the true percentiles of the distribution of (y 1 y j 1 ) = (y j 1 y j ) are given in the parenthesis. The con dence intervals reported in Table 1 underestimate the uncertainty about the true percentiles of the asymptotic distribution of our test statistics because y 1 ; :::; y j are distributed according the joint Tracy-Widom law only asymptotically, as the dimensionality of the matrix of which y i s are eigenvalues increases to in nity. We hope that the amount of the additional uncertainty due to the fact that we used matrices to simulate y i s is small and do not try to improve the uncertainty estimates. Our hope is supported by El Karoui s (2006a) nding that the cumulative distribution function of y 1 converges to that of the univariate Tracy-Widom law at the very fast rate of n 2=3 ; where n is the dimensionality of the corresponding matrix: Table 1 can be used to nd critical values of our test that the number of factors is k vs. an alternative that it is more than k but less than k max + 1: According to Theorem 3, to determine, say, 5% critical value of the test, we need to consult Table 1 and nd the 95-th percentile of the distribution of (x 1 x j 1 ) = (x j 1 x j ) ; where j = k max k + 2: For example, the 5% critical value of our test of the hypothesis that there are 3 factors vs. an alternative that there are more than 3 factors but less than 7 factors is equal to To study the nite sample properties of our test, we simulated 1000 data sets having 12

13 k max % 0:008 0:010 0:011 0:009 0:012 0:011 0:008 0:013 95% 0:044 0:058 0:050 0:056 0:048 0:058 0:053 0:070 90% 0:090 0:112 0:111 0:102 0:097 0:103 0:125 0:126 Table 2: Empirical size of the tests of 3 factors vs. di erent alternatives, n=t=100 factor structure X = F + e; where is an n r matrix with i.i.d. standard normal entries, F is a r ~ T matrix with standard normal entries (hence, the true number of factors is r), and e is an n T ~ matrix with i.i.d. N (0; ) columns, where ij = ji jj : We considered three di erent choices of n and T ~ : n; T ~ = (100; 200) ; (50; 100) ; and (25; 50) ; ten di erent choices of : = 0; 0:1; 0:2; :::; 0:9; and nine di erent choices of r : r = 3; 4; :::; 11:.To create complex-valued data sets we added the rst ~ T =2 columns of X and p 1 times the last ~ T =2 columns of X: Hence, the dimensionality of our complex data sets were nt; where T = ~ T =2: Using the simulated data we tested a hypothesis that there are 3 factors vs. alternatives that there are more than 3, but less than 5, 6,..., 12 factors. Table 2 reports the empirical size of the performed tests for n = 100; ~ T = 200; and = 0. The last 3 rows of the table correspond to the tests with theoretical size equal to 1%, 5%, and 10%. Di erent columns of the table correspond to the tests that share the null of 3 factors but have di erent alternatives indexed by k max : We see that the size distortions due to the small sample size are small. Figure 2 shows contour plots of the empirical size and power of our tests (with theoretical size 5%) for di erent choices of and n; T ~ : The contour plots are drawn in the space of (horizontal axis) and k max 1 (vertical axis). The left panel of Figure 2 shows the tests size, and the right panel shows the tests power when the true number of factors is equal to k max : The rst row of the gure corresponds to the sample size n = 100; T ~ = 200; the second row corresponds to n = 50; T ~ = 100; and the third row corresponds to n = 25; T ~ = 50: We see that the size of our tests deteriorates when the amount of dependence in the idiosyncratic terms rises. This is especially noticeable for tests corresponding to large k max and for n = 25: 13

14 Figure 2: Contour plots of the empirical size and power of the tests of 3 factors vs. alternatives of less than k max + 1 factors. Horizontal axis: : Vertical axis: k max 1: Left panel: size. Right panel: power when the true number of fators is k max : First row: n = T = 100; second row: n = T = 50; third row: n = T = 25: For n 50 and k max = 4; the size distortions are small for all : For n 50 and relatively large k max ; the size distortions are larger. The power of the tests is very good for n 50 and virtually all and k max ; but becomes small for n = 25; large ; and small k max. The fact that the power of our test deteriorates as the amount of the idiosyncratic dependence rises and the sample size falls accords with the following intuition. When the true number of factors is k max ; the eigenvalue kmax+1 measures the largest contribution of the idiosyncratic terms into the data s variance. As the amount of the dependence in the idiosyncratic terms rises, this contribution increases so that it may become comparable to the factors shares of the data s variance measured by 1 ; :::; kmax : Furthermore, since factors 14

15 are common to all idiosyncratic units, their share in the data s variance rises proportionally to the number of the cross-sectional units. Therefore, eigenvalues 1 ; :::; kmax would be relatively small when the sample size is small and the gap between k and kmax+1 can become small even for relatively small amount of the idiosyncratic dependence. Hence, for a large amount of the idiosyncratic dependence or for small sample sizes, the numerator of our test statistics k+1 kmax+1 = kmax+1 kmax+2 may become relatively small even when the true number of factors is k max. This explains deterioration of the power of our test when the amount of the idiosyncratic dependence rises or the sample size falls. To check the robustness of our test to violations of Gaussianity and time independence of the idiosyncratic terms, we perform the following three experiments. First, we use an AR(1) process e i;t = e i 1;t +" it ; where " it are i.i.d. random variables having Student s t distribution with 5 degrees of freedom, 3 to generate relatively heavy-tailed idiosyncratic terms. Second, we use the same AR(1) process but assume that " it are i.i.d. random variables having 2 (1) distribution centered so that its mean is equal to zero. Hence, in the second experiment the idiosyncratic terms are skewed to the right. Finally, we simulate idiosyncratic terms that satisfy a version of the constant conditional correlation multivariate GARCH model of Bollerslev (1990): e i;t = p h i;t u i;t ; u i;t = u i 1;t +" it ; " i;t are i.i.d. N (0; 1 2 ) ; and h it = 0:1 + 0:8h i;t 1 + 0:1u 2 i;t 1: In the rst two experiments, we simulate data using the idiosyncratic terms after normalizing them so that they have unit sample variance. In the last experiment, the theoretical unconditional variance of e it is equal to 1, so we do not normalize the idiosyncratic terms. An equivalent of Figure 2 for the simulated data looks very similar to Figure 2 for all three experiments and it is not reported here. Instead, Figure 3 reports the size and the power of the test of the null of 3 factors vs. the alternative that the number of factors is 4 for the case N = 50; ~ T = 100. The left panel of Figure 3 shows the size of the test and the right panel shows the 3 We chose to consider Student s t distribution with 5 degrees of freedom because we wanted the distribution to have 4 moments, which is a necessary condition for many Large Random Matrix theory results to hold (see Bai, 1999). 15

16 Figure 3: Size and power of our test of 3 factors vs. 4 factors. Left panel: size. Right panel: power. Horizontal axis: : N = 50; ~ T = 100: Solid line: no violations of the assumptions. Dashed line: multivarite GARCH model for the idiosyncratic terms. Dash-dot line: skewed idiosyncratic terms. Dotted line: heavy tails. 16

17 power. The solid line corresponds to the benchmark simulations when no assumptions of the paper were violated. The dashed line corresponds to the idiosyncratic terms satisfying the multivariate GARCH model. The dash-dot line corresponds to the idiosyncratic terms heavily skewed to the right. The dotted line corresponds to the idiosyncratic terms which have heavy tails. We see that the small sample properties of our test are reasonably robust to di erent violations of our assumptions. The worst distortion of the size and the power happens when heavy tails are assumed for the idiosyncratic process. However, the distortions do not become prohibitively large until the amount of the cross-sectional correlation in the idiosyncratic terms measured by becomes as large as 0.7 even in the heavy-tail case. Our next task is to compare the nite sample properties of our test and that proposed by Connor and Korajczyk (1993). The procedure of the Connor-Korajczyk test of a null of k factors vs. an alternative of k + 1 factors is as follows. First, a k-factor model and a k + 1- factor model are estimated from the data, say, by principle components estimator. Then, the average over the cross-sectional units or the squared residuals for each of the models and each time period are obtained. Let us denote these averages as 1;t and 2;t ; where the rst subscript is equal to 1 if the average corresponds to the k-factor model and it is equal to 2 if the average corresponds to the k + 1-factor model. After that, a vector with components 1;2 1 2;2 ; = 1; :::; T=2 is formed. Connor and Korajzcyk (1993) show that, under the null, when n tends to in nity and T remains xed, the asymptotic joint distribution of the components of is zero mean Gaussian with some unspeci ed covariance matrix. In contrast, under the alternative, we would expect the residuals from the k + 1-factor model to be much smaller than those from the k-factor model. Therefore, the components of should be large. Hence, Connor and Korajzcyk propose as their test statistics a ratio of the mean of the components of and the heteroskedasticity and autocorrelation robust estimate of its standard error. Under the null, such a ratio should be asymptotically distributed as a standard normal random variable as rst n goes to in nity and then T goes to in nity. We implemented the Connor-Korajzcyk test of 3 factors relative to the alternative of 4 17

18 factors using the same 1000 simulations of the real data that we used to study the nite sample properties of our test. We focused on those simulations for which the true number of factors equals 3 or 4. Figure 4 shows the size and the power of our test (thick lines) and that of the Connor-Korajzcyk test (thin lines) for di erent amounts of the idiosyncratic dependence. The left panel of the gure shows the size and the right panel of the gure shows the power of the tests. The solid lines correspond to the sample size n = 100; T ~ = 200; the solid lines with dot markers on them correspond to the sample size n = 50; T ~ = 100; and the dashed lines correspond to the smallest sample size n = 25; T ~ = 50: We see that the Connor-Korajzcyk test has much too large a size (the theoretical size is equal to 5%) for all (horizontal axis). The size of the test becomes extremely distorted even for moderate amounts of the idiosyncratic dependence. The empirical size of our test behaves strikingly better than that of the Connor-Korajzcyk test. The power of both tests is very good for small amounts of the idiosyncratic dependence. For relatively large amounts of the idiosyncratic dependence and for relatively small sample sizes, the power of our test becomes much worse than that of the Connor-Korajzcyk test. However, given the huge size distortions of the latter, such a power advantage cannot exploited. Overall, the properties of our test are much better than those of the Connor-Korajzcyk at least in our Monte Carlo experiments. Before we turn to an application of our test, we would like formulate a conjecture which is an analog Theorem 1 for the case of real data. Conjecture 1: Assume that the columns of the matrix of idiosyncratic terms e (n) are multivariate real random variables with covariance matrix :Let Assumptions 1 and 2 hold. Then, as n and T go to in nity so that n=t! b; for any nite positive integer j; the limiting joint distribution of ~ 1 ; :::; ~ j is equal to the limiting joint distribution of T 2=3 (d i 2) ; i = 1; :::; j; where d i is the i-th largest eigenvalue of a matrix from the Gaussian Orthogonal Ensemble described by Tracy and Widom (1996) (denoted as TW1).. Conjecture 1 has been supported by numerical simulations in El Karoui (2006). If the conjecture is correct, then we have: 18

19 Figure 4: Empirical size and power of our test of 3 vs. 4 factors (thick lines) and the Connor- Korajzcyk test of 3 vs. 4 factors (thin lines). Left panel: size. Right panel: power. Solid lines: n = 100; ~ T = 200: Solid lines with dot markers: n = 50; ~ T = 100: Dashed lines: n = 25; ~ T = 50: Horizontla axis: : 19

20 99% 16:56 (14:79;18:33) 95% 6:89 (6:64;71:5) 90% 4:54 (4:45;4:63) j = k max k :75 (25:97;31:53) 12:41 (12:07;12:75) 8:53 (8:44;8:62) 41:57 (36:02;47:12) 18:16 (17:49;18:83) 12:46 (12:13;12:78) 55:07 (48:49;61:65) 23:99 (23:26;24:73) 16:40 (16:18;16:62) 67:53 (61:28;73:78) 29:41 (28:33;30:50) 20:29 (20:03;20:55) 79:13 (72:87;85:40) 35:05 (34:30;35:79) 24:34 (23:98;24:71) 91:90 (74:13;109:67) 39:89 (38:27;41:52) 27:42 (26:37;28:46) 106:01 (89:03;122:99) 47:35 (45:49;49:21) 32:62 (31:87;33:37) Table 3: Approximate percentiles of the test statistics for the tests of k factors vs. alternative of more than k but less than kmax factors. Real data. an Conjecture 2. Assume that the columns of the matrix of idiosyncratic terms e (n) are multivariate real random variables with covariance matrix :Let Assumptions 1 and 2 hold. Then, if the true number of factors is equal to k; the distribution of the ratio k+1 kmax+1 = kmax+1 kmax+2 converges to the distribution of (x 1 x kmax k+1) = (x kmax k+1 x kmax k+2), where x 1 ; :::; x j are random variables with the Tracy- Widom (TW1) joint distribution. The convergence takes place as n and T go to in nity so that n=t! b: In contrast, when the true number of factors is larger than k but smaller than k max + 1; the ratio k+1 kmax+1 = kmax+1 kmax+2 diverges in probability to in nity. Table 3 below is the analog of Table 1 for the test based on Conjecture 2. 4 The number of factors in asset returns As an application of our test, we would like to test di erent hypotheses about the number of pervasive factors driving stock returns. There is a large amount of controversy about this number in the literature. For example, Connor and Korajcyk (1993) nd evidence for between one and six pervasive factors in the stock returns. Trzcinka (1986) nds some support to the existence of ve pervasive factors. Five seems also to be a preferred number for Roll and Ross (1980) and Reinganum (1981). A study by Brown and Weinstein (1983) also suggested that the number of factors is unlikely to be greater than ve: 4 Huang and Jo (1995), Bai and Ng (2002), and Onatski (2005) identify only two common factors 4 Dhrymes, Friend, and Gultekin (1984) nd that the estimated number of factors grows with the sample size. However, their setting was the classical factor model as opposed to the approximate factor model. 20

21 To test hypotheses about the number of factors in stock returns we use CRSP data on monthly returns of 171 company for a period from January 1960 to December Our data set includes those and only those companies for which CRSP provides monthly holding period return data for all months in the studied time interval. To compute the excess returns we used monthly returns on 6-month Treasury bills provided by the Board of Governors of the Federal Reserve System. The cross-sectional dimensionality of our data is n = 171; and the time series dimensionality is T ~ = 552: To get a complex valued data set, we divide the real-valued data set into two periods: the rst containing all observations from January 1960 to December 1982, and the second containing all observations from January 1983 to December Then we add the data from the rst time period and square root from -1 times the data from the second period. The dimensionality of the obtained complex-valued data set is n = 171 and T = 276: Using the obtained data, we test hypotheses that the number of factors is 0,1,...,7 vs. alternatives that the number of factors is more than the null-hypothesis number but no more than up to 8 factors. the resulting test statistics are reported in Table 4. The numbers in parentheses given below the values of the test statistics are approximate p-values of the test. The approximate p-values were computed as the proportion of 30,000 simulated draws from the null distribution that fall above the corresponding test statistics. We see that the null hypothesis of no factors vs. any of the alternatives of the form: positive number of factors no larger than k max is overwhelmingly rejected by the data. The null of one factor vs. more factors is rejected at 5% level for all corresponding alternatives except the alternative that the number of factors is larger than 1 but no larger than 4. The null of 2 factors is not rejected at 5% level for all corresponding alternatives except the alternative that the number of factors is more than 2 but no larger than 5. The situation is the same for the null of 3 and 4 factors. The nulls of 5, 6, and 7 factors are not rejected by the data even at at 29% level. To interpret these results it is useful to analyze the largest eigenvalues of the sample 21

22 k max H 0 number of factors :72 (0:008) 2 75:19 (0:000) 3 126:71 (0:000) 4 46:88 (0:002) 5 592:27 (0:000) 6 204:54 (0:000) 7 257:62 (0:000) 8 349:00 (0:000) 7:01 (0:012) 13:33 (0:010) 5:26 (0:279) 77:44 (0:000) 27:04 (0:016) 35:15 (0:011) 48:78 (0:007) 1:66 (0:286) 0:98 (0:944) 24:46 (0:005) 8:78 (0:155) 12:57 (0:121) 17:89 (0:072) 0:37 (0:923) 16:91 (0:004) 6:18 (0:171) 8:99 (0:138) 13:49 (0:087) 12:37 (0:002) 4:61 (0:132) 7:03 (0:120) 10:84 (0:080) 0:34 (0:928) 1:69 (0:682) 3:62 (0:449) 1:25 (0:401) 3:04 (0:295) 1:35 (0:360) Table 4: Value of test statistics and approximate p-values for the tests of the number of factors in the second row vs. alternative of more factors but less than or equal to the number in the rst column covariance matrix of the data. Note that the eigenvalues can be interpreted as reductions in the mean square error which result from using an extra principal component to model systematic part of the data. The eigenvalues turned out to be (in percentage units relative to the largest eigenvalue): 100:00; 18:40; 10:00; 8:80; 8:09; 6:12; 5:96; 5:51; 5:14; 4:87; etc. The rejection of the null of no factors means that the gap between the explanatory power of the rst principal component and any consequent principal components is too large relative to the gaps between the explanatory power of more distant components. It is too large in the sense that such a gap is not consistent with the null that all principal components, including the rst one, represent just idiosyncratic in uences. The rst principal component must, therefore, contain a systematic or pervasive in uence on the cross-sectional units. Similarly, the gap between the explanatory power of the second principal component and virtually all consequent principal components is too large (in relative terms) to be consistent with the idiosyncratic nature of the second component. 22

23 As to the nulls of 2, 3 and 4 factors, they are rejected at 5% level only vs. an alternative of no more than 5 factors. Technically, this happens because the gaps between the explanatory power of, say, the second principal component and that of the 6th, 7th, 8th, and 9th principal components do not seem large (at 5% level) relative to all gaps but the gap between the explanatory power of the 6th and the 7th components. An important question arises whether we should consider the closeness of the 6th and 7th eigenvalues as a uke in the data, and, therefore, accept the null of only two factors, or we should interpret our results as suggesting that there are likely to be ve factors. We favor the latter interpretation. It is because, although we cannot reject the nulls of 2, 3 and 4 factors vs. alternatives of more factors but less than 7, 8, and 9 factors at 5% level, we reject these nulls vs. the alternatives at about 17% level, which is not a high level. Had the relative closeness of the 6th and 7th eigenvalues been just a uke, we would not expect the p-values of the tests against the alternatives of less than 7, 8, and 9 factors to be consistently at the low end. Note in this context, that the nulls of 5, 6 and 7 factors cannot be rejected with the corresponding p-values being everywhere in the range from to To check our nding that there likely be 5 factors driving the stock returns, we run the real data tests (which are conjectured but not proven to be correct procedures) on the untransformed real data. We get very similar results to those reported in Table 4. To check robustness of the nding to the choice of the data, we run our complex-data test on the CRSP data for 1148 stocks traded on the NYSE, AMEX, and NASDAQ during the period from January 1983 to December 2003, which we use in our previous work (see Onatski (2005)). For these data, the nulls of zero and one factors are, again, overwhelmingly rejected. The nulls of 2, 3, and 4 factors are rejected (at 5% level) again only vs. the alternatives of no more than 5 factors. This time, however, the p-values corresponding the alternatives of less than 7, 8, and 9 factors are not consistently small. They span a range between and Moreover, when we run the real-data test on the new data, we still overwhelmingly 23

24 reject hypotheses of zero and one factors, but no longer reject the hypotheses of 2, 3, etc. factors at any reasonable critical levels against any alternatives (the smallest p-value being for the test of two factors vs. the alternative of more than two but no more than four factors). The partial robustness of the our nding that there likely be 5 factors in the stock returns may be due either to the fact that much more stocks were included into the second data set, or to a possibility that the number of factors driving the stock returns decreased over time so that the second data set which spans more recent data period does not support the 5-factor hypothesis as strongly as the rst data set. To check the latter possibility, we split the original data set into two periods and then run our tests separately for the complex data constructed from the rst and the second periods. For both periods the nulls of zero and one factors are rejected, as usual. However, the nulls of 2 and 3 factors are now rejected at about 5% level against two alternatives: no more than 4 factors and no more than 7 factors for both time periods. The null of 4 factors is not rejected for any conventional critical levels against any of the studied alternatives for both time periods. Hence, the 5-factor phenomenon does not hold in both subsamples. To summarize, we nd that our tests overwhelmingly reject the hypotheses that there are either no factors or only one factor driving stock returns. There is some evidence that the number of factors may be ve, but this evidence is only partially robust with respect to the choice of the stocks to be included to the data set and with respect to the di erent choice of time periods. 5 Conclusion The new test shows good nite sample properties and uses simultaneous large n- large T asymptotics. It strongly outperforms the Connor-Korajczyk (1993) test for a variety of nite sample situations. The biggest weakness of the test is that it is developed for the situation 24

25 when the data are Gaussian and independent over time. The Monte Carlo experiments show, however, that the test is robust with respect to violations of the Gaussianity and the independence assumptions. We apply the test to test di erent hypotheses about the number of pervasive factors driving US stock returns. Our test rejects the nulls of zero or only one factors against alternatives of more factors at 5% levels. We also can reject the nulls of 2, 3 and 4 factors at 5% level at least against some alternatives. We cannot reject the null of 5 factors against alternatives of more factors at 5% level. The 5-factor nding turns out to be only partially robust with respect to di erent choices of the time interval and stocks to be included into the dataset. The rejection of zero and 1 factor hypotheses is very robust. 6 Appendix Proof of Theorem 2: We will rst formulate and prove a lemma, which our proof of Theorem 2 will be based upon. Let A (1) be a symmetric non-negative de nite nn matrix and A be an nn diagonal matrix of the form A = diag (a 1 ; :::; a k ; 0; 0; :::; 0) ; a 1 a 2 ::: a k > 0: Note that A and A (1) can be interpreted as matrix representations of linear operators acting in the space R n : For any linear bounded operator on R n ; B; we de ne its norm as kbk = (max eval (B B)) 1=2 ; which is the operator norm induced by the standard Euclidean norm of vectors in R n : We denote the j-th largest by absolute value eigenvalue of B as j (B); and the j-th largest by absolute value eigenvalue of B restricted to its invariant subspace M as i (BjM). Let M 0 be the invariant subspace of A corresponding to the eigenvalue 0 and let P 0 be the orthogonal projection on M 0 : We have the following Lemma 1: Let A ({) = A + {A (1) and let r 0 = a k =2: For real { such that 0 < { < r 0 = A (1) we have: k+1 (A ({)) { 1 P 0 A (1) P 0 jm 0 j{j 2 A (1) 2 3r0 (r 0 j{j ka (1) k) 2 25

26 Proof of Lemma 1: Let R (z; {) = (A ({) zi n ) 1 be the resolvent of A ({) de ned for all complex z not equal to any of the eigenvalues of A ({) : Then R (z; 0) is the resolvent of A: We will denote R (z; 0) as R(z): Let be a positively oriented circle in the complex plain with center at 0 and radius r 0 : According to Kato (1980), pages 67-68, R (z; {) can be expanded as R (z; {) = R (z) + X 1 ( j=1 {)j R (z) A (1) R (z) j ; where the sum on the right hand side is uniformly convergent on for j{j < min z2 A (1) 1 kr (z)k = r0 = A (1) ; where the last equality follows from the fact that kr (z)k = r 1 0 (2) for any z 2 : Furthermore, for j{j < r 0 = A (1) ; there are exactly n k eigenvalues (counted as many times as their algebraic multiplicity) inside the circle : We will call these eigenvalues 0-group eigenvalues and denote them as a (1) ({) ::: a (n k) ({) : Denote the direct sum of the invariant subspaces of A ({) corresponding to the 0-group eigenvalues as M 0 ({). Let P 0 be the orthogonal projection on M 0 and P 0 ({) be a projection R on M 0 ({) de ned as P 0 ({) = 1 2i R (z; {) dz: We have: P 0 ({) = P 0 1 2i 1X Z ( {) j j=1 R (z) A (1) R(z) j dz (3) where the series absolutely converges for j{j < r 0 = A (1) (see Kato (1980), pages 75-76). Equations (3) and (2) imply, that kp 0 ({) P 0 k 1X A j{j j (1) j j=1 r j 0 = j{j A (1) r 0 j{j ka (1) k : (4) Now consider an operator ~ A (1) ({) 1 { A ({) P 0 ({) : Since M 0 ({) is the invariant subspace of A ({) corresponding to the 0-group eigenvalues we have: 1 ~A (1) ({) jm 0 ({) = 1 { a(1) ({) : (5) 26

Testing hypotheses about the number of factors in. large factor models.

Testing hypotheses about the number of factors in. large factor models. Testing hypotheses about the number of factors in large factor models. Alexei Onatski Economics Department, Columbia University April 5, 2009 Abstract In this paper we study high-dimensional time series

More information

Asymptotics of the principal components estimator of large factor models with weak factors

Asymptotics of the principal components estimator of large factor models with weak factors Asymptotics of the principal components estimator of large factor models with weak factors Alexei Onatski Economics Department, Columbia University First draft: November, 2005 This draft: May, 2009 Abstract

More information

R = µ + Bf Arbitrage Pricing Model, APM

R = µ + Bf Arbitrage Pricing Model, APM 4.2 Arbitrage Pricing Model, APM Empirical evidence indicates that the CAPM beta does not completely explain the cross section of expected asset returns. This suggests that additional factors may be required.

More information

Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models

Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models Ryan Greenaway-McGrevy y Bureau of Economic Analysis Chirok Han Korea University February 7, 202 Donggyu Sul University

More information

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM.

Ross (1976) introduced the Arbitrage Pricing Theory (APT) as an alternative to the CAPM. 4.2 Arbitrage Pricing Model, APM Empirical evidence indicates that the CAPM beta does not completely explain the cross section of expected asset returns. This suggests that additional factors may be required.

More information

Supplemental Material 1 for On Optimal Inference in the Linear IV Model

Supplemental Material 1 for On Optimal Inference in the Linear IV Model Supplemental Material 1 for On Optimal Inference in the Linear IV Model Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Vadim Marmer Vancouver School of Economics University

More information

Testing for a Trend with Persistent Errors

Testing for a Trend with Persistent Errors Testing for a Trend with Persistent Errors Graham Elliott UCSD August 2017 Abstract We develop new tests for the coe cient on a time trend in a regression of a variable on a constant and time trend where

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

GMM estimation of spatial panels

GMM estimation of spatial panels MRA Munich ersonal ReEc Archive GMM estimation of spatial panels Francesco Moscone and Elisa Tosetti Brunel University 7. April 009 Online at http://mpra.ub.uni-muenchen.de/637/ MRA aper No. 637, posted

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set

Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set Chow forecast test: Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set of N sample observations into N 1 observations to be used for estimation

More information

Parametric Inference on Strong Dependence

Parametric Inference on Strong Dependence Parametric Inference on Strong Dependence Peter M. Robinson London School of Economics Based on joint work with Javier Hualde: Javier Hualde and Peter M. Robinson: Gaussian Pseudo-Maximum Likelihood Estimation

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails GMM-based inference in the AR() panel data model for parameter values where local identi cation fails Edith Madsen entre for Applied Microeconometrics (AM) Department of Economics, University of openhagen,

More information

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation Inference about Clustering and Parametric Assumptions in Covariance Matrix Estimation Mikko Packalen y Tony Wirjanto z 26 November 2010 Abstract Selecting an estimator for the variance covariance matrix

More information

Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 770

Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 770 Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 77 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill October 4, 2 MATHEMATICAL

More information

Notes on Time Series Modeling

Notes on Time Series Modeling Notes on Time Series Modeling Garey Ramey University of California, San Diego January 17 1 Stationary processes De nition A stochastic process is any set of random variables y t indexed by t T : fy t g

More information

Econ 424 Time Series Concepts

Econ 424 Time Series Concepts Econ 424 Time Series Concepts Eric Zivot January 20 2015 Time Series Processes Stochastic (Random) Process { 1 2 +1 } = { } = sequence of random variables indexed by time Observed time series of length

More information

A Conditional-Heteroskedasticity-Robust Con dence Interval for the Autoregressive Parameter

A Conditional-Heteroskedasticity-Robust Con dence Interval for the Autoregressive Parameter A Conditional-Heteroskedasticity-Robust Con dence Interval for the Autoregressive Parameter Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Patrik Guggenberger Department

More information

Web Appendix to Multivariate High-Frequency-Based Volatility (HEAVY) Models

Web Appendix to Multivariate High-Frequency-Based Volatility (HEAVY) Models Web Appendix to Multivariate High-Frequency-Based Volatility (HEAVY) Models Diaa Noureldin Department of Economics, University of Oxford, & Oxford-Man Institute, Eagle House, Walton Well Road, Oxford OX

More information

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

1 The Multiple Regression Model: Freeing Up the Classical Assumptions 1 The Multiple Regression Model: Freeing Up the Classical Assumptions Some or all of classical assumptions were crucial for many of the derivations of the previous chapters. Derivation of the OLS estimator

More information

Bootstrap Inference for Impulse Response Functions in Factor-Augmented Vector Autoregressions

Bootstrap Inference for Impulse Response Functions in Factor-Augmented Vector Autoregressions Bootstrap Inference for Impulse Response Functions in Factor-Augmented Vector Autoregressions Yohei Yamamoto University of Alberta, School of Business January 2010: This version May 2011 (in progress)

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

Chapter 2. Dynamic panel data models

Chapter 2. Dynamic panel data models Chapter 2. Dynamic panel data models School of Economics and Management - University of Geneva Christophe Hurlin, Université of Orléans University of Orléans April 2018 C. Hurlin (University of Orléans)

More information

Small versus big-data factor extraction in Dynamic Factor Models: An empirical assessment

Small versus big-data factor extraction in Dynamic Factor Models: An empirical assessment Small versus big-data factor extraction in Dynamic Factor Models: An empirical assessment Pilar Poncela and Esther Ruiz yz May 2015 Abstract In the context of Dynamic Factor Models (DFM), we compare point

More information

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models Prof. Massimo Guidolin 019 Financial Econometrics Winter/Spring 018 Overview ARCH models and their limitations Generalized ARCH models

More information

Estimation and Inference of Linear Trend Slope Ratios

Estimation and Inference of Linear Trend Slope Ratios Estimation and Inference of Linear rend Slope Ratios imothy J. Vogelsang and Nasreen Nawaz Department of Economics, Michigan State University October 4 Abstract We focus on the estimation of the ratio

More information

Lecture 4: Linear panel models

Lecture 4: Linear panel models Lecture 4: Linear panel models Luc Behaghel PSE February 2009 Luc Behaghel (PSE) Lecture 4 February 2009 1 / 47 Introduction Panel = repeated observations of the same individuals (e.g., rms, workers, countries)

More information

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 00 points possible. Within

More information

Comment on HAC Corrections for Strongly Autocorrelated Time Series by Ulrich K. Müller

Comment on HAC Corrections for Strongly Autocorrelated Time Series by Ulrich K. Müller Comment on HAC Corrections for Strongly Autocorrelated ime Series by Ulrich K. Müller Yixiao Sun Department of Economics, UC San Diego May 2, 24 On the Nearly-optimal est Müller applies the theory of optimal

More information

On the Power of Tests for Regime Switching

On the Power of Tests for Regime Switching On the Power of Tests for Regime Switching joint work with Drew Carter and Ben Hansen Douglas G. Steigerwald UC Santa Barbara May 2015 D. Steigerwald (UCSB) Regime Switching May 2015 1 / 42 Motivating

More information

Bias in the Mean Reversion Estimator in the Continuous Time Gaussian and Lévy Processes

Bias in the Mean Reversion Estimator in the Continuous Time Gaussian and Lévy Processes Bias in the Mean Reversion Estimator in the Continuous Time Gaussian and Lévy Processes Aman Ullah a, Yun Wang a, Jun Yu b a Department of Economics, University of California, Riverside CA 95-47 b School

More information

2. Multivariate ARMA

2. Multivariate ARMA 2. Multivariate ARMA JEM 140: Quantitative Multivariate Finance IES, Charles University, Prague Summer 2018 JEM 140 () 2. Multivariate ARMA Summer 2018 1 / 19 Multivariate AR I Let r t = (r 1t,..., r kt

More information

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity Songnian Chen a, Xun Lu a, Xianbo Zhou b and Yahong Zhou c a Department of Economics, Hong Kong University

More information

Weak and Strong Cross Section Dependence and Estimation of Large Panels

Weak and Strong Cross Section Dependence and Estimation of Large Panels Weak and Strong Cross Section Dependence and Estimation of Large Panels (with Alexander Chudik and Elisa Tosetti) Cambridge University and USC Econometric Society Meeting, July 2009, Canberra Background

More information

1 Regression with Time Series Variables

1 Regression with Time Series Variables 1 Regression with Time Series Variables With time series regression, Y might not only depend on X, but also lags of Y and lags of X Autoregressive Distributed lag (or ADL(p; q)) model has these features:

More information

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT MARCH 29, 26 LECTURE 2 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT (Davidson (2), Chapter 4; Phillips Lectures on Unit Roots, Cointegration and Nonstationarity; White (999), Chapter 7) Unit root processes

More information

Chapter 2. GMM: Estimating Rational Expectations Models

Chapter 2. GMM: Estimating Rational Expectations Models Chapter 2. GMM: Estimating Rational Expectations Models Contents 1 Introduction 1 2 Step 1: Solve the model and obtain Euler equations 2 3 Step 2: Formulate moment restrictions 3 4 Step 3: Estimation and

More information

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation

Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Robust Con dence Intervals in Nonlinear Regression under Weak Identi cation Xu Cheng y Department of Economics Yale University First Draft: August, 27 This Version: December 28 Abstract In this paper,

More information

Approximately Most Powerful Tests for Moment Inequalities

Approximately Most Powerful Tests for Moment Inequalities Approximately Most Powerful Tests for Moment Inequalities Richard C. Chiburis Department of Economics, Princeton University September 26, 2008 Abstract The existing literature on testing moment inequalities

More information

Averaging and the Optimal Combination of Forecasts

Averaging and the Optimal Combination of Forecasts Averaging and the Optimal Combination of Forecasts Graham Elliott University of California, San Diego 9500 Gilman Drive LA JOLLA, CA, 92093-0508 September 27, 2011 Abstract The optimal combination of forecasts,

More information

Trending Models in the Data

Trending Models in the Data April 13, 2009 Spurious regression I Before we proceed to test for unit root and trend-stationary models, we will examine the phenomena of spurious regression. The material in this lecture can be found

More information

Normal Probability Plot Probability Probability

Normal Probability Plot Probability Probability Modelling multivariate returns Stefano Herzel Department ofeconomics, University of Perugia 1 Catalin Starica Department of Mathematical Statistics, Chalmers University of Technology Reha Tutuncu Department

More information

Intro VEC and BEKK Example Factor Models Cond Var and Cor Application Ref 4. MGARCH

Intro VEC and BEKK Example Factor Models Cond Var and Cor Application Ref 4. MGARCH ntro VEC and BEKK Example Factor Models Cond Var and Cor Application Ref 4. MGARCH JEM 140: Quantitative Multivariate Finance ES, Charles University, Prague Summer 2018 JEM 140 () 4. MGARCH Summer 2018

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

Nonparametric Trending Regression with Cross-Sectional Dependence

Nonparametric Trending Regression with Cross-Sectional Dependence Nonparametric Trending Regression with Cross-Sectional Dependence Peter M. Robinson London School of Economics January 5, 00 Abstract Panel data, whose series length T is large but whose cross-section

More information

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables Likelihood Ratio Based est for the Exogeneity and the Relevance of Instrumental Variables Dukpa Kim y Yoonseok Lee z September [under revision] Abstract his paper develops a test for the exogeneity and

More information

Applications of Subsampling, Hybrid, and Size-Correction Methods

Applications of Subsampling, Hybrid, and Size-Correction Methods Applications of Subsampling, Hybrid, and Size-Correction Methods Donald W. K. Andrews Cowles Foundation for Research in Economics Yale University Patrik Guggenberger Department of Economics UCLA November

More information

GMM based inference for panel data models

GMM based inference for panel data models GMM based inference for panel data models Maurice J.G. Bun and Frank Kleibergen y this version: 24 February 2010 JEL-code: C13; C23 Keywords: dynamic panel data model, Generalized Method of Moments, weak

More information

CAE Working Paper # Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators

CAE Working Paper # Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators CAE Working Paper #06-04 Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators by Nigar Hashimzade and Timothy Vogelsang January 2006. Fixed-b Asymptotic

More information

A CONDITIONAL-HETEROSKEDASTICITY-ROBUST CONFIDENCE INTERVAL FOR THE AUTOREGRESSIVE PARAMETER. Donald W.K. Andrews and Patrik Guggenberger

A CONDITIONAL-HETEROSKEDASTICITY-ROBUST CONFIDENCE INTERVAL FOR THE AUTOREGRESSIVE PARAMETER. Donald W.K. Andrews and Patrik Guggenberger A CONDITIONAL-HETEROSKEDASTICITY-ROBUST CONFIDENCE INTERVAL FOR THE AUTOREGRESSIVE PARAMETER By Donald W.K. Andrews and Patrik Guggenberger August 2011 Revised December 2012 COWLES FOUNDATION DISCUSSION

More information

Finite State Markov-chain Approximations to Highly. Persistent Processes

Finite State Markov-chain Approximations to Highly. Persistent Processes Finite State Markov-chain Approximations to Highly Persistent Processes Karen A. Kopecky y Richard M. H. Suen z This Version: November 2009 Abstract The Rouwenhorst method of approximating stationary AR(1)

More information

Measuring robustness

Measuring robustness Measuring robustness 1 Introduction While in the classical approach to statistics one aims at estimates which have desirable properties at an exactly speci ed model, the aim of robust methods is loosely

More information

Bootstrapping the Grainger Causality Test With Integrated Data

Bootstrapping the Grainger Causality Test With Integrated Data Bootstrapping the Grainger Causality Test With Integrated Data Richard Ti n University of Reading July 26, 2006 Abstract A Monte-carlo experiment is conducted to investigate the small sample performance

More information

This paper develops a test of the asymptotic arbitrage pricing theory (APT) via the maximum squared Sharpe

This paper develops a test of the asymptotic arbitrage pricing theory (APT) via the maximum squared Sharpe MANAGEMENT SCIENCE Vol. 55, No. 7, July 2009, pp. 1255 1266 issn 0025-1909 eissn 1526-5501 09 5507 1255 informs doi 10.1287/mnsc.1090.1004 2009 INFORMS Testing the APT with the Maximum Sharpe Ratio of

More information

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),

More information

Lecture Notes on Measurement Error

Lecture Notes on Measurement Error Steve Pischke Spring 2000 Lecture Notes on Measurement Error These notes summarize a variety of simple results on measurement error which I nd useful. They also provide some references where more complete

More information

Modeling Expectations with Noncausal Autoregressions

Modeling Expectations with Noncausal Autoregressions Modeling Expectations with Noncausal Autoregressions Markku Lanne * Pentti Saikkonen ** Department of Economics Department of Mathematics and Statistics University of Helsinki University of Helsinki Abstract

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 Revised March 2012 COWLES FOUNDATION DISCUSSION PAPER NO. 1815R COWLES FOUNDATION FOR

More information

Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE

Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE Craig A. Tracy UC Davis RHPIA 2005 SISSA, Trieste 1 Figure 1: Paul Painlevé,

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations. Exercises for the course of Econometrics Introduction 1. () A researcher is using data for a sample of 30 observations to investigate the relationship between some dependent variable y i and independent

More information

Testing for Panel Cointegration Using Common Correlated Effects Estimators

Testing for Panel Cointegration Using Common Correlated Effects Estimators Department of Economics Testing for Panel Cointegration Using Common Correlated Effects Estimators Department of Economics Discussion Paper 5-2 Anindya Banerjee Josep Lluís Carrion-i-Silvestre Testing

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access

Online Appendix to: Marijuana on Main Street? Estimating Demand in Markets with Limited Access Online Appendix to: Marijuana on Main Street? Estating Demand in Markets with Lited Access By Liana Jacobi and Michelle Sovinsky This appendix provides details on the estation methodology for various speci

More information

CENTRE FOR ECONOMETRIC ANALYSIS

CENTRE FOR ECONOMETRIC ANALYSIS CENTRE FOR ECONOMETRIC ANALYSIS CEA@Cass http://www.cass.city.ac.uk/cea/index.html Cass Business School Faculty of Finance 106 Bunhill Row London EC1Y 8TZ Testing for Instability in Factor Structure of

More information

Nowcasting Norwegian GDP

Nowcasting Norwegian GDP Nowcasting Norwegian GDP Knut Are Aastveit and Tørres Trovik May 13, 2007 Introduction Motivation The last decades of advances in information technology has made it possible to access a huge amount of

More information

Testing Weak Convergence Based on HAR Covariance Matrix Estimators

Testing Weak Convergence Based on HAR Covariance Matrix Estimators Testing Weak Convergence Based on HAR Covariance atrix Estimators Jianning Kong y, Peter C. B. Phillips z, Donggyu Sul x August 4, 207 Abstract The weak convergence tests based on heteroskedasticity autocorrelation

More information

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Pedro Albarran y Raquel Carrasco z Jesus M. Carro x June 2014 Preliminary and Incomplete Abstract This paper presents and evaluates

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models Journal of Finance and Investment Analysis, vol.1, no.1, 2012, 55-67 ISSN: 2241-0988 (print version), 2241-0996 (online) International Scientific Press, 2012 A Non-Parametric Approach of Heteroskedasticity

More information

Finite sample distributions of nonlinear IV panel unit root tests in the presence of cross-section dependence

Finite sample distributions of nonlinear IV panel unit root tests in the presence of cross-section dependence Finite sample distributions of nonlinear IV panel unit root tests in the presence of cross-section dependence Qi Sun Department of Economics University of Leicester November 9, 2009 Abstract his paper

More information

Math 443/543 Graph Theory Notes 5: Graphs as matrices, spectral graph theory, and PageRank

Math 443/543 Graph Theory Notes 5: Graphs as matrices, spectral graph theory, and PageRank Math 443/543 Graph Theory Notes 5: Graphs as matrices, spectral graph theory, and PageRank David Glickenstein November 3, 4 Representing graphs as matrices It will sometimes be useful to represent graphs

More information

Markov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1

Markov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1 Markov-Switching Models with Endogenous Explanatory Variables by Chang-Jin Kim 1 Dept. of Economics, Korea University and Dept. of Economics, University of Washington First draft: August, 2002 This version:

More information

The Case Against JIVE

The Case Against JIVE The Case Against JIVE Related literature, Two comments and One reply PhD. student Freddy Rojas Cama Econometrics Theory II Rutgers University November 14th, 2011 Literature 1 Literature 2 Key de nitions

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1 PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,

More information

Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities

Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities João Victor Issler (FGV) and Claudia F. Rodrigues (VALE) August, 2012 J.V. Issler and C.F. Rodrigues () Forecast Models for

More information

MULTIVARIATE POPULATIONS

MULTIVARIATE POPULATIONS CHAPTER 5 MULTIVARIATE POPULATIONS 5. INTRODUCTION In the following chapters we will be dealing with a variety of problems concerning multivariate populations. The purpose of this chapter is to provide

More information

Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

More information

3 Random Samples from Normal Distributions

3 Random Samples from Normal Distributions 3 Random Samples from Normal Distributions Statistical theory for random samples drawn from normal distributions is very important, partly because a great deal is known about its various associated distributions

More information

Dynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano

Dynamic Factor Models and Factor Augmented Vector Autoregressions. Lawrence J. Christiano Dynamic Factor Models and Factor Augmented Vector Autoregressions Lawrence J Christiano Dynamic Factor Models and Factor Augmented Vector Autoregressions Problem: the time series dimension of data is relatively

More information

Heteroskedasticity-Robust Inference in Finite Samples

Heteroskedasticity-Robust Inference in Finite Samples Heteroskedasticity-Robust Inference in Finite Samples Jerry Hausman and Christopher Palmer Massachusetts Institute of Technology December 011 Abstract Since the advent of heteroskedasticity-robust standard

More information

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers Student Name: Economics 4818 - Introduction to Econometrics - Fall 2007 Final Exam - Answers SHOW ALL WORK! Evaluation: Problems: 3, 4C, 5C and 5F are worth 4 points. All other questions are worth 3 points.

More information

Short T Dynamic Panel Data Models with Individual and Interactive Time E ects

Short T Dynamic Panel Data Models with Individual and Interactive Time E ects Short T Dynamic Panel Data Models with Individual and Interactive Time E ects Kazuhiko Hayakawa Hiroshima University M. Hashem Pesaran University of Southern California, USA, and Trinity College, Cambridge

More information

Comparing Nested Predictive Regression Models with Persistent Predictors

Comparing Nested Predictive Regression Models with Persistent Predictors Comparing Nested Predictive Regression Models with Persistent Predictors Yan Ge y and ae-hwy Lee z November 29, 24 Abstract his paper is an extension of Clark and McCracken (CM 2, 25, 29) and Clark and

More information

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J.

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J. CAE Working Paper #05-08 A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests by Nicholas M. Kiefer and Timothy J. Vogelsang January 2005. A New Asymptotic Theory for Heteroskedasticity-Autocorrelation

More information

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill November

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Bootstrap tests of multiple inequality restrictions on variance ratios

Bootstrap tests of multiple inequality restrictions on variance ratios Economics Letters 91 (2006) 343 348 www.elsevier.com/locate/econbase Bootstrap tests of multiple inequality restrictions on variance ratios Jeff Fleming a, Chris Kirby b, *, Barbara Ostdiek a a Jones Graduate

More information

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8] 1 Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8] Insights: Price movements in one market can spread easily and instantly to another market [economic globalization and internet

More information

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner Econometrics II Nonstandard Standard Error Issues: A Guide for the Practitioner Måns Söderbom 10 May 2011 Department of Economics, University of Gothenburg. Email: mans.soderbom@economics.gu.se. Web: www.economics.gu.se/soderbom,

More information

A Test of Cointegration Rank Based Title Component Analysis.

A Test of Cointegration Rank Based Title Component Analysis. A Test of Cointegration Rank Based Title Component Analysis Author(s) Chigira, Hiroaki Citation Issue 2006-01 Date Type Technical Report Text Version publisher URL http://hdl.handle.net/10086/13683 Right

More information

Large Panels with Common Factors and Spatial Correlations

Large Panels with Common Factors and Spatial Correlations DISCUSSIO PAPER SERIES IZA DP o. 332 Large Panels with Common Factors and Spatial Correlations M. Hashem Pesaran Elisa osetti September 27 Forschungsinstitut zur Zukunft der Arbeit Institute for the Study

More information

Predicting bond returns using the output gap in expansions and recessions

Predicting bond returns using the output gap in expansions and recessions Erasmus university Rotterdam Erasmus school of economics Bachelor Thesis Quantitative finance Predicting bond returns using the output gap in expansions and recessions Author: Martijn Eertman Studentnumber:

More information

Panel cointegration rank testing with cross-section dependence

Panel cointegration rank testing with cross-section dependence Panel cointegration rank testing with cross-section dependence Josep Lluís Carrion-i-Silvestre y Laura Surdeanu z September 2007 Abstract In this paper we propose a test statistic to determine the cointegration

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

A New Approach to Robust Inference in Cointegration

A New Approach to Robust Inference in Cointegration A New Approach to Robust Inference in Cointegration Sainan Jin Guanghua School of Management, Peking University Peter C. B. Phillips Cowles Foundation, Yale University, University of Auckland & University

More information

TESTS FOR STOCHASTIC SEASONALITY APPLIED TO DAILY FINANCIAL TIME SERIES*

TESTS FOR STOCHASTIC SEASONALITY APPLIED TO DAILY FINANCIAL TIME SERIES* The Manchester School Vol 67 No. 1 January 1999 1463^6786 39^59 TESTS FOR STOCHASTIC SEASONALITY APPLIED TO DAILY FINANCIAL TIME SERIES* by I. C. ANDRADE University of Southampton A. D. CLARE ISMA Centre,

More information

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis Sílvia Gonçalves and Benoit Perron Département de sciences économiques,

More information