THE BEHAVIOUR OF LINEAR MODEL SELECTION TESTS UNDER GLOBALLY NON-NESTED HYPOTHESES

Size: px
Start display at page:

Download "THE BEHAVIOUR OF LINEAR MODEL SELECTION TESTS UNDER GLOBALLY NON-NESTED HYPOTHESES"

Transcription

1 Sankhyā : The Indian Journal of Statistics 2002, Volume 64, Series A, Pt.1, pp THE BEHAVIOUR OF LINEAR MODEL SELECTION TESTS UNDER GLOBALLY NON-NESTED HYPOTHESES By MITCHELL R. WATNIK and WESLEY O. JOHNSON University of California, Davis SUMMARY. We consider the asymptotic behaviour of tests for non-nested linear model selection under the alternative hypothesis. We consider three testing procedures: the J-test (cf. Davidson and MacKinnon, 1981; the JA-test (cf. Fisher and McAleer, 1981; and the modified Cox test (cf. Godfrey and Pesaran, 198. All are shown to have positive means under the true alternative. Thus, for discriminating between models, they should be treated as one-sided tests. We derive asymptotic distributions for slightly modified versions of these statistics under the alternative in order to obtain relative efficiencies. The modified versions are selected since they have greater power than the original test statistics. Both algebraic and theoretical relationships among the tests are indicated. 1. Introduction Choosing between non-nested linear regression models is an important practical problem in statistics. For example, suppose there are two schools of thought regarding a certain biological, economic or sociological phenomenon. Each school proceeds to analyze particular data accordingly, resulting in models that overlap to a degree but are still separate e.g. non-nested. Alternatively, the models may have resulted from different model selection strategies e.g. C p (cf. Mallows, 197 or the corrected Akaike Information Criteria (cf. Hurvich and Tsai, There is a clear interest in choosing between the two models. The problem of choosing between non-nested models traces back to Cox (1961 who formulated a testing procedure based on a likelihood ratio approach. Under appropriate conditions, differing variants of his procedure for non-nested linear models have well-behaved asymptotic properties under the null and local alternative hypotheses (cf. Dastoor and McAleer, 1989; Pesaran, 1982, 1987; Watnik, Johnson and Bedrick, Paper received October 2000; revised July AMS (1991 subject classifications. Primary 62J05, 62F05; secondary 62J15, 62F0. Key words and phrases. Asymptotic relative efficiency; j-test; JA-test; Multiple correlation; Power; Regression; Separate hypotheses.

2 110 mitchell r. watnik and wesley o. johnson In the significance testing arena (cf. Cox, 1962, one does not assume that the two models make up the entire space of possibilities; the alternative is simply left unspecified. Here, unlike Dastoor and McAleer (1989, for example, we are interested in the discrimination problem (Cox, 1962; i.e., determining which of the two hypotheses is better (Efron, McAleer (1995 notes that, although this is not the way non-nested tests were derived, this is typically the way they are employed. Unlike the usual nested situation, it is possible to reject both hypotheses when they are non-nested. To be specific, if the two hypotheses are referred to as H X and H Z, one must calculate the test statistic when H X is the null and H Z is the alternative as well as when H Z is the null and H X is the alternative. With each calculation, the practitioner must consider whether or not to reject the current null hypothesis. In the non-nested situation, then, four conclusions are possible (Green, 1971: 1 reject both hypotheses; 2 reject H X but not H Z ; reject H Z but not H X ; and 4 do not reject either hypothesis. Pesaran (1974 defines size and power for non-nested tests. Size, α, is the probability, calculated under H X, of rejecting the null hypothesis when the H X is the null. Power, π n, is defined to be the probability of correctly identifying the true model; namely the probability under H X of simultaneously i not rejecting H X and ii rejecting H Z. By definition, then, power in the non-nested situation is less than the power for a comparable test of nested hypotheses, πn, which in our case would simply be the probability under H X of rejecting the null hypothesis H Z. The last point is important, because standard definitions of efficiency, such as the Bahadur efficiency (Bahadur, 1960 and the Hájek and Šidák (1967, p. 267 relative efficiency, depend on the definition of power in the nested situation. We note, however, that π n is approximately πn α provided the chance of rejecting the false null, conditional on having rejected the true null, is nearly one. Thus, with nested power calculations, we need only make a simple adjustment to get the non-nested power under this very plausible assumption, which is supported by empirical work. Cox (1961, 1962 considers the significance testing setting and uses it to motivate the idea of artificial nesting, wherein one creates an artificial supermodel and tests hypotheses about subsets of it. This is the tactic employed in both the Davidson and MacKinnon (1981 development of the popular J-test, and the Fisher and McAleer (1981 JA-test. Our results (cf. section indicate that, in the discrimination setting, the above procedures should result in one-sided tests. Non-nested tests have often been considered as significance tests and thus treated as two-sided (cf. Godfrey, 1998.

3 linear model selection tests 111 The primary purpose of this paper is to make asymptotic power comparisons for the modified Cox test of Godfrey and Pesaran (198, the J-test, the JA-test, and some variations of these three procedures. Towards that goal, we also give a way to estimate the power of non-nested model selection tests when the true parameter vector and variance can be consistently estimated. In addition, we give a method for theoretically judging which procedure is the most powerful by using measures of asymptotic relative efficiency. Although the J-test usually stands out as preferable in large samples, it has been established that this test may not hold size in moderate samples, especially when the dimensions of the two models are different (Godfrey, 1998; McAleer, 1995; Watnik, Johnson and Bedrick, Pesaran (1984 compares the Bahadur (1960 asymptotic relative efficiencies of some Cox-type testing procedures against each other and against the J-test using the two-sided versions; i.e., under the significance testing setting. Pesaran (1982 considers the power of the J-test against a Cox-type test for locally non-nested hypotheses, both using asymptotic and Monte Carlo methods. Watnik, et al. (2001 give Monte Carlo results for the three aforementioned test procedures. Godfrey (1998 suggests a promising bootstrap-j-test to correct for the J-test size problem, but asymptotic theory for this procedure is beyond the scope of this article. Other approaches to the non-nested testing problem in various settings have been discussed by Graybill and Iyer (1994, pp , Efron (1984, Efron and Tibshirani (199, pp and Victoria- Feser (1997. In section 2, we develop notation, introduce the models, and give preliminary results. Test statistics are introduced and asymptotic properties given in section with technical details given in the appendix. Section 4 discusses the asymptotic relative efficiency comparisons for the testing procedures. Concluding remarks are given in section Background Throughout this paper, the model generating the n 1 vector Y will be assumed to be one of two candidate models. First consider Y = V δ + Xβ + ɛ x, (2.1 where V is an n o matrix; X is n p; and ɛ x1,..., ɛ xn are independently and identically distributed with mean 0 and finite variance, σ 2 x > 0. The rank of X is defined to be p throughout. If model (2.1 generates Y, then we assume that δ 1,..., δ o, β 1,..., β p are all non-zero. As a competitor, the model

4 112 mitchell r. watnik and wesley o. johnson Y = V δ + Zγ + ɛ z (2.2 is suggested; Z is an n x q matrix; V is common to both models; and ɛ zi are independently and identically distributed with mean 0 and variance σ 2 z for i = 1,..., n. The rank of Z is q. If model (2.2 generates Y, then we assume that δ 1,..., δ o, γ 1,..., γ q are all non-zero. We refer to matrices V, X, Z etc. and their respective column spaces (cf. Christensen, 1987, p. 25 interchangeably in order to ease the notational burden. We assume that the n 1 vector of ones, e, is in V. The column vectors in X are linearly independent of those in Z and vice versa. Furthermore, neither X nor Z may be the null set, as this would result in nested models. For ease of tractability, and without loss of generality, we assume V T X = 0, V T Z = 0. (2. Thus the vector δ is the same in models (2.1 and (2.2. Otherwise, one can reparameterize by projecting both X and Z onto V, the orthogonal complement of V, in order to satisfy (2. without altering the models. Equations (2. allow us to consider the contributions due to X and Z individually; i.e., outside of whatever V already adds to the model. We develop asymptotic properties for two statistics that are key to our later results. Our inferences are conditional upon X and Z and we require C XX lim XT X n, C ZZ lim ZT Z n, C XZ = C T ZX lim XT Z n, (2.4 where C XX and C ZZ are finite, positive definite, symmetric matrices. In fact, we assume that X T X/n = C XX, etc. for large enough n. We also require the following standard assumptions (cf. McAleer, 1995 to hold for large n: Assumptions 2.1. a. The design matrices X and Z are completely separate (cf. Cox, 1961; i.e., no linear combination of columns in X lies in the column space of Z and vice-versa, or X Z =. b. Xβ does not lie in Z. c. X and Z are non-singular. Kent (1986 discusses the implications of violating these assumptions. We address the consequences of violating Assumption 2.1b in section..

5 linear model selection tests 11 Pesaran (1987 defines local alternatives to mean that (2.1 and (2.2 can be brought arbitrarily close to one another in our setting. His definition of global alternatives is that the two models cannot be brought arbitrarily close; i.e., they are distinct. With the parameterization that Pesaran (1987 uses, he asserts that linear models cannot be globally non-nested. However, in our parameterization and because of the assumptions that β i (i = 1,..., p and γ j (j = 1,..., q are all non-zero, we ensure that the models are globally non-nested and are not pair-wise local alternatives. Ours separates the common part of the models (V, and the distinct parts (X and Z, while his does not. We believe that the global setting is a more natural way to consider linear alternatives. Now, define the set of linear statistics W X n 1/2 (X T X 1/2 X T Y, W Z n 1/2 (Z T Z 1/2 Z T Y. Under (2.1, and with large n, E(W X = C 1/2 XX β µ x, E(W Z = C 1/2 ZZ C ZXβ µ z. (2.5 Further, define A (Z T Z 1/2 Z T X(X T X 1/2, R C 1/2 ZZ C ZXC 1/2 XX, M RT R. (2.6 Note that A = corr(n 1/2 W Z, n 1/2 W X and that X and Z are centered since e V and both are projected onto V, due to (2.. We use A to denote the computational formula for the correlation matrix while R represents the limiting correlation in the theoretical results. Notably, A T A is also the matrix from which the sample canonical correlation coefficients are obtained for the two samples of column variables associated with X and Z, while M is the population counterpart. An important property of M is that its characteristic roots are between 0 and 1, inclusive, and hence is nonnegative definite. We note in the Appendix to Section 2 that W X and W Z are asymptotically jointly normal under mild conditions. Recalling the definitions of M from (2.6 and µ x from (2.5, we define ξ i µ T x M i µ x, i = 0, 1,..., 5; λ i ξ i1 ξ i = µ T x M i1 (I Mµ x, i = 1,..., 5; (2.7 where M 0 I by convention. We note that λ i 0 and that λ i λ i+1 = µ T x (IMM i1 (IMµ x 0, where the inequality holds because M = R T R is symmetric with roots between 0 and 1. It follows that ξ 0 ξ 1 ξ 5 0, and λ 2 λ λ 4 λ 5 0. (2.8

6 114 mitchell r. watnik and wesley o. johnson We note that, under Assumptions 2.1, statement (2.8 becomes > λ 2 > λ > λ 4 > λ 5 > 0. The following proposition is useful for establishing later results. Proposition 2.1. (a λ 2 2 λ ; (b λ 2 λ 2 λ 4 ; (c λ 2 λ 5 ; (d λ 2 λ λ 4. (2.9 We prove claims (a and (d in the Appendix to Section 2 and the others follow similarly. Finally, we define a consistent estimator of the variance. Following Pesaran (1974, under (2.1, s 2 x P (V,X Y 2 n p o σx, 2 s 2 z P (V,Z Y 2 n q o σ 2 x +, in probability as n, where P X I P X, P is the projection matrix,. denotes the Euclidean norm of a vector. Similarly, s 2 z σz 2 and s 2 x σz 2 + C 1/2 ZZ γ 2 C 1/2 XX C XZ γ 2, in probability under (2.2. Since the true model is unknown à priori, we use Royston and Thompson s (1995 estimate of variance: ˆσ 2 min(s 2 x, s 2 z. Then, ˆσ 2 consistently estimates σ 2 x under (2.1 and σ 2 z under (2.2. Equivalently, ˆσ 2 consistently estimates σ 2, the variance of the true model, regardless of whether σ 2 is σ 2 x or σ 2 z. We use ˆσ instead of either s x or s z to estimate σ, because it increases the power of non-nested tests while not changing the asymptotic distributions under the null.. The Test Statistics We seek to evaluate the performance of various non-nested testing procedures. We first present the test statistics associated with the J-test (Davidson and MacKinnon, 1981, the JA-test (Fisher and McAleer, 1981, and the modified Cox test (Godfrey and Pesaran, 198. All of these test statistics considered herein have asymptotic standard normal distributions under the null hypothesis. We will derive the asymptotic distributions under the alternative..1. The null hypothesis (2.2. The version of the test statistic for the J-test developed by Royston and Thompson (1995, equation 4.5, which they

7 linear model selection tests 115 argue has closer to nominal size under (2.2 than does the original Davidson and MacKinnon (1981 proposal, is J Z P X Y 2 Y T P X P Z Y n 1/2ˆσ. (.1 ( P X Y 2 P Z P X Y 2 1/2 In the Appendix to Section, we present this and the other statistics in this section in a form which facilitates theoretical arguments. We consider the form of (.1 to be more practical for calculations. The JA-test has test statistic (cf. Table 1 of Fisher, 198 JA Z Y T P X P Z Y P X P Z Y 2 n 1/2ˆσ. (.2 ( P X P Z Y 2 P Z P X P Z Y 2 1/2 Fisher (198 shows that, when the appropriate term is substituted for ˆσ above, JA Z has an exact t distribution under (2.2 when the errors are normally distributed. Our version of the modified Cox test statistic of Godfrey and Pesaran (198 is the negative of the original where G Z P X Y 2 P X P Z Y 2 s 2 z {p tr(p Z P X } n 1/2 {4s 2 z ( P X P Z Y 2 P Z P X P Z Y 2 + 2s 4 ztr(b 2 } 1/2, tr(b 2 p tr(p X P Z 2 {p tr(p XP Z } 2. n q The numerator of this statistic is identical to expression (89 of Cox (1961 and it has expectation zero under H Z. We consider the simpler test statistic G Z P X Y 2 P X P Z Y 2. (. {4nˆσ 2 ( P X P Z Y 2 P Z P X P Z Y 2 1/2 } We note that G Z G Z = o p (1 under (2.2 (cf. Godfrey and Pesaran, 198, p. 18. Remarks. 1. The statistics J X, JA X, and G X are defined analogously; namely, by exchanging Z for X and vice-versa in equations (.1, (.2, and (..

8 116 mitchell r. watnik and wesley o. johnson 2. The statistics J Z, JA Z, and G Z are, with the exception of using ˆσ in place of their respective estimates of the standard deviation, exactly the same as those given in Davidson and MacKinnon (1981 (cf. Table 1 of Fisher, 198, Fisher and McAleer (1981, and Godfrey and Pesaran (198, respectively. Our use of ˆσ to estimate the standard deviation was done solely to increase the power of all of the tests (cf. section 2.. Had we used s z in place of ˆσ in (., it would have been the case that G Z G Z = o p (1 under (2.1 (cf. Godfrey and Pesaran, 198 and that G Z would be equivalent to the test statistic given in equation (25 of Fisher and McAleer (1981 (cf. Table 1 of Fisher, 198. We define a notation for the denominators of the J and JA tests, respectively: ˆτ JZ {ˆσ 2 ( W X 2 AW X 2} 1/2 ; ˆτ JAZ {ˆσ ( 2 A T W Z 2 AA T W Z 2} 1/2 (.4 Note that the denominator of G Z is proportional to the denominator of JA Z. Using (2.7 and (A.1, we see that, under (2.1, ˆτ JX p σx /2 2, ˆτ JAX p σx /2 2, ˆτ JZ p σx /2 1, ˆτ JAZ p σx /2. (.5 This verifies the result of Fisher (198, who notes that the denominators of J X and JA X are asymptotically equivalent under H X. So, it may be interesting to investigate the properties of these statistics using the other s denominator, as Pesaran (1984, p. 251 notes, the efficiency of a Cox-type test depends on the choice of consistent estimate of the variance. We therefore define and F G Z P XY 2 P X P Z Y 2 2ˆτ JZ ; (.6 F J Z P XY 2 Y T P X P Z Y ˆτ JAZ ; (.7 F JA Z Y T P X P Z Y P X P Z Y 2 ˆτ JZ. (.8 We note the correspondence F G Z = J Z + F JA Z 2, G Z = F J Z + JA Z. (.9 2

9 linear model selection tests 117 As a consequence of this, we thus expect the performance of F G Z and G Z to strike a balance between that for the corresponding two tests. We finally note that the statistics F G X, F J X, and F JA X are defined by interchanging the roles of X and Z in equations (.6, (.7, and (.8..2 The alternative hypothesis: (2.1. We now derive the distributions of J Z, F J Z, JA Z, F JA Z, G Z, and F G Z under (2.1. As the subscript indicates, (2.2 is considered to be the null. Using (2.5 and (2.7 it follows that, in probability and under (2.1, J Z σ/2 1 F J Z σ/2, JA Z λ 2 σ/2, F JA Z λ 2 σ/2 1, G Z +λ 2, 2σ/2, F G Z +λ 2. 2σ/2 1 (.10 Under Assumptions 2.1, then, the asymptotic means of these test statistics are all strictly positive. Pesaran (1987 notes this about J, JA and G, but, since he considers local alternatives, he does not suggest using the statistics as one-sided tests. In the model discrimination setting, however, it follows directly that corresponding tests of hypotheses should be one-sided and the null hypothesis should only be rejected if the observed test statistic is too large. We obtain the result Proposition.2. Under (2.1, the assumptions for (A.1 and Assumptions 2.1, the following convergence results hold as n : ( n 1/2 J Z λ1/2 1 N (0, 1, σ ( n 1/2 JA Z λ [ 2 N 0, λ 2 n 1/2 ( σ/2 G Z + λ 2 2σ/2 N [ 0, λ { ( λ1 + λ 2 λ 2 ( λ4 λ { + ( λ2 λ 2 }] (λ 4 λ 5, λ 4 + ( + λ 2 (λ 4 λ 5 4λ } ] 1,

10 118 mitchell r. watnik and wesley o. johnson ( n 1/2 F J Z N σ/2 [0, 1λ { ( n 1/2 F JA Z λ 2 N σ/2 1 [0, 1λ1 { ( n 1/2 F G Z λ 1 + λ 2 N 2σ/2 1 [ ( 1 0, 4 2 λ λ 2 4 ( λ2 ( λ1 λ λ 4 + (λ 2 λ + ( 2 λ1 (λ 4 λ 5 }] λ ( 2 λ2 ( λ 2 }] { ( λ1 + λ 2 2 ( λ ( }] λ2 λ.,, (.11 The proof of Proposition.2 is in the Appendix to Section. Note that the asymptotic variances do not depend on β, while the means do. Also note that the asymptotic distributions under the alternative only depend on the design matrices X and Z through the canonical correlation matrix M. We have the following corollary to Proposition.2, which becomes important in the next section. Corollary to Proposition.2: Under (2.1, the assumptions for (A.1 and Assumptions 2.1, the asymptotic variance of the JA test is greater than or equal to 1. The proof of this corollary is in the Appendix to Section... The case of X Z. Although the situation with X Z is excluded by Assumptions 2.1, it is a situation in which it should be easy to discriminate between the two models. So, when X is nearly perpendicular to Z, and the assumptions hold, one might expect high power from these test procedures. However, when X Z, the JA-test has numerator 0. The denominators of both the JA-test, and the modified Cox test, G Z and G X, are 0 in this situation as well (the statistics G Z and G X will have relatively small, but not 0, denominators. Fisher (198, p. 10 notes the failure of the JA-test in this situation. In fact, when X is perpendicular to Z, F G is the only one of these six statistics that does not necessarily have either the numerator or denominator tending to 0. To elaborate on this, we note that, when X Z, ξ i = 0 for i = 1, 2,..., 5. So, λ i = 0 for i = 2,..., 5. The proofs of asymptotic normality for JA Z and

11 linear model selection tests 119 G Z will fail in the case where X Z as noted in the proofs in the Appendix to Section. However, when X Z, the J-test also fails. This observation about the J-test is made in a footnote of Davidson and MacKinnon (1981. From equation (.1, we see that, in our way of calculating the J-test, it becomes P X Y /(n 1/2ˆσ when X Z. It is easy to see that this cannot be asymptotically normal, with mean 0, in this case because the statistic is necessarily non-negative. One obvious possibility is to instead consider the model E(Y = V δ + Xβ + Zγ and to use conventional F -tests for β = 0 and γ = Relative Efficiency We proceed to calculate the asymptotic relative efficiencies for J, F J, JA, F JA, G and F G. We define three measures of relative efficiency and then proceed to make comparisons. We conclude this section by giving an illustration of how to consistently estimate power in the general case Measures of Relative Efficiency. We first define the standard measure of relative efficiency for comparing one-sided tests. Suppose a statistic, say n 1/2 S, is asymptotically standard normal under the null, and n 1/2 (S µ a /σ a is asymptotically standard normal under the alternative. Then the nested asymptotic power of a size α test is 1 Φ{(z α n 1/2 µ a /σ a } where Φ(z α = 1 α. Thus power is large provided µ a is large and/or σ a is small. Solving for the sample size that achieves power πn, we obtain n = (z α σ a z π n 2 /µ 2 a, provided z α σ a z π n. Then define the asymptotic relative efficiency of, say, J to F G to be n j /n fg, the relative sample sizes for the two size α tests to both achieve power πn. Then e S (J, F G n {( ( } 2 j z1α σ j z π n z1α σ fg z π n = /, (4.1 n fg µ j µ fg where µ fg is the mean for F G under the alternative, etc. This measure of efficiency implies that J is superior to F G if e S (J, F G < 1. By (.11 and (4.1 we see that relative efficiencies for any pair of tests will not depend on the length of the vector β. We may thus assume without loss of generality that β = 1 when making these calculations. In theory, this measure can be somewhat complicated since it may depend on many inputs. Simplifications are obtained if we let α =.5 or if πn 1. In these instances ( 2 µfg σ j e S (J, F G e HS (J, F G =, (4.2 µ j σ fg

12 120 mitchell r. watnik and wesley o. johnson which is the Hájek and Šidák (1967, p. 267 measure of relative efficiency, which we will refer to as the HS criterion. On the other hand, if we let α 0, or πn =.5, we obtain e S (J, F G e B (J, F G = ( µfg µ j 2, (4. which is the Bahadur (1961 relative efficiency for comparing J to F G. 4.2 General conclusions about efficiencies. Using Proposition.2, we can order some of the statistics relative to these criteria. We begin with a comparison between JA and J under the condition π n 0.5. We note that e S (J, JA = { ( z1α z π n µja ( z1α σ ja z π n µa } 2. Now, under the condition π n 0.50, z π n 0, which implies e S (J, JA ( µja µ j 2 = e B (J, JA since, by Corollary to Proposition.2, σ ja 1. Therefore, e S (J, JA λ 2 2 /λ 1, by Proposition 2.1. Hence J is superior to JA under the general criterion, as well as both the HS and Bahadur criteria. In the Appendix to Section 4, we establish that, according to the Bahadur criterion, F J G J F G F JA. So, according to the Bahadur criterion, F J is the theoretically optimal procedure to use, while F JA is theoretically worst. Considering only the three test statistics from the literature, we have established that G is preferable to J, which in turn is preferable to JA, using the Bahadur criterion. We shall see later that JA neither dominates nor is dominated by F G in this context. Using HS efficiency, we have the following comparison between F G and J: e HS (F G, J = / { ( λ1 +λ 2 2 (λ1 λ λ 2 2/2 1 ( } 2 λ2 λ (4.4 In the Appendix to Section 4, we show that e HS (F G, J 1 and thus F G J, under HS. Interestingly, the reverse was true according to Bahadur.

13 linear model selection tests 121 If we add the following assumptions λ 2 = λ 2 λ 4 ; λ 2 4 = λ λ 5, (4.5 more conclusions under the HS criterion are possible. Equations (4.5 imply that λ 2 λ 5 = λ λ 4. Note that equations (4.5 hold if the roots of M are equal or if rank(m = 1, or, more generally, if the non-zero roots of M are equal. So, these assumptions hold in the situations when either p = 1 or q = 1, as we see in the next two subsections. In the Appendix to Section 4, we show that e HS (G, J 1 and that e HS (G, F J 1. So, under the HS criterion and assumptions (4.5, we have the ordering F G J G F J. We do not compare F JA here, because, as we see in the succeeding subsections, no generalizations for F JA are possible. We do not compare JA to the others under HS because it is dominated by J. 4. Relative efficiencies; p = 1 Case. With p = 1, results are particularly simple. Keep in mind that we are testing H Z versus H X where the rank of X is one and the rank of Z is arbitrary. We note that M is now the squared multiple correlation between X and Z. Then, from (2.7, we have λ i = C XX β 2 M i1 (1 M for i = 1,..., 5. It follows that assumptions (4.5 hold. Proposition.2 yields µ j = µ ja = C1/2 XXβ (1 M1/2, σ µ g = C1/2 XX β 2Mσ (1 + M(1 M1/2, µ fj = C1/2 XXβ (1 M1/2, Mσ µ fja = C1/2 XXβ M(1 M1/2, σ XX β µ fg = C1/2 2σ (1 + M(1 M 1/2, σ 2 j = σ 2 ja = 1, σ 2 g = 1 + M + M 2 M 4M, σ 2 fj = 1 + M M 2 M, σ 2 fja = M(1 M + M 2, σfg 2 = 1 + M M 2 + M. 4 (4.6 This implies that, when p = 1, J and JA will be asymptotically equivalent, as they are under local alternatives (McAleer, For the remainder of this subsection, then, we will only state results for J. We observe from (4.6 that, with small M, J and F G will be more powerful than F JA. In addition, (4.1 and (4.6 imply that changes in size will have the most dramatic effect on G and F J because of the presence of M in the denominators of their means. Note that the power of all

14 122 mitchell r. watnik and wesley o. johnson of the test statistics will be monotone and increasing in C 1/2 XXβ for fixed M; more variability in the X values gives greater power. Also note that all relative efficiencies will depend on the model only through the value of M. We summarize our conclusions for the situations where α = 0.05 in Table 1. It is worth remembering that, as size decreases, e S e B. In section 4.2, we saw that the Bahadur criterion had F J G J F G F JA. In this special case, then, the only new conclusion is that JA F G. Table 1. Orderings for tests with p = 1. M πn = 0.50 πn = 0.75 small ( 0.1 F J G J JA F G F JA J JA G F J F G F JA moderate ( 0.5 F J G J JA F G F JA F J G J JA F G F JA large ( 0.9 F J G J JA F G F JA F J G J JA F G F JA πn = 0.95 πn = 1 small J JA F G G F J F JA F G J JA G F J F JA moderate J JA G F J F G F JA F G J JA G F J large J JA G F J F G F JA F JA F G J JA G F J When πn = 1, we obtain the conclusions summarized in the lower righthand box of Table 1. In the previous subsection, we showed that F G J G F J according to this measure, i.e., the HS criterion. We now see that F JA is most efficient for large values of M, while it is least efficient for small values. This is why we were unable to make general conclusions about F JA for the HS criterion. Its place for moderate values of M depends on M. We view these results as theoretically interesting, but of limited value in practice since πn must be arbitrarily close to one for them to hold. In Table 1 we also summarize the ordering for the situations when πn = 0.75 and πn = 0.95, with α =.05. When seeking larger power, J tends to be best, while F J may be preferable when seeking moderate power. The orderings are almost identical to the Bahadur case with πn = Conclusions are practically identical with α = The ordering tends towards that for the HS criterion with πn = The tests are practically equivalent with M > 0.6. The situation changes slightly with α = 0.10; F JA is technically most efficient though by a practically insignificant margin, especially for M > For values of πn that are as large as 0.999, the orderings tend to be more like those for πn = 0.50, 0.75, or 0.95 than for πn = 1, so for realistic power and based on Table 1, it would appear that F J, G, J and JA are preferable to F G and F JA. The test F J is preferable for moderate power and J JA is preferable for higher power. The test G ranks either second or third.

15 linear model selection tests 12 We recall from (.9 that F G is the average of J and F JA and G is the average of F J and JA. It is thus not surprising that, when G is more efficient than F J, it tends to be less efficient than JA and vice-versa, and analogously for F G, J, and F JA, as can be seen by consideration of Table Relative efficiencies; q = 1 Case. Here, the rank of X is arbitrary and the rank of Z is 1. For this situation, we define r = RR T = C ZX CXX 1 C XZ/C ZZ, the squared multiple correlation between Z and X. Also, define c = β 2 and η = ρ T β/c 1/2, where ρ T C ZX C 1/2 XX /C1/2 ZZ (ρ 1,..., ρ p, the vector of correlations between Z and X. Note that r = p i=1 ρ2 i. As a consequence of these definitions and Assumptions 2.1, 0 < η 2 r < 1. Also note that, if p = 1 then η 2 = r. We use the assumptions C XX = I and C ZZ = 1, without loss of generality. For example, in setting up the models (2.1 and (2.2, we could simply have reparameterized X and Z appropriately utilizing Gram-Schmidt without altering the basic structure of the models. It then follows from (2.7 that = c(1 η 2, λ 2 = cη 2 (1 r, λ = cη 2 r(1 r, λ 4 = cη 2 r 2 (1 r, λ 5 = cη 2 r (1 r. It follows directly that assumptions (4.5 hold. Additionally, Proposition.2 implies { c(1 η 2 } 1/2 c 1/2 (1 η 2 µ j =, µ fj = σ σ {η 2 r(1 r} 1/2, { cη 2 (1 r } 1/2 µ ja = σr 1/2, µ fja = c1/2 η 2 (1 r σ {(1 η 2 } 1/2, µ g = c 1/2 (1 rη 2 2ση {r(1 r} 1/2, µ fg = c 1/2 (1 rη 2 2σ {(1 η 2 } 1/2, σj 2 = 1, σja 2 = 1, σg 2 = 1 { } 1 ( 4 rη rη 2 4rη 4 + r 2 η 4, (1 r { } σfj 2 1 = (2r η 2 + η 2 r 4η r(1 r η 2, ( { η σfja 2 2 ( 2 (1 r 1 r 2 = 1 η 2 1 r 4η 1 η 2 + η2 1 η 2 (1 2η 2 + rη } 2, σ 2 fg = 1 4 ( 1 1 η 2 ( 1 rη 2 2 (1 2η 2 + rη 2 + 4rη 1 η 2 4 (1 r2 1 η 2. (4.7

16 124 mitchell r. watnik and wesley o. johnson Recall that the efficiencies of the tests will not depend on c, but we have included it for potential use in power calculations. We must keep in mind that β here is the reparameterized version. Note that when p = 1, these formulae reduce to those in (4.6, as they should. We have already established that J is asymptotically at least as efficient as JA according to the general criterion. When η 2 < r, J JA and they are equally efficient only where η 2 = r. This illustrates how restrictive is the assumption of local alternatives (cf. Pesaran, 1987 because, under local alternatives, J is asymptotically equivalent to JA (cf. McAleer, Clearly, if η 2. = 0, JA will have π n. = α. The test F JA will also suffer if η 2. = 0. We saw earlier that the Bahadur criterion s ordering of the procedures was F J G J F G F JA and J JA F JA (cf. section 4.2. Now, we can see that neither F G nor JA dominates the other according to the Bahadur criterion since, with r = 1, F G JA, while with η 2 = r, JA F G. As was the case when p = 1, the HS measure of relative efficiency does not imply that F JA is superior or inferior to either J or F G everywhere. So, either F G or F JA will be superior when the power is 1. We present a summary of the conclusions we draw for the situation when α = 0.05 and π n = 0.75 and 0.95 in Table 2. We note that the conclusions when α = 0.10 are virtually identical to those in Table 2. Table 2. Orderings for tests with q = 1. r η 2 πn = 0.75 πn = 0.95 small small J G F J F G F JA J F G G F J F JA moderate small J F J G F G F JA J F G F J G F JA moderate moderate F J G J F G F JA J G F J F G F JA large small J F J G F G F JA J F G F J G F JA large moderate F J G J F G F JA J F J G F G F JA large large F J G J F G F JA J F J G F G F JA The conclusions in Table 2 are similar to the analogous ones in Table 1 in that they imply that, in general, either F J or J is superior while F JA is inferior An example: flow rate. We illustrate how our formulae can be used to make theoretical power calculations. Consider modelling flow rate from watersheds following storm episodes as in Rawlings (1988, pp Though the sample size is 0, which is not particularly large, we use it nonetheless for illustrative purposes. The response variable is the log of the

17 linear model selection tests 125 peak flow rate, Y. Other variables are the average slope of the watershed in percent, s; longest stream flow in watershed in thousands of feet, f; infiltration rate of water into soil in inches per hour, i; watershed area in square miles, a; estimated soil storage capacity in inches of water, c; rainfall in inches, r; time period during which rainfall exceeded 0.25 inches per hour, t; and surface absorbancy index, where 0 represents complete absorbancy and 100 represents no absorbancy, b. The model obtained through a forward stepwise selection procedure was H X : Y = δ 0 + δ 1 s + β 1 f + β 2 i + ɛ X. Rawlings (1988, p. 519 mentions that Mallows C p selects the model H Z : Y = δ 0 + δ 1 s + γ 1 a + γ 2 c + γ r + γ 4 t + γ 5 b + ɛ Z. Comparing these models is of interest. Here, then, p = 2, q = 5 and o = 2. We projected the matrices X and Z into V and, assumed that C XX = X T X/n, etc. Since ˆβ and ˆγ are consistent estimates of β and γ under their respective hypotheses, and ˆσ = min(s x, s z is a consistent estimate of σ, we were then able to calculate the following statistics: ˆ = 0.144, ˆλ 2 = , ˆλ = 0.058, ˆλ 4 = 0.07, ˆλ 5 = 0.027, ˆZ = , ˆλ 2Z = 0.046, ˆλ Z = 0.04, ˆλ 4Z = 0.026, ˆλ 5Z = Note that assumptions (4.5 do not hold. We have s X = 0.67 and s Z = So ˆσ = Using Proposition.2, we obtained the following approximations to power, say π n, for.05 level tests assuming model X is true, namely ˆπ n(j = , ˆπ n(ja = , ˆπ n(g = 0.941, ˆπ n(f J = , ˆπ n(f JA = , ˆπ n(f G = (4.8 Similarly, approximations to π n under model Z are: ˆπ n(j = , ˆπ n(ja = , ˆπ n(g = 0.956, ˆπ n(f J = , ˆπ n(f JA = , ˆπ n(f G = (4.9 Thus J is most powerful, while JA and F JA are evidently considerably less powerful. These power estimates are for the nested power and that the

18 126 mitchell r. watnik and wesley o. johnson actual power for this scenario will be about 5% less than the ˆπ ns in (4.8 and (4.9; cf. section 1. For these data, we obtained the following test statistics: J X = 2.91, J Z = 4.26, JA X = 1.01, JA Z = 1.68, G X = 0.20, G Z = 2.29, F J X = 5.04, F J Z = 8.40, F JA X = 0.59, F JA Z = 0.85, F G X = 1.16, F G Z = The multiple correlation between the models X and Z was 0.98, RX 2 was approximately 86% and RZ 2 was approximately 8%. The tests JA, F G and G lead us to the conclusion that model X was preferable to model Z. The tests J and F J reject both models at the 1% level of significance, while F JA does not reject either model. These results are not surprising, as the power indicated that the tests G and F G would probably be decisive while F JA would tend to be indecisive. The J-test may reject too often in practice (cf. McAleer, 1995, and we conjecture that it is also the case that F J will have larger than nominal size. Therefore, as some preliminary work has been done to find finite-sample corrections for the J-test (cf. Royston and Thompson, 1995 and others have used the bootstrap as a means of correcting the size (Godfrey, 1998, we believe it would be worthwhile to thoroughly investigate such corrections for both J and F J. In the meantime, with relatively small sample sizes, practitioners may prefer to use the more robust method presented in Victoria-Feser (1997 or the less powerful alternatives G, F G, and JA. 5. Concluding Remarks We have considered non-nested hypothesis testing procedures in the linear regression setting. We demonstrated that the asymptotic means of the most commonly-used methods (cf. McAleer, 1995 are positive and thus, the procedures should be treated as one-sided tests when one is comparing two specified models. Furthermore, we presented a way to estimate the power of non-nested model selection tests when the true parameter vector and variance can be consistently estimated. In addition, we have given a method for theoretically judging which procedure is the most powerful by using measures of asymptotic relative efficiency. Interestingly, the asymptotic relative efficiencies did not depend upon the norm of the parameter vector. For the cases of p = 1 or q = 1, we saw that the efficiency depends only upon how close the design matrices are and how close the mean vector is to the alternative design s column space. This is consistent with the results obtained in the local alternatives case (Pesaran, 1987.

19 linear model selection tests 127 In the general case, the relative efficiencies will involve the canonical correlation matrix M and the vector β. We saw that J is theoretically preferable to JA in general. For the Bahadur (1961 measure of asymptotic relative efficiency (cf. (4., we were able to conclude that F J G J F G F JA. For the Hájek and Šidák (1967, p. 267 measure, we saw that F G J G F J, under assumptions (4.5. We remind the reader that these asymptotic results do not take into account the possibility of incorrect size. Therefore, if one does not have a large sample the most powerful test, as determined by use of this methodology, may not be the best test to use (McAleer, Acknowledgements. We are indebted to an anonymous referee for helpful suggestions. We also thank Ron Christensen for his numerous useful comments on earlier versions of this paper. A. Appendix to Section 2 Under the conditions necessary for the Lindeberg-Feller Central Limit Theorem (cf. Loève, 1977, pp , we have the following distributional result {( n 1/2 WX W Z ( } µx µ z N { ( 0, σx 2 I R T }, (n, (A.1 R I under (2.1. Practically speaking, the conditions sufficient for (A.1 to hold are that the matrices C XX and C ZZ are invertible and finite, i.e., the matrices X T X and Z T Z are all of order n and of full rank, and the error term ɛ is either bounded or derives from a distribution with at least 2 moments and sufficiently small tails. To prove claim (a of Proposition 2.1, we recall that the characteristic roots of M, which we call m i for i = 1,..., p here, are between 0 and 1. Let M i be the eigenvectors of M and d i = M i T µ x. Then λ j = p i=1 mj1 i (1 m i d 2 i. Letting w i = (1m i d 2 i /, we have that the w i s define a probability measure. So λ 2 2 λ ( p 2 w i m i i=1 p w i m 2 i, i=1 which is obviously true. Claims (b and (c follow similarly. Claim (d follows from claims (a and (b and that λ i 0. B. Appendix to Section The calculational and theoretical forms for J Z (cf. (.1 are J Z P X Y 2 Y T P X P Z Y n 1/2ˆσ ( P X Y 2 P Z P X Y 2 1/2 = W X 2 W T X AT W Z ˆσ ( W X 2 AW X 2 1/2.

20 128 mitchell r. watnik and wesley o. johnson The theoretical form of JA Z (cf. (.2 is JA Z W T X AT W Z A T W Z 2 ˆσ ( A T W Z 2 AA T W Z 2 1/2. Similarly, the theoretical form of G Z (cf. (. is G Z W X 2 A T W Z 2 {4ˆσ 2 ( A T W Z 2 AA T W Z 2 } 1/2. For F G Z, F J Z, and F JA Z (cf. (.6, (.7, (.8, the theoretical forms are W X 2 A T W Z 2 F G Z 2ˆσ ( W X 2 AW X 2 1/2 ; F J Z W X 2 W T X AT W Z ˆσ ( A T W Z 2 AA T W Z 2 1/2 ; and F JA Z W T X AT W Z A T W Z 2 ˆσ ( W X 2 AW X 2 1/2. We now prove Proposition.2. First, note that the numerators for each of J Z, JA Z, G Z, F J Z, F JA Z, and F G Z may be written as a linear combination of the three statistics W X 2, W T Z AW X, and A T W Z 2. Let b 1, b 2, b represent the coefficients associated with the three statistics, respectively. That is, for J X, b 1 = 1, b 2 = 1, and b = 0. Let b 4 be an indicator variable for using ˆτ JAZ in the denominator of the test statistic. This leads us to the realization that, when calculated under H Z, the numerators of all of our test statistics may be written in the form u n b 1 W X 2 + b 2 W T Z AW X + b A T W Z 2, where b 1, b 2, b {1, 0, 1}. Under H X, u n all of our test statistics may be written as p b1 ξ 0 + b 2 ξ 1 + b ξ 2. Therefore, S n = u n c {(1 b 4 ˆτ JZ + b 4ˆτ JAZ }, where c = 2 if we are using G or F G and it is 1 otherwise and b 4 {0, 1}.

21 linear model selection tests 129 We can now obtain the partial derivatives of S n with respect to W X and W Z. S n ( u n / W X = W X c {(1 b 4 ˆτ JZ + b 4ˆτ JAZ } + u { 1 n ˆτ JZ ˆτ 1 } JAZ (1 b 4 + b 4 c W X W X S n ( u n / W Z = W Z c {(1 b 4 ˆτ JZ + b 4ˆτ JAZ } + u { 1 n ˆτ JZ ˆτ 1 } JAZ (1 b 4 + b 4 c W Z W Z (B.1 We note that, as alluded to in section., when X Z, these derivatives are infinite when b 4 = 1, because ˆτ JAZ = 0. So, these results do not hold for JA Z, G Z and F J Z, if X Z. Recalling the notational definitions in equations (2.4, (2.5, (2.6 and (2.7, we note that µ z = Rµ x. Using all of these, we obtain: u n W X = 2b 1 W X + b 2 A T W Z p 2b1 µ x + b 2 R T µ z = 2b 1 µ x + b 2 R T Rµ x ; u n W Z = b 2 AW X + 2b AA T W Z p b 2 µ z + 2b RR T µ z = b 2 Rµ x + 2b RR T Rµ x ; (B.2 where the convergences are under H X. We now obtain the asymptotic norms: u n 2 W 4b 2 1ξ 0 + b 2 2ξ 2 + 4b 1 b 2 ξ 1 UX. X u n 2 W b 2 2ξ 1 + 4b 2 ξ + 4b 2 b ξ 2 UZ. Z ( T ( un A T un 2b 1 b 2 ξ 1 + 4b 1 b ξ 2 + b 2 W X W 2ξ 2 + 2b 2 b ξ UXZ. Z (B. From (.4, we derive the derivatives of ˆτ JZ 1 1 and ˆτ JAZ : ˆτ (W JZ 1 = ˆσ X A T AW X W X ˆτ JZ ˆτ 1 JZ W Z = 0 ˆτ 1 JAZ W X = 0 ˆτ 1 JAZ W Z (AA = ˆσ T W Z AA T AA T W Z. ˆτ JAZ (B.4 Now, noting that b 4 is an indicator, we calculate the partial derivatives

22 10 mitchell r. watnik and wesley o. johnson of S n using (B.1-(B.4: S n W X = S n W Z = ( u n / W X c {(1 b 4 ˆτ JZ + b 4ˆτ JAZ } (1 b } 4u nˆσ {(I cˆτ JZ A T AW X ( u n / W Z c {(1 b 4 ˆτ JZ + b 4ˆτ JAZ } b 4u nˆσ (AA T AA T AA T W Z. cˆτ JAZ (B.5 From (2.7, (.5, and (B.1-(B.5, we see that S n 2 UX W X c 2 σx 2 {(1 b 4 + b 4 λ } { } { } (1 b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 (b1 ξ 0 + b 2 ξ 1 + b ξ 2 ( λ 2 + c 2 σxλ { } ( 2(1 b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 2b1 + b 2 λ 2 c 2 σxλ 2. 1 S n 2 UZ W Z c 2 σx 2 {(1 b 4 + b 4 λ } { } [{ } ] b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 b1 ξ 0 + b 2 ξ 1 + b ξ 2 + c 2 σxλ 2 2 (λ 4 λ 5 λ { } ( 2b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 b2 λ + 2b λ 4 c 2 σxλ 2. λ ( T ( Sn A T Sn UXZ W X W Z c 2 σx 2 {(1 b 4 + b 4 λ } { (1 b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 c 2 σxλ { b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 c 2 σxλ 2 2 } (b 2 λ 2 + 2b λ } (2b 1 λ + b 2 λ 4. Using the delta method, the asymptotic variance of n 1/2 S n (2.1 is ( [ S W X µ ] T, { = σx 2 W X [ ] T ( S W Z µ σx 2 I R T ( S W X µ R I S W Z µ S 2 + S 2 ( S T ( S W + 2 R T Z W X W Z } (B.6 under

23 linear model selection tests 11 = U X + U Z + 2U XZ c 2 {(1 b 4 + b 4 λ } { } { } (1 b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 (b1 ξ 0 + b 2 ξ 1 + b ξ 2 ( λ 2 + c 2 λ 2 1 { } ( (1 b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 4b1 + 4b 2 λ 2 + 4b λ c 2 { } [{ } ] b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 b1 ξ 0 + b 2 ξ 1 + b ξ 2 + c 2 λ 2 (λ 4 λ 5 λ { } { } b4 (b 1 ξ 0 + b 2 ξ 1 + b ξ 2 (4b1 + 2b 2 λ + (2b 2 + 4b λ 4 c 2. λ λ (B.7 For J Z, we have b 1 = 1, b 2 = 1, b = 0, b 4 = 0, c = 1. Therefore, according to (B.7, the asymptotic variance of J Z under H X is: { } (4ξ 0 + ξ 2 4ξ 1 + (ξ 1 + 2(2ξ 1 + ξ 2 (ξ0 ξ 1 ( λ 2 + (ξ 0 ξ 1 λ 1 { } ( (ξ0 ξ 1 4λ1 4λ 2 = 4 λ 2 + (λ ( 1 λ 2 4λ1 4λ 2 = 1. (B.8 For JA Z, we have b 1 = 0, b 2 = 1, b = 1, b 4 = 1, c = 1. Therefore, according to (B.7, the asymptotic variance of JA Z under H X is: ( ( (ξ 2 + (ξ 1 + 4ξ 4ξ 2 + 2(ξ 2 2ξ ξ1 ξ 2 ξ1 ξ 2 + λ λ 2 (λ 4 λ 5 λ ( ( ξ1 ξ 2 2λ 2λ 4 λ λ = ξ ( 1 ξ 2 λ 2 ( ( + 2 2λ2 λ λ 4 λ λ (λ 4 λ 5 λ λ = λ {( ( } 2 λ2 λ4 λ λ 2 (λ 4 λ λ (B.9 For G Z, we have b 1 = 1, b 2 = 0, b = 1, b 4 = 1, c = 2. Therefore,

24 12 mitchell r. watnik and wesley o. johnson according to (B.7, the asymptotic variance of G Z under H X is: ( ( (4ξ 0 + (4ξ + 2(4ξ 2 ξ0 ξ 2 ξ0 ξ 2 + 4λ 4λ 2 λ ( ( ξ0 ξ 2 4λ 4λ 4 4λ λ = λ ( ( 1 + λ 2 λ λ1 + λ 2 λ1 + λ 2 + λ 4λ 2 λ ( ( λ1 + λ 2 4λ 4λ 4 4λ λ ( { ( λ1 + λ 2 λ1 + λ 2 = λ 4λ 2 (λ 4 λ 5 (λ 4 λ 5 (λ 4 λ 5 ( {( } λ1 + λ 2 λ1 + λ 2 = 1 + λ 2 (λ 4 λ 5 + λ 4. 4λ ( } λ λ 4 λ (B.10 For F J Z, we have b 1 = 1, b 2 = 1, b = 0, b 4 = 1, c = 1. Therefore, according to (B.7, the asymptotic variance of F J Z under H X is: ( ( (4ξ 0 + ξ 2 4ξ 1 + (ξ 1 + 2(2ξ 1 + ξ 2 ξ0 ξ 1 ξ0 ξ 1 + λ λ 2 (λ 4 λ 5 λ ( ( ξ0 ξ 1 2λ 2λ 4 λ λ = 4λ ( 1 λ 2 λ 2 ( ( + 1 2λ1 λ λ 4 λ λ (λ 4 λ 5 λ λ = 2λ ( 1 λ 2 λ λ λ (λ 4 λ 5 + 2λ 4. λ (B.11 For F JA Z, we have b 1 = 0, b 2 = 1, b = 1, b 4 = 0, c = 1. Therefore, according to (B.7, the asymptotic variance of F JA Z under H X is: ( { } (ξ 2 + (ξ 1 + 4ξ 4ξ 2 + 2(ξ 2 2ξ ξ1 ξ 2 (ξ1 ξ 2 ( λ 2 + λ 2 1 ( ( ξ1 ξ 2 4λ2 4λ = λ ( { } ( ( 2 λ2 λ2 ( λ 2 λ2 4λ2 4λ + λ 2. 1 (B.12 We note that, when X Z, as is considered in section., this asymptotic variance is 0.

25 linear model selection tests 1 For F G Z, we have b 1 = 1, b 2 = 0, b = 1, b 4 = 0, c = 2. Therefore, according to (B.7, the asymptotic variance of F G Z under H X is: ( { } (4ξ 0 + (4ξ + 2(4ξ 2 ξ0 ξ 2 (ξ0 ξ 2 ( λ λ 2 1 ( ( ξ0 ξ 2 4λ1 4λ 4 = 4λ ( { } 1 + 4λ 2 4λ λ1 + λ 2 (λ1 + λ 2 ( λ λ 2 1 ( ( λ1 + λ 2 4λ1 4λ 4 ( { } ( λ1 + λ 2 (λ1 + λ 2 ( λ 2 λ2 λ = +. 4λ 2 1 The proof of the Corollary to Proposition.2 follows. V ar(ja Z = λ { λ 4 + λ } 2(λ 4 λ 5 λ λ λ 2. This is greater than or equal to 1 if and only if λ λ 4 + λ 2(λ 4 λ 5 λ λ 2 λ λ λ 4 λ λ λ 2 λ 2λ 4 λ 2 The last statement is true by Proposition C. Appendix to Section 4 To show the Bahadur ordering, we have +λ 2 2 ( 2σ/2 λ1 + λ 2 2 e B (F J, G = e B (J, F G = = < 1. 2λ σ/2 1 e B (F G, F JA = λ 2 σ/2 +λ 2 2σ/2 2 ( 2 2λ2 = < 1. + λ 2 (B.1

26 14 mitchell r. watnik and wesley o. johnson Now, we compare J and G: e B (F J, J = e B (G, F G = λ < 1. e B (G, J = 4 { } λ1 λ ( + λ 2 2. (C.1 { } To show that e B (G, J < 1, we note that 4 λ1 λ 2 { } ( +λ λ1 λ ( +λ 2 < 1 because λ 2 < λ 2. But, { } λ1 λ 2 4 ( + λ ( λ 2 ( + λ implies 0 λ λ λ 2 = ( λ 2 2, Now, we show e HS (J, F G 1. Beginning with (4.4, we have e HS (J, F G = /2 1 +λ 2 2/ { ( λ1 +λ 2 2 (λ1 λ = + λ 2 { } (λ1 +λ 2 2 ( λ (λ 2 λ = = ( + λ 2 2 { } (λ1 +λ 2 2 ( λ (λ 2 λ 1 { } 1 ( λ λ2 λ ( +λ 2 2 This is greater than, or equal to, 1 if and only if 1 (λ { } 1 λ 2 λ2 λ + 4 ( + λ 2 2 λ { } 2 λ2 λ 4 ( + λ { λ1 λ ( + λ ( } λ2 λ But the right-hand side is equivalent to e B (G, J (see equation 4.A.1. So, this completes the proof. }. 2

27 linear model selection tests 15 We now show e HS (G, J 1. For HS efficiency, we consider the relative efficiencies between G and J, and between G and F J. e HS (G, J = [( { } ] λ λ1 +λ 2 1 λ λ (+λ 2 (λ 4 λ 5 4λ 1 ( + λ 2 2 /4λ = 4 [ ( + λ 2 This is greater than, or equal to, 1, if and only if { } ] λ 4 + (+λ 2 (λ 4 λ 5 4λ λ 2 ( + λ 2 2. λ 4 [( + λ 2 {4λ λ 4 + ( +λ 2 (λ 4 λ 5 } 4λ 4( +λ 2 2 λ 2 ] [( + λ 2 (4λ λ 4 + λ 4 + λ 2 λ 4 λ 5 λ 2 λ 5 4λ The left-hand side is equivalent to (λ λ 2 + λ 2 2λ 2. (4 λ λ 4 +4λ 2 λ λ 4 +λ 2 1λ 4 +2 λ 2 λ 4 +λ 2 2λ 4 λ 2 1λ 5 2 λ 2 λ 5 λ 2 2λ 5 4λ ( = 2 λ λ 4 +λ 2 1λ 4 +2 λ 2 λ 4 +λ 2 2λ 4 λ 2 1λ 5 λ 2 2λ 5, ] (C.2 where the underscores indicate use of assumptions (4.5. We will use underscores in the following to indicate use of (2.9 and assumptions (4.5. Now, we continue with (C.2: 2λ 2 1λ λ 4 +λ 1λ 4 +2λ 2 1λ 2 λ 4 + λ 2 2λ 4 λ 1λ 5 λ 2 2λ 5 (λ λ 2 + λ 2 2λ 2 2λ 2 1λ λ 4 + λ 1λ 4 + 2λ 2 1λ 2 λ 4 + λ 2 2λ 4 λ 1λ 5 + λ 2 2λ 5 + λ 2 1λ λ 2 λ 2 + λ 2 2λ 2 2λ 2 1λ λ 4 +λ 1λ 4 +2λ 2 1λ 2 λ 4 λ 1λ 5 + λ 2 2λ 5 +λ 2 1λ 2 + λ 2 λ 2 +λ 2 2λ 2 2λ 2 1λ λ 4 +2λ 2 1λ 2 λ 4 λ 2 2λ 5 +λ 2 1λ 2 λ 4 + λ 2 λ 2 +λ 2 2λ 2 2λ 2 1λ λ 4 +λ 2 1λ 2 λ 4 λ 2 λ λ 4 + λ 2 λ 2 +λ 2 2λ 2 λ 2 1λ λ 4 +λ 2 1λ 2 λ 4 λ 2 2λ 4 +λ 2 2λ 2 λ 2 1λ λ 4 λ 2 λ λ 4, which is a true statement.

28 16 mitchell r. watnik and wesley o. johnson Next, we compare G to F J according to the Hájek and Šidák criterion: [( { } ] λ 2 1 λ1 +λ 2 λ λ λ e HS (G, F J = (+λ 2 (λ 4 λ 5 4λ 1 { (λ1 +λ 2 2 /4λ 2 } { ( ( } 2 2 λ 2 +2λ λ1 4 λ + λ1 λ (λ4 λ 5 λ 2 [ 1 (λ1 + λ 2 {4λ λ 4 + ( + λ 2 (λ 4 λ 5 } 4λ = ] ( ( 2 }. λ 2 ( + λ 2 {2λ 2 1 λ 2 + 2λ λ1 4 λ + λ1 λ (λ4 λ 5 This is less than, or equal to, 1, if and only if λ 2 1 [ ] ( + λ 2 {4λ λ 4 + ( + λ 2 (λ 4 λ 5 } 4λ { } ( + λ λ 2 λ 2 λ λ λ 4 + λ 2 1(λ 4 λ 5. (C. The left-hand side is equivalent to: λ 2 1 ( + λ 2 {4λ λ 4 + ( + λ 2 (λ 4 λ 5 } 4λ 2 1λ = 4λ 1λ λ 4 + 4λ 2 1λ 2 λ λ 4 + (λ 4 1λ 4 + 2λ 1λ 2 λ 4 + λ 2 1λ 2 2λ 4 (λ 4 1λ 5 + 2λ 1λ 2 λ 5 + λ 2 1λ 2 2λ 5 4λ 2 1λ = 4λ 1λ λ 4 + λ 4 1λ 4 + 2λ 1λ 2 λ 4 + λ 2 1λ 2 2λ 4 λ 4 1λ 5 2λ 1λ 2 λ 5 λ 2 1λ 2 2λ 5 = 2λ 1λ λ 4 + λ 4 1λ 4 + 2λ 1λ 2 λ 4 + λ 2 1λ 2 2λ 4 λ 4 1λ 5 λ 2 1λ 2 2λ 5, where the underscores above and below indicate the use of (4.5. The right-hand side of (C. is equivalent to: (λ λ 2 + λ 2 2 (2 λ 2 λ 2 λ λ λ 4 + λ 2 1λ 4 λ 2 1λ 5 = (λ 4 1λ 4 λ 4 1λ 5 + 2λ 1λ 2 + 2λ 1λ λ 4 + 2λ 1λ 2 λ 4 2λ 1λ 2 λ 5 + (λ 2 1λ 2 λ 2 + 4λ 2 1λ 2 λ λ 4 + λ 2 1λ 2 2λ 4 λ 2 1λ 2 2λ 5 4 λ 2 2λ λ 2 2λ λ 4 λ 2λ 2 = (λ 4 1λ 4 λ 4 1λ 5 + 2λ 1λ 2 + 2λ 1λ 2 λ 4 + (λ 2 1λ 2 λ 2 + λ 2 1λ 2 λ λ 4 + λ 2 1λ 2 2λ 4 4 λ 2 2λ λ 2 2λ λ 4 λ 2λ 2. So, e HS (G, F J 1 if and only if (transformations using (2.9 or (4.5 are indicated beforehand using underscores 2λ 1λ λ 4 + λ 4 1λ 4 + 2λ 1λ 2 λ 4 + λ 2 1λ 2 2λ 4 λ 4 1λ 5 λ 2 1λ 2 2λ 5

29 linear model selection tests 17 λ 4 1λ 4 λ 4 1λ 5 + 4λ 1λ 2 + 2λ 2 1λ 2 2λ 4 +λ 2 1λ 2 λ λ 4 4 λ 2 2λ λ 2 2λ λ 4 λ 2λ 2 2λ 1λ λ 4 λ 2 1λ 2 2λ 5 + 2λ 1λ 2 + λ 2 1λ 2 2λ 4 +λ 2 1λ 2 λ λ 4 4 λ 2 2λ λ 2 2λ λ 4 λ 2λ 2 4 λ 2 2λ 2 + λ 2λ 2 λ 2 1λ 2 2λ 5 + λ 2 1λ 2 2λ 4 + λ 2 1λ 2 λ λ λ 2 2λ λ 4 λ 2 2λ 2 + λ 2λ 2 λ 2 1λ 2 2λ 5 + λ 2 1λ 2 2λ λ 2 2λ λ 4 λ 2 2λ 2 + λ 2λ 2 λ 2 1λ 2 2λ 5 + λ 2 1λ 2 2λ 4 λ 2λ 2 λ 2 1λ 2 2λ 4 λ 4 2λ 4 λ 2 1λ 2 2λ 4, which is a true statement. References Bahadur, R.R. (1960. Stochastic Comparison of Tests. Ann. Math. Statist. 1, Christensen, R. (1987. Plane Answers to Complex Questions: The Theory of Linear Models. New York: Springer-Verlag. Cox, D.R. (1961. Tests of separate families of hypotheses. Proceedings of the 4th Berkeley Symposium on Mathematics, Probability and Statistics, Vol. 1, Berkeley: University of California Press. Cox, D.R. (1962. Further results on tests of separate families of hypotheses. J. Roy. Stat. Soc., Series B, 24, Cox, D.R. and Hinkley, D.V. (1974. Theoretical Statistics. New York: Wiley. Dastoor, N.K. and McAleer, M. (1989. Some power comparisons of joint and paired tests for nonnested models under local hypotheses. Econometric Theory 5, Davidson, R. and MacKinnon, J.G. (1981. Several tests for model specification in the presence of alternative hypotheses. Econometrica 49, Efron, B. (1984. Comparing non-nested linear models. J. Amer. Statist. Assoc. 79, Efron, B. and Tibshirani, R.J. (199. An Introduction to the Bootstrap, Pacific Grove, CA: Chapman and Hall. Fisher, G.R. and McAleer, M. (1981. Alternative procedures and associated tests of significance for non-nested hypotheses. J. Econometrics 16, Fisher, G.R. (198. Tests for two separate regressions. J. Econometrics 21, Godfrey, L.G. (1998. Tests of non-nested regression models: some results on small sample behaviour and the bootstrap. J. Econometrics 84, Godfrey, L.G. and Pesaran, M.H. (198. Tests of non-nested regression models. J. Econometrics 21, Graybill, F.A. and Iyer, H.K. (1994. Regression Analysis: Concepts and Applications. Belmont, CA: Duxbury. Green, J.R. (1971. Testing departure from a regression without using replication. Technometrics 1,

INTERNALLY STUDENTIZED RESIDUALS TEST FOR MULTIPLE NON-NESTED REGRESSION MODELS

INTERNALLY STUDENTIZED RESIDUALS TEST FOR MULTIPLE NON-NESTED REGRESSION MODELS International Journal of Physics and Mathematical Sciences ISSN: 77-111 (Online) 014 Vol. 4 (1) January-March, pp.49-56/naidu et al. INTERNALLY STUENTIZE RESIUALS TEST FOR MULTIPLE NON-NESTE REGRESSION

More information

Birkbeck Working Papers in Economics & Finance

Birkbeck Working Papers in Economics & Finance ISSN 1745-8587 Birkbeck Working Papers in Economics & Finance Department of Economics, Mathematics and Statistics BWPEF 1809 A Note on Specification Testing in Some Structural Regression Models Walter

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

A Monte Carlo Sampling Approach to Testing Separate Families of Hypotheses: Monte Carlo Results

A Monte Carlo Sampling Approach to Testing Separate Families of Hypotheses: Monte Carlo Results A Monte Carlo Sampling Approach to Testing Separate Families of Hypotheses: Monte Carlo Results by Nouhoun Coulibaly and B. Wade Brorsen * * Nouhoun Coulibaly is a Ph.D. candidate and B. Wade Brorsen is

More information

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,

More information

Heteroskedasticity-Robust Inference in Finite Samples

Heteroskedasticity-Robust Inference in Finite Samples Heteroskedasticity-Robust Inference in Finite Samples Jerry Hausman and Christopher Palmer Massachusetts Institute of Technology December 011 Abstract Since the advent of heteroskedasticity-robust standard

More information

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED by W. Robert Reed Department of Economics and Finance University of Canterbury, New Zealand Email: bob.reed@canterbury.ac.nz

More information

Bootstrap Tests: How Many Bootstraps?

Bootstrap Tests: How Many Bootstraps? Bootstrap Tests: How Many Bootstraps? Russell Davidson James G. MacKinnon GREQAM Department of Economics Centre de la Vieille Charité Queen s University 2 rue de la Charité Kingston, Ontario, Canada 13002

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

The exact bootstrap method shown on the example of the mean and variance estimation

The exact bootstrap method shown on the example of the mean and variance estimation Comput Stat (2013) 28:1061 1077 DOI 10.1007/s00180-012-0350-0 ORIGINAL PAPER The exact bootstrap method shown on the example of the mean and variance estimation Joanna Kisielinska Received: 21 May 2011

More information

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: INDEPENDENCE BETWEEN TWO SAMPLES Hisashi Tanizaki Graduate School of Economics, Kobe University, Kobe 657-8501, Japan e-mail: tanizaki@kobe-u.ac.jp Abstract:

More information

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University

More information

Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful?

Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful? Journal of Modern Applied Statistical Methods Volume 10 Issue Article 13 11-1-011 Estimation and Hypothesis Testing in LAV Regression with Autocorrelated Errors: Is Correction for Autocorrelation Helpful?

More information

Stochastic Design Criteria in Linear Models

Stochastic Design Criteria in Linear Models AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework

More information

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai

More information

A Test of Cointegration Rank Based Title Component Analysis.

A Test of Cointegration Rank Based Title Component Analysis. A Test of Cointegration Rank Based Title Component Analysis Author(s) Chigira, Hiroaki Citation Issue 2006-01 Date Type Technical Report Text Version publisher URL http://hdl.handle.net/10086/13683 Right

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

PRE-TEST ESTIMATION OF THE REGRESSION SCALE PARAMETER WITH MULTIVARIATE STUDENT-t ERRORS AND INDEPENDENT SUB-SAMPLES

PRE-TEST ESTIMATION OF THE REGRESSION SCALE PARAMETER WITH MULTIVARIATE STUDENT-t ERRORS AND INDEPENDENT SUB-SAMPLES Sankhyā : The Indian Journal of Statistics 1994, Volume, Series B, Pt.3, pp. 334 343 PRE-TEST ESTIMATION OF THE REGRESSION SCALE PARAMETER WITH MULTIVARIATE STUDENT-t ERRORS AND INDEPENDENT SUB-SAMPLES

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

1 Introduction. 2 AIC versus SBIC. Erik Swanson Cori Saviano Li Zha Final Project

1 Introduction. 2 AIC versus SBIC. Erik Swanson Cori Saviano Li Zha Final Project Erik Swanson Cori Saviano Li Zha Final Project 1 Introduction In analyzing time series data, we are posed with the question of how past events influences the current situation. In order to determine this,

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

An Unbiased C p Criterion for Multivariate Ridge Regression

An Unbiased C p Criterion for Multivariate Ridge Regression An Unbiased C p Criterion for Multivariate Ridge Regression (Last Modified: March 7, 2008) Hirokazu Yanagihara 1 and Kenichi Satoh 2 1 Department of Mathematics, Graduate School of Science, Hiroshima University

More information

Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals

Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals Michael Sherman Department of Statistics, 3143 TAMU, Texas A&M University, College Station, Texas 77843,

More information

On Least Squares Linear Regression Without Second Moment

On Least Squares Linear Regression Without Second Moment On Least Squares Linear Regression Without Second Moment BY RAJESHWARI MAJUMDAR University of Connecticut If \ and ] are real valued random variables such that the first moments of \, ], and \] exist and

More information

An Introduction to Multivariate Statistical Analysis

An Introduction to Multivariate Statistical Analysis An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

460 HOLGER DETTE AND WILLIAM J STUDDEN order to examine how a given design behaves in the model g` with respect to the D-optimality criterion one uses

460 HOLGER DETTE AND WILLIAM J STUDDEN order to examine how a given design behaves in the model g` with respect to the D-optimality criterion one uses Statistica Sinica 5(1995), 459-473 OPTIMAL DESIGNS FOR POLYNOMIAL REGRESSION WHEN THE DEGREE IS NOT KNOWN Holger Dette and William J Studden Technische Universitat Dresden and Purdue University Abstract:

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003.

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003. A Note on Bootstraps and Robustness Tony Lancaster, Brown University, December 2003. In this note we consider several versions of the bootstrap and argue that it is helpful in explaining and thinking about

More information

arxiv: v2 [stat.me] 8 Jun 2016

arxiv: v2 [stat.me] 8 Jun 2016 Orthogonality of the Mean and Error Distribution in Generalized Linear Models 1 BY ALAN HUANG 2 and PAUL J. RATHOUZ 3 University of Technology Sydney and University of Wisconsin Madison 4th August, 2013

More information

COMPARING TRANSFORMATIONS USING TESTS OF SEPARATE FAMILIES. Department of Biostatistics, University of North Carolina at Chapel Hill, NC.

COMPARING TRANSFORMATIONS USING TESTS OF SEPARATE FAMILIES. Department of Biostatistics, University of North Carolina at Chapel Hill, NC. COMPARING TRANSFORMATIONS USING TESTS OF SEPARATE FAMILIES by Lloyd J. Edwards and Ronald W. Helms Department of Biostatistics, University of North Carolina at Chapel Hill, NC. Institute of Statistics

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

Criteria for existence of semigroup homomorphisms and projective rank functions. George M. Bergman

Criteria for existence of semigroup homomorphisms and projective rank functions. George M. Bergman Criteria for existence of semigroup homomorphisms and projective rank functions George M. Bergman Suppose A, S, and T are semigroups, e: A S and f: A T semigroup homomorphisms, and X a generating set for

More information

Robust Nonnested Testing and the Demand for Money

Robust Nonnested Testing and the Demand for Money Robust Nonnested Testing and the Demand for Money Hwan-sik Choi Cornell University Nicholas M. Kiefer Cornell University October, 2006 Abstract Non-nested hypothesis testing procedures have been recently

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

An Encompassing Test for Non-Nested Quantile Regression Models

An Encompassing Test for Non-Nested Quantile Regression Models An Encompassing Test for Non-Nested Quantile Regression Models Chung-Ming Kuan Department of Finance National Taiwan University Hsin-Yi Lin Department of Economics National Chengchi University Abstract

More information

Mostly calibrated. Yossi Feinberg Nicolas S. Lambert

Mostly calibrated. Yossi Feinberg Nicolas S. Lambert Int J Game Theory (2015) 44:153 163 DOI 10.1007/s00182-014-0423-0 Mostly calibrated Yossi Feinberg Nicolas S. Lambert Accepted: 27 March 2014 / Published online: 16 April 2014 Springer-Verlag Berlin Heidelberg

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap

Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine September 1, 2008 Abstract This paper

More information

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation Inference about Clustering and Parametric Assumptions in Covariance Matrix Estimation Mikko Packalen y Tony Wirjanto z 26 November 2010 Abstract Selecting an estimator for the variance covariance matrix

More information

ROBUSTNESS OF TWO-PHASE REGRESSION TESTS

ROBUSTNESS OF TWO-PHASE REGRESSION TESTS REVSTAT Statistical Journal Volume 3, Number 1, June 2005, 1 18 ROBUSTNESS OF TWO-PHASE REGRESSION TESTS Authors: Carlos A.R. Diniz Departamento de Estatística, Universidade Federal de São Carlos, São

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

Tail dependence in bivariate skew-normal and skew-t distributions

Tail dependence in bivariate skew-normal and skew-t distributions Tail dependence in bivariate skew-normal and skew-t distributions Paola Bortot Department of Statistical Sciences - University of Bologna paola.bortot@unibo.it Abstract: Quantifying dependence between

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 Homework 4 (version 3) - posted October 3 Assigned October 2; Due 11:59PM October 9 Problem 1 (Easy) a. For the genetic regression model: Y

More information

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University QED Queen s Economics Department Working Paper No. 1319 Hypothesis Testing for Arbitrary Bounds Jeffrey Penney Queen s University Department of Economics Queen s University 94 University Avenue Kingston,

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

Holzmann, Min, Czado: Validating linear restrictions in linear regression models with general error structure

Holzmann, Min, Czado: Validating linear restrictions in linear regression models with general error structure Holzmann, Min, Czado: Validating linear restrictions in linear regression models with general error structure Sonderforschungsbereich 386, Paper 478 (2006) Online unter: http://epub.ub.uni-muenchen.de/

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Linear Algebra I. Ronald van Luijk, 2015

Linear Algebra I. Ronald van Luijk, 2015 Linear Algebra I Ronald van Luijk, 2015 With many parts from Linear Algebra I by Michael Stoll, 2007 Contents Dependencies among sections 3 Chapter 1. Euclidean space: lines and hyperplanes 5 1.1. Definition

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Interpreting Regression Results

Interpreting Regression Results Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split

More information

#A offered. #B offered. Firing rate. Value of X

#A offered. #B offered. Firing rate. Value of X 15 12 #A offered 9 6 3 2 1 0 0 1 2 3 4 5 6 8 10 15 20 25 30 35 40 45 #B offered Figure S1. The x-axis and y-axis represent, respectively, the quantities of juices B and A offered in any given trial, and

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

Bayesian Interpretations of Heteroskedastic Consistent Covariance. Estimators Using the Informed Bayesian Bootstrap

Bayesian Interpretations of Heteroskedastic Consistent Covariance. Estimators Using the Informed Bayesian Bootstrap Bayesian Interpretations of Heteroskedastic Consistent Covariance Estimators Using the Informed Bayesian Bootstrap Dale J. Poirier University of California, Irvine May 22, 2009 Abstract This paper provides

More information

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) 1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Asymptotic efficiency of simple decisions for the compound decision problem

Asymptotic efficiency of simple decisions for the compound decision problem Asymptotic efficiency of simple decisions for the compound decision problem Eitan Greenshtein and Ya acov Ritov Department of Statistical Sciences Duke University Durham, NC 27708-0251, USA e-mail: eitan.greenshtein@gmail.com

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Junfeng Shang Bowling Green State University, USA Abstract In the mixed modeling framework, Monte Carlo simulation

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis

Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Tetsuro Sakurai and Yasunori Fujikoshi Center of General Education, Tokyo University

More information

Robust model selection criteria for robust S and LT S estimators

Robust model selection criteria for robust S and LT S estimators Hacettepe Journal of Mathematics and Statistics Volume 45 (1) (2016), 153 164 Robust model selection criteria for robust S and LT S estimators Meral Çetin Abstract Outliers and multi-collinearity often

More information

3 The language of proof

3 The language of proof 3 The language of proof After working through this section, you should be able to: (a) understand what is asserted by various types of mathematical statements, in particular implications and equivalences;

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

A Note on UMPI F Tests

A Note on UMPI F Tests A Note on UMPI F Tests Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico May 22, 2015 Abstract We examine the transformations necessary for establishing

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

Minimum distance tests and estimates based on ranks

Minimum distance tests and estimates based on ranks Minimum distance tests and estimates based on ranks Authors: Radim Navrátil Department of Mathematics and Statistics, Masaryk University Brno, Czech Republic (navratil@math.muni.cz) Abstract: It is well

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression Andrew A. Neath Department of Mathematics and Statistics; Southern Illinois University Edwardsville; Edwardsville, IL,

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

WEIBULL RENEWAL PROCESSES

WEIBULL RENEWAL PROCESSES Ann. Inst. Statist. Math. Vol. 46, No. 4, 64-648 (994) WEIBULL RENEWAL PROCESSES NIKOS YANNAROS Department of Statistics, University of Stockholm, S-06 9 Stockholm, Sweden (Received April 9, 993; revised

More information

Lehrstuhl für Statistik und Ökonometrie. Diskussionspapier 87 / Some critical remarks on Zhang s gamma test for independence

Lehrstuhl für Statistik und Ökonometrie. Diskussionspapier 87 / Some critical remarks on Zhang s gamma test for independence Lehrstuhl für Statistik und Ökonometrie Diskussionspapier 87 / 2011 Some critical remarks on Zhang s gamma test for independence Ingo Klein Fabian Tinkl Lange Gasse 20 D-90403 Nürnberg Some critical remarks

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

In Search of Desirable Compounds

In Search of Desirable Compounds In Search of Desirable Compounds Adrijo Chakraborty University of Georgia Email: adrijoc@uga.edu Abhyuday Mandal University of Georgia Email: amandal@stat.uga.edu Kjell Johnson Arbor Analytics, LLC Email:

More information

Testing for Regime Switching in Singaporean Business Cycles

Testing for Regime Switching in Singaporean Business Cycles Testing for Regime Switching in Singaporean Business Cycles Robert Breunig School of Economics Faculty of Economics and Commerce Australian National University and Alison Stegman Research School of Pacific

More information

Chapter 1. GMM: Basic Concepts

Chapter 1. GMM: Basic Concepts Chapter 1. GMM: Basic Concepts Contents 1 Motivating Examples 1 1.1 Instrumental variable estimator....................... 1 1.2 Estimating parameters in monetary policy rules.............. 2 1.3 Estimating

More information

Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances

Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances METRON - International Journal of Statistics 2008, vol. LXVI, n. 3, pp. 285-298 SHALABH HELGE TOUTENBURG CHRISTIAN HEUMANN Mean squared error matrix comparison of least aquares and Stein-rule estimators

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

A Vector Space Justification of Householder Orthogonalization

A Vector Space Justification of Householder Orthogonalization A Vector Space Justification of Householder Orthogonalization Ronald Christensen Professor of Statistics Department of Mathematics and Statistics University of New Mexico August 28, 2015 Abstract We demonstrate

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC A Simple Approach to Inference in Random Coefficient Models March 8, 1988 Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC 27695-8203 Key Words

More information

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments Tak Wai Chau February 20, 2014 Abstract This paper investigates the nite sample performance of a minimum distance estimator

More information

A Simplified Test for Preference Rationality of Two-Commodity Choice

A Simplified Test for Preference Rationality of Two-Commodity Choice A Simplified Test for Preference Rationality of Two-Commodity Choice Samiran Banerjee and James H. Murphy December 9, 2004 Abstract We provide a simplified test to determine if choice data from a two-commodity

More information

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics

More information

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56 Cointegrated VAR s Eduardo Rossi University of Pavia November 2013 Rossi Cointegrated VAR s Financial Econometrics - 2013 1 / 56 VAR y t = (y 1t,..., y nt ) is (n 1) vector. y t VAR(p): Φ(L)y t = ɛ t The

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information