Department of Mathematics, Graduate School of Science Hiroshima University Kagamiyama, Higashi Hiroshima, Hiroshima , Japan.

Size: px
Start display at page:

Download "Department of Mathematics, Graduate School of Science Hiroshima University Kagamiyama, Higashi Hiroshima, Hiroshima , Japan."

Transcription

1 High-Dimensional Properties of Information Criteria and Their Efficient Criteria for Multivariate Linear Regression Models with Covariance Structures Tetsuro Sakurai and Yasunori Fujikoshi Center of General Education, Tokyo University of Science, Suwa Toyohira, Chino, Nagano , Japan Department of Mathematics, Graduate School of Science Hiroshima University -3- Kagamiyama, Higashi Hiroshima, Hiroshima , Japan Abstract In this paper, first we consider high-dimensional consistency properties of information criteria IC g,d and their efficient criteria EC g,d for selection of variables in multivariate regression model with covariance structures. The covariance structures considered are () independent covariance structure, (2) uniform covariance structure and (3) autoregressive covariance structure. Sufficient conditions for IC g,d and EC g,d to be consistent are derived under a high-dimensional asymptotic framework such that the sample size n and the number p of response variables are large as in the way p/n c (0, ). Our results are checked numerically by conducting a Mote Carlo simulation. Next we discuss with high-dimensional properties of AIC and BIC for selecting (4) independence covariance structure with different variances, and (5) no covariance structure, in addition to the covariance structures () (3). Some tendancy is pointed through a numerical experiment. AMS 2000 subject classification: primary 62H2; secondary 62H30 Key Words and Phrases: Consistency property, Information criteria, Covariance structures, Efficient criteria, High-dimensional asymptotic framework, Multivariate linear regression.

2 . Introduction We consider a multivariate linear regression of p response variables y,..., y p on a subset of k explanatory variables x,..., x k. Suppose that there are n observations on y = (y,..., y p ) and x = (x,..., x k ), and let Y : n p and X : n k be the observation matrices of y and x with the sample size n, respectively. The multivariate linear regression model including all the explanatory variables is written as Y N n p (XΘ, Σ I n ), (.) where Θ is a k p unknown matrix of regression coefficients and Σ is a p p unknown covariance matrix. The notation N n p (, ) means the matrix normal distribution such that the mean of Y is XΘ and the covariance matrix of vec (Y) is Σ I n, or equivalently, the rows of Y are independently normal with the same covariance matrix Σ. Here, vec(y) be the np column vector obtained by stacking the columns of Y on top of one another. We assume that rank(x) = k. We consider the problem of selecting the best model from a collection of candidate models specified by a linear regression of y on subvectors of x. Our interest is to examine consistency properties of information criteria and their efficient criteria when p/n c (0, ). When Σ is unknown positive definite, it has been pointed (see, e.g., Yanagihara et al. (205), Fujikoshi et al. (204), etc.) that AIC and Cp have consistency properties when p/n c (0, ), under some conditions, but BIC is not necessarily consistent. Related to high-dimensional data, it is important to consider selection of regression variables in the case that p is larger than n, and k is also large. When p is large, it will be natural to consider a covariance structure, since a covariance matrix with no covariance structure involves many unknown parameters. One way is to consider a sparse method or a joint regularization 2

3 of the regression parameters and the inverse covariance matrix, see, e.g., Rothman et al. (200). As another approach, it may consider to select the regression variables, assuming a simple covariance structure. Related to the later approach, it might occur to select an appropriate simple covariance structure from a set of some simple covariance structures. In this paper, first we consider the variables selection problems under some simple covariance structures such as () independent covariance structure, (2) uniform covariance structure and (3) autoregressive covariance structure, based on model selection approach. We study consistency properties of information criteria including AIC and BIC. These information criteria have a computational problem when k becomes large. In order to avoid such problem, we consider their efficient criteria based on Zho et al. (986) and Nishii et al. (988). It is shown that the efficient criteria also consistent in a high dimensional situation. Our results are checked numerically by conducting a Mote Carlo simulation. Next we discuss with AIC and BIC for selecting (4) independence covariance structure with different variances, and (5) no covariance structure, in addition to covariance structures () (3). In a high-dimensional situation, it is noted that the covariance structures except for (2) are identified by AIC and BIC, through a simulation experiment. The present paper is organized as follows. In section 2, we present notations and preliminaries. In Sections 3, 4 and 5 we show high-dimensional consistencies of information criteria under the covariance structures (), (2) and (3), respectively. These are also numerically studied. In Section 6, their efficient criteria are shown to be consistent under the same condition as in the case of information criteria. In Section 7, we discuss with selections of covariance structures by AIC and BIC. In Section 8, our conclusions are discussed. The proofs of our results are given in Appendix. 3

4 2. Notations and Preliminaries This paper is concerned with selection of explanatory variables in multivariate regression model (.). Suppose that j denotes a subset of ω = {,..., k} containing k j elements, and X j denotes the n k j matrix consisting the columns of X indexed by the elements of j. Then, X ω = X. Further, we assume that the covariance matrix Σ have a covariance structure Σ g. Then a generic candidate model can be expressed as M g,j : Y N n p (X j Θ j, Σ g,j I n ), (2.) where Θ j is a k j p unknown matrix of regression coefficients. We assume that rank(x) = k(< n). As a model selection method, we use a generalized criterion of AIC (Akaike (973)). When Σ g,j is a p p unknown covariance matrix, the AIC (see, e.g., Bedrick and Tsai (994), Fujikoshi and Satoh (997)) for M g,j is given by AIC g,j = n log ˆΣ g,j + np(log 2π + ) + 2 {k j p + 2 p(p + ) }, (2.2) where n ˆΣ g,j = Y (I n P j )Y and P j = X j (X jx j ) X j. The part of n log ˆΣ g,j + np(log 2π+) is 2 log max Mg,j f(y; Θ j, Σ g,j ), where f(y; Θ j, Σ g,j ) is the density function of Y under M g,j. The AIC was introduced as an asymptotic unbiased estimator for the risk function defined as the expected logpredictive-likelihood or equivalently the Kullback-Leibler information, for a candidate model M g,j, see, e.g., Fujikoshi and Satoh (997). When j = ω, the model M g,ω is called the full model. Note that ˆΣ g,ω and P ω are defined from ˆΣ g,j and P j as j = ω, k ω = k and X ω = X. In this paper, first we consider the case that the covariance matrix Σ belongs to each of the following three classes; () Independent covariance structure (IND); Σ v = σ 2 vi p, (2) Uniform covariance structure (UNIF); Σ u = σ 2 u(ρ δ ij u ) i,j p, (3) Autoregressive covariance structure (AUTO); Σ a = σ 2 a(ρ i j a ) i,j p. 4

5 Our candidate model can be expressed as (2.) with Σ v,j, Σ u,j or Σ a,j for Σ g,j. For deriving the maximum likelihood under M g,j, we shall use the fact that for any positive definite Σ g,j, max Θ j f(y; Θ j, Σ g,j ) = np log Σ g,j + np(log 2π + ) + min trσ g,j(y X j Θ j ) (Y X j Θ j ) (2.3) Θ j = np log Σ g,j + np log 2π + trσ g,jy(i n P j )Y. Let ˆΣ g,j be the quantity minimizing the right side of (2.3). Then, in our problem, it satisfies tr ˆΣ g,jy(i n P j )Y = np. We consider a general information criterion defined by IC g,d,j = 2 log f(y; ˆΘ j, ˆΣ g,j ) + dm g,j = np log Σ g,j + np(log 2π + ) + dm g,j, (2.4) where m g,j is the number of independent unknown parameters under M g,j, and d is a positive constant which may depend on n. When d = 2 and d = log n, IC g,2,j = AIC g,j, IC g,log n,j = BIC g,j. Such general information criterion was considered in a univariate regression model by Nishii (984). For each of the three covariance structures, consider to select the best model from all the models or a subset of all the models. Let F be the set of all the candidate models, which is denoted by {{},..., {k}, {, 2},..., {,..., k}}, or its subfamily. Then, our model selction criterion is to select the model M j or the subset j minimizing IC g,d,j, which is written as ĵ ICg,d = arg min j F IC g,d,j. (2.5) 5

6 For studying consistency properties of ĵ ICg,d, it is assumed that the true model M g, is included in the full model, i.e., M g, : Y N n p (X Θ, Σ g, I n ). (2.6) Let us denote the minimum model including the true model by M g,j. The true mean of Y is expresed as X Θ = X j Θ j for some k j p matrix Θ j. So, the notation X j Θ j is also used for the true mean of Y. Let F separate into two sets, one is a set of overspecified models, i.e., F + = {j F j j} and the other is a set of underspecified models, i.e., F = F+ c F. Here we list some of our main assumptions: A (The true model): M g, F. A2 (The asymptotic framework): p, n, p/n c (0, ). It is said that a general model selection criterion ĵ ICg,d is consistent if lim Pr(ĵ ICg,d = j ) =. p/n c (0, ) In order to obtain ĵ ICg,d, we must calcurate IC g,d for all the subsets of j ω = {, 2,..., k}, i.e., 2 k IC g,d. This will become extensive computation as k becomes large. As a method of overcoming this weak point, we consider EC criterion based on Zho et al. (986) and Nishii et al. (988). Let j (i) be the subset of j ω omitting the i ( i k). Then EC g,d is defined to select ĵ ECg,d = {i j ω EC g,d,j(i) > EC g,d,jω, i =,..., k}. (2.7) In Section 6 we shall show that ĵ ECg,d has an consistency property. 6

7 3. IC under Independent Covariance Structure In this section we consider the problem of selecting the regression variables in the multivariate regression model under the assumption that the covariance matrix has an independent covariance structure. A generic candidate model can be expressed as M v,j : Y N n p (X j Θ j, Σ v,j I n ), (3.) where Σ v,j = σ 2 v,ji p and σ v,j > 0. Then, we have 2 log f(y; Θ j, σv,j) 2 = np log(2π) + np log σv,j 2 + tr(y X σv,j 2 j Θ j ) (Y X j Θ j ). Therefore, it is easily seen that the maximum estimators of Θ j and σ 2 v,j under M v,j are given as ˆΘ j = (X jx j ) X jy, ˆσ 2 v,j = np try (I n P j )Y. (3.2) The information criterion (2.4) is given by IC v,d,j = np log ˆσ 2 v,j + np(log 2π + ) + d m v,j, (3.3) where d is a positive constant and m v,j model is expressed as = k j p +. Assume that the true M v, : Y N n p (X Θ, σv, I 2 p I n ), (3.4) and denote the minimum model including the true model M v, by M v,j. In general, Y (I n P j )Y is distributed as a nocentral Wishart distribution, more precisely Y (I n P j )Y W p (n k j, Σ v, ; (X j Θ j ) (I n P j )X j Θ j ), which implies the following Lemma. 7

8 is distributed as a noncentral chi- Lemma 3.. Under (3.4), npˆσ v,j/σ 2 v, 2 square distribution χ 2 (n k j )p (δ2 v,j), where When j F +, then, δ 2 v,j = 0. δv,j 2 = tr(x σv, 2 j Θ j ) (I n P j )X j Θ j = tr(x σv, 2 j Θ j ) (P ω P j )X j Θ j. (3.5) For a sufficient condition for consistency of IC v,g, we assume A3v : For any j F, δ 2 v,j = O(np), and lim p/n c np δ2 v,j = η 2 v,j > 0. (3.6) Theorem 3.. Suppose that the assumptions A, A2 and A3v are satisfied. Then, the information criteria IC v,d defined by (3.3) is consistent if d > and d/n 0. AIC and BIC satisfy the conditions d > and d/n 0, and we have the following result. Corollary 3.. Under the assumptions A, A2 and A3v AIC and BIC are consistent. In the following we numerically examine the validity of our claims. The true model was assumed as M v, : Y N n p (X Θ, σ 2 v, I p I n ). Here Θ : 3 p was determined by random numbers from the uniform distribution on (, 2), i.e., i.i.d. from U(, 2). The first column of X ω : n 0 is n, and the other elements are i.i.d. from U(, ). The true variance was set as σ 2 v, = 2. The five candidate models M jα, α =, 2,..., 5 were considered, where j α = {,..., α}, We studied selection percentages of the true model for 0 4 replications under AIC and BIC for (n, p) = (50, 5), (00, 30), (200, 60), (50, 00), (00, 200), (200, 400) 8

9 The results are given in Tables 4. and 4.2. It is seen that the true model has been selected for all the cases except the case (n, p) = (50, 5) of AIC v. In the case (n, p) = (50, 5) of AIC v, the selection percentage is not 00, but it is very high. Table 3.. Selection percentages of AIC v and BIC v for p/n = 0.3 AIC v BIC v j (50,5) (00,30) (200,60) (50,5) (00,30) (200,60) Table 3.2. Selection percentages of AIC v and BIC v for p/n = 2 AIC v BIC v j (50,00) (00,200) (200,400) (50,00) (00,200) (200,400) IC under Uniform Covariance Structure In this section we consider model selection criterion when the covariance matrix has a uniform covariance structure Σ u = σ 2 u(ρ δ ij u ) = σ 2 u{( ρ u )I p + ρ u p p}. (4.) The covariance structure is expressed as Σ u = τ (I p p G p 9 ) + τ 2 p G p,

10 where τ = σ 2 u( ρ u ), τ 2 = σ 2 u{ + (p )ρ u }, G p = p p, and p = (,..., ). Noting that the matrices I p p G p and p G p are orthogonal idempotent matrices, we have Σ u = τ 2 τ p, Σ u = (I p ) τ p G p Now we consider the model M u,j given by + τ 2 p G p. M u,j : Y N n p (X j Θ j, Σ u,j I n ), (4.2) where Σ u,j = τ j (I p p G p )+τ 2j p G p. Let H = (h, H 2 ) be an orthognal matrix where h = p /2 p, and let U j = H W j H, W j = Y (I n P j )Y. Let the density function of Y under M u,j denote by f(y; Θ j, τ j, τ 2j ). From (2.3) we have g(τ j, τ 2j ) = 2 log max Θ j f(y; Θ j, τ j, τ 2j ) = np log(2π) + n log τ 2j + n(p ) log τ j + trψ j U j, where Ψ j = diag(τ 2j, τ j,..., τ j ). Then, it can be shown that the maximum likelihood estimators of τ j and τ 2j under M u,j are given by ˆτ j = n(p ) tr D U j = ˆτ 2j = n tr D 2U j = n h Y (I n P j )Yh, n(p ) trh 2Y (I n P j )YH 2, where D = diag(0,,..., ) and D 2 = diag(, 0,..., 0). The number of independent parameters under M u,j is m j = k j p + 2. Noting that Ψ j is diagonal, we can get the information criterion (2.4) given as IC u,d,j = n(p ) log ˆτ j + n log ˆτ 2j + np(log 2π + ) + d(k j p + 2). (4.3) 0

11 Assume that the true model is expressed as M u, : Y N n p (X Θ, Σ u, I n ), (4.4) where Σ u, = τ (I p p G p ) + τ 2 p G p, and denote the minimum model including the true model M u, by M u,j. In general, it holds that U j W p (n k j, Ψ ; j ), where j = (X j Θ j H) (I n P j )X j Θ j H, and Ψ = diag(τ 2, τ,..., τ ). Therefore, we have the following Lemma (see, e.g., Fujikoshi et al. (200)). Lemma 4.. Under the true model (3.4), it holds that () n(p )τ ˆτ j is distributed as a noncentral distribution χ 2 (p )(n k j ) (δ2 j), where δ 2 j = τ trh 2(X j Θ j ) (I n P j )(X j Θ j )H 2. (2) nτ 2 ˆτ 2j is distributed as a noncentral distribution χ 2 n k j (δ 2 2j), where δ 2 2j = τ 2 trh (X j Θ j ) (I n P j )(X j Θ j )h. (3) If j F +, then δ j = 0 and δ 2j = 0. For a sufficient condition for consistency of IC u,d, we assume A3u: For any j F, δ 2 j = O(np), δ 2 2j = O(n) and lim p/n c np δ2 j = η 2 j > 0, lim p/n c n δ2 2j = η2j 2 > 0, (4.5) Theorem 4.. Suppose that the assumptions A, A2 and A3u are satisfied. Then, the information criteria IC u,d defined by (4.3) is consistent if d > and d/n 0.

12 Corollary 4.. Under the assumptions A, A2 and A3u AIC and BIC are consistent. We tried a numerical experiment under the same setting as in the independence covariance structure except for covariance structure. The true uniform covariance structure was set as the one with σ 2 u, = 2, ρ u, = 0.2. The results are given Tables 4. and 4.2. In general, it seems that AIC selects the true model even a finite setting with a high probability. However, BIC does not always select the true model when (n, p) is small, though it has a consistency property. Table 4.. Selection percentages of AIC and BIC for p/n = 0.3 AIC BIC j (50,5) (00,30) (200,60) (50,5) (00,30) (200,60) Table 4.2. Selection percentages of AIC and BIC for p/n = 2 AIC BIC j (50,00) (00,200) (200,400) (50,00) (00,200) (200,400)

13 5. IC under Autoregressive Covariance Structure In this section we consider model selection criterion when the covariance matrix Σ has an autoregressive covariance structure Σ a = σ 2 a(ρ i j a ) i,j p. (5.) Then, it is well known (see, e.g., Fujikoshi et al. (990)) Σ a = (σ 2 a) p ( ρ 2 a) p, Σ a = σ 2 a( ρ 2 a) (ρ2 ac 2ρ a C 2 + C 0 ), where C0 = I p, C = , C 2 = Now we consider the model M a,j given by M a,j : Y N n p ( X j Θ j, Σ a,j I n ), (5.2) where Σ a,j = σa,j(ρ 2 i j a,j ). Then, from (2.3) the maximum likelihood estimate of Θ j is given by ˆΘ j = (X X) X Y, and the maximum likelihood estimators of ρ a,j and σ 2 a,j can be obtained as the minimization of 2 log f(y; ˆΘ a,j, σa,j, 2 ρ j ) = np log(2π) + np log σa,j 2 + n(p ) log( ρ 2 a,j) + σa,j 2 ( ρ2 j ) + tr(ρ2 a,jc 2ρ a,j C 2 + C 0 )Y (I n P j )Y with respect to σ a,j and ρ a,j. Therefore, the maximum likelihood estimators of σ 2 a,j and ρ a,j are given (see Fujikoshi et al. (990)) through the following 3

14 two equations: () ˆσ 2 a,j = n k j n p( ˆρ 2 a,j )(a j ˆρ 2 a,j 2a 2j ˆρ a,j + a 0j ), (5.3) (2) (p )a j ˆρ 3 a,j (p 2)a 2j ˆρ 2 a,j (pa j + a 0j )ˆρ a,j + pa 2j = 0, (5.4) where a ij = trc i S j, i = 0,, 2, and S j = (n k j ) Y (I n P j )Y. Then, the information criterion IC a,d,j can be written as IC a,d,j =np log ˆσ 2 a,j + n(p ) log( ˆρ 2 a,j) + np(log 2π + ) + d(k j p + 2). (5.5) Note that the maximum likelihood estimators ˆρ and σ 2 are expressed in terms of a 0j, a j and a 2j or b 0j = trc 0 W j, b j = trc W j, b 2j = trc 2 W j, (5.6) where W j = (n k j )S j = Y (I n P j )Y. (5.7) Assume that the true model is expressed as M a, : Y N n p (X Θ, Σ a, I n ), (5.8) where Σ a, = σa, (ρ 2 a, i j ), and denote the minimum model including the true model M by M j. Then, W j = Y (I n P j )Y W p (n k j, Σ a, ; Ω j ), where Ω j = (X j Θ j ) (I n P j )X j Θ j. For relating to the noncentrality matrix Ω j, we use the following three quantities: and δ ij = trc i Ω j, i = 0,, 2. (5.9) As a sufficient condition for consistency, we assume A3a: For any j F, the order of each element of Ω j is O(n), δ 2 ij = O(np), lim p/n c np δ2 ij = η 2 ij > 0, i = 0,, 2. (5.0) 4

15 Theorem 5.. Suppose that the assumptions A, A2 and A3a are satisfied. Then, the information criteria IC a,d defined by (5.5) is consistent if d > and d/n 0. Corollary 5.. Under the the assumptions A, A2 and A3a AIC and BIC are consistent. We tried a numerical experiment under the same setting as in the independence covariance structure and in the uniform covariance structure except for covariance structure. Here the true covariance structure was set as the autoregressive covariance structure Σ a, = σ 2 a, (ρ i j a, ) with σ 2 a, = 2, ρ a, = 0.2. The results are given Tables 5. and 5.2. It is seen that AIC and BIC are selecting the true model in all the cases. Table 5.. Selection percentages of AIC and BIC for p/n = 0.3 AIC BIC j (50,5) (00,30) (200,60) (50,5) (00,30) (200,60) Table 5.2. Selection percentages of AIC and BIC for p/n = 2 AIC BIC j (50,00) (00,200) (200,400) (50,00) (00,200) (200,400)

16 6. Consitency Properties of EC In this section we are interested in consistency properties of three efficient criteria EC v,d, EC u,d and EC a,d defined from IC v,d, IC u,d and IC a,d through (2.7). Note that consistency properties of IC v,d, IC u,d and IC a,d are given in Theorems 3., 4. and 5.. Using these consitency properties it is expected that EC v,d, EC u,d and EC a,d have similar consistency properties. In fact, the following result holds. Theorem 6.. Suppose that the assumptions A and A2 are satisfied. Then, it holds that () the efficient criterion EC v,d is consistent under the assumption A3v if d > and d/n 0. (2) the efficient criterion EC u,d is consistent under the assumption A3u if d > and d/n 0. (3) the efficient criterion EC a,d is consistent under the assumption A3a if d > and d/n 0. In order to examine the validity of the results and the speed of convergences we tried a numerical experiment. The simulation settings are similar to the cases of IC v,d, IC u,d and IC a,d except for that the following points: The total number of explanatory variables to be selected was changed to 5 from 0. The true covariance structures were set as follows for IND(independent covariance structure), UNIF(uniform covariance structure) and AUTO(autoregressive covariance structure): IND Σ = σ 2 v, I p, σ 2 v, = 2. UNIF Σ = σ 2 u, (ρ δ ij u, ), σ 2 u, = 2, ρ u, = 0.9. AUTO Σ = σ 2 a, (ρ i j a, ), σ 2 a, = 2, ρ a, = 0.9. Let EC A and EC B be the efficient criterion based on AIC and BIC, respectively. Selection rates of these criteria are given in Tables for each of three covariance structures. In the tables the column of x i denotes the selection rate for the i-th explanatory variable x i. The Under, 6

17 True and Over denote the underspecified models, the true model and the overspecified models, respectively. Table 6.. Selection percentages of EC A and EC B for (n, p) = (20, 0) n = 20, p = 0 Under True Over x x 2 x 3 x 4 x 5 EC A IND UNIF AUTO EC B IND UNIF AUTO Table 6.2. Selection percentages of EC A and EC B for (n, p) = (200, 00) n = 200, p = 00 Under True Over x x 2 x 3 x 4 x 5 EC A IND UNIF AUTO EC B IND UNIF AUTO Table 6.3. Selection percentages of EC A and EC B for (n, p) = (0, 20) n = 0, p = 20 Under True Over j = j = 2 j = 3 j = 4 j = 5 EC A IND UNIF AUTO EC B IND UNIF AUTO

18 Table 6.4. Selection percentages of EC A and EC B for (n, p) = (00, 200) n = 00, p = 200 Under True Over j = j = 2 j = 3 j = 4 j = 5 EC A IND UNIF AUTO EC B IND UNIF AUTO Selection of Covariance Structures In this section we consider high-dimensional properties of AIC and BIC for selection of covariance structures. The covariance structures considered are () IND(independent covariance structure), (2) UNIF(uniform covariance structure), (3) AUTO(autoregressive covariance structure), (4) NOST(no covariance structure). For the first three covariance structures, we use the same notation as in Sections 3, 4 and 5. The covariance matrix under NOST is denoted by Σ m,j. When E(Y) = X j Θ j, AIC criteria for these four models are expressed as A v,j = np log ˆσ 2 vj + np(log 2π + ) + 2(jp + ), A u,j = n log ˆτ 2j + n(p ) log ˆτ j + np(log 2π + ) + 2(jp + 2), A a,j = np log ˆσ aj 2 + n(p ) log( ˆρ 2 aj) + np(log 2π + ) + 2(jp + 2), A m,j = n log ˆΣ j + np(log 2π + ) + 2 {jp + 2 } p(p + ), The expressions for ˆσ vj, 2 ˆτ j, ˆτ 2j, ˆσ aj 2 and ˆρ 2 aj are given in Sections 3, 4 and 5. The BIC are defined from AIC by replacing 2 in the 2 the number of independent parameters to log n. In order to see high-dimensional behaviors of AIC and BIC, simulation experiments were done. For multivariate regression models, we selected k = 3, k ω = 5. 8

19 The multivariate regression model was set by the same way as in Sections 3, 4 and 5. For the first covariance structure, σ 2 = 2. For the second and third covariance structures, σ 2 = 2, ρ = 0.9. For no-strctured case, the true covariance matrix Σ follows: ( σ ii = σ i ), i =,..., p, p = (σ ij ) was set as The other elements of Σ = (σ ij ) are i.i.d. from U(0.3, 0.9). The simulation results are given in Table 7.. Table 7.. Selection rates of AIC for the four covariance structures n = 20, p = 0 n = 200, p = 00 TRUE IND UNIF AUTO NOST IND UNIF AUTO NOST IND UNIF AUTO NOST It is seen that though it is difficult to select IND covariance structure correctively, but the other three model will be correctly selected. IND covariance structure can be seen as a limit of UNIF and AUTO covariance matrices. So, it seems that it is difficult to select IND covariance structure correctively. On the other hand, we consider another independent covariance structure with different variances given by Σ = diag(σ 2,..., σ 2 p) whose structure is expressed as DIAG. The AIC is given by AIC 5j = n p log ˆσ ii 2 + np(log 2π + ) + 2(jp + p), i= where ˆσ 2 ii = n e iw j e i, W j = Y (I n P j )Y, and e i is the p component vector with i-th component and other components zero. In our simulation 9

20 experiment, the variances were defined by ( σi 2 = σ i ), i =,..., p, p where σ 2 = 2. The simulation result is given in Table 7.2. Table 7.2. Selection rates of AIC for the five covariance structures n = 20, p = 0 TRUE IND UNIF AUTO NOST DIAG IND UNIF AUTO NOST DIAG n = 200, p = 00 TRUE IND UNIF AUTO NOST DIAG IND UNIF AUTO NOST DIAG It is seen that AIC is consistent for selection of the five covariance structures in high-dimensional situation. Similar experiments were done for BIC. The result for selection of the five covariance structures is given in Table 7.3. It seems that BIC coverges more firstly to the true model than AIC, except for NOST. BIC chooses UNIF when the true is NOST. This will come from that our setting for NOST will be near UNIF. 20

21 Table 7.3 Selection rates of BIC for the four covariance strucures n = 20, p = 0 TRUE IND UNIF AUTO NOST DIAG IND UNIF AUTO NOST DIAG n = 200, p = 00 TRUE IND UNIF AUTO NOST DIAG IND UNIF AUTO NOST DIAG Concluding Remarks In this paper, firstly we consider to select regression variables in p variate regression model with one of three covariance structures; () IND(independent covariance structure), (2) UNIF(uniform covariance structure), (3) AUTO (autoregressive covariance structure). As a selection method, a general information IC g,d was considered for each of three covariance structures, where d is a positive constant and may depend on the sample size n. When d = 2 and log n, IC g,d becomes to AIC and BIC, respectively. Under a high-dimensional asymptotic framework p/n c (0, ), it was shown that IC g,d with g = v, u or a is consistent under the assumption of A3g if d/n 0 and d >. Further, in order to avoid a computational problem of IC g,d, we study EC g,d. It was pointed that EC g,d has a consistency property similar to IC g,d. The result was obtained by assuming normality. It is left to extend the result to the case of non-normality. 2

22 Next, under multivariate regression model, we examined to select covariance structures. The covariance structures picked up are (4) NOST(no covariance structure) and (5) IND (independent covariance structure), in addition to the three covariance structures (), (2) and (3). Two criteria AIC and BIC were examined through a simulation experiment in a high-dimensional setting, It was seen that (i) the four covariance structures except IND can be selected correctly by using AIC, (ii) BIC converges to the true model more firstly than AIC, except for IND. The proofs of theoretical results on these properties will be given in a future work. Appendix: The Proofs of Theorems 3., 4. and 5. A. Outline of Our Proofs and Preliminary Lemma First we explain an outline of our proof. In general, let F be a finite set of candidate models j(or M j ). Assume that j is the minimum model including the true model and j F. Let T j (n) be a general criterion for model j, which depends on parameters p and n. The best model chosen by minimizing T j (p, n) is written as ĵ T (p, n) = arg min j F T j (p, n). Suppose that we are interested in asymptotic behavior of ĵ T (p, n) when p/n tends to c > 0. In order to show a consistency of T j (p, n), we may check a sufficient condition such that for any j j F, there exists a sequence {a p,n } with a p,n > 0, a p,n {T j (p, n) T j (p, n)} p b j > 0. In fact, the condition implies that for any j j F, and P (ĵ T (p, n) = j) P (T j (p, n) < T j (p, n)) 0, P (ĵ T (p, n) = j ) = j j F P (ĵ T (p, n) = j). For the proofs of Theorems 3., 4. and 5., we use the following Lemma frequently. 22

23 Lemma A. Suppose that a p p symmetric random matrix W is distributed as a noncentaral Wishart distribution W p (n k, Σ; Ω). Let A be a given p p symmetric matrix. We consider asymptotic behavior of traw when p and n are large as in the way such that p/n c (0, ), where k is fixed. Suppose that Then, it holds that lim p traσ = a2 > 0, lim np traω = η2 > 0. T p,n = np traw p a 2 + η 2. Proof. Let m = n k. We may write W as W = Σ /2 (z z + + z m z m)σ /2, (A.) (A.2) where z i N p (ζ i, I p ), i =,..., m and z i s are independent. Here, Ω = µ µ + + µ m µ m and ζ i = Σ /2 µ i, i =,..., m. Note that traw is expressed as a quadratic form of z = (z,..., z m) as follows: traw = z Bz, where B = I m Σ /2 AΣ /2. Note that z N mp (ζ, I mp ), where ζ = (ζ,..., ζ m). Then, it is known (see, e.g., Gupta ) that for any symmetric matrix B, E[z Bz] = trb + ζ Bζ, Var(z Bz) = 2trB 2 + 4ζ B 2 ζ. Especially, when B = I m Σ /2 AΣ /2, we have E[z Bz] = mtraσ + traω, Var(z Bz) = 2mtr (AΣ) 2 + 4trAΣAΩ. Under the assumption (A.), we have E(T p,n ) = m np traσ + np traω a 2 + η 2. 23

24 Further, Var(T p,n ) = 2m (np) 2 tr(aσ)2 + 4 (np) traσaω 2 2m (np) 2 (traσ)2 + 4 (np) traσtraω 2 0. These imply our conclution. to In a special case A = I p and Σ = σ 2 I p, the assumptions in (A.) become The conclusion is that lim p traσ = lim p pσ2 = σ 2, lim traω = lim np np trω p η 2. T p,n = np trw p σ 2 ( + δ 2 ), (A.3) where δ 2 = (/σ 2 )η 2. Note that (/σ 2 )trw χ 2 (n k)p (δ2 ). This implies that σ 2 trw χ2 (n k)p(δ 2 ). Therefore, under a high-dimensional asymptotic framework p/n c (0, ), np χ2 (n k)p(δ 2 ) p + η 2, if (np) δ 2 η 2. More generally, we use the following result. np χ2 (n k)(p h)(δ 2 ) p + η 2, (A.4) (A.5) if (np) δ 2 η 2, where k and h are constants or more generally, they may be the constants satisfying k/n 0 and h/n 0. following property for noncentral χ 2 -square distribution. if n δ 2 η 2. n χ2 n k(δ 2 ) p + η 2, Further, we use the (A.6) 24

25 A2. The Proof of Theorem 3. Using (3.3) we can write IC v,d,j IC v,d,j = np log ˆσ 2 v,j np log ˆσ 2 v,j + d(k j k j )p. Lemma 3. shows that np σ 2 ˆσ2 j χ 2 (n k j )p(δ 2 j ), δ 2 j = σ 2 tr(x j Θ j ) (I n P j )(X j Θ j ). In particular, np σ 2 ˆσ2 j χ 2 (n k j )p. First, consider the case j j. Under the assumption A3v, np np σ 2 ˆσ2 j = (n k j)p np (n k j )p σ 2 ˆσ2 p j + ηj 2, np np σ 2 ˆσ2 j = (n k j )p np (n k j )p σ 2 ˆσ2 p j. Therefor np (IC v,g,j IC v,g,j ) = log ˆσ j 2 log ˆσ j 2 + d n (k j k j ) ( ) ( ) np np = log np σ 2 ˆσ2 j log np σ 2 ˆσ2 j + d n (k j k j ) p log( + ηj 2 ) + log + 0 = log( + ηj 2 ) > 0, when g/n 0. Next, consider the case j j. Then Further, IC v,g,j IC v,g,j = np log ˆσ2 v,j ˆσ 2 v,j + d(k j k j )p. log ˆσ2 j = log try (I n P j )Y ˆσ j 2 try (I n P j )Y ( ) = log + try (P j P j )Y try (I n P j )Y ( ) = log + χ2 (k j k j )p χ 2 (n k j )p. 25

26 Using the fact that χ 2 m/m as m, we have { } n log ˆσ2 j = n log + (k j k j ) χ 2 (k j k )p/((k j j k j )p) ˆσ j 2 (n k j ) χ 2 (n k j )p /((n k, j)p) and hence p {IC v,d,j IC v,d,j } = n log ˆσ2 j ˆσ j + 2(k j k j ) if d >. p (k j k j ) + d(k j k j ) = (d )(k j k j ) > 0, A3. The Proof of Theorem 4. Using (4.3) we can write IC u,d,j IC u,d,j = n(p )(log ˆτ 2 j log ˆτ 2 j ) Lemma 4. shows that for a general j ω, + n(log ˆτ 2 2j log ˆτ 2 2j ) + d(k j k j )p. npτ ˆτ j χ 2 (p )(n k j )(δ 2 j), nτ 2 ˆτ 2j χ 2 n k j (δ 2 2j), (A.7) (A.8) where δ 2 j = τ trh 2(X j Θ j ) (I n P j )(X j Θ j )H 2, δ 2 2j = τ 2 trh (X j Θ j ) (I n P j )(X j Θ j )h. Note thta if j j, δ j = 0 and δ 2j = 0. First consider the case j j. Then, using (A.5) and (A.7), we have ˆτ j = χ2 (n k j )p (δ2 j) ˆτ j χ 2 (n k j )p p + η 2. 26

27 Similarly, using (A.6) and (A.8), we have ˆτ 2j = χ2 (n k j )p (δ2 2j) ˆτ 2j χ 2 (n k j )p These results imply the following: if d/n 0. np (IC u,d,j IC u,d,j ) = n(p ) np Next, consider the case j j. Then Similarly p + η 2. log(ˆτ j /ˆτ j ) + p log(ˆτ 2j/ˆτ 2j ) + d n (k j k j ) log( + η 2 j) > 0 log ˆτ j = log trh 2Y (I n P j )YH 2 ˆτ j trh 2Y (I n P j )YH 2 ( = log + trh 2Y ) (P j P j )YH 2 trh 2Y (I n P j )YH 2 ( ) = log + χ2 (k j k j )(p ) χ 2 (n k j )(p ) k j k j n k j. log ˆτ 2j = log h Y (I n P j )Yh ˆτ 2j vh Y (I n P j )Yh ( = log + h Y ) (P j P j )Yh h Y (I n P j )Yh ( ) = log These results imply the following: + χ2 k j k j χ 2 n k j k j k j n k j. p (IC u,d,j IC u,d,j ) p (k j k j ) + d(k j k j ) = (d )(k j k j ) > 0, if d >. Combining these to the result in the case j j, we get Therem

28 A4. The Proof of Theorem 5. In this section, for notational simplification, we put Σ a,, σ 2 a, and ρ a, into Σ, σ 2 and ρ, respectively. Using (5.5) we can write IC a,d,j IC a,d,j = np log(ˆσ 2 a,j/ˆσ 2 a,j ) + n(p ) log { ( ˆρ 2 a,j)/( ˆρ 2 a,j ) } + d(k j k j )p. Further, using (5.3), the above expression can be expressed as IC a,d,j IC a,d,j = np log {(n k j )/(n k j )} n log { ( ˆρ 2 j)/( ˆρ 2 j ) } + np log { (a j ˆρ 2 j 2a 2j ˆρ j + a 0j )/(a j ˆρ 2 j 2a 2j ˆρ j + a 0j ) } + d(k j k j )p. (A.9) Here, the maximum likelihood estimators ˆσ 2 j and ˆρ j are defined in terms of a ij = trc i S j, i = 0,, 2, (n k j )S j = W j = Y (I n P j )Y W p (n j, Σ; Ω j ). For the definition of C i and Ω j, see Section 5. We also use the notation b ij = trc i W j, i = 0,, 2, W j = (n k j )S j = Y (I n P j )Y W p (n k j, Σ; Ω j ). First, consider the case j j. Under the assumption A3a, it is possible to get asymptotic behavior of a ij and b ij which is given in lator. It is shown that and more precisely ˆρ j = a 2j a j + p ˆρ j = a 2j a j + O p (p ), { a } 2(a 0 a a 2 2) + O p (p 2 ). a (a a 2 )(a + a 2 ) These imply that ( ) log( ˆρ 2 j) = log a2 2j + O a 2 p (p ), j ( ) log(a j ˆρ 2 j 2a 2j ˆρ j + a 0j ) = log a 0j a2 2j + O p (p 2 ). a j 28

29 The above expansions hold also for j = j. (A.9) and noting b ij = (n k j )a ij, we have Substituting these results to np (IC a,d,j IC a,d,j ) = { ( ) ( )} log b2 2j log b2 2j (A.0) p b 2 j b 2 j { ( ) ( )} + log b 0j b2 2j log b 0j b2 2j + d b j b j n (k j k j ) + O p (p 2 ). Using Lemma A, under the assumption A3a we have np b 0j = n k j n np b j = n k j n np b 2j = n k j n (n k j )p b 0j (n k j )p b j (n k j )p b 2j p σ 2 + η 2 0j, p σ 2 + η 2 j, p σ 2 ρ + η 2 2j, p trc 0Σ = p p σ2 σ 2, p trc Σ = p 2 p σ2 σ 2, p trc 2Σ = p p σ2 ρ σ 2 ρ. Asymptotic behaviors for b ij are obtained from the ones by putting η ij = 0. These imply that np (IC a,d,j IC a,d,j ) p log (σ 2 + η 20j (σ2 ρ + η2j) 2 2 ) log σ 2 ( ρ 2 ), σ 2 + ηj 2 if d/n 0. Now we show log (σ 2 + η 20j (σ2 ρ + η2j) 2 2 ) log σ 2 ( ρ 2 ) > 0. σ 2 + ηj 2 (A.) Note that np (trc 0Ω j trc Ω j ) = np tr(c 0 C )Ω j = np (the sum of (, ) and (p, p) elements of Ω j) 0, 29

30 which implies η 2 0j = η 2 j. Using this equality, we can express log (σ 2 + η 20j (σ2 ρ + η2j) 2 2 ) log σ 2 ( ρ 2 ) σ 2 + ηj 2 ( ) ( = log + η2 0j + log + η2 0j η 2 ) ( 2j + log + η2 0j + η 2 ) 2j. σ 2 σ 2 ( ρ) σ 2 ( + ρ) The first and the third terms are positive. The quantity η 2 0j η 2 2j in the second term can be expressed as the limit of np (trc 0Ω j trc 2 Ω j ) = np tr(c 0 C 2 )Ω j. Here, the matrix C 0 C 2 is expressed as C 0 C 2 = , and x (C 0 C 2 )x = x 2 x x 2 + x 2 2 x 2 x 3 + x x 2 p x p x p + x 2 p = 2 { x 2 + (x x 2 ) (x p x p ) 2 + x 2 p} > 0, except for the case x = = x p. Therefore, the second term is nonnegative, and (IC a,d,j IC a,d,j )/(np) converges to a positive constant if d/n 0. Next, consider the case j j. From (A.0) we have p (IC a,d,j IC a,d,j ) = n { ( ) log b2 2j p b 2 j { ( ) + n log b 0j b2 2j b j ( log b2 2j b 2 j )} ( log b 0j b2 2j b j 30 )} + d(k j k j ) + O p (p ).

31 Then, it is easely seen that { ( ) ( )} n log b2 2j log b2 2j p b 2 j b 2 j p { log( ρ 2 ) log( ρ 2 ) } = 0. c Note that { ( ) ( )} n log b 0j b2 2j log b 0j b2 2j b j b j = n { (log b j log b j ) + log(b 0j b j b 2 2j) log(b 0j b j b 2 2j ) }. Further, we use the following relation between b ij and b ij : b ij = trc i Y (I n P j )Y = trc i Y (I n P j )Y trc i Y (P j P j )Y = b ij b ij+, where b ij+ = trc i Y (P j P j )Y. Since Y (P j P j )Y W p (k j k j, Σ), it holds that (k j k j )p b 0j + p σ 2, (k j k j )p b j + p σ 2, (k j k j )p b 2j + p σ 2 ρ. Therefore, noting that b ij = O p (np) and b ij+ = O p (p). we have n(log b j log b j ) = n log ( b j+ b j ) nbj+ b j, and hence Consider n(log b j log b j ) p (k j k j ). f = n{log(b 0j b j b 2 2j) log(b 0j b j b 2 2j )}. Sustituting b ij = b ij b ij+ to the above expression, we have f = n log( h) nh, 3

32 where h = (b 0j b 0j+ ) (b j b j+ ) (b 0j b j+ + b j b 0j+ 2b 2j b 2j+ b 0j+ b 0j+ + b 2 2j + ) Considering the limiting values of each terms in h, we have f p (k j k j ) ( ρ 2 ) (k j k j ) ( ρ 2 ) + 2(k j k j )ρ ( ρ 2 ) = 2(k j k j ). Summerizing the above results, for j j, p (IC a,d,j IC a,d,j ) p (k j k j ) 2(k j k j ) + d(k j k j ) = (d )(k j k j ) > 0, if d >. This completes the proof. A5. The Proof of Theorem 6. For a notational simplicity, we denote IC g,d,j by IC(j). Note that j ω = {, 2,..., k} and k is finite. Without loss of generality, we may assume that the true model is j = {,..., b} and b = k j. Under the assumption A, A2 and A3 3, it was shown that our information criterion ĵ IC has a highdimensiona consistency property. proving that () for j F, and (2) for j F + and j j, The consitency property was shown by m {IC(j) IC(j )} p γ j > 0, (A.2) m 2 {IC(j) IC(j )} p (d )(k j k j ) > 0, (A.3) where d >, m = np and m 2 = p. 32

33 From the definition of our efficient criterion ĵ EC, we have P (ĵ ED = j ) = P (IC(j () ) IC(j ω ) > 0,..., IC(j (b) ) IC(j ω ) > 0, IC(j (b+) IC(j ω ) < 0,..., IC(j (k) ) IC(j ω ) < 0) ( b k ) = P (IC(j (i) ) IC(j ω )) > 0) (IC(j (i) ) IC(j ω )) < 0), i= which can be expressed as ( b P (IC(j (i) ) IC(j ω ) < 0) = i= i=b+ k i=b+ b P (IC(j (i) ) IC(j ω ) < 0) i= ) (IC(j (i) ) IC(j ω ) > 0) k j=b+ b { P (IC(j (i) ) IC(j ω ) > 0)} i= k { P (IC(j (i) ) IC(j ω ) < 0)}. j=b+ Therefore, we have P (IC(j (i) ) IC(j ω ) > 0) P (ĵ ED = j ) i j { P (IC(j (i) ) IC(j ω ) > 0)} i j ω /j { P (IC(j (i) ) IC(j ω ) < 0)}. Now, consider to evaluate the following probabilities: (A.4) i j, P (IC(j (i) ) IC(j ω ) > 0), i j ω /j, P (IC(j (i) ) IC(j ω ) < 0). When i j, j (i) F, and hence, using (A.2), it holds that m (IC(j (i) ) IC(j ω )) = m (IC(j (i) ) IC(j )) m (IC(j ω ) IC(j )) p γ i + 0 = γ i > 0. 33

34 When i j ω /j, j (i) F + and hence, using (A.3), it holds that m 2 (IC(j (i) ) IC(j ω )) = m 2 (IC(j (i) ) IC(j )) m 2 (IC(j ω ) IC(j )) These results imply that p (d )(k b) (d )(k b) = (d ) < 0. i j, lim P (IC(j (i) ) IC(j ω ) > 0) =, i j ω /j, lim P (IC(j (i) ) IC(j ω ) < 0) =. Using the above results, we can see that the right-hand side of (A.4) tends to [ { } + j j ] { } =, j j k /j and P (ĵ ED = j ). This completes the proof of Theorem 6.. Acknowledgements The second author s research is partially supported by the Ministry of Education, Science, Sports, and Culture, a Grant-in-Aid for Scientific Research (C), 6K00047, References [] Akaike, H. (973). Information theory and an extension of the maximum likelihood principle. In 2nd. International Symposium on Information Theory (eds. B. N. Petrov and F. Csáki), , Akadémiai Kiadó, Budapest. [2] Bedrick, E. J. and Tsai, C.-L. (994). Model selection for multivariate regression in small samples. Biometrics, 50,

35 [3] Fujikoshi, Y., Kanda, T. and Tanimura, N. (990). The growth curve model with an autoregressive covariance structure. Ann. Inst. Statist. Math., 42, [4] Fujikoshi, Y. and Satoh, K. (997). Modified AIC and C p in multivariate linear regression. Biometrika, 84, [5] Fujikoshi, Y., Sakurai, T. and Yanagihara, H. (203). Consistency of high-dimensional AIC-type and Cp-type criteria in multivariate linear regression. J. Multivariate Anal., 49, [6] Fujikoshi, Y., Ulyanov, V. V. and Shimizu, R. (200). Multivariate Statistics: High-Dimensional and Large-Sample Approximations. Wiley, Hobeken, N.J. [7] Nishii, R. (984). Asymptotic properties of criteria for selection of variables in multiple regression. Ann. Statist., 2, [8] Nishii, R., Bai, Z. D. and Krishnaia, P. R. (988). Strong consistency of the information criterion for model selection in multivariate analysis. Hiroshima Math. J., 8, [9] Rothman, A., Levina, E. and Zhu, J. (200). Sparse multivariate regression with covariance estimation. J. Comput. Graph.. Stat., 9, [0] Yanagihara, H., Wakaki, H. and Fujikoshi, Y. (205). A consistency property of the AIC for multivariate linear models when the dimension and the sample size are large. Electron. J. Stat., 9, [] Zhao, L. C., Krishnaia, P. R. and Bai, Z. D. (986). On detection of the number of signals in presence of white noise. J. Multivariate Anal., 20,

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,

More information

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai

More information

Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis

Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Tetsuro Sakurai and Yasunori Fujikoshi Center of General Education, Tokyo University

More information

Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension

Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension (Last Modified: November 3 2008) Mariko Yamamura 1, Hirokazu Yanagihara 2 and Muni S. Srivastava 3

More information

A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration

A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration Ryoya Oda, Yoshie Mima, Hirokazu Yanagihara and Yasunori Fujikoshi Department of Mathematics, Graduate

More information

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,

More information

An Unbiased C p Criterion for Multivariate Ridge Regression

An Unbiased C p Criterion for Multivariate Ridge Regression An Unbiased C p Criterion for Multivariate Ridge Regression (Last Modified: March 7, 2008) Hirokazu Yanagihara 1 and Kenichi Satoh 2 1 Department of Mathematics, Graduate School of Science, Hiroshima University

More information

Bias-corrected AIC for selecting variables in Poisson regression models

Bias-corrected AIC for selecting variables in Poisson regression models Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,

More information

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems

More information

A Study on the Bias-Correction Effect of the AIC for Selecting Variables in Normal Multivariate Linear Regression Models under Model Misspecification

A Study on the Bias-Correction Effect of the AIC for Selecting Variables in Normal Multivariate Linear Regression Models under Model Misspecification TR-No. 13-08, Hiroshima Statistical Research Group, 1 23 A Study on the Bias-Correction Effect of the AIC for Selecting Variables in Normal Multivariate Linear Regression Models under Model Misspecification

More information

A fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables

A fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables A fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables Ryoya Oda and Hirokazu Yanagihara Department of Mathematics,

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

Modfiied Conditional AIC in Linear Mixed Models

Modfiied Conditional AIC in Linear Mixed Models CIRJE-F-895 Modfiied Conditional AIC in Linear Mixed Models Yuki Kawakubo Graduate School of Economics, University of Tokyo Tatsuya Kubokawa University of Tokyo July 2013 CIRJE Discussion Papers can be

More information

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori On Properties of QIC in Generalized Estimating Equations Shinpei Imori Graduate School of Engineering Science, Osaka University 1-3 Machikaneyama-cho, Toyonaka, Osaka 560-8531, Japan E-mail: imori.stat@gmail.com

More information

Likelihood Ratio Tests in Multivariate Linear Model

Likelihood Ratio Tests in Multivariate Linear Model Likelihood Ratio Tests in Multivariate Linear Model Yasunori Fujikoshi Department of Mathematics, Graduate School of Science, Hiroshima University, 1-3-1 Kagamiyama, Higashi Hiroshima, Hiroshima 739-8626,

More information

Ken-ichi Kamo Department of Liberal Arts and Sciences, Sapporo Medical University, Hokkaido, Japan

Ken-ichi Kamo Department of Liberal Arts and Sciences, Sapporo Medical University, Hokkaido, Japan A STUDY ON THE BIAS-CORRECTION EFFECT OF THE AIC FOR SELECTING VARIABLES IN NOR- MAL MULTIVARIATE LINEAR REGRESSION MOD- ELS UNDER MODEL MISSPECIFICATION Authors: Hirokazu Yanagihara Department of Mathematics,

More information

Comparison with RSS-Based Model Selection Criteria for Selecting Growth Functions

Comparison with RSS-Based Model Selection Criteria for Selecting Growth Functions TR-No. 14-07, Hiroshima Statistical Research Group, 1 12 Comparison with RSS-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 2, Mariko Yamamura 1 and Hirokazu Yanagihara 2 1

More information

STRONG CONSISTENCY OF THE AIC, BIC, C P KOO METHODS IN HIGH-DIMENSIONAL MULTIVARIATE LINEAR REGRESSION

STRONG CONSISTENCY OF THE AIC, BIC, C P KOO METHODS IN HIGH-DIMENSIONAL MULTIVARIATE LINEAR REGRESSION STRONG CONSISTENCY OF THE AIC, BIC, C P KOO METHODS IN HIGH-DIMENSIONAL MULTIVARIATE LINEAR REGRESSION AND By Zhidong Bai,, Yasunori Fujikoshi, and Jiang Hu, Northeast Normal University and Hiroshima University

More information

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical

More information

On testing the equality of mean vectors in high dimension

On testing the equality of mean vectors in high dimension ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.

More information

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Junfeng Shang Bowling Green State University, USA Abstract In the mixed modeling framework, Monte Carlo simulation

More information

A Fast Algorithm for Optimizing Ridge Parameters in a Generalized Ridge Regression by Minimizing an Extended GCV Criterion

A Fast Algorithm for Optimizing Ridge Parameters in a Generalized Ridge Regression by Minimizing an Extended GCV Criterion TR-No. 17-07, Hiroshima Statistical Research Group, 1 24 A Fast Algorithm for Optimizing Ridge Parameters in a Generalized Ridge Regression by Minimizing an Extended GCV Criterion Mineaki Ohishi, Hirokazu

More information

Jackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification

Jackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification Jackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification (Last Modified: May 22, 2012) Yusuke Hashiyama, Hirokazu Yanagihara 1 and Yasunori

More information

EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large

EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large Tetsuji Tonda 1 Tomoyuki Nakagawa and Hirofumi Wakaki Last modified: March 30 016 1 Faculty of Management and Information

More information

A Variant of AIC Using Bayesian Marginal Likelihood

A Variant of AIC Using Bayesian Marginal Likelihood CIRJE-F-971 A Variant of AIC Using Bayesian Marginal Likelihood Yuki Kawakubo Graduate School of Economics, The University of Tokyo Tatsuya Kubokawa The University of Tokyo Muni S. Srivastava University

More information

Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions

Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions c 215 FORMATH Research Group FORMATH Vol. 14 (215): 27 39, DOI:1.15684/formath.14.4 Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 1,

More information

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa

1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University

More information

Orthogonal decompositions in growth curve models

Orthogonal decompositions in growth curve models ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 4, Orthogonal decompositions in growth curve models Daniel Klein and Ivan Žežula Dedicated to Professor L. Kubáček on the occasion

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors

On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors Takashi Seo a, Takahiro Nishiyama b a Department of Mathematical Information Science, Tokyo University

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples

Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples Nobumichi Shutoh, Masashi Hyodo and Takashi Seo 2 Department of Mathematics, Graduate

More information

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and

More information

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem T Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem Toshiki aito a, Tamae Kawasaki b and Takashi Seo b a Department of Applied Mathematics, Graduate School

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

Akaike Information Criterion

Akaike Information Criterion Akaike Information Criterion Shuhua Hu Center for Research in Scientific Computation North Carolina State University Raleigh, NC February 7, 2012-1- background Background Model statistical model: Y j =

More information

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information

ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE

ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE J Jaan Statist Soc Vol 34 No 2004 9 26 ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE Yasunori Fujikoshi*, Tetsuto Himeno

More information

Ch4. Distribution of Quadratic Forms in y

Ch4. Distribution of Quadratic Forms in y ST4233, Linear Models, Semester 1 2008-2009 Ch4. Distribution of Quadratic Forms in y 1 Definition Definition 1.1 If A is a symmetric matrix and y is a vector, the product y Ay = i a ii y 2 i + i j a ij

More information

Some New Properties of Wishart Distribution

Some New Properties of Wishart Distribution Applied Mathematical Sciences, Vol., 008, no. 54, 673-68 Some New Properties of Wishart Distribution Evelina Veleva Rousse University A. Kanchev Department of Numerical Methods and Statistics 8 Studentska

More information

Edgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population

Edgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population Edgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population (Last Modified: April 24, 2008) Hirokazu Yanagihara 1 and Ke-Hai Yuan 2 1 Department of Mathematics, Graduate

More information

Obtaining simultaneous equation models from a set of variables through genetic algorithms

Obtaining simultaneous equation models from a set of variables through genetic algorithms Obtaining simultaneous equation models from a set of variables through genetic algorithms José J. López Universidad Miguel Hernández (Elche, Spain) Domingo Giménez Universidad de Murcia (Murcia, Spain)

More information

Lecture Stat Information Criterion

Lecture Stat Information Criterion Lecture Stat 461-561 Information Criterion Arnaud Doucet February 2008 Arnaud Doucet () February 2008 1 / 34 Review of Maximum Likelihood Approach We have data X i i.i.d. g (x). We model the distribution

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of

More information

Statistical Inference On the High-dimensional Gaussian Covarianc

Statistical Inference On the High-dimensional Gaussian Covarianc Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

A NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION

A NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION J. Japan Statist. Soc. Vol. 32 No. 2 2002 165 181 A NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION Hisayuki Tsukuma* The problem of estimation in multivariate linear calibration

More information

On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control

On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control Takahiro Nishiyama a a Department of Mathematical Information Science, Tokyo University of Science,

More information

An Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data

An Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data An Akaike Criterion based on Kullback Symmetric Divergence Bezza Hafidi a and Abdallah Mkhadri a a University Cadi-Ayyad, Faculty of sciences Semlalia, Department of Mathematics, PB.2390 Marrakech, Moroco

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,

More information

The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models

The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population Health

More information

Approximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work

Approximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work Hiroshima Statistical Research Group: Technical Report Approximate interval estimation for PMC for improved linear discriminant rule under high dimensional frame work Masashi Hyodo, Tomohiro Mitani, Tetsuto

More information

Discrepancy-Based Model Selection Criteria Using Cross Validation

Discrepancy-Based Model Selection Criteria Using Cross Validation 33 Discrepancy-Based Model Selection Criteria Using Cross Validation Joseph E. Cavanaugh, Simon L. Davies, and Andrew A. Neath Department of Biostatistics, The University of Iowa Pfizer Global Research

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

Bootstrap Variants of the Akaike Information Criterion for Mixed Model Selection

Bootstrap Variants of the Akaike Information Criterion for Mixed Model Selection Bootstrap Variants of the Akaike Information Criterion for Mixed Model Selection Junfeng Shang, and Joseph E. Cavanaugh 2, Bowling Green State University, USA 2 The University of Iowa, USA Abstract: Two

More information

Akaike criterion: Kullback-Leibler discrepancy

Akaike criterion: Kullback-Leibler discrepancy Model choice. Akaike s criterion Akaike criterion: Kullback-Leibler discrepancy Given a family of probability densities {f ( ; ψ), ψ Ψ}, Kullback-Leibler s index of f ( ; ψ) relative to f ( ; θ) is (ψ

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review

More information

MLES & Multivariate Normal Theory

MLES & Multivariate Normal Theory Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate

More information

AR-order estimation by testing sets using the Modified Information Criterion

AR-order estimation by testing sets using the Modified Information Criterion AR-order estimation by testing sets using the Modified Information Criterion Rudy Moddemeijer 14th March 2006 Abstract The Modified Information Criterion (MIC) is an Akaike-like criterion which allows

More information

Tests for Covariance Matrices, particularly for High-dimensional Data

Tests for Covariance Matrices, particularly for High-dimensional Data M. Rauf Ahmad Tests for Covariance Matrices, particularly for High-dimensional Data Technical Report Number 09, 200 Department of Statistics University of Munich http://www.stat.uni-muenchen.de Tests for

More information

Approaches for Multiple Disease Mapping: MCAR and SANOVA

Approaches for Multiple Disease Mapping: MCAR and SANOVA Approaches for Multiple Disease Mapping: MCAR and SANOVA Dipankar Bandyopadhyay Division of Biostatistics, University of Minnesota SPH April 22, 2015 1 Adapted from Sudipto Banerjee s notes SANOVA vs MCAR

More information

Quadratic forms in skew normal variates

Quadratic forms in skew normal variates J. Math. Anal. Appl. 73 (00) 558 564 www.academicpress.com Quadratic forms in skew normal variates Arjun K. Gupta a,,1 and Wen-Jang Huang b a Department of Mathematics and Statistics, Bowling Green State

More information

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine

More information

Bootstrap prediction and Bayesian prediction under misspecified models

Bootstrap prediction and Bayesian prediction under misspecified models Bernoulli 11(4), 2005, 747 758 Bootstrap prediction and Bayesian prediction under misspecified models TADAYOSHI FUSHIKI Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106-8569,

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Estimation of parametric functions in Downton s bivariate exponential distribution

Estimation of parametric functions in Downton s bivariate exponential distribution Estimation of parametric functions in Downton s bivariate exponential distribution George Iliopoulos Department of Mathematics University of the Aegean 83200 Karlovasi, Samos, Greece e-mail: geh@aegean.gr

More information

Vector Auto-Regressive Models

Vector Auto-Regressive Models Vector Auto-Regressive Models Laurent Ferrara 1 1 University of Paris Nanterre M2 Oct. 2018 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions

More information

VAR Models and Applications

VAR Models and Applications VAR Models and Applications Laurent Ferrara 1 1 University of Paris West M2 EIPMC Oct. 2016 Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions

More information

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Estimating prediction error in mixed models

Estimating prediction error in mixed models Estimating prediction error in mixed models benjamin saefken, thomas kneib georg-august university goettingen sonja greven ludwig-maximilians-university munich 1 / 12 GLMM - Generalized linear mixed models

More information

On corrections of classical multivariate tests for high-dimensional data

On corrections of classical multivariate tests for high-dimensional data On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics

More information

High-dimensional Covariance Estimation Based On Gaussian Graphical Models

High-dimensional Covariance Estimation Based On Gaussian Graphical Models High-dimensional Covariance Estimation Based On Gaussian Graphical Models Shuheng Zhou, Philipp Rutimann, Min Xu and Peter Buhlmann February 3, 2012 Problem definition Want to estimate the covariance matrix

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

Math 181B Homework 1 Solution

Math 181B Homework 1 Solution Math 181B Homework 1 Solution 1. Write down the likelihood: L(λ = n λ X i e λ X i! (a One-sided test: H 0 : λ = 1 vs H 1 : λ = 0.1 The likelihood ratio: where LR = L(1 L(0.1 = 1 X i e n 1 = λ n X i e nλ

More information

Variable selection in multiple regression with random design

Variable selection in multiple regression with random design Variable selection in multiple regression with random design arxiv:1506.08022v1 [math.st] 26 Jun 2015 Alban Mbina Mbina 1, Guy Martial Nkiet 2, Assi Nguessan 3 1 URMI, Université des Sciences et Techniques

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates

More information

Empirical Likelihood Tests for High-dimensional Data

Empirical Likelihood Tests for High-dimensional Data Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on

More information

Model selection using penalty function criteria

Model selection using penalty function criteria Model selection using penalty function criteria Laimonis Kavalieris University of Otago Dunedin, New Zealand Econometrics, Time Series Analysis, and Systems Theory Wien, June 18 20 Outline Classes of models.

More information

Estimation of Threshold Cointegration

Estimation of Threshold Cointegration Estimation of Myung Hwan London School of Economics December 2006 Outline Model Asymptotics Inference Conclusion 1 Model Estimation Methods Literature 2 Asymptotics Consistency Convergence Rates Asymptotic

More information

Order Selection for Vector Autoregressive Models

Order Selection for Vector Autoregressive Models IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 51, NO. 2, FEBRUARY 2003 427 Order Selection for Vector Autoregressive Models Stijn de Waele and Piet M. T. Broersen Abstract Order-selection criteria for vector

More information

Lecture 20: Linear model, the LSE, and UMVUE

Lecture 20: Linear model, the LSE, and UMVUE Lecture 20: Linear model, the LSE, and UMVUE Linear Models One of the most useful statistical models is X i = β τ Z i + ε i, i = 1,...,n, where X i is the ith observation and is often called the ith response;

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics

More information

Cross-sectional space-time modeling using ARNN(p, n) processes

Cross-sectional space-time modeling using ARNN(p, n) processes Cross-sectional space-time modeling using ARNN(p, n) processes W. Polasek K. Kakamu September, 006 Abstract We suggest a new class of cross-sectional space-time models based on local AR models and nearest

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression Andrew A. Neath Department of Mathematics and Statistics; Southern Illinois University Edwardsville; Edwardsville, IL,

More information

Permutation-invariant regularization of large covariance matrices. Liza Levina

Permutation-invariant regularization of large covariance matrices. Liza Levina Liza Levina Permutation-invariant covariance regularization 1/42 Permutation-invariant regularization of large covariance matrices Liza Levina Department of Statistics University of Michigan Joint work

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 21 Model selection Choosing the best model among a collection of models {M 1, M 2..., M N }. What is a good model? 1. fits the data well (model

More information

Robust model selection criteria for robust S and LT S estimators

Robust model selection criteria for robust S and LT S estimators Hacettepe Journal of Mathematics and Statistics Volume 45 (1) (2016), 153 164 Robust model selection criteria for robust S and LT S estimators Meral Çetin Abstract Outliers and multi-collinearity often

More information