A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration
|
|
- Miranda Mitchell
- 5 years ago
- Views:
Transcription
1 A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration Ryoya Oda, Yoshie Mima, Hirokazu Yanagihara and Yasunori Fujikoshi Department of Mathematics, Graduate School of Science, Hiroshima University -3- Kagamiyama, Higashi-Hiroshima, Hiroshima , Japan Last Modified: December 2, 208 Abstract In a multivariate linear regression with a p-dimensional response vector y and a q- dimensional explanatory vector x, we consider a multivariate calibration problem requiring the estimation of an unknown explanatory vector x 0 corresponding to a response vector y 0, based on y 0 and n-samples of x and y. We propose a high-dimensional bias-corrected Akaike s information criterion HAIC C for selecting response variables. To correct the bias between a risk function and its estimator, we use a hybrid-high-dimensional asymptotic framework such that n tends to but p/n does not exceed. Through numerical experiments, we verify that the HAIC C performs better than a formal AIC. Introduction Multivariate linear regression is very widely used in various fields. Let y = y,..., y p and x = x,..., x q be a p-dimensional response vector and a q-dimensional nonstochastic explanatory vector, respectively. A multivariate linear regression is written as y = α + B x + ε, where α is a p-dimensional unknown vector of intercept coefficients, B is a q p unknown matrix of regression coefficients, and ε is a p-dimensional error vector. We assume that ε is distributed according to the p-dimensional normal distribution with a mean vector 0 p, which is a p-dimensional vector of zeros, and a p p covariance matrix Σ. Further, assume that there are n observations y, x,..., y n, x n, which are expressed as Y = y,..., y n : n p and X = x,..., x n : n q in matrix notation, and X is centralized X n = 0 q, where n is the n-dimensional vector of ones. Our model selection criterion is defined for the case that rankx = q p and n p q 2 > 0. Corresponding author. oda.stat@gmail.com Current address: Institute of educational foundation Fukuyama Akenohoshi, 3-4- Nishifukatsucho, Fukuyama-shi, Hiroshima , Japan.
2 In actual empirical contexts, calibration is often needed and is widely applied e.g., Alves and Poppi, 206; Bro, 2003; Carvalho et al., 205. A typical calibration problem can be described as follows. Suppose that a value y 0 of y in has been observed, but the corresponding value x 0 of x is unknown. Then the problem is to make an inference about an unknown vector x 0, based on y 0, Y, and X. Brown 982 summarized various aspects of this problem. There is considerable literature on point estimators of x 0 and confidence regions for x 0 in the controlled calibration model. Another calibration problem involves removing redundant response variables in estimating x 0. Generally, estimating a reduced model on a subset of the available data leads to an inferior fit. This will be the case when all the parameters are known. However, it is understood that when some response variables are redundant, we can expect better prediction by neglecting these redundant response variables. In this vein, two approaches are salient: one based on test procedures for a redundancy hypothesis and the other based on model selection procedures for selecting response variables. Brown 982 proposed a procedure based on a test for redundancy of response variables. In the context of model selection, Fujikoshi and Nishii 986 proposed a formal Akaike s information criterion AIC for the redundancy of response variables in estimating x 0. In the AIC, the goodness of fit of a model is measured by the Kullback-Leibler KL discrepancy function. The best model is defined as that where the risk function defined by the expected KL loss function is lowest among all the redundancy models. The AIC is defined by adding an estimator of the bias between the risk function and the expectation of a negative twofold maximum log-likelihood to the negative twofold maximum log-likelihood, and the AIC is regarded as an estimator of the risk function. Following Akaike 973, Fujikoshi and Nishii 986 formally regarded the twofold number of parameters in the redundancy model as an estimator of bias. However, it is known that when the sample size n is small or the number of redundancy models is large, a formal AIC does not estimate the risk function well and tends to choose a model such that many response variables are not redundant. Therefore, it is important to correct the bias in order to better estimate the risk function. For selecting explanatory variables in an ordinary linear regression, Bedrick and Tsai 994, Hurvich and Tsai 989, and Sugiura 978 corrected the bias for overspecified models. However, these corrections did not consider the selection of response variables in multivariate calibration. In recent years, it has become commonplace to analyze high-dimensional data such that not only the sample size n but also the dimension p are large. Bedrick and Tsai 994, Hurvich and Tsai 989 and Sugiura 978 used a large sample LS asymptotic framework such that n only tends to in order to correct bias. However, corrections using the LS asymptotic framework become inadequate when the response dimension p is large. It is known that corrections using the high-dimensional HD asymptotic framework such that both n and p tend to are better than those using the LS asymptotic framework. Fujikoshi et al. 204 offered a bias correction using the HD asymptotic framework in the context of selecting explanatory variables in a multivariate linear regression. It can be expected that such a correction performs well when both n and p are large but is inadequate when p is small. However, it can be difficult for analysts to decide whether p is large or small; thus, when p is moderate, it can be burdensome to determine whether corrected AICs by the LS or HD asymptotic framework should be applied. In this paper, we correct the bias under a hybrid-high-dimensional HHD asymptotic frame- 2
3 work: n, p c [0,. n It should be noted that the LS and HD asymptotic frameworks are included in the HHD asymptotic framework as special cases. Thus, it is expected that the bias-corrected AIC determined by using the HHD asymptotic framework is a better estimator of the risk function regardless of the size of p. We refer to this bias-corrected AIC as the high-dimensional bias-corrected AIC HAIC C. The remainder of the paper is organized as follows. In section 2, we introduce a framework for selecting response variables in multivariate calibration. In section 3, the HAIC C is proposed and we obtain an asymptotic property of the HAIC C. In section 4, we explore, and verify, the performance of the HAIC C through conducting numerical simulations. relegated to the Appendix. Technical details are 2 A framework for selecting response variables 2. A redundancy hypothesis of response variables Let Z = n, X and Θ = α, B be n q + and q + p matrices. multivariate linear regression for Y and X is written as Then, the Y = ZΘ + E, 2 where E = ε,..., ε n is the n p error matrix and ε,..., ε n are mutually independent and identically distributed according to N p 0 p, Σ assuming that Σ is positive definite. To introduce the redundancy hypothesis of response variables by Fujikoshi and Nishii 986, we prepare a classical estimator of x 0. If the population parameters α, B, and Σ are known, then the classical estimator x 0 is written as x 0 = arg min x 0 {y 0 α B x 0 Σ y 0 α B x 0 } = BΣ B BΣ y 0 α. 3 We partition y = y, y 2. Without loss of generality, let y and y 2 be the p -dimensional and p p -dimensional vectors of the candidate redundant variables and non-redundant variables, respectively. To define the redundancy hypothesis, we consider only the case q p < p. We express the partitions of y 0, α, B, and Σ corresponding to the division of y = y, y 2 as follows: y 0 = y 0,, y 0,2, α = α, α 2 Σ Σ 2, B = B, B 2, Σ =, 4 Σ 2 Σ 22 where y 0, and y 0,2 are p -dimensional and p p -dimensional vectors, α and α 2 are p - dimensional and p p -dimensional vectors, B and B 2 are q p and q p p matrices, Σ and Σ 22 are the p p and p p p p covariance matrices of y and y 2, and Σ 2 is the p p p covariance matrix of y and y 2. Let C = BΣ B BΣ. We also partition the classical estimator x 0 in 3 as x 0 = Cy 0 α = C y 0, α + C 2 y 0,2 α 2, 3
4 where C and C 2 are the q p and q p p submatrices of C = C, C 2. From Fujikoshi and Nishii 986, the hypothesis such that y 2 is redundant for estimating x 0 is written as C 2 = B 2 B Γ = O q,p p, 5 where Γ = Σ Σ 2 and O q,p p is the q p p matrix of zeros. 2.2 Models, Risk, and Bias Let Y and Y 2 be the n p and n p p partitioned matrices of Y = Y, Y 2, and let Θ = α, B, Θ 2 = α 2, B 2, and Θ 2 = Θ 2 Θ Γ. From a property of a conditional distribution of a multivariate normal distribution e.g., Srivastava and Khatri, 979, we can express 2 as follows: Y N n p ZΘ, Σ I n, Y 2 Y N n p p Z Θ 2 + Y Γ, Σ 22 I n, where Σ ab c = Σ ab Σ ac Σ cc Σ ca. Under hypothesis 5, it is straightforward to observe that Z Θ 2 = n δ, where δ = α 2 Γ α. Therefore, the candidate model M such that y 2 is redundant is expressed as Y N n p ZΘ, Σ I n, Y 2 Y N n p p n δ + Y Γ, Σ 22 I n. 6 Let S = Y, Y 2, X I n J n Y, Y 2, X be n times as large as the sample covariance matrix of y, x, where J n = n n n. We partition S as follows: S S 2 S x S = S 2 S 22 S 2x, S x S x2 S xx where the sizes of S, S 22, and S xx are p p, p p p p, and q q, respectively. Let fy ; Θ, Σ be the probability density function of N n p ZΘ, Σ I n. Then, the log-likelihood function is derived as lθ, Σ; Y, X = 2 log fy ; Θ, Σ = [ np log 2π + n log Σ + tr{σ Y ZΘ Y ZΘ} ]. 2 By maximizing lθ, Σ; Y, X under model 6, we can obtain the maximum likelihood estimators MLEs of Σ, Σ 22, α, α 2, B, B 2, Θ, Θ 2, Γ, and δ as follows the proof is given in Appendix A: Σ = n Y I n P Z Y, Σ22 = n Y 2 I n P n,y Y 2, α = ȳ S x Sxx x, B = Sxx S x, Θ = Z Z Z Y = α B α 2 = ȳ 2 S 2 S S xs xx x, B2 = Sxx S x S S 2, ȳ, Θ2 = 2 x Sxx S x S S 2 Sxx S x S S, 2 Γ = S S 2, δ = ȳ2 Γ ȳ, 4
5 where P A = AA A A for a matrix A, ȳ = n Y n, ȳ 2 = n Y 2 n, and x = n X n. We assume that the data are generated from the following true model M : Y N n p, Σ I n, y 0 N p ξ, Σ, where is an n p true mean matrix, ξ is a p-dimensional true mean vector, and Σ is a p p true covariance matrix assuming that Σ is positive definite. Under 6, we partition the true covariance matrix Σ in the same way as Σ, as follows: Σ Σ 2 Σ =. Σ 2 Let Γ = Σ Σ 2. We state that the candidate model M is an overspecified model when M satisfies Σ 22 = ZΘ, B 2 B Γ = O q,p p, 7 where Θ = α, B is a q+ p true unknown matrix, α is the p-dimensional true unknown vector of intercept coefficients, B is the q p true unknown matrix of regression coefficients, and B and B 2 are q p and q p p submatrices of B = B, B 2. Let LΘ, Σ be the expected negative twofold log-likelihood function as follows: LΘ, Σ = E Y [ 2lΘ, Σ; Y, X] = 2lΘ, Σ;, X + ntrσ Σ, 8 where EY is the expectation with respect to Y under the true model M. Then, we define the risk function R KL as R KL = EY [L Θ, Σ]. This type of risk function is essentially the same as the expected KL discrepancy function between the true model and a redundancy model. Although the best model is defined as the candidate model with the smallest risk function, we need to estimate the risk function because the risk function includes unknown parameters. Therefore, we usually estimate R KL as 2l Θ, Σ; Y, X, which is expressed as 2l Θ, Σ; Y, X = np{log 2π + } + nlog Σ + log Σ 22, 9 under a candidate model M the proof of 9 is given in Appendix A. However, when R KL is estimated as 9, there is the following bias between the risk function and 2l Θ, Σ; Y, X: B KL = R KL E Y [ 2l Θ, Σ; Y, X]. 0 Although 2l Θ, Σ; Y, X + B KL may be considered as a criterion, note that the bias B KL also includes unknown parameters. Therefore, we usually consider estimating B KL by using its estimator. 5
6 3 Main results 3. High-Dimensional Bias-Corrected AIC Let the number of parameters in the candidate model M be h M = {p + qp + pp + /2}. Following Akaike 973, by approximating B KL as 2h M, the formal AIC is given by { np{log2π + } + nlog AIC = Σ + log Σ h M q p < p np{log2π + } + n log Σ, ω + 2h M p = p where Σ ω = n Y I n P Z Y is the MLE of Σ in the model where none of the response variables y are redundant. This formal AIC has been given by Fujikoshi and Nishii 986 and the model which minimizes the AIC should be selected. However, since the AIC cannot approximate the risk function well when n is small or when n and p are both large, the AIC may not perform adequately in general terms. Therefore, it is important to consider a new criterion that has a better approximation than the AIC in such cases. Correcting the bias B KL in the HHD asymptotic framework, we propose a high-dimensional bias-corrected AIC HAIC C as follows: np{log 2π + } + nlog Σ + log Σ 22 + mn, p q p < p HAIC C = np log2π + n log Σ npn + q + ω + p = p n p q 2 Here, mn, p is defined by mn, p = m n, p + m 2 n, p, m n, p = np + np n + q + + np q + 2n + q + n p n p 2 + n2 p q n p 3 m 2 n, p = + np p n + + 2np p n + n p n p 2 + 4n2 p p n p 3 + np p [ { q n } ] q + 2 n pn p n p n p + nn + p p { p q } n pn p n p n p + 4n{np { q + p }p p n pn p { + 8n2 p p p n pn p n 2 n pn p {2q + τ τ 2 τ 3 }, and τ, τ 2 and τ 3 are given by n p 2 + n p 2 + n pn p }. 2 n p 3 + n p 3 + n pn p 2 + n p 2 n p }, 3 τ = p p { tr{se S e + S h } p q }, τ 2 = n τ 2, 4 n p n p τ 3 = np p n p 2 { tr[{se S e + S h } 2 ] p q } 5 6
7 where S e = n ˆΣ = Y I n P Z Y and S h = Y P X Y. Practically, to propose the HAIC C when q p < p, we took the following step. We expanded the bias B KL in overspecified models the result is given in Appendix B, Proposition B.2. Since there are unknown parameters included in the expanded term, we replaced the unknown parameters as τ, τ 2 and τ 3. Therefore, the HAIC C is proposed using mn, p instead of B KL. When p = p, the redundancy model means the usual multivariate linear regression such that all response variables y are not redundant. Therefore, from Bedrick and Tsai 994, the bias for p = p is exactly equal to np + {npn + q + }/n p q Asymptotic property of HAIC C To present an asymptotic property of the HAIC C, we start by offering some notation and delineating assumptions. Let Ξ be the p p constant matrix given by Ξ = B X XB. 6 Note that Σ /2 Ξ Σ /2 is the p p symmetric matrix and rankσ /2 Ξ Σ /2 min {p, q} = q. Therefore, its spectral decomposition can be expressed as Σ /2 Ξ Σ /2 = H Λ O p q,q O q,p q O p q,p q H, 7 where H is a p p orthogonal matrix, and Λ is a q q diagonal matrix of which the a-th diagonal element is a singular value λ a, i.e., Λ = diagλ,..., λ q with λ λ q 0. Let G be a q q random matrix distributed according to the non-central Wishart distribution with n degrees of freedom, covariance matrix Φ, and non-central parameter matrix ΛΦ, that is, where Φ is defined by We prepare the following assumptions for the moments of G: Assumption A. E[trG 2 ] = O. G W q n, Φ ; ΛΦ, 8 Φ = n I q + Λ. 9 Assumption A2. E[trG 4 ] = O, E[trG 2 2 ] = On + λ q, E[trG 8 ] = On + λ q 3. Note that Φ = On + λ q and ΛΦ = O because Φ 2 = ΛΦ 2 = q n + λ i 2 q n + λ q 2 = On + λ q 2, q 2 λ i qλ 2 n + λ i n + λ 2 = O, i= i= where is the Frobenius norm. Therefore, Assumptions A and A2 are natural. We obtain the asymptotic unbiasedness of the HAIC C for R KL as Theorem 3.. Theorem 3. is directly derived from 0, Proposition B.2 in Appendix B, and Proposition C.2 in Appendix C. 7
8 Theorem 3.. Suppose that q p < p, and Assumptions A and A2 hold. Then, under overspecified models 7, we have R KL E Y [HAIC C ] = Opn n + λ q /2, as n and p/n c [0,, where λ q is the q-th singular value of Σ /2 Ξ Σ /2. From Theorem 3., the HAIC C approximates the risk function well in cases where n is not large and when n and p are both large. Thus, it is expected that the HAIC C will work well. 4 Numerical simulations We conduct numerical simulations to show that the HAIC C in 2 works better than the AIC in. The p redundancy models j j =,..., p were prepared for Monte Carlo simulations with 0,000 iterations. Here, model j denotes a model such that y,..., y j are not redundant and y j+,..., y p are redundant. Suppose that the true model is the model such that p -response variables y,..., y p are not redundant. The data Y were generated from the true model N n p ZΘ, Σ I n, where Z = n, X, Θ = α, B, and we gave α = p and Σ = 0.8I p p p. We constructed X and B as follows. First, we independently generated u,..., u n from U,, where Ua, b denotes the uniform distribution with the range a, b. Using u,..., u n, we constructed X as X = I n J n X 0, where i, j-th element of X 0 is defined by u j i i n, j q. Second, let B and B 2 be submatrices of B in 4 when p = p. Similarly, Σ and Σ 2 are submatrices of Σ. Using the above notation, we constructed B 2 as B 2 = B Σ Σ 2, where the l l p -th column vectors and p - th column vector of B are defined by, 2,..., 2 and, 2p,..., 2p, respectively. Next, we set q = 2, and data were generated for various combinations of n, p, and p. In order to compare the performances of the HAIC C and AIC, the following three properties were considered: Ratio of the information criterion IC average to the risk function R KL : E Y [IC]/R KL. The probability of selecting the true model: the frequency with which the true model is chosen as the best model. The KL information of the predicted values of the best model chosen by the information criterion, which is defined by KL = np {E Y [L Θ best, Σ best ] LΘ, Σ }, where LΘ, Σ is given by 8 and Θ best and Σ best are the MLEs of Θ and Σ, respectively, under the best model. A high-performance model selector is considered to be a model selection criterion where the ratio of the information criterion average to the risk function is close to, where there is a high probability of selecting the true model, and where there is a small amount of KL information. Figures -3 show the ratio of the average of each criterion, i.e., the HAIC C and the AIC to R KL. From Figures -3, it can be observed that the values for the HAIC C are closer to than those for 8
9 the AIC. Therefore, the HAIC C can approximate the risk exactly, but the AIC underestimates this risk. Moreover, as the number of dimensions p increases, the bias of the AIC increases. Table shows the probabilities of selecting the true model and the amount of KL information for the HAIC C and AIC. Therein, it is clear that the probability for the HAIC C is higher than that for the AIC. On the other hand, the KL information for the HAIC C nearly equals that for the AIC, except when p = 80 and p = 60, when it is significantly smaller than that for the AIC. p = 0, p = 5 p = 20, p = 5 p = 20, p = 0 p = 40, p = 5 p = 40, p = 20 Figure : Ratio of the average of each criterion to the risk function with n = 50 9
10 p = 0, p = 5 p = 40, p = 5 p = 40, p = 20 p = 80, p = 5 p = 80, p = 40 Figure 2: Ratio of the average of each criterion to the risk function with n = 00 0
11 p = 0, p = 5 p = 80, p = 5 p = 80, p = 40 p = 60, p = 5 p = 60, p = 80 Figure 3: Ratio of the average of each criterion to the risk function with n = 200
12 Table : Probabilities of selecting the true model and the KL information of the predicted values of the best model Probabilities % KL n p p HAIC C AIC HAIC C AIC Acknowledgments Ryoya Oda was supported by a Research Fellowship for Young Scientists from the Japan Society for the Promotion of Science. Hirokazu Yanagihara and Yasunori Fujikoshi were both partially supported by the Ministry of Education, Science, Sports, and Culture through Grantsin-Aid for Scientific Research C, #8K0345 and #6K00047, respectively. Appendix A Deriving MLEs and proof of 9 To derive MLEs, we use the following lemma e.g., Fujikoshi et al., 200, A.2.: Lemma A.. Let Y be an n p known matrix and A be an n k known matrix of rank k. Consider a function of p p positive definite matrix Σ and k p matrix Θ given by gθ, Σ = m log Σ + tr{y AΘΣ Y AΘ }, where m > 0 and all elements of Θ are bounded. Then, gθ, Σ takes the minimum at Θ = Θ = A A A Y, Σ = Σ = m Y I n P A Y, 2
13 and the minimum value is given by m log Σ + mp. First, we derive MLEs of Σ, Σ 22, α, α 2, B, B 2, Θ, Θ 2, Γ, and δ. From 6, we have 2lΘ, Σ; Y, X = np log 2π + n log Σ + tr{y ZΘ Σ Y ZΘ } + n log Σ 22 + tr{y 2 n δ Y ΓΣ 22 Y 2 n δ Y Γ }. Let g Θ, Σ = n log Σ +tr{y ZΘ Σ Y ZΘ } and g 2 δ, Γ, Σ 22 = n log Σ 22 + tr{y 2 n δ Y ΓΣ 22 Y 2 n δ Y Γ }. The set of unknown parameters {Θ, Σ} exhibits one-to-one correspondence with the set {α, B, C 2, δ, Γ, Σ, Σ 22 }, and each set of parameters for g and g 2 is separated. Thus, we only consider each minimization of g and g 2. Note that rankz = q + and rank{ n, Y } = p +. Then, from Lemma A., we can state that min g Θ, Σ = n log Σ + tr{y Z Θ Σ Y Z Θ } Θ,Σ = n log Σ + np, min δ,γ,σ 22 g 2 δ, Γ, Σ 22 = n log Σ 22 + tr{y 2 n δ Y Γ Σ 22 Y 2 n δ Y Γ } = n log Σ 22 + np p, where Θ = Z Z Z Y, Σ = n Y I n P Z Y, δ, Γ = { n, Y n, Y } n, Y Y 2 and Σ 22 = n Y 2 I n P n,y Y 2. Therefore, the MLEs of α, α 2, B, B 2, and Θ 2 can also be derived from Θ and δ. Next, we show 9. From Lemma A., it is clearly that 2l Θ, Σ; Y, X = np{log 2π + } + nlog Σ + log Σ 22. B Calculating and expanding bias Here, we present propositions for calculating and expanding the bias B KL in 0. B. Calculating bias and its proof Proposition B.. Suppose that q p < p. In overspecified models 7, we obtain the exact expression of the bias as follows: B KL = np + np n + q + n p q 2 + nn + p p n p 2 where Ξ is defined in 6. + nn + p p EY [trs n p 2 Σ ] + np p n p 2 E Y [trs Ξ ], B. The proof of Proposition B. is presented as follows. To calculate the bias, we describe a lemma for distributions of some statistics the proof is given in Appendix D. 3
14 Lemma B.. Suppose that q p < p. Let Σ 22 be the MLE of Σ 22 under model 6, and let [ Γ ] T = {Y I n J n Y } /2 {Y I n J n Y } Y I n J n Ω Γ Σ /2 22, B.2 ] [ δ U = { n, Y n, Y } /2 { n, Y Γ n, Y } n, Y Ω + Y Γ Σ /2 22, B.3 where Ω = 2 Γ, and and 2 are the n p and n p p partitioned matrices of =, 2. Then, T and U are independent of Y, and n Σ 22 and δ, Γ are mutually conditional independent under Y, and n Σ 22 Y W p p n p, Σ 22 ; Ω, T N p p p O p,p p, I p p I p, U N p+ p p O p +,p p, I p p I p +, where Ω = Ω I n P n,y Ω. Moreover, if the model is overspecified model 7, then n Σ 22 and Y are mutually independent, n Σ 22 and δ, Γ are also mutually independent, and n Σ 22 W p p n p, Σ 22, T = {Y I n J n Y } /2 Γ Γ Σ /2 U = { n, Y n, Y } /2 δ δ Γ Γ 22, Σ /2 22, where δ = α 2 Γ α and α and α 2 are the p - and p p -dimensional subvectors of α = α, α 2. From the general formula of the determinant of a partitioned matrix e.g., Lütkepohl, 997, , log Σ = log Σ + log Σ 22 holds. Under overspecified models 7, L Θ, Σ can be expressed as L Θ, Σ = np log 2π + nlog Σ + log Σ 22 + ntrσ Σ + tr{ Σ Θ Θ Z Z Θ Θ }. Thus, from the above equation and 9, the bias B KL in 0 can be expressed as [ B KL = np + EY ntrσ Σ + tr{ Σ Θ Θ Z Z Θ ] Θ }. We separate the second and third terms in the above equation into four terms in order to calculate the bias. It follows from the general formula for the inverse of a block matrix e.g., Harville, 997, Theorem 8.5. that Σ Σ O p,p p = Γ + O p p,p O p p,p p I p p Σ 22 Γ, I p p. Then, we can derive another expression of the bias B KL as follows: [ { }] B KL = np + ney [trσ Σ ] + ne Y tr Σ 22 Γ Γ, I p p Σ I p p 4
15 [ + EY tr{ Σ Θ Θ Z Z Θ ] Θ } [ { + E Y tr Σ 22 Γ, I p p Θ Θ Z Z Θ Θ Γ I p p }] = np + i + ii + iii + iv say, B.4 where Θ and Θ 2 are the q + p and q + p p submatrices of Θ = Θ, Θ 2. Now, we calculate i, ii, iii, and iv in B.4. Starting with i in B.4, it can be stated that nσ /2 Σ Σ /2 is distributed according to W p n q, I p. Thus, by using the expectation of an inverted Wishart distribution see, Watamori, 990, we have i = ne Y [trσ Σ ] = n 2 p n p q 2. B.5 Now we calculate ii in B.4. From Lemma B., by using the expectation of an inverted Wishart distribution e.g., Fujikoshi et al., 200, Theorem 2.2.7, the expectation of EY [ Σ 22 ] can be calculated as EY [ Σ 22 ] = ne Y [n Σ 22 n ] = n p 2 Σ 22. From the above equation and the fact that Σ 22 is independent of Γ, ii can be expressed as [ }] n 2 Γ ii = n p 2 E Y tr {Σ Σ 22 Γ, I p p. By the definition of T in B.2, we have Γ Σ Σ 22 Γ, I p p I p p = Σ Γ I p p Σ 22 Γ, I p p I p p {Y I n J n Y } /2 T Σ / Σ Γ {Y I n J n Y } /2 T Σ /2 22 O p p,p O p p,p p {Y I n J n Y } /2 T Σ / Σ Γ {Y I n J n Y } /2 T Σ /2 22 O p p,p O p p,p p {Y I n J n Y } /2 T T {Y I n J n Y } /2 O p,p p + Σ O p p,p O p p,p p = ii.a + ii.b + ii.c + ii.d say. From the general formula for the inverse of a block matrix, it is straightforward to observe that tr{ii.a} = p p. Since T and Y are mutually independent, we can calculate the expectations of tr{ii.b} and tr{ii.c} as follows: E Y [tr{ii.b}] = 0, E Y [tr{ii.c}] = 0. 5
16 Note that E Y [T T ] = p p I p. Then, from the independence of T and Y, we can calculate tr{ii.d} as E Y [tr{ii.d}] = p p E Y [ tr { Σ {Y I n J n Y } }]. Hence, ii can be expressed as ii = n2 p p n p 2 { + E Y [ tr { Σ {Y I n J n Y } }]}. B.6 Now we move on to calculating iii in B.4. Recall that n Σ = Y I n P Z Y and Θ = Z Z Z Y. Since it is straightforward to observe that Z Z Z I n P Z = O q+,n, n Σ and Θ are mutually independent from Cochran s Theorem e.g., Fujikoshi et al., 200, Theorem Thus, by using a property of conditional expectations, we obtain n iii = n p q 2 E Y = n n p q 2 trp ZtrI p = np q + n p q 2. [ tr{σ P ZY ZΘ P Z Y ZΘ } ] B.7 Finally, we calculate iv in B.4. Recall that Θ is expressed by α, α 2, B, and B 2. Then, we can express Γ, I p p Θ Θ as Γ, I p p Θ Θ = Γ α α B B, I p p α 2 α 2 B 2 B 2 = α 2 α 2 Γ α α, B 2 B 2 Γ B B. From the definitions of δ, α, and α 2, we have δ = α 2 Γ α. Thus, α 2 α 2 Γ α α = α 2 Γ α α 2 Γ α = δ δ + α 2 Γ α α 2 Γ α B.8 = δ δ + Γ Γ α { } =, α δ δ. B.9 Γ Γ Since it is straightforward that B 2 B Γ = Oq,p p and it is prudent that B 2 B Γ = O q,p p under overspecified models, we obtain B 2 B 2 B B Γ = B 2 B Γ B2 B Γ + B Γ Γ δ δ = 0 q, B. B.0 Γ Γ By using B.9 and B.0, we can express B.8 as Γ, I p p Θ Θ δ δ α = Γ Γ 0 q B 6
17 = Σ /2 22 U { n, Y n, Y } /2 α, 0 q B where U is defined in B.3. Thus, by using the above equation and the independence of U and Σ 22, iv can be expressed as follows: [ }] iv = np p n p 2 E Y tr {{ n, Y n, Y } n nα nα nα α + B. X XB From the general formula for the inverse of a block matrix, { n, Y n, Y } is expressed as { n, Y n, Y } n + ȳ = S ȳ ȳ S S ȳ S. Note that S and ȳ are mutually independent and E Y [ȳ ] = α. Thus, we can calculate iv as follows: iv = np p n p 2 = np p n p 2 [ [ { + E Y tr {nȳ α ȳ α + B X XB }S }]] [ [ { + E Y tr {Σ + B X XB }S }]]. B. Therefore, from B.4, B.5, B.6, B.7, and B., Proposition B. can be derived. B.2 Results for expanding bias and its proof Proposition B.2. Suppose that q p < p and Assumption A both hold. Then, under overspecified models 7, we have B KL = mn, p + Op max {n 2, n + λ q 3/2 }, B.2 as n, p/n c [0,, where λ q is the q-th diagonal element of Λ defined in 7. Here, mn, p is the constant given by mn, p = m n, p + m 2 n, p, where m n, p is defined in 3, and m 2 n, p is given by m 2 n, p = n2 p p { 2q + trφ ntrφ 2 ntrφ 2 }, n pn p in which Φ is defined by 9. The proof of Proposition B.2 is presented as follows. We use the following lemma the proof is given in Appendix D. Lemma B.2. Suppose that n q 2 > 0. Let G be random matrices defined by G = n + λ q G I q, 7
18 where G is defined by 8, λ q are given by 7, and Φ = n I q + Λ is defined by 9. Then, using the HHD asymptotic framework, we have G = O p, and G can be expanded as G = I q G + n + λq n + λ G2 + R G, q B.3 where R G = n + λ q 3/2 G G3 = O p n + λ q 3/2. Further, when Assumption A holds, we have E[trD Φ R G ] = On n + λ q 3/2, B.4 where D = + n I q + n Λ and Λ is given in 7. We focus on the case that q < p < p because the proof is simpler where p = q. Let V = H Σ /2 S Σ /2 H, B.5 where H is the p p orthogonal matrix given by 7. It is straightforward that V is distributed according to W p n, I p ; Λ, where Λ is given by Λ O q,p q Λ =. B.6 O p q,q O p q,p q Further, we partition V as V V 2 V =, V 2 V 22 where the sizes of V and V 22 are q q and p q p q, respectively. Let Ṽ be the p p random matrix given by Ṽ = Φ /2 V Φ /2. B.7 Since Φ = E Y [V ], it is straightforward that Ṽ is distributed according to W q n, Φ ; ΛΦ. Now, we calculate the parts of the expectations in B.. Using the definitions in 7 and B.5, we have nn + p p EY [trs n p 2 Σ ] + np p n p 2 E Y [trs Ξ ] = n2 p p [ n p 2 E Y trdv ], B.8 where D and the partitioned expression are given by D = + n I p + n Λ = D O q,p q O p q,q D 22 By applying the general formula for the inverse of a block matrix to V, trdv can be expressed as trdv = trd V + trd V V 2V 22 V 2V + trd 22V 22. Recall that V W p n, I p ; Λ. By using the properties of a partitioned non-central Wishart matrix see Kabe, 964, we can state that V W q n, I q ; Λ, V 22 W p qn q. 8
19 , I p q, V 2 V /2 N p q p O p q,p, I p I p q, and V, V 22 and V 2 V /2 are mutually independent. Thus, we can calculate the expectations of trd V V 2V22 V 2V and trd 22 V22 as follows: E Y [trd V V 2V 22 V 2V ] = p 2 n p 2 E Y [trd V ], EY [trd 22 V22 ] = n p 2 trd 22. Hence, from the above equations, E Y [ trdv ] is expressed as E Y [ trdv ] = n q 2 n p 2 E Y [trd V ] + n p 2 trd 22. B.9 From the definition of Ṽ in B.7, EY [trd V ] is expressed as E Y [trd Φ Ṽ ]. Let L = n + λ q Ṽ I q. Then, by using B.3 and B.4 in Lemma B.2, we can expand EY [trd V ] as E Y [trd V ] = trd Φ + n + λ q E Y [trd Φ L 2 ] + On n + λ q 3/2. B.20 From the expectation of a non-central Wishart distribution, n + λ q E Y [trd Φ L 2 ] can be calculated as EY [trd Φ L 2 ] n + λ q = 2 trφ 3 n 5 trφ 2 + n From B.2, we can express B.20 as E Y [trd V ] = trφ 2 trφ trφ 2 2 trφ trφ 2 n n 2q + trφ + q n n trd Φ. B.2 2q + trφ + q n n + On n + λ q 3/2. B.22 Thus, by using B.8, B.9, B.22, and Proposition B., we can expand the bias as follows: B KL = np + np n + q + n p q 2 + nn + p p n p 2 + n2 n q 2p p n p 2n p 2 + n 2 p p n p 2n p 2 { trφ 2 trφ 2 + 2q + trφ + q } n n + p q + Opn + λ q 3/2. n Note that n p q 2 = n p n p 2 = n p { + q + 2 n p + } q + 22 n p 2 + On 3, }, { + 2 n p + 4 n p 2 + On 3 9
20 n p 2 = { + 2 n p n p 2n p 2 = n p + } 4 n p 2 + On 3, { + 2 n p + 2 n p + n pn p 4 + n pn p + 8 n p n p 3 8 n p 2 n p + 8 n pn p 2 + On 4 4 n p n p 2 }. Therefore, by using the above equations, we can derive B.2. C Asymptotic properties of τ, τ 2 and τ 3 Here we present the results for expanding τ, τ 2 and τ 3 defined by 4 and 5. C. Expanding τ, τ 2, and τ 3 and the proof Proposition C.. Suppose that q p < p. Then, we have τ = p p trφ + O p pn /2 n + λ q, τ 2 = np p trφ 2 + O p pn /2 n + λ q 2, τ 3 = np p trφ 2 + O p pn /2 n + λ q 2, as n, p/n c [0,, where Φ is defined by 9. The proof of Proposition C. is presented as follows. To expand τ, τ 2 and τ 3 by using a HHD asymptotic framework, we use the following lemma which was essentially obtained in Wakaki et al. 204: Lemma C.. Let F h = F F and F e be independently distributed according to W p q, I p ; M M and W p n, I p, respectively. Here, F is a q p random matrix distributed according to N q p M, I p I q. Put B = F F, W = B /2 F F e F B /2. Then, B and W are independently distributed according to W q p, I q ; MM and W q n p+q, I q, respectively. Further, the nonzero eigenvalues of F h Fe we have tr{f e F e + F h } = tr{w W + B } + p q, are the same as those of BW, and tr [ {F e F e + F h } 2] = tr [ {W W + B } 2] + p q. We consider the distributions of S e and S h defined in 5. Since Y is distributed according to N n p ZΘ, Σ I n, it is straightforward that S e and S h are mutually independent and S e W p n q, Σ, S h W p q, Σ ; Ξ, 20
21 where Ξ is given by 6. Let T e = H Σ /2 S eσ /2 H, T h = H Σ /2 S hσ /2 H, where H is the p p orthogonal matrix defined in 7. non-central Wishart matrices that It follows from the properties of T e W p n q, I p, T h W p q, I p ; Λ, where Λ is defined by B.6. Then, we can express tr{s e S e +S h } and tr[{s e S e +S h } 2 ] as tr{s e S e + S h } = tr{t e T e + T h }, tr[{s e S e + S h } 2 ] = tr [ {T e T e + T h } 2]. First, we express tr{s e S e + S h } and tr[{s e S e + S h } 2 ] as functions of q q matrices in order to examine their asymptotic behaviors. From the definition of the non-central Wishart distribution, a different expression for T h is given by T h = U h U h, where U h N q p Λ, I p I q and Λ is the q p partitioned matrix of Λ = Λ, O p,p q satisfying Λ Λ = Λ. Let W = U h U h, W 2 = W /2 U h T e U h W /2. C. Then, from Lemma C., W and W 2 are independently distributed according to W q p, I q ; Λ and W q n p, I q, respectively, and we have tr{s e S e + S h } = tr{w 2 W + W 2 } + p q, tr[{s e S e + S h } 2 ] = tr [ {W 2 W + W 2 } 2] + p q. Next, we expand tr{w 2 W + W 2 } and tr [ {W 2 W + W 2 } 2]. Let W 3 = Φ /2 W + W 2 Φ /2. C.2 Since W and W 2 are mutually independent, we can state that W 3 W q n, Φ ; ΛΦ. Therefore, using B.3 in Lemma B.2, W 3 can be expanded as W 3 = I q + O p n + λ q /2. C.3 By calculating the expectation and variance of W 2, we can also expand W 2 as Thus, Φ /2 W 2 Φ /2 is expressed as W 2 = n p I q + O p n /2. Φ /2 W 2 Φ /2 = n p Φ + O p n /2 n + λ q. C.4 From C.3 and C.4, the following equations can be derived: tr{w 2 W + W 2 } = trφ /2 W 2 Φ /2 W 3 = n p trφ + O p n /2 n + λ q, tr{w 2 W + W 2 } 2 = n p 2 trφ 2 + O p n 3/2 n + λ q 2, 2
22 tr [ {W 2 W + W 2 } 2] = tr{φ /2 W 2 Φ /2 W 3 2 } Thus, we can derive the following equations: = n p 2 trφ 2 + O p n 3/2 n + λ q 2. p p n p tr{w 2 W + W 2 } = p p trφ + O p pn /2 n + λ q, np p n p 2 tr{w 2W + W 2 } 2 = np p trφ 2 + O p pn /2 n + λ q 2, np p n p 2 tr [ {W 2 W + W 2 } 2] = np p trφ 2 + O p pn /2 n + λ q 2. Therefore, we can derive the result of Proposition C.. C.2 Expanding the expectations of τ, τ 2, and τ 3 Proposition C.2. Suppose that q p < p and Assumptions A and A2 hold. Then, we have E Y [ τ ] = p p trφ + Opn n + λ q /2, E Y [ τ 2 ] = np p trφ 2 + Opn n + λ q /2, E Y [ τ 3 ] = np p trφ 2 + Opn n + λ q /2, as n, p/n c [0,, where Φ is defined by 9. The proof of Proposition C.2 is presented as follows. Let W 3 = n + λ q W 3 I q, where W 3 is given by C.2. Then, from Lemma B.2, we can observe that W3 is expanded as W3 = I q R W3, where R W3 = n + λ q /2 W3 W 3. We consider EY [tr{w 2W + W 2 }], EY [tr{w 2W + W 2 } 2 ] and EY [tr[{w 2W + W 2 } 2 ]], where W and W 2 are given by C.. By using R W3, we have E Y [tr{w 2 W + W 2 }] = E Y [trφ /2 W 2 Φ /2 ] E Y [trφ /2 W 2 Φ /2 R W3 ], E Y [tr{w 2 W + W 2 } 2 ] = E Y [trφ /2 W 2 Φ /2 2 ] 2E Y [trφ /2 W 2 Φ /2 trφ /2 W 2 Φ /2 R W3 ] C.5 + E Y [trφ /2 W 2 Φ /2 R W3 2 ], C.6 E Y [tr[{w 2 W + W 2 } 2 ]] = E Y [tr{φ /2 W 2 Φ /2 2 }] 2E Y [tr{φ /2 W 2 Φ /2 2 R W3 }] + E Y [tr{φ /2 W 2 Φ /2 R W3 2 }]. C.7 Since Φ /2 W 2 Φ /2 is distributed according to W q n p, Φ, we can calculate the expectation as follows: E Y [trφ /2 W 2 Φ /2 ] = n p trφ + On + λ q, E Y [trφ /2 W 2 Φ /2 2 ] = n p 2 trφ 2 + Onn + λ q 2, E Y [tr{φ /2 W 2 Φ /2 2 }] = n p 2 trφ 2 + Onn + λ q 2. C.8 C.9 C.0 22
23 By using the Cauchy-Schwarz inequality, we have E Y [trφ /2 W 2 Φ /2 R W3 ] = EY [trφ /2 W 2 Φ /2 W W 3 ] n + λq 3 n + λq E Y [tr{φ /2 W 2 Φ /2 4 }] /2 E Y [tr W 4 3 ] /2 E Y [trw 2 3 ] /2 = On + λ q /2, C. E Y [trφ /2 W 2 Φ /2 trφ /2 W 2 Φ /2 R W3 ] n + λq E Y [tr{φ /2 W 2 Φ /2 8 }] /4 E Y [tr W 4 3 ] /4 E Y [trw 2 3 ] /2 = On + λ q /2, C.2 E Y [trφ /2 W 2 Φ /2 R W3 2 ] n + λ q E Y [tr{φ /2 W 2 Φ /2 4 } 2 ] /4 E Y [tr{ W 4 3 } 2 ] /4 E Y [tr{w 2 3 } 2 ] /2 = On + λ q /2, C.3 E Y [tr{φ /2 W 2 Φ /2 2 R W3 }] n + λq E Y [tr{φ /2 W 2 Φ /2 4 }] /2 E Y [tr W 8 3 ] /4 E Y [trw 4 3 ] /4 = On + λ q /2, C.4 E Y [tr{φ /2 W 2 Φ /2 R W3 2 }] n + λ q 2 E Y [tr{φ /2 W 2 Φ /2 4 }] /2 E Y [tr W 8 3 ] /2 E Y [trw 8 3 ] /2 = On + λ q /2. C.5 Therefore, from C.5-C.5, we have E Y [ τ ] = p p trφ + Opn n + λ q /2, E Y [ τ 2 ] = np p trφ 2 + Opn n + λ q /2, E Y [ τ 3 ] = np p trφ 2 + Opn n + λ q /2. D Proofs of Lemma B. and B.2 D. Proof of Lemma B. From a property of a conditional distribution of a multivariate normal distribution e.g., Srivastava and Khatri, 979, we have Y 2 Y N n p pω + Y Γ, Σ 22 I n. Then, another expression of the true model M is given as follows: Y N n p, Σ I n, Y 2 Y N n p pω + Y Γ, Σ 22 I n. 23
24 On the other hand, we can express Y 2 as Y 2 = Ω + Y Γ + AΣ /2 22, D. where A and Y are mutually independent, and A is distributed according to N n p p O n,p p, I p p I n. By using D., n Σ 22, Γ, and δ, Γ are expressed as n Σ 22 = Ω + AΣ /2 22 I n P n,y Ω + AΣ /2 22. D.2 Γ = {Y I n J n Y } Y I n J n Ω + AΣ / Γ, δ = { n, Y Γ n, Y } n, Y Ω + Y Γ + AΣ /2 22. Hence, the conditional distribution of n Σ 22, Γ, and δ, Γ under Y are n Σ 22 Y W p p n p, Σ 22 ; Ω, Γ Y N p p p {Y I n J n Y } Y I n J n Ω + Γ, Σ 22 {Y I n J n Y }, δ Γ Y N p+ p p { n, Y n, Y } n, Y Ω + Y Γ, Σ 22 { n, Y n, Y }. From the above equations, we can state that T and U are distributed according to N p p p O p,p p, I p p I p and N p + p p O p+,p p, I p p I p+, respectively, and are independent of Y. Next, we show that n Σ 22 and δ, Γ are mutually conditionally independent under Y. It is straightforward that { n, Y n, Y } n, Y I n P n,y = O p +,n. D.3 Hence, from Cochran s Theorem, the conditional independence of n Σ 22 and δ, Γ is obtained. Finally, we consider the case that the model is overspecified. By the definition of overspecified models in 7, the equation Ω = n δ holds. Thus, from D.2, we can state that n Σ 22 = Σ /2 22 A I n P n,y AΣ /2 22 W p p n p, Σ 22, and n Σ 22 and Y are mutually independent. Also, T and U are expressed as T = {Y I n J n Y } /2 Γ Γ Σ /2 U = { n, Y n, Y } /2 δ δ Γ Γ 22, Σ /2 22, and n Σ 22 and δ, ˆΓ are mutually independent from D.3. D.2 Proof of Lemma B.2 This proof is based on an idea in Hashiyama et al From the expectation of a noncentral Wishart distribution, we can calculate the expectations as E[G] = I q and E [ G I q 2] = n trφ 2 n trφ 2 + 2q + trφ 24
25 = On + λ q. The above equation implies that G = O p. By applying Taylor expansion to G, we derive G = I q + G = I q G + n + λq n + λq n + λ G2 + R G, q D.4 where R G is a remainder term in the Taylor expansion. First, we calculate an explicit form of R G. From D.4, the following equation can be derived: I q = GG = I q + = I q + GR G G I q n + λq n + λ q 3/2 G 3. From the above equation, R G can be expressed as R G = G + n + λq n + λ G2 + R G q n + λ q 3/2 G G3. Next, we show B.4. By using the Cauchy-Schwarz inequality, we have E[ trd Φ R G ] n + λ q 3/2 { E[trD 2 Φ 2 G6 ]E[trG 2 ]} /2. We can observe that D Φ = On because nd Φ 2 = q 2 n + + λi = O. i= n + λ i By using results for expectations of the multivariate normal random vector e.g., Gupta and Nagar, 2000, we can calculate E[trD 2 Φ 2 G6 ] = On 2. Therefore, we derive B.4. References [] Akaike, H Information theory and an extension of the maximum likelihood principle. In 2nd International Symposium on Information Theory eds. B. N. Petrov & F. Csáki, pp Akadémiai Kiadó, Budapest. [2] Alves, J. C. L. & Poppi, R. J Quantification of conventional and advanced biofuels contents in diesel fuel blends using near-infrared spectroscopy and multivariate calibration. Fuel, 65, [3] Bedrick, E. J. & Tsai, C.-L Model selection for multivariate regression in small samples. Biometrics, 50, [4] Bro, R Multivariate calibration: What is in chemometrics for the analytical chemist? Anal. Chim. Acta, 500, [5] Brown, P. J Multivariate calibration. J. R. Statist. Soc., B,
26 [6] Carvalho, B. M. A., Carvalho, L. M., Coimbra, J. S. R., Minim, L. A., Barcellos, E. S., Júnior, W. F. S., Detmann, E. & Carvalho, G. G. P Rapid detection of whey in milk powder samples by spectrophotometric and multivariate calibration. Food Chem., 74, 7. [7] Fujikoshi, Y. and Nishii, R Selection of variables in multivariate regression. Hiroshima Math. J., 3, [8] Fujikoshi, Y., Sakurai, T. & Yanagihara, H Consistency of high-dimensional AICtype and C p -type criteria in multivariate linear regression. J. Multivariate Anal., 23, [9] Fujikoshi, Y., Ulyanov, V. V. & Shimizu, R Multivariate Statistics: High- Dimensional and Large-Sample Approximations. John Wiley & Sons, Inc., Hoboken, New Jersey. [0] Gupta, A. & Nagar, D Matrix Variate Distributions. Chapman and Hall/CRC, Boca Raton, FL. [] Harville, D. A Matrix Algebra from a Statistician s Perspective. Springer-Verlag, New York. [2] Hashiyama, Y., Yanagihara, Y. & Fujikoshi, Y Jackknife bias correction of the AIC for selecting variables in canonical correlation analysis under model misspecification. Linear Algebra Appl., 455, [3] Hurvich, C. M. & Tsai, C.-L Regression and time series model selection in small samples. Biometrika, 76, [4] Kabe, D. G A note on the Bartlett decomposition of a Wishart matrix. J. Roy. Statist. Soc. Ser. B, 26, [5] Lütkepohl, H Handbook of Matrices. Wiley, Chichester. [6] Srivastava, M. S. & Khatri, C. G An Introduction to Multivariate Statistics. North- Holland, New York. [7] Sugiura, N Further analysis of the data by Akaike s information criterion and the finite corrections, Comm. Statist. Theory Methods, A7, [8] Wakaki, H., Fujikoshi, Y. & Ulyanov, V. V Asymptotic expansions of the distributions of MANOVA test statistics when the dimension is large. Hiroshima Math. J., 44, [9] Watamori, Y On the moments of traces of Wishart and inverted Wishart matrices. South African Statist. J., 24,
Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis
Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,
More informationConsistency of test based method for selection of variables in high dimensional two group discriminant analysis
https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai
More informationHigh-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi
High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,
More informationBias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition
Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems
More informationA Study on the Bias-Correction Effect of the AIC for Selecting Variables in Normal Multivariate Linear Regression Models under Model Misspecification
TR-No. 13-08, Hiroshima Statistical Research Group, 1 23 A Study on the Bias-Correction Effect of the AIC for Selecting Variables in Normal Multivariate Linear Regression Models under Model Misspecification
More informationVariable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension
Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension (Last Modified: November 3 2008) Mariko Yamamura 1, Hirokazu Yanagihara 2 and Muni S. Srivastava 3
More informationConsistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis
Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Tetsuro Sakurai and Yasunori Fujikoshi Center of General Education, Tokyo University
More informationBias-corrected AIC for selecting variables in Poisson regression models
Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,
More informationLikelihood Ratio Tests in Multivariate Linear Model
Likelihood Ratio Tests in Multivariate Linear Model Yasunori Fujikoshi Department of Mathematics, Graduate School of Science, Hiroshima University, 1-3-1 Kagamiyama, Higashi Hiroshima, Hiroshima 739-8626,
More informationAn Unbiased C p Criterion for Multivariate Ridge Regression
An Unbiased C p Criterion for Multivariate Ridge Regression (Last Modified: March 7, 2008) Hirokazu Yanagihara 1 and Kenichi Satoh 2 1 Department of Mathematics, Graduate School of Science, Hiroshima University
More informationDepartment of Mathematics, Graduate School of Science Hiroshima University Kagamiyama, Higashi Hiroshima, Hiroshima , Japan.
High-Dimensional Properties of Information Criteria and Their Efficient Criteria for Multivariate Linear Regression Models with Covariance Structures Tetsuro Sakurai and Yasunori Fujikoshi Center of General
More informationTesting Some Covariance Structures under a Growth Curve Model in High Dimension
Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping
More informationKen-ichi Kamo Department of Liberal Arts and Sciences, Sapporo Medical University, Hokkaido, Japan
A STUDY ON THE BIAS-CORRECTION EFFECT OF THE AIC FOR SELECTING VARIABLES IN NOR- MAL MULTIVARIATE LINEAR REGRESSION MOD- ELS UNDER MODEL MISSPECIFICATION Authors: Hirokazu Yanagihara Department of Mathematics,
More informationJackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification
Jackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification (Last Modified: May 22, 2012) Yusuke Hashiyama, Hirokazu Yanagihara 1 and Yasunori
More informationHigh-dimensional asymptotic expansions for the distributions of canonical correlations
Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic
More informationEPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large
EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large Tetsuji Tonda 1 Tomoyuki Nakagawa and Hirofumi Wakaki Last modified: March 30 016 1 Faculty of Management and Information
More informationSelection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models
Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Junfeng Shang Bowling Green State University, USA Abstract In the mixed modeling framework, Monte Carlo simulation
More informationIllustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives
TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui
More informationA fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables
A fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables Ryoya Oda and Hirokazu Yanagihara Department of Mathematics,
More informationJournal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error
Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA
More informationAnalysis of variance, multivariate (MANOVA)
Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables
More informationComparison with RSS-Based Model Selection Criteria for Selecting Growth Functions
TR-No. 14-07, Hiroshima Statistical Research Group, 1 12 Comparison with RSS-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 2, Mariko Yamamura 1 and Hirokazu Yanagihara 2 1
More informationApproximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work
Hiroshima Statistical Research Group: Technical Report Approximate interval estimation for PMC for improved linear discriminant rule under high dimensional frame work Masashi Hyodo, Tomohiro Mitani, Tetsuto
More informationOn Properties of QIC in Generalized. Estimating Equations. Shinpei Imori
On Properties of QIC in Generalized Estimating Equations Shinpei Imori Graduate School of Engineering Science, Osaka University 1-3 Machikaneyama-cho, Toyonaka, Osaka 560-8531, Japan E-mail: imori.stat@gmail.com
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationA NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION
J. Japan Statist. Soc. Vol. 32 No. 2 2002 165 181 A NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION Hisayuki Tsukuma* The problem of estimation in multivariate linear calibration
More informationStatistical Inference On the High-dimensional Gaussian Covarianc
Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More informationASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE
J Jaan Statist Soc Vol 34 No 2004 9 26 ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE Yasunori Fujikoshi*, Tetsuto Himeno
More information1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa
On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University
More informationAsymptotic non-null distributions of test statistics for redundancy in high-dimensional canonical correlation analysis
Asymtotic non-null distributions of test statistics for redundancy in high-dimensional canonical correlation analysis Ryoya Oda, Hirokazu Yanagihara, and Yasunori Fujikoshi, Deartment of Mathematics, Graduate
More informationT 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem
T Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem Toshiki aito a, Tamae Kawasaki b and Takashi Seo b a Department of Applied Mathematics, Graduate School
More informationComparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions
c 215 FORMATH Research Group FORMATH Vol. 14 (215): 27 39, DOI:1.15684/formath.14.4 Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 1,
More informationModel Selection for Semiparametric Bayesian Models with Application to Overdispersion
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationAnalysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems
Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA
More informationA Fast Algorithm for Optimizing Ridge Parameters in a Generalized Ridge Regression by Minimizing an Extended GCV Criterion
TR-No. 17-07, Hiroshima Statistical Research Group, 1 24 A Fast Algorithm for Optimizing Ridge Parameters in a Generalized Ridge Regression by Minimizing an Extended GCV Criterion Mineaki Ohishi, Hirokazu
More informationDiscrepancy-Based Model Selection Criteria Using Cross Validation
33 Discrepancy-Based Model Selection Criteria Using Cross Validation Joseph E. Cavanaugh, Simon L. Davies, and Andrew A. Neath Department of Biostatistics, The University of Iowa Pfizer Global Research
More informationMultivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation
Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics
More informationAkaike Information Criterion
Akaike Information Criterion Shuhua Hu Center for Research in Scientific Computation North Carolina State University Raleigh, NC February 7, 2012-1- background Background Model statistical model: Y j =
More informationEdgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population
Edgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population (Last Modified: April 24, 2008) Hirokazu Yanagihara 1 and Ke-Hai Yuan 2 1 Department of Mathematics, Graduate
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More informationGeographically Weighted Regression as a Statistical Model
Geographically Weighted Regression as a Statistical Model Chris Brunsdon Stewart Fotheringham Martin Charlton October 6, 2000 Spatial Analysis Research Group Department of Geography University of Newcastle-upon-Tyne
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationAn Akaike Criterion based on Kullback Symmetric Divergence in the Presence of Incomplete-Data
An Akaike Criterion based on Kullback Symmetric Divergence Bezza Hafidi a and Abdallah Mkhadri a a University Cadi-Ayyad, Faculty of sciences Semlalia, Department of Mathematics, PB.2390 Marrakech, Moroco
More informationOn the conservative multivariate multiple comparison procedure of correlated mean vectors with a control
On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control Takahiro Nishiyama a a Department of Mathematical Information Science, Tokyo University of Science,
More informationOn the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors
On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors Takashi Seo a, Takahiro Nishiyama b a Department of Mathematical Information Science, Tokyo University
More informationTesting a Normal Covariance Matrix for Small Samples with Monotone Missing Data
Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical
More informationStatistical Inference with Monotone Incomplete Multivariate Normal Data
Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful
More informationHigh-Dimensional Asymptotic Behaviors of Differences between the Log-Determinants of Two Wishart Matrices
TR-No. 4-0, Hiroshima Statistical Research Grou, 9 High-Dimensional Asymtotic Behaviors of Differences between the Log-Determinants of Two Wishart Matrices Hirokazu Yanagihara,2,3, Yusuke Hashiyama and
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationON A ZONAL POLYNOMIAL INTEGRAL
ON A ZONAL POLYNOMIAL INTEGRAL A. K. GUPTA AND D. G. KABE Received September 00 and in revised form 9 July 003 A certain multiple integral occurring in the studies of Beherens-Fisher multivariate problem
More informationCOMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS
Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department
More informationExpected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples
Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples Nobumichi Shutoh, Masashi Hyodo and Takashi Seo 2 Department of Mathematics, Graduate
More informationMultivariate Regression (Chapter 10)
Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate
More informationMULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García. Comunicación Técnica No I-07-13/ (PE/CIMAT)
MULTIVARIATE ANALYSIS OF VARIANCE UNDER MULTIPLICITY José A. Díaz-García Comunicación Técnica No I-07-13/11-09-2007 (PE/CIMAT) Multivariate analysis of variance under multiplicity José A. Díaz-García Universidad
More informationDepartment of Statistics
Research Report Department of Statistics Research Report Department of Statistics No. 05: Testing in multivariate normal models with block circular covariance structures Yuli Liang Dietrich von Rosen Tatjana
More informationTesting linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices
Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationEffective Sample Size in Spatial Modeling
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS029) p.4526 Effective Sample Size in Spatial Modeling Vallejos, Ronny Universidad Técnica Federico Santa María, Department
More informationABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY
ABOUT PRINCIPAL COMPONENTS UNDER SINGULARITY José A. Díaz-García and Raúl Alberto Pérez-Agamez Comunicación Técnica No I-05-11/08-09-005 (PE/CIMAT) About principal components under singularity José A.
More informationThe Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models
The Behaviour of the Akaike Information Criterion when Applied to Non-nested Sequences of Models Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population Health
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationModfiied Conditional AIC in Linear Mixed Models
CIRJE-F-895 Modfiied Conditional AIC in Linear Mixed Models Yuki Kawakubo Graduate School of Economics, University of Tokyo Tatsuya Kubokawa University of Tokyo July 2013 CIRJE Discussion Papers can be
More informationMultivariate Least Weighted Squares (MLWS)
() Stochastic Modelling in Economics and Finance 2 Supervisor : Prof. RNDr. Jan Ámos Víšek, CSc. Petr Jonáš 23 rd February 2015 Contents 1 2 3 4 5 6 7 1 1 Introduction 2 Algorithm 3 Proof of consistency
More informationConsistent Bivariate Distribution
A Characterization of the Normal Conditional Distributions MATSUNO 79 Therefore, the function ( ) = G( : a/(1 b2)) = N(0, a/(1 b2)) is a solu- tion for the integral equation (10). The constant times of
More informationLeast Squares Estimation-Finite-Sample Properties
Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions
More informationA Variant of AIC Using Bayesian Marginal Likelihood
CIRJE-F-971 A Variant of AIC Using Bayesian Marginal Likelihood Yuki Kawakubo Graduate School of Economics, The University of Tokyo Tatsuya Kubokawa The University of Tokyo Muni S. Srivastava University
More informationStatistical Inference with Monotone Incomplete Multivariate Normal Data
Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful
More informationThomas J. Fisher. Research Statement. Preliminary Results
Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these
More informationON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE-COVARIANCE COMPONENTS
Ø Ñ Å Ø Ñ Ø Ð ÈÙ Ð Ø ÓÒ DOI: 10.2478/v10127-012-0017-9 Tatra Mt. Math. Publ. 51 (2012), 173 181 ON EXACT INFERENCE IN LINEAR MODELS WITH TWO VARIANCE-COVARIANCE COMPONENTS Júlia Volaufová Viktor Witkovský
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationCONDITIONS FOR ROBUSTNESS TO NONNORMALITY ON TEST STATISTICS IN A GMANOVA MODEL
J. Japan Statist. Soc. Vol. 37 No. 1 2007 135 155 CONDITIONS FOR ROBUSTNESS TO NONNORMALITY ON TEST STATISTICS IN A GMANOVA MODEL Hirokazu Yanagihara* This paper presents the conditions for robustness
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More informationLasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices
Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,
More informationLecture 3. Inference about multivariate normal distribution
Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More informationA note on profile likelihood for exponential tilt mixture models
Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationRandom Matrices and Multivariate Statistical Analysis
Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical
More informationarxiv: v1 [math.st] 2 Apr 2016
NON-ASYMPTOTIC RESULTS FOR CORNISH-FISHER EXPANSIONS V.V. ULYANOV, M. AOSHIMA, AND Y. FUJIKOSHI arxiv:1604.00539v1 [math.st] 2 Apr 2016 Abstract. We get the computable error bounds for generalized Cornish-Fisher
More informationLecture Stat Information Criterion
Lecture Stat 461-561 Information Criterion Arnaud Doucet February 2008 Arnaud Doucet () February 2008 1 / 34 Review of Maximum Likelihood Approach We have data X i i.i.d. g (x). We model the distribution
More informationHigh-dimensional two-sample tests under strongly spiked eigenvalue models
1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data
More informationInferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data
Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationBootstrap Variants of the Akaike Information Criterion for Mixed Model Selection
Bootstrap Variants of the Akaike Information Criterion for Mixed Model Selection Junfeng Shang, and Joseph E. Cavanaugh 2, Bowling Green State University, USA 2 The University of Iowa, USA Abstract: Two
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES. 1. Introduction
Acta Math. Univ. Comenianae Vol. LXV, 1(1996), pp. 129 139 129 ON VARIANCE COVARIANCE COMPONENTS ESTIMATION IN LINEAR MODELS WITH AR(1) DISTURBANCES V. WITKOVSKÝ Abstract. Estimation of the autoregressive
More informationSTRONG CONSISTENCY OF THE AIC, BIC, C P KOO METHODS IN HIGH-DIMENSIONAL MULTIVARIATE LINEAR REGRESSION
STRONG CONSISTENCY OF THE AIC, BIC, C P KOO METHODS IN HIGH-DIMENSIONAL MULTIVARIATE LINEAR REGRESSION AND By Zhidong Bai,, Yasunori Fujikoshi, and Jiang Hu, Northeast Normal University and Hiroshima University
More informationFinite Population Sampling and Inference
Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationOn testing the equality of mean vectors in high dimension
ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.
More informationEstimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk
Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:
More informationTAMS39 Lecture 2 Multivariate normal distribution
TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution
More informationObtaining simultaneous equation models from a set of variables through genetic algorithms
Obtaining simultaneous equation models from a set of variables through genetic algorithms José J. López Universidad Miguel Hernández (Elche, Spain) Domingo Giménez Universidad de Murcia (Murcia, Spain)
More informationOrthogonal decompositions in growth curve models
ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 4, Orthogonal decompositions in growth curve models Daniel Klein and Ivan Žežula Dedicated to Professor L. Kubáček on the occasion
More informationSTAT 730 Chapter 5: Hypothesis Testing
STAT 730 Chapter 5: Hypothesis Testing Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 28 Likelihood ratio test def n: Data X depend on θ. The
More information