Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis
|
|
- Esmond August Carter
- 5 years ago
- Views:
Transcription
1 Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Tetsuro Sakurai and Yasunori Fujikoshi Center of General Education, Tokyo University of Science, Suwa, Toyohira, Chino, Nagano , Japan Department of Mathematics, Graduate School of Science, Hiroshima University, -3- Kagamiyama, Higashi Hiroshima, Hiroshima , Japan Abstract This paper is concerned with selection of variables in two-group discriminant analysis with the same covariance matrix. We propose a distance-based criterion (DC drawing on the distance of each variables. The selection method can be applied for high-dimensional data. Sufficient conditions for the distance-based criterion to be consistent are provided when the dimension and the sample are large. Our results, and tendencies therein are explored numerically through a Monte Carlo simulation. AMS 2000 subject classification: primary 62H2; secondary 62H30 Key Words and Phrases: Consistency, Discriminant analysis, High-dimensional framework, Selection of variables, Distance-based criterion.
2 . Introduction This paper is concerned with the variable selection problem in two-group discriminant analysis with the same covariance matrix. Some methods have been proposed for high-dimensional data as well as large-sample data. One of the methods are based on model selection criteria such as AIC and BIC, see, e.g., Fujikoshi (984, Nishii et al. (988. There are methods based on misclassification errors by MaLachlan (976, Fujikoshi (985, Hyodo and Kubokawa (204, Yamada et al. (207. For high-dimensional data, there are Lasso and other regularization methods by Clemmensen et al. (20, Witten and Tibshirani (20, etc. In this paper we propose a distance-based criterion based on the distance of each variables, which is useful for high-dimensional data as well as largesample data. Here, the distance of each variables is measured as the one except for the effect of other variables. The criterion involves a constant term which should be determined through some optimality. We propose a class of constants satisfying a consistency when the dimension and the sample size are large. The class depends on the parameters. With respect to the actual use, we also propose a class of estimators for the constant term which satisfies a consistency. Our results, and tendencies therein are explored numerically through a Monte Carlo simulation. The remainder of the present paper is organized as follows. In Section 2, we present the relevant notation and the distance-based criterion. In Sections 3 we derive sufficient conditions for the distance-based criterion to be consistent under a high-dimensional case. In Section 4 we also propose a class of estimators for the constant term. The proposed distance-based criterion is also examined through a Monte Carlo simulation in Section 5. In Section 6, conclusions are offered. All proofs of our results are provided in the Appendix. 2
3 2. Distance-based Criterion In two-group discriminant analysis, suppose that we have independent samples y (i,..., y n (i i from p-dimensional normal distributions N p (µ (i, Σ, i =, 2. Let Y be the total sample matrix defined by Y = (y (,..., y ( n, y (2,..., y (2 n 2. Let and D be the population and the sample Mahalanobis distances defined by = { (µ ( µ ( Σ (µ ( µ ( } /2, and D = { ( x ( x (2 S ( x ( x (2 } /2, respectively. Here x ( and x (2 are the sample mean vectors, and S be the pooled sample covariance matrix based on n = n + n 2 samples. Let β be the coefficients of the population discriminant function given by β = Σ (µ ( µ (2 = (β,..., β p. (2. Suppose that j denotes a subset of ω = {,..., p} containing p j elements, and y j denotes the p j vector consisting of the elements of y indexed by the elements of j. We use the notation D j and D ω for D based on y j and y ω (= y, respectively. One of the approaches for variable selection is to consider a set of models as follows. For each subset j of ω = {,..., p}, consider a variable selection model M j defined by M j : β i 0 if i j, and β i = 0 if i j. (2.2 We identify the selection of M j with the selection of y j. Let AIC j be the AIC for M j. Then, it is known (see, e.g., Fujikoshi 985 that A j = AIC j AIC ω (2.3 { = n log + g2 (Dω 2 Dj 2 } 2(p p n 2 + g 2 Dj 2 j, 3
4 where g = (n n 2 /n. Similarly, let BIC j be the BIC for M j, and we have B j = BIC j BIC ω (2.4 { = n log + g2 (Dω 2 Dj 2 } (log n(p p n 2 + g 2 Dj 2 j. In a large sample framework, it is known (see, Fujikoshi 985, Nishii et al. 988 that AIC is not consistent, but BIC is consistent. The variable selection methods based on AIC and BIC are given as min j AIC j and min j BIC j, respectively. Therefor, such model selection methods have a computationally onerous problem when p is large, since the methods involve minimizing with respect to the subsets of 2 p. To circumvent this issue, we consider a distance-based criterion (DC drawing on the distance of each variables. A contribution of y i in the distance between two groups may be measured by D 2 D 2 ( i (2.5 where ( i is the subset of ω = {,..., p} obtained by omitting i from ω. If D 2 D 2 ( i is large, it may be considered that y i has a large contribution in discriminant analysis. Conversely, if D 2 D( i 2 is small, it may be considered that y i has a small contribution in discriminant analysis. Let us define D d,i = D 2 D 2 ( i d, i =,..., p, (2.6 where d is a positive constant. We propose a distance-based criterion for the selection of variables defined by selecting the set of suffixes or the set of variables given by DC d = {i ω D d,i > 0}, (2.7 or {y i {y,..., y p } D d,i > 0}. The notation ĵ DCd is also used for DC d. For a distance-based criterion, there is an important problem how to decide the constant term d. In Section 3 we derive a range of d which has a high-dimensional consistency. The range depends on the population parameters. For a practical usefulness, we also propose a class of estimators 4
5 for d given by d = n 2 + { g2 D 2 a(np /3 + n p }, (2.8 (n p g 2 n p 3 where a is a constant satisfying a = O(. It is shown that DC d is consistent under a high-dimensional framwork. 3. High-dimensional Consistency For studying consistency properties of the variable selection criterion DC d, it is assumed that the true model M is specified through the values of µ (, µ (2 and Σ. For simplicity, we write the variable selection models in terms of β, or M j. Assume that the true model M is included in the full model M ω. Let the minimum model including M be M j. For a notational simplicity, we regard the true model M as M j. Let F be the entire suite of candidate models, defined by F = {{},..., {k}, {, 2},..., {,..., k}}, Let F separate into two sets, one is a set of overspecified models, i.e., F + = {j F j j} and the other is a set of underspecified models, i.e., F = F+ c F. Here we list some of our main assumptions: A (The true model: M j F. A2 (The high-dimensional asymptotic framework: p, n, p/n c (0,. A3 : p is finite, and 2 = O(. A4 : n i /n k i > 0, i =, 2. 5
6 Relating to our proof of a consistency of DC d in (2.7, note that DC d = j D d,i > 0 for i j, and D d,i 0 for i / j Therefore, P (DC d = j = P D d,i > 0 D d,i < 0 i j i/ j = P D d,i 0 D d,i 0 i j i/ j P (D d,i 0 P (D d,i 0. i j i/ j Our result will be otained by proving the followings: [F] i j P (D d,i 0 0. (3. [F2] i/ j P (D d,i 0 0, (3.2 [F ] is the probability that DC d does not select the set of true variables. [F 2] is the probability that DC d select the set of no true variables. Our consistency depends on where τ i = 2 2 ( i, i ω. τ min = min i j τ i, (3.3 Theorem 3.. Suppose that assumptions A, A2, A3 and A4 are satisfied. Let b = τ min /( c, where τ min is defined by (3.3. Then, if 0 < d < b, the distance-based criterion DC d satisfies [F], i.e., i j P (D d,i 0 0. (3.4 6
7 Related to a sufficient condition for (3.2, we use the following result: for i j, E { } D 2 D( i 2 n 2 { = g 2 (n 3 + 2} (3.5 (n p 3(n p 2 q(p, n, 2. Theorem 3.2. Suppose that assumptions A, A2, A3 and A4 are satisfied. Then, if d = O( and d > q(p, n, 2, the distance-based criterion DC d satisfies P (D d,i 0 0, (3.6 i/ j Combining Theorems 3. and 3.2, a sufficiency condition for DC d to be consistent is given as follows:. Theorem 3.3. Suppose that assumptions A, A2, A3 and A4 are satisfied. Then, if d = O( and q(p, n, 2 < d < b = τ min /( c, (3.7 the distance-based criterion DC d is consistent. For an actual use, it is necessary to get an estimator d for d. This problem is discussed in Section A Method of Determining the Constant Term In the previous section we have seen that the distance-based criterion DC d is consistent if we use a constant term d such that d = O( and q(p, n, < d < τ min, where q(p, n, is given by (3.5, α min = min i j α i and α i = 7
8 2 2 ( i. However, such a d involves unknown parameters. We need to estimate d by using the estimators of 2 and τ min. Here, we consider to construct an estimator by using the fact that DC d is based on the test of α i = 0 or β i = 0. The likelihood test is based on (see, e.g., Rao 973, Fujikoshi et al. 200 g 2 D 2 {i} ( i n 2 + g 2 D 2 ( i whose null distribution is a ratio χ 2 /χ 2 n p of independent chi-squared variates χ 2 and χ 2 n p, where D 2 {i} ( i = D2 D 2 ( i. Let us define ˆd as follows: d = n 2 + { g2 D 2 a(np /3 + n p }. (4. (n p g 2 n p 3 Here a is a costant sasisfying a = O(. Then, it is shown that DC d has a consistency. Theorem 4.. Suppose that assumptions A, A2, A3 and A4 are satisfied. Then, the distance-based criterion DC d with d in (4. is consistent. As a special ˆd, we consider the followings: d = n 2 + { g2 D 2 (np /3 + n p }, (4.2 (n p g 2 n p 3 d 2 = n 2 + { g2 D 2 ( p 2 (np /3 + n p }. (4.3 (n p g 2 n n p 3 The estimators d and d 2 are defined from d by choosing a as a = and a = ( p/n 2, respectively. In Theorem 3.3, we have seen that DC d is consistent under a highdimensional framework if d satisfies q(p, n, 2 < d < τ min /( c. The expectation of d is related to the range as follows: ( E( d > q(p, n, 2, (2 lim E( d = 0 τ min /( c. p/n c 8
9 The above result follows from that { E[ ˆd] = a(np /3 + n p } n 2 + g 2 p+g 2 2 n p 3 n p 3 (n p g { 2 } = a(np /3 (n 2(n 3 g 2 (n p 3(n p + n 2 (n p 3(n p 2 + (n 2(n 3 g 2 (n p 3 + n 2 2 (n p Numerical Study In this section we numerically explore the validity of our claims through three distance-based criteria, DC d0, DC d, and DC d2. Here, d 0 is the midpoint in the interval [q(p, n, 2, τ min /( c] in (3.7. The true model was assumed as follows: the true dimension is p = 4, 8, the true mean vectors; µ = α(,...,, 0,..., 0, µ 2 = α(,...,, 0,..., 0, and the true covariance matrix is Σ = I p. The selection rates associated with these criteria are given in Tables 5. to 5.4. Under, True, and Over denote the underspecified models, the true model, and the overspecified models, respectively. We focused on selection rates for 0 3 replications in Tables From these tables, we can identify the following tendencies. Table 5.. Selection rates of DC d0, DC d and DC d2 for p = 4 and α = 9
10 p = 4, α = d 0 d d2 n n 2 p Under True Over Under True Over Under True Over Table 5.2. Selection rates of DC d0, DC d and DC d2 for p = 4 and α = 0 p = 4, α = 0 d 0 d d2 n n 2 p Under True Over Under True Over Under True Over Table 5.3. Selection rates of DC d0, DC d and DC d2 for p = 8 and α = p = 8, α = d 0 d d2 n n 2 p Under True Over Under True Over Under True Over
11 Table 5.4. Selection rates of DC d0, DC d and DC d2 for p = 8 and α = 0 p = 8, α = d 0 d ˆd2 n n 2 p Under True Over Under True Over Under True Over From Tables , we can identify the following tendncies. In all the DC criteria with d 0, d and d, there are cases which their selection rates are high. Especially, when the dimension p is small as in p = 0, their selection rates are near to. However, when n and p are near, their selection rates do not near to one when n and p are not large. On the other hand, for the case which a consistency can not be seen in Tables, it will be expected to be consistent as the values of n and p are large. In all the DC criteria with d 0, d and d, their selection rates of the true model are incresing as n and p are large. In all the DC criteria with d 0, d and d, their selection rates of the true model are decreasing as n and p are closer. In all the DC criteria with d 0, d and d, their selection rates of the true model are decreasing as the dimension p is increasing. In all the DC criteria with d 0, d and d, their selection rates of the true model are increasing as the distance between groups or α is increasing. The DC criteria with d 0 has a tendency of not selecting overspecified models as p is small in comparison with n, and has a tendency of selecting overspcified models as p and n are near.
12 The DC criteria with d does not select overspesified models, and has a tendancy of choosing the variables strictly. The DC criteria with d 2 has a tendency of not selectiong overspecified models when p is small, and has a tendency of selecting overspesified models as n and p are near, as in the DC criteria with d Concluding Remarks In this paper we propose a distance-based criterion (DC for the variable selection problem, based on drawing on the significance of each the variables except for the effect due to other variables. The criterion involves a constant term d, and is denoted by DC d. DC d need not to examine all the subsets, but need to examine only the p subsets ω ( i. The circumvent computational complexities associated with all the subsets selection procedures been resolved. Further, it was identified that a range of d and a class of its estimators such that DC d has a high-dimensional consistency property. Especially the criteria DC d with d = d 0, d = d and d = d 2 were numerically examined. However, an optimality problem on d is left for future research. A study of high-dimensional consistency properties when p is infinite and 2 = O(p are also left. Appendix: Proofs of Theorems 3. and 3.2 A Preliminary Lemmas First we summarize the distributional results related to the Mahalanobis distance and the statistics D d,i in (2.5. For a notational simplicity, consider a decomposition of y = (y, y 2, y; p, p 2 ; p 2. Let D and D be the 2
13 sample Mahalanobis distances based on y and y, respectively. Let D2 2 = D 2 D. 2 Similarly, the corresponding population quantities are expressed as, and 2 2. Let the coefficient vector β of the linear discriminant function in (2. decompose as β = (β, β 2, β ; p, β 2 ; p 2. The following lemmas (see, e.g., Fujikoshi et al. 200 are used. Lemma A.. For the population Mahalanobis distances and, it holds that ( 2 2 = (2 2 2 = 0 β 2 = 0. Note that 2 2 ( i = 0 β i = 0. (A. Lemma A.2. For the sample Maharanobis distances D and D, it holds that ( D 2 = (n 2g 2 χ 2 p (g 2 2 { χ 2 n p }. (2 If 2 2 = 0, g 2 (D 2 D 2 {n 2 + g 2 D 2 } = χ 2 p 2 {χ 2 n p }. (3 If 2 2 = 0, (D 2 D{n g 2 D} 2 and D 2 are independent, (4 D2 2 = (n 2g 2 χ 2 p 2 (g 2 2 {χ R n p } ( + R, where R = χ 2 p (g 2 2 { χ 2 n p }. Here, χ 2 p (, χ 2 n p, χ 2 p 2 (, and χ 2 n p are independent chi-square variates. From Lemma A., D 2 and D 2 ( i can be expressed as D 2 = (n 2g 2 χ2 p(g 2 2 χ 2 n p, D( i 2 = (n 2g 2 χ2 p (g 2 2 ( i, (A.2 χ 2 n p 2 where g = (n n 2 /n. It is easy to see that m χ2 p m, if η 2 = lim m δ 2 /m. m χ2 m(δ 2 p + η 2, (A.3 3
14 A2 Proofs of Theorems 3. and 3.2 Without loss of generality, we may j = {,..., s} and p = s. Then, [F] and [F2] are expressed as [F] = s p P (D d,i 0, [F2] = P (D d,i 0. i= i=s+ Proof of Theorem 3. Using (A.2 and (A.3, Similarly, Therefor, D 2 = (n 2g 2 χ2 p(g 2 2 χ 2 n p n = (n 2 χ2 p(g 2 2 n p n n 2 p χ 2 n p p c k k 2 ( c + c 2. D 2 ( i p D 2 D 2 ( i c k k 2 ( c + c 2 ( i. p ( 2 2 c ( i p n p If i j = {,..., s}, then, τ i = 2 2 ( i > 0, i =,..., s, and D 2 D 2 ( i p c τ i. Therefor, if d < c τ min, where τ min = min i=,...,s τ i, D d,i = D 2 D 2 ( i d is positive, and P (D d,i 0 0 s [F] = P (D d,i 0 0, i= 4
15 since s is finite. Proof of Theorem 3.2 Next we shall show [F2] under the assumptions in Theorem 3.2. Assume that i {s +,..., p}. Then, we have 2 = 2 ( i. In general, note that D 2 D( i 2 > 0. Using Lemma A.2 (, and hence E(D 2 = (n 2np n n 2 (n p 3 + n 2 n p 3 2, E { D 2 D 2 ( i} = q(p, n, 2, where q(p, n, 2 is given by (3.5. Suppose that. Then, we have P ( D 2 D 2 ( i d d > q(p, n, 2. = P ( D 2 D 2 ( i E[D 2 D 2 ( i] d E[D 2 D 2 ( i] P ( D 2 D( i 2 E[D 2 D( i] 2 d E[D 2 D( i] 2 (d E[D 2 D( i 2 Var(D2 D( i. 2 ]2 The last inequality follows from Chebyshev s inequality. It is easy to see that E[D 2 D 2 ( i ] = O(n. Since d = O( from our assumption, (d E[D 2 D 2 ( i] 2 = O(. Using Lemma A.2 (4, the denominator is expressed as Var(D 2 D( i 2 = E{(D 2 D( i 2 2 } {E(D 2 D( i} 2 2 { ( χ = (n 2 2 g ( E + χ2 p (g } χ 2 n p (n 2 2 g 4 (n p χ 2 n p ( + p + g n p 2
16 The expectation of the first term can be computed as { ( χ 2 2 ( E + χ2 p (g } = χ 2 n p χ 2 n p { (n p 3(n p 5 E + 2 χ2 p (g 2 2 χ 2 n p ( χ 2 p (g } + χ 2 n p = { + 2 p + g2 2 (n p 3(n p 5 n p 2 + (p } 2 + 2(p + 2(p g g g 4 4 (n p 2(n p 4 { } = + O( + O( = O(n 2. (n p 3(n p 5 The second term is evaluated as (n 2 2 g 4 ( + p + g2 2 2 (n p 3 2 n p 2 = O( (n p 3 ( + 2 O(2 = O(n 2. Therefore, we have Var(D 2 D 2 ( i = (n 2 2 g 4 O(n 2 O(n 2 = O(n 2. These imply the followings: P ( D 2 D( i 2 d O(n 2, and hence p P ( D 2 D( i 2 d 0, i=s+ which proves [F2] 0. 6
17 A3 Proof of Theorem 4. As in the proofs of Theorems 3. and 3.2, assume that j = {,..., s}, and show that [F] [F2] s P i= p i=s+ P ( D 2 D( i 2 d 0, ( D 2 D( i 2 d 0. First consider [F], and assume that i {,..., p }. In the proof of Theorem 3., it has been shown that D 2 Using this result, we have p c k k 2 ( c + c 2. d p 0. These imply that and hence D 2 D 2 ( i d p c τ i > 0, P (D 2 D 2 ( i d = 0, which implies [F] 0. Next, consider to show [F2] 0. Assume that i {s+,..., p}. Then, 2 = 2 ( i. Denote that D2 D 2 ( i = D {i} ( i. Using Lemma A.2(2 and (3, we can write as P ( D 2 D( i 2 d = P P = P ( ( χ 2 χ 2 n p χ 2 χ 2 n p g 2 d n 2 + g 2 D(i 2 g 2 d n 2 + g 2 D 2 ( F,n p n p n p 3 a(np/3. 7
18 Noting that the mean of F,n p is (n p (n p 3, ( P D 2 D( i 2 d P ( F,n p E[F,n p ] a(np /3 This implies that [F2] a 2 p i=j + which proves [F2] 0. = a 2 (np Var(F,n p 2/3 2n 2 (n a 2 (np 2/3 (n 2 2 (n 4. 2n 2 (n a 2 (np 2/3 (n 2 2 (n 4 p 2n 2 (n (np 2/3 (n 2 2 (n 4 = O(n /3, Acknowledgements The first author s research is partially supported by the Ministry of Education, Science, Sports, and Culture, a Grant-in-Aid for Scientific Research (C, 6K00047, References [] Clemmensen, L., Hastie, T., Witten, D. M. and Ersbell, B. (20. Sparse discriminant analysis. Technometrics, 53, [2] Fujikoshi, Y. (985. Selection of variables in two-group discriminant analysis by error rate and Akaike s information criteria. J. Multivariate Anal., 7, [3] Fujikoshi, Y., Ulyanov, V. V. and Shimizu, R. (200. Multivariate Statistics: High-Dimensional and Large-Sample Approximations. Wiley, Hobeken, N.J. 8
19 [4] Hyodo, M. and Kubokawa, T. (204. A variable selection criterion for linear discriminant rule and its optimality in high dimensional and large sample data. J. Multivariate Anal., 23, [5] Nishii, R., Bai, Z. D. and Krishnaia, P. R. (988. Strong consistency of the information criterion for model selection in multivariate analysis. Hiroshima Math. J., 8, [6] Rao, C. R. (973. Linear Statistical Inference and Its Applications, 2nd ed. Wiley, New-York. [7] Yamada, T., Sakurai, T. and Fujikoshi, Y. (207. High-dimensional asymptotic results for EPMCs of W- and Z- rules. Hiroshima Statistical Research Group, TR 7-. [8] Witten, D. W. and Tibshirani, R. (20. Penalized classification using Fisher s linear discriminant. J. R. Statist. Soc. B, 73,
Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis
Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,
More informationConsistency of test based method for selection of variables in high dimensional two group discriminant analysis
https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai
More informationHigh-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi
High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,
More informationApproximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work
Hiroshima Statistical Research Group: Technical Report Approximate interval estimation for PMC for improved linear discriminant rule under high dimensional frame work Masashi Hyodo, Tomohiro Mitani, Tetsuto
More informationDepartment of Mathematics, Graduate School of Science Hiroshima University Kagamiyama, Higashi Hiroshima, Hiroshima , Japan.
High-Dimensional Properties of Information Criteria and Their Efficient Criteria for Multivariate Linear Regression Models with Covariance Structures Tetsuro Sakurai and Yasunori Fujikoshi Center of General
More informationEPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large
EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large Tetsuji Tonda 1 Tomoyuki Nakagawa and Hirofumi Wakaki Last modified: March 30 016 1 Faculty of Management and Information
More informationHigh-dimensional asymptotic expansions for the distributions of canonical correlations
Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic
More informationAn Unbiased C p Criterion for Multivariate Ridge Regression
An Unbiased C p Criterion for Multivariate Ridge Regression (Last Modified: March 7, 2008) Hirokazu Yanagihara 1 and Kenichi Satoh 2 1 Department of Mathematics, Graduate School of Science, Hiroshima University
More informationT 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem
T Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem Toshiki aito a, Tamae Kawasaki b and Takashi Seo b a Department of Applied Mathematics, Graduate School
More informationSTRONG CONSISTENCY OF THE AIC, BIC, C P KOO METHODS IN HIGH-DIMENSIONAL MULTIVARIATE LINEAR REGRESSION
STRONG CONSISTENCY OF THE AIC, BIC, C P KOO METHODS IN HIGH-DIMENSIONAL MULTIVARIATE LINEAR REGRESSION AND By Zhidong Bai,, Yasunori Fujikoshi, and Jiang Hu, Northeast Normal University and Hiroshima University
More informationExpected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples
Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples Nobumichi Shutoh, Masashi Hyodo and Takashi Seo 2 Department of Mathematics, Graduate
More informationComparison with RSS-Based Model Selection Criteria for Selecting Growth Functions
TR-No. 14-07, Hiroshima Statistical Research Group, 1 12 Comparison with RSS-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 2, Mariko Yamamura 1 and Hirokazu Yanagihara 2 1
More informationA fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables
A fast and consistent variable selection method for high-dimensional multivariate linear regression with a large number of explanatory variables Ryoya Oda and Hirokazu Yanagihara Department of Mathematics,
More informationVariable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension
Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension (Last Modified: November 3 2008) Mariko Yamamura 1, Hirokazu Yanagihara 2 and Muni S. Srivastava 3
More informationA high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration
A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration Ryoya Oda, Yoshie Mima, Hirokazu Yanagihara and Yasunori Fujikoshi Department of Mathematics, Graduate
More informationLikelihood Ratio Tests in Multivariate Linear Model
Likelihood Ratio Tests in Multivariate Linear Model Yasunori Fujikoshi Department of Mathematics, Graduate School of Science, Hiroshima University, 1-3-1 Kagamiyama, Higashi Hiroshima, Hiroshima 739-8626,
More informationTesting linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices
Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical
More informationModfiied Conditional AIC in Linear Mixed Models
CIRJE-F-895 Modfiied Conditional AIC in Linear Mixed Models Yuki Kawakubo Graduate School of Economics, University of Tokyo Tatsuya Kubokawa University of Tokyo July 2013 CIRJE Discussion Papers can be
More informationBias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition
Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems
More informationStatistical Inference On the High-dimensional Gaussian Covarianc
Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional
More informationIllustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives
TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui
More informationarxiv: v1 [math.st] 2 Apr 2016
NON-ASYMPTOTIC RESULTS FOR CORNISH-FISHER EXPANSIONS V.V. ULYANOV, M. AOSHIMA, AND Y. FUJIKOSHI arxiv:1604.00539v1 [math.st] 2 Apr 2016 Abstract. We get the computable error bounds for generalized Cornish-Fisher
More informationComparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions
c 215 FORMATH Research Group FORMATH Vol. 14 (215): 27 39, DOI:1.15684/formath.14.4 Comparison with Residual-Sum-of-Squares-Based Model Selection Criteria for Selecting Growth Functions Keisuke Fukui 1,
More informationTesting Block-Diagonal Covariance Structure for High-Dimensional Data
Testing Block-Diagonal Covariance Structure for High-Dimensional Data MASASHI HYODO 1, NOBUMICHI SHUTOH 2, TAKAHIRO NISHIYAMA 3, AND TATJANA PAVLENKO 4 1 Department of Mathematical Sciences, Graduate School
More informationMore Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction
Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order
More informationOn testing the equality of mean vectors in high dimension
ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.
More informationCOMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS
Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department
More informationBias-corrected AIC for selecting variables in Poisson regression models
Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,
More informationAsymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data
Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations
More informationA Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data
A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine
More informationAn Information Criteria for Order-restricted Inference
An Information Criteria for Order-restricted Inference Nan Lin a, Tianqing Liu 2,b, and Baoxue Zhang,2,b a Department of Mathematics, Washington University in Saint Louis, Saint Louis, MO 633, U.S.A. b
More informationThe Role of "Leads" in the Dynamic Title of Cointegrating Regression Models. Author(s) Hayakawa, Kazuhiko; Kurozumi, Eiji
he Role of "Leads" in the Dynamic itle of Cointegrating Regression Models Author(s) Hayakawa, Kazuhiko; Kurozumi, Eiji Citation Issue 2006-12 Date ype echnical Report ext Version publisher URL http://hdl.handle.net/10086/13599
More informationTesting Some Covariance Structures under a Growth Curve Model in High Dimension
Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping
More informationSHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan
A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,
More informationASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE
J Jaan Statist Soc Vol 34 No 2004 9 26 ASYMPTOTIC RESULTS OF A HIGH DIMENSIONAL MANOVA TEST AND POWER COMPARISON WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE Yasunori Fujikoshi*, Tetsuto Himeno
More informationOn the conservative multivariate multiple comparison procedure of correlated mean vectors with a control
On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control Takahiro Nishiyama a a Department of Mathematical Information Science, Tokyo University of Science,
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More informationEstimation of multivariate 3rd moment for high-dimensional data and its application for testing multivariate normality
Estimation of multivariate rd moment for high-dimensional data and its application for testing multivariate normality Takayuki Yamada, Tetsuto Himeno Institute for Comprehensive Education, Center of General
More informationAnalysis Methods for Supersaturated Design: Some Comparisons
Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs
More informationTesting equality of two mean vectors with unequal sample sizes for populations with correlation
Testing equality of two mean vectors with unequal sample sizes for populations with correlation Aya Shinozaki Naoya Okamoto 2 and Takashi Seo Department of Mathematical Information Science Tokyo University
More informationA Study on the Bias-Correction Effect of the AIC for Selecting Variables in Normal Multivariate Linear Regression Models under Model Misspecification
TR-No. 13-08, Hiroshima Statistical Research Group, 1 23 A Study on the Bias-Correction Effect of the AIC for Selecting Variables in Normal Multivariate Linear Regression Models under Model Misspecification
More informationRobustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples
DOI 10.1186/s40064-016-1718-3 RESEARCH Open Access Robustness of the Quadratic Discriminant Function to correlated and uncorrelated normal training samples Atinuke Adebanji 1,2, Michael Asamoah Boaheng
More informationOn the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors
On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors Takashi Seo a, Takahiro Nishiyama b a Department of Mathematical Information Science, Tokyo University
More informationOn High-Dimensional Cross-Validation
On High-Dimensional Cross-Validation BY WEI-CHENG HSIAO Institute of Statistical Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 11529, Taiwan hsiaowc@stat.sinica.edu.tw 5 WEI-YING
More informationA Robust Approach to Regularized Discriminant Analysis
A Robust Approach to Regularized Discriminant Analysis Moritz Gschwandtner Department of Statistics and Probability Theory Vienna University of Technology, Austria Österreichische Statistiktage, Graz,
More informationSmall sample size in high dimensional space - minimum distance based classification.
Small sample size in high dimensional space - minimum distance based classification. Ewa Skubalska-Rafaj lowicz Institute of Computer Engineering, Automatics and Robotics, Department of Electronics, Wroc
More informationExtended Bayesian Information Criteria for Gaussian Graphical Models
Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationExtended Bayesian Information Criteria for Model Selection with Large Model Spaces
Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Jiahua Chen, University of British Columbia Zehua Chen, National University of Singapore (Biometrika, 2008) 1 / 18 Variable
More informationInferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data
Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone
More informationA Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when the Covariance Matrices are Unknown but Common
Journal of Statistical Theory and Applications Volume 11, Number 1, 2012, pp. 23-45 ISSN 1538-7887 A Test for Order Restriction of Several Multivariate Normal Mean Vectors against all Alternatives when
More informationEXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES AND DISTANCES BETWEEN CENTROIDS OF GROUPS. Haruo Yanai*
J. Japan Statist. Soc. Vol. 11 No. 1 1981 43-53 EXPLICIT EXPRESSIONS OF PROJECTORS ON CANONICAL VARIABLES AND DISTANCES BETWEEN CENTROIDS OF GROUPS Haruo Yanai* Generalized expressions of canonical correlation
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More informationHigh-dimensional regression modeling
High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making
More informationEdgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population
Edgeworth Expansions of Functions of the Sample Covariance Matrix with an Unknown Population (Last Modified: April 24, 2008) Hirokazu Yanagihara 1 and Ke-Hai Yuan 2 1 Department of Mathematics, Graduate
More informationChris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010
Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,
More informationMS-C1620 Statistical inference
MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents
More informationLecture 32: Asymptotic confidence sets and likelihoods
Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence
More informationA Fast Algorithm for Optimizing Ridge Parameters in a Generalized Ridge Regression by Minimizing an Extended GCV Criterion
TR-No. 17-07, Hiroshima Statistical Research Group, 1 24 A Fast Algorithm for Optimizing Ridge Parameters in a Generalized Ridge Regression by Minimizing an Extended GCV Criterion Mineaki Ohishi, Hirokazu
More informationKen-ichi Kamo Department of Liberal Arts and Sciences, Sapporo Medical University, Hokkaido, Japan
A STUDY ON THE BIAS-CORRECTION EFFECT OF THE AIC FOR SELECTING VARIABLES IN NOR- MAL MULTIVARIATE LINEAR REGRESSION MOD- ELS UNDER MODEL MISSPECIFICATION Authors: Hirokazu Yanagihara Department of Mathematics,
More informationA NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION
J. Japan Statist. Soc. Vol. 32 No. 2 2002 165 181 A NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION Hisayuki Tsukuma* The problem of estimation in multivariate linear calibration
More informationJackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification
Jackknife Bias Correction of the AIC for Selecting Variables in Canonical Correlation Analysis under Model Misspecification (Last Modified: May 22, 2012) Yusuke Hashiyama, Hirokazu Yanagihara 1 and Yasunori
More informationQing-Ming Cheng and Young Jin Suh
J. Korean Math. Soc. 43 (2006), No. 1, pp. 147 157 MAXIMAL SPACE-LIKE HYPERSURFACES IN H 4 1 ( 1) WITH ZERO GAUSS-KRONECKER CURVATURE Qing-Ming Cheng and Young Jin Suh Abstract. In this paper, we study
More informationLinear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77
Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical
More informationStepwise Searching for Feature Variables in High-Dimensional Linear Regression
Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Qiwei Yao Department of Statistics, London School of Economics q.yao@lse.ac.uk Joint work with: Hongzhi An, Chinese Academy
More informationVariable selection in multiple regression with random design
Variable selection in multiple regression with random design arxiv:1506.08022v1 [math.st] 26 Jun 2015 Alban Mbina Mbina 1, Guy Martial Nkiet 2, Assi Nguessan 3 1 URMI, Université des Sciences et Techniques
More informationNonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University
Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced
More informationInference After Variable Selection
Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection
More informationA Confidence Region Approach to Tuning for Variable Selection
A Confidence Region Approach to Tuning for Variable Selection Funda Gunes and Howard D. Bondell Department of Statistics North Carolina State University Abstract We develop an approach to tuning of penalized
More informationFeature selection with high-dimensional data: criteria and Proc. Procedures
Feature selection with high-dimensional data: criteria and Procedures Zehua Chen Department of Statistics & Applied Probability National University of Singapore Conference in Honour of Grace Wahba, June
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationHypothesis testing:power, test statistic CMS:
Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this
More informationGlobal Model Fit Test for Nonlinear SEM
Global Model Fit Test for Nonlinear SEM Rebecca Büchner, Andreas Klein, & Julien Irmer Goethe-University Frankfurt am Main Meeting of the SEM Working Group, 2018 Nonlinear SEM Measurement Models: Structural
More informationAR-order estimation by testing sets using the Modified Information Criterion
AR-order estimation by testing sets using the Modified Information Criterion Rudy Moddemeijer 14th March 2006 Abstract The Modified Information Criterion (MIC) is an Akaike-like criterion which allows
More informationA COMMENT ON FREE GROUP FACTORS
A COMMENT ON FREE GROUP FACTORS NARUTAKA OZAWA Abstract. Let M be a finite von Neumann algebra acting on the standard Hilbert space L 2 (M). We look at the space of those bounded operators on L 2 (M) that
More informationMATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.)
1/12 MATH 829: Introduction to Data Mining and Analysis Graphical Models III - Gaussian Graphical Models (cont.) Dominique Guillot Departments of Mathematical Sciences University of Delaware May 6, 2016
More informationComment about AR spectral estimation Usually an estimate is produced by computing the AR theoretical spectrum at (ˆφ, ˆσ 2 ). With our Monte Carlo
Comment aout AR spectral estimation Usually an estimate is produced y computing the AR theoretical spectrum at (ˆφ, ˆσ 2 ). With our Monte Carlo simulation approach, for every draw (φ,σ 2 ), we can compute
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationKRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS
Bull. Korean Math. Soc. 5 (24), No. 3, pp. 7 76 http://dx.doi.org/34/bkms.24.5.3.7 KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS Yicheng Hong and Sungchul Lee Abstract. The limiting
More informationExtreme inference in stationary time series
Extreme inference in stationary time series Moritz Jirak FOR 1735 February 8, 2013 1 / 30 Outline 1 Outline 2 Motivation The multivariate CLT Measuring discrepancies 3 Some theory and problems The problem
More informationLecture 7: Model Building Bus 41910, Time Series Analysis, Mr. R. Tsay
Lecture 7: Model Building Bus 41910, Time Series Analysis, Mr R Tsay An effective procedure for building empirical time series models is the Box-Jenkins approach, which consists of three stages: model
More informationarxiv: v1 [stat.co] 26 May 2009
MAXIMUM LIKELIHOOD ESTIMATION FOR MARKOV CHAINS arxiv:0905.4131v1 [stat.co] 6 May 009 IULIANA TEODORESCU Abstract. A new approach for optimal estimation of Markov chains with sparse transition matrices
More information1. Introduction Over the last three decades a number of model selection criteria have been proposed, including AIC (Akaike, 1973), AICC (Hurvich & Tsa
On the Use of Marginal Likelihood in Model Selection Peide Shi Department of Probability and Statistics Peking University, Beijing 100871 P. R. China Chih-Ling Tsai Graduate School of Management University
More information1 Introduction. 2 AIC versus SBIC. Erik Swanson Cori Saviano Li Zha Final Project
Erik Swanson Cori Saviano Li Zha Final Project 1 Introduction In analyzing time series data, we are posed with the question of how past events influences the current situation. In order to determine this,
More informationThe Third International Workshop in Sequential Methodologies
Area C.6.1: Wednesday, June 15, 4:00pm Kazuyoshi Yata Institute of Mathematics, University of Tsukuba, Japan Effective PCA for large p, small n context with sample size determination In recent years, substantial
More informationSIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA
SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA Kazuyuki Koizumi Department of Mathematics, Graduate School of Science Tokyo University of Science 1-3, Kagurazaka,
More informationOutline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data
Outline Mixed models in R using the lme4 package Part 3: Longitudinal data Douglas Bates Longitudinal data: sleepstudy A model with random effects for intercept and slope University of Wisconsin - Madison
More informationHigh-dimensional two-sample tests under strongly spiked eigenvalue models
1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data
More informationShrinkage Tuning Parameter Selection in Precision Matrices Estimation
arxiv:0909.1123v1 [stat.me] 7 Sep 2009 Shrinkage Tuning Parameter Selection in Precision Matrices Estimation Heng Lian Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang
More informationMixture Modeling. Identifying the Correct Number of Classes in a Growth Mixture Model. Davood Tofighi Craig Enders Arizona State University
Identifying the Correct Number of Classes in a Growth Mixture Model Davood Tofighi Craig Enders Arizona State University Mixture Modeling Heterogeneity exists such that the data are comprised of two or
More informationA Method to Estimate the True Mahalanobis Distance from Eigenvectors of Sample Covariance Matrix
A Method to Estimate the True Mahalanobis Distance from Eigenvectors of Sample Covariance Matrix Masakazu Iwamura, Shinichiro Omachi, and Hirotomo Aso Graduate School of Engineering, Tohoku University
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationASYMPTOTICALLY NONEXPANSIVE MAPPINGS IN MODULAR FUNCTION SPACES ABSTRACT
ASYMPTOTICALLY NONEXPANSIVE MAPPINGS IN MODULAR FUNCTION SPACES T. DOMINGUEZ-BENAVIDES, M.A. KHAMSI AND S. SAMADI ABSTRACT In this paper, we prove that if ρ is a convex, σ-finite modular function satisfying
More informationACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION
ACCURATE ASYMPTOTIC ANALYSIS FOR JOHN S TEST IN MULTICHANNEL SIGNAL DETECTION Yu-Hang Xiao, Lei Huang, Junhao Xie and H.C. So Department of Electronic and Information Engineering, Harbin Institute of Technology,
More informationMinimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA
Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University
More informationSUPPLEMENT TO TESTING UNIFORMITY ON HIGH-DIMENSIONAL SPHERES AGAINST MONOTONE ROTATIONALLY SYMMETRIC ALTERNATIVES
Submitted to the Annals of Statistics SUPPLEMENT TO TESTING UNIFORMITY ON HIGH-DIMENSIONAL SPHERES AGAINST MONOTONE ROTATIONALLY SYMMETRIC ALTERNATIVES By Christine Cutting, Davy Paindaveine Thomas Verdebout
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationVariable Selection for Highly Correlated Predictors
Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates
More informationStructure estimation for Gaussian graphical models
Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of
More information