Approximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work

Size: px
Start display at page:

Download "Approximate interval estimation for EPMC for improved linear discriminant rule under high dimensional frame work"

Transcription

1 Hiroshima Statistical Research Group: Technical Report Approximate interval estimation for PMC for improved linear discriminant rule under high dimensional frame work Masashi Hyodo, Tomohiro Mitani, Tetsuto Himeno and Takashi Seo Last Modified: March 9, 5 Abstract. An observation is to be classified into one of two multivariate normal populations with equal covariance matrix. In this paper, we consider the confidence intervals for expected probability of misclassification PMC for improved linear discriminant rule in two types of data:namely, large sample data and high dimensional data. Our approximate confidence interval is based on the asymptotic normality of consistent estimator of PMC. We obtain new results of stochastic expression for two bilinear forms and two quadratic forms which are important for our asymptotic evaluation of PMC. We prove asymptotic normality under two different frameworks which could be convenient in different situations based on these results. Through simulation study, it is observed that our approximate confidence interval has a good performance not only in high dimensional and large sample settings, but also in large sample settings. AMS Mathematics Subject Classification. 6H, 6. Key words and phrases. asymptotic approximations, expected probability of misclassification, high dimensional data, linear discriminant function.. Introduction We consider the problem of classifying a future observation vector into one of the two population groups Π and Π. For each i,, Π i denotes a population from a multivariate normal distribution N p µ i, Σ, and it is supposed that x ij, j,..., N i, are observed from the population Π i. Here, µ i i, and Σ are unknown parameters, and they are estimated by the sample mean vectors x i N S n i Ni i j x ij i, and the pooled sample covariance matrix Ni j x ij x i x ij x i for n N N.

2 M. HYODO, T. MITANI, T. HIMNO AND T. SO The linear discriminant function is defined as T x x x S {x x x }. Observe however that the linear discriminant function T x has a bias. In fact, T x x Π i ] n i n p nn N p n p N N, i,, where µ µ Σ µ µ. For this reason, we use the bias-corrected discriminant function defined as. T x x x S {x x x } nn N p n p N N, where the subtraction of nn N p/{n p N N } in. is to guarantee that T x x Π i ] n/{n p } i, i,. Now using T x, a new observation x is to be assigned to Π if T x >, and to Π otherwise. The performance of this discriminant rule is evaluated by its probabilities of misclassification. The probabilities of misclassification have been obtained with respect to the distribution of the linear discriminant function T x. There are different types of misclassification probability associated with T x. These are the conditional probabilities of misclassification CPMC and expected probabilities of misclassification PMC. The CPMC is defined by. L P T x x Π, X], L P T x > x Π, X], where X x,..., x N, x,..., x N. We note that the CPMC is the conditional probability of misclassifying an observation x from Π i into Π j, i, j,, i j. On the other hand, the PMC is defined by. R L ], R L ]. We note that the PMC is the unconditional probability of misclassifying an observation x from Π i into Π j, i, j,, i j. Since the exact expression for the PMC is very complicated, there are much works for the approximation of PMC. The asymptotic approximation of PMC under a framework such that N and N are large with p is fixed has been studied. This approximation is called large sample approximation. For a review of these results, see, e.g., Okamoto 96, 968 and Siotani 98. Further, asymptotic approximation of PMC under a framework that N, N and p are all large have also been studied see, e.g., Lachenbruch 968 and Fujikoshi and Seo 998.

3 APPROXIMAT INTRVAL STIMATION FOR PMC This approximation is called high dimensional and large sample approximation. In addition, Fujikoshi gave an explicit formula of error bounds for a high dimensional and large sample approximation of PMC proposed by Lachenbruch 968. However, as their approximations are functions of unknown parameters, it must be estimated in practice. Based on the large sample approximation, Lachenbruch and Mickey 968 proposed the asymptotic unbiased estimator of PMC. On the other hand, Kubokawa, Hyodo and Srivastava proposed the second order asymptotic unbiased estimator of the PMC in high dimensional and large sample framework. In this paper, we consider the interval estimations for the PMC. Since the exact interval estimations for the PMC are very difficult problem, there are some works for the approximate confidence interval. McLachlan 975 proposed an approximate confidence interval for the CPMC based on the large sample approximation. Recently, Chung and Han 9 proposed the jackknife confidence interval and the bootstrap confidence interval for the CPMC. The problems with these methods are listed below. A Since CPMC is conditional probability, it is more desirable to derive interval estimation of PMC. B Since these methods are based on large sample asymptotic results, these methods do not perform well in high dimensional settings. For the problems A and B, we derive the asymptotic distribution of the estimator of PMC under the high dimensional and large sample frame works, and propose the approximate confidence interval for the PMC. For that purpose, we derive explicit expression of stochastic for two bilinear forms and two quadratic forms. The method used in this paper is to express based on eight primitive random variables, namely four random variables having the standard normal distribution and four random variables having chi-square distributions. This approach not only makes it easier to derive asymptotic distribution of estimator of PMC, but also enables us to show the asymptotic normality of CPMC. As a by product, we show asymptotic normality of CPMC. The organization of this paper is as follows. In Section, we propose consistent estimator of PMC. In Section, we propose new approximate confidence interval of PMC and show the asymptotic normality of CPMC. In Section 4, we investigate the performances of our approximate confidence intervals through the numerical studies. The conclusion of our study is summarized in Section 5. Some preliminary results are given in Appendix.. The consistent estimator of PMC In this section, we propose the consistent estimator of the PMC. Since R can be obtained from R simply by interchanging N and N, we only deal

4 4 M. HYODO, T. MITANI, T. HIMNO AND T. SO with R. Let c p/n, γ N /n, γ N /n. We assume the following asymptotic frameworks, in order to derive limiting value of R. A n, p with n c c for some c,, A n, N, N with n γ γ, n γ γ for some γ, γ,, A n with n for some,. Suppose that x Π. Under these conditions, a conditional distribution of T x given x, x, S is distributed as N U, V, where U x x S x µ x x S x x nn N p n p N N, V x x S ΣS x x. Then, R can be expressed as R Φ UV /], where Φ denotes the cumulative distribution function of N,. We rewrite U and V by using τ N N /n Σ / N N µ µ, u n Σ / x x, u Σ / N x N x N µ N µ, W nσ / SΣ /. n It is seen that u, u and W are mutually independently and distributed as u N p τ, I p, u N p, I p and W W p n, I p, respectively. Using these variables, we can rewrite U and V as.. U N N n u N N W n u u N N W u n τ W u N nn N p n p N N, V n n N N u W u. Applying Lemma A. to. and., we obtain the constants U and V as U lim U] n,p c, V lim V ] n,p c c γ γ.

5 APPROXIMAT INTRVAL STIMATION FOR PMC 5 Also, the expectations U U ] and V V ] can be evaluated as U U ] { n c 4 c. γ γ cγ γ } γ on, γ V V ] c n c { c c }.4 γ γ c { c c } ] γ on. γ under the asymptotic frameworks A-A. See details in Appendix B and C. Thus, using.,.4 and Chebyshev s inequality, we have that U p U and V p V. Furthermore, using continuous mapping theorem, we obtain that.5 Φ UV / Φ U V / p under the asymptotic frameworks A-A. On the other hand, it holds that Φ UV / Φ U V /.6 < a.s. Combining.5,.6 and dominated convergence theorem, we obtain the following lemma. Lemma.. Under the asymptotic frameworks A-A, it holds that R Φ c/. c/γ γ Since the limiting value of R is a function of, we begin by obtaining its consistent estimator. Lemma.. The estimator of is defined by n p x x S n p x x. n N N Under the asymptotic frameworks A-A, it holds that p. Proof We can rewrite the estimator.7 n p n N N u W u n p N N.

6 6 M. HYODO, T. MITANI, T. HIMNO AND T. SO Applying Lemma A. to.7, we have.8 ] n c 4 4 γ γ c γ γ on under the asymptotic frameworks A-A. See details in Appendix D. Thus, using.8 and Chebyshev s inequality, we have p under the asymptotic frameworks A-A. Substituting the consistent estimator into the limiting term ΦU V /, the consistent estimator of R is obtained by R Φ Û V, where Û c and V c { c/γ γ }. The following corollary is obtained from continuous mapping theorem and consistency of estimator. Corollary.. Under the asymptotic frameworks A-A, it holds that R p R.. Approximate interval estimation for PMC and asymptotic normality of CPMC In Section., we show the asymptotic normality of the estimator of PMC under two different frameworks, and propose the approximate confidence interval. In Section., we also show the asymptotic normality of CPMC... The asymptotic normality of the estimator of PMC At first, we derive the asymptotic distribution of the studentized statistics under the high dimensional frameworks A-A. We consider the following random variable n R Φ U V. To show the asymptotic normality of the above random variable, we consider the stochastic expansions of Û and V. Since the statistics Û and V are the functions of, it is essential to derive the stochastic expansion of. By using u and W, we rewrite as n p n N N u W u n p N N.

7 APPROXIMAT INTRVAL STIMATION FOR PMC 7 Define the variables where v ṽ p p, v ṽ n p n p, ṽ χ p, ṽ χ n p. Here, χ a a N means chi-square distribution with a degrees of freedom. The estimator is expanded as. D n o p n /, where D g v g v g u. Here, c c γ γ u N,, g, g, g γ γ cγ γ γ γ and v, v and u are mutually independent. From., it is noted that. Û U c D n o p n /, V V c D n o p n /, for c { c} and c c. expansion, it follows that where Here Û V U V V U V n Q o p n /, Using. and Taylor series c D n U V c D n o p n / Q q v q v q u. c c c γ γ q γ γ {cγ γ } /, q c c γ γ q c γ γ /. From the stochastic expansion of Û V, we have. R Φ c/ c/γ γ ϕ c γ γ γ γ cγ γ, c/ c/γ γ Q n o p n /,

8 8 M. HYODO, T. MITANI, T. HIMNO AND T. SO where ϕ is the p.d.f. of the standard normal distribution. Note that u is distributed as N,, v and v are asymptotically distributed as N, under the asymptotic framework A, and these variables are mutually independent. Hence, under the asymptotic frameworks A-A, it holds that.4 n R Φ c/ c/γ γ σ e where σ e ϕ c/ c/γ γ ϕ c/ c/γ γ q q q d N,, c γ γ c γ γ γ γ γ γ c γ γ /. Now turn to evaluate the difference of the limiting value of R and R. The remainder after using first term of the Taylor series of Φ at UV / U V / is given by Φ d! U V / U V / for some value d between UV / and U V /, and Φ d is equal or smaller than / πe uniformly in d,. Here, Φ is second derivative function of Φ. Hence, we have that ] R Φ U V / ϕ U V / U V / U.5 V / πe U V / U V /. We note that.6 U V / U U V / V / V U U U V / V V V V /V V V /V V V V V V / V U U V V. V

9 APPROXIMAT INTRVAL STIMATION FOR PMC 9 From A.8 and A. ].7 U U V ].8 U V / V V V c n on { U 4 n c V / c c } 4 γ γ c on. Since V /V > and V.9 V /V > V, U V V / V /V V V /V V V V < U V V ] V 4V / By using Lemma A., we obtain that V V ]. On V under the asymptotic frameworks A-A. From.9 and., See details in Appendix.. U V / V /V V V V /V V V On. V By using V V / V > V > and Cauchy Schwarz inequality, ] V V / U U V V. V V ] U U < V V U U ] V V ]. V V V V V By using Lemma A., we obtain that U U ]. On V

10 M. HYODO, T. MITANI, T. HIMNO AND T. SO under the asymptotic frameworks A-A. See details in Appendix F. From. and., ] V V / U U V V.4 On. V V Combining.7,.8,. and.4, under the asymptotic frameworks A-A, it holds that ] U V / U.5 V / On. Since V V V V >, U V / U V / U U V U U V U V V V V V V U V V V V V V U U U V V V V V V U U V U V V V U U U V V V V V By using Cauchy Schwarz inequality, we obtain that U V / U U U ] U.6 V V U V V / U U ] / V V ] /. V V From.,. and.6, we obtain that U V / U.7 V / On V V ] V under the asymptotic frameworks A-A. Combining.5,.5 and.7, under the asymptotic frameworks A-A, it holds that R Φ c/.8 c/γ γ On. By using.4 and.8, we obtain the following theorem..

11 APPROXIMAT INTRVAL STIMATION FOR PMC Theorem.. Under the asymptotic frameworks A-A, it holds that n R R T e σ e d N,. To propose the interval estimation of the PMC, we need to estimate σ e. We use truncated estimator ˆ max,, so that the estimator of σ e may be negative. Then it holds that.9 max, a.s. By using Markov s inequality,.8 and.9, we obtain ˆ p under the asymptotic frameworks A-A. Hence, ˆ is a consistent estimator of. Assigning the truncated estimator to the portion of σ e which may be negative, we propose σ e ˆ ϕ c/ ˆ ˆ c/γ γ c ˆ γ γ c ˆ γ γ ˆ γ γ γ γ c ˆ γ γ /. By using the consistent estimator σ e ˆ, we obtain the following statistics of T e n R R T e σ e ˆ Therefore we can obtain the following corollary. Corollary.. Under the asymptotic frameworks A-A, it holds that T e d N,.. Next, we show that asymptotic normality of Te the large sample framework is also established under A : p is fixed and n or A : n, p with p/ n.

12 M. HYODO, T. MITANI, T. HIMNO AND T. SO Under the frameworks A and A or the frameworks A, A and A, it holds that R Φ on / U, Φ Φ. on /, V σ e ϕ /γ γ. o, σ e ˆ p ϕ /γ γ,. T e ϕ σ e From.-., we have that Te 8 4γ γ v γ γ / u v Therefore we can obtain the following corollary. o p. γ γ / u o p. Corollary.. Assume the conditions A and A or the conditions A, A and A. Then, it holds that T e d N,. Remark.. From Corollary. and., T e has a asymptotic normality not only under high dimensional and large sample frame work, but also under the large sample framework. Based on Corollary. and., we propose an approximate α percentile confidence interval for PMC as following: C T R σ e y α, R σ e ]. y α, n n where y α denotes upper α percentile of standard normal distribution... Asymptotic normality of CPMC In this section, we show asymptotic normality of CPMC. The CPMC can be expressed as L Φ UV.

13 APPROXIMAT INTRVAL STIMATION FOR PMC Applying Lemma A. to. and., we obtain where U { n npn N ṽ N N n p nn N nu N N ṽ N N n u u ṽ nu 4 ṽ N N ṽ N N ṽ 4 n u V ṽ } u N N ṽ u ṽ u ṽ n ṽ N u N u, ṽ n N N ṽ 4 { n ṽ n n ṽ ṽ 4 ṽ 4 n n ṽ ṽ N N ṽ 4 N N ṽ u, u u ṽ } u i N, i,,, 4, ṽ χ p, ṽ χ n p, ṽ χ p, ṽ 4 χ n p, and these variables are mutually independent. Define the variables v ṽ p p, v ṽ n p n p, v ṽ p p, v 4 ṽ n p n p. Note that ṽ p p v, ṽ n p n p v, ṽ p p v, ṽ 4 n p n p v 4, and v, v, v and v 4 are asymptotically distributed as N, under the asymptotic framework A. By using Taylor series expansion based on these variables, we can expand U stochastically,.4 U U n U o p n /,

14 4 M. HYODO, T. MITANI, T. HIMNO AND T. SO where U c, cγ γ cγ γ γ γ U v v cγ γ c / γ γ cγ c / u γ γ c γ u c γ γ c c c u γ γ γ γ c / u 4. γ γ Using similar arguments, we can expand V stochastically,.5 V V V n o p n /, where V V c c, γ γ c c v c γ γ c c γ γ γ γ c 7/ v γ γ c v γ γ c c γ γ c 7/ v 4 γ γ c u. γ γ By using.4,.5 and Taylor series expansion, it follows that UV U V { U U } V / op n / nv V where Here, U V W n o p n /, W w v w v w v w 4 v 4 w 5 u w 6 u w 7 u w 8 u 4. w w c c c γ γ {cγ γ } / cγ γ γ γ cγ γ, γ c γ γ cγ γ, c c w cγ γ, w c 4 cγ γ, c w 5 γ γ {cγ γ } / c γ γ γ cγ γ, c γ w 6 γ γ cγ γ, w 7 c, w 8 c.

15 APPROXIMAT INTRVAL STIMATION FOR PMC 5 Using the Taylor series expansion, L is expressed as L ΦU V ϕu V W n o p n /. Since the random variables v, v, v, v 4, u, u, u and u 4 in W are mutually independent and asymptotically or exactly distributed as N,, we obtain the following theorem. Theorem.. Under the asymptotic frameworks A-A, it holds that nl R d N, σ, where σ {ϕu V } 8 i w i. Next, we evaluate asymptotic property of L under the large sample framework. We assume the conditions A and A or the conditions A, A and A. Then it holds that L Φ ϕ Thus, we obtain the following corollary. γ γ n u u o p n /. γ γ Corollary.. Assume the conditions A and A or the conditions A, A and A. Then, it holds that n L Φ d N, ϕ. 4γ γ Remark.. We consider the relation between the optimal rule.6 T opt x >resp. x Π resp.π, and our suggested rule.7 T x>resp. x Π resp.π, where T opt x µ µ Σ {x µ µ }, T x x x S {x x x }. From Corollary., we note that the distribution of the CPMC of the rule.7 under the condition A or A approaches a normal distribution with standard deviation shrinking in proportion to / n around the error rate of the optimal rule.6.

16 6 M. HYODO, T. MITANI, T. HIMNO AND T. SO 4. Simulation study In this section, we investigate the performance of proposed approximate confidence intervals.. In order to evaluate coverage probabilities of the approximate confidence intervals and the expected lengths, a Monte Carlo study is conducted. Without loss of generality, multivariate normal random samples are generated from Π : N p, I p and Π : N p 5, p, I p. The values of N, N and p are chosen as follows: n CaseA p,,,, 4, N : N :, :, :, p CaseB p 5, n,, 5, N : N :, :, :. In above configuration, we calculate the following coverage probabilities CP { R, R R n / σ e y α/, R n / σ e y α/ ]}, sim and the following expected lengths of approximate confidence interval L n / σ e y α/ y α/ ], where { } denotes number of element of set { }, sim denotes replication number of simulation. We also estimate the exact expected length by using Monte Carlo simulation as follows: L R α/ sim R α/ sim, where R i denotes i-th largest value among the sim values. Tables - give the coverage probabilities when p, and 5, respectively. Tables 4-6 give the expected lengths of approximate confidence interval and exact expected length when p, and 5, respectively. As can be seen from the Tables -, when the sample size or dimension is increased, probability for approximate confidence interval is close to confidence level. In addition, we observe that our approximations have a high level of accuracy in different situations: large sample settings Table, high dimensional and large sample settings Table -. From Tables 4-6, when the sample sizes increase, the expected lengths become narrower for each case. Through these simulation results, we can see that our approximate confidence interval has a good performance not only in high dimensional and large sample settings, but also in large sample settings. The asymptotic normality obtained by Corollary. is also demonstrated. Let B N,N N N /nl Φ / ϕ /, H p,n,n nl R σ.

17 APPROXIMAT INTRVAL STIMATION FOR PMC 7 Then Corollary. Theorem. show that B N,N H p,n,n converges in distribution to standard normal distribution as n n, p. To check for asymptotic normality make B N,N H p,n,n vs standard normal Q-Q plot in Case A. The straight line y x represents where asymptotic normality holds. Figure display the Q-Q plots of B N,N in Case B, and Figure, display the Q-Q plots of H p,n,n in Case A. From figures, it is confirmed that CPMC has normality when sample size is large enough compared with the dimension. 5. Conclusion The performance of classification procedure is evaluated by its error probability which usually depends on unknown parameters. In practice, we considered the interval estimation for PMC of improved linear discriminant rule. To derive an approximate confidence interval, we obtained the explicit expression of stochastic for two bilinear forms and two quadratic forms, and derived the asymptotic distribution for the studentized statistics of estimator of PMC under the high dimensional and large sample frame work. Our approximate confidence interval not only has been established in high dimensional and large sample settings, but also has been established in large sample settings. Also, we confirmed that the superiority of our approximate confidence intervals have been verified in the sense of the coverage probability and expected length by using Monte Carlo simulation. Appendix A. Stochastic expression quadratic form We present here the preliminary results about the quadratic form. Lemma A.. Let z N p ν, I p, g N p, I p, W W p n, I p and ν ν ν. Assume that n p > and p >. Then, it holds that i z W z u ν u ṽ, ṽ ii z W z u ν u ṽ ṽ { iii ν W z ν ṽ iv z W g ν u u ṽ ṽ 4 u ν u ṽ ṽ ṽ } ṽ 4,, { } ṽ u u 4, ṽ 4

18 8 M. HYODO, T. MITANI, T. HIMNO AND T. SO where u i N, i,,, 4, ṽ χ p, ṽ χ n p, ṽ χ p, ṽ 4 χ n p, and these variables are mutually independent. Here, χ p means chi-square distribution with p degrees of freedom. Proof The proof of assertions i-iv follows directly by applying the technique derived in Lemma, in Yamada et al. 5. B. Derivation of. By using Lemma A., U can be rewritten as A. { npn N N N n p nn N nu N N ṽ N N n u u ṽ nu 4 ṽ N N ṽ N N ṽ 4 n u u ṽ n ṽ N u N u. ṽ n N N ṽ 4 U n ṽ } u N N ṽ u ṽ By using above expression, we calculate the expectation of U as A. U] n n p c c n c on. n c The expectation of U is obtained by calculating the second moment of each term in A.. The second moment of each term in A. is calculated as

19 APPROXIMAT INTRVAL STIMATION FOR PMC 9 follows: { n ṽ npn N N N n p nn N N N ṽ u u ṽ n 4n p n p 4 n pn N N N n p n p n n pn N N N n p n p c c γ γ n c n c cγ γ γ γ n c γ γ nu N N ṽ N N n u u ṽ n n n p n p n c c n c γ γ on, } on, n p N N n p n p nu 4 ṽ N N ṽ N N ṽ 4 n u u ṽ n p n n p n p n p n p p N N n p n p n p c c n c n c on, γ γ { } n ṽ N u N u ṽ n N N ṽ 4 n { N n p N p } n N N n p n p n p γ γ c γ c nγ γ c on.

20 M. HYODO, T. MITANI, T. HIMNO AND T. SO Summarizing these results, we obtain that A. U ] From A. and A., we obtain that c n c n c γ c γ γ c n c γ γ n c on. γ γ U U ] U ] U U] U { n c 4 c γ γ cγ γ γ γ } on. C. Derivation of.4 By using Lemma A., V can be rewritten as A. 4 V { n ṽ ṽ n n ṽ 4 N N ṽ n n ṽ ṽ N N ṽ 4 u. ṽ u u } ṽ ṽ 4 By using above expression, we calculate the expectation of V as A. 5 V ] n ṽ ṽ n n ṽ 4 N N ṽ ṽ u u ] ṽ ṽ 4 n n n p n p n p n n n p N N n p n p n p { c } 4 n c { c c } 4 γ γ n c on. The expectation of V is obtained by calculating the second moment of each term in A.4. The second moment of each term in A.4 is calculated as

21 APPROXIMAT INTRVAL STIMATION FOR PMC follows: A. 6 { n ṽ ṽ n n ṽ 4 N N ṽ ṽ u ṽ 4 } { n 4 N N n p n p N N A. 7 { n } n ṽ ṽ u N N ṽ 4 p 4 n 4 n N N We note that Thus we obtain that ṽ 4 ṽ 4 ] ṽ 4 ] ṽ4 ] ṽ 4 p ṽ 4 ṽ4 ṽ 4 ]. u } ṽ c 4 n 4 6 c 5 n 5 on 5, c n c n on, cn. p ṽ ṽ 4 p 4 ṽṽ4 4 ] ṽ 4, A. 8 p ṽ ṽ 4 p 4 ṽṽ4 4 ] ṽ 4 c 7 c 6 n4 c 7 n 5 on 5. Substitute A.8 into A.6 and A.7, we obtain that A. 9 V ] c c 6 4 c 6 γ γ c 6 γ γ c 7 4 4{cc 7 } n c 7 c 7 γ γ c8c c 7 γ on. γ c

22 M. HYODO, T. MITANI, T. HIMNO AND T. SO From A.5 and A.9, we obtain that V V ] V ] V V ] V c n c { c c } γ γ c { c c } ] γ on. γ D. Derivation of.8 By using Lemma A., ˆ can be rewritten as A. ˆ n p n u τ u ṽ n p N N ṽ N N By using above expression, we calculate the expectation of ˆ as A. ˆ ] n p n u τ u ṽ ] n p N N ṽ N N n p n N N n p n p N N n n p N N. Also, we calculate the second moment of ˆ as A. ˆ 4 ] n p u n τ u N N ṽ ] ṽ n n p p u τ u N N ṽ ] n p ṽ N N.

23 APPROXIMAT INTRVAL STIMATION FOR PMC The expected term in A. can be calculated as A. u τ u ṽ ] ṽ A. 4 u τ u ṽ ] ṽ N N n p n n p, n p n p n {N N 4 n p N N n pp }. Substitute A. and A.4 into A., we obtain that A. 5 ˆ 4 ] 4 4n n n p n p N N n pn N N. n p From A. and A.5, we obtain that ] 4 ] ] n c γ γ c γ γ on.. Derivation of. From Lemma A., we note that < V V V < N N ṽ n n ṽ V V a.s. So, we consider to evaluate A. 6 N N ṽ ] n V V n ṽ N N ṽ n V n ṽ N N ṽ V n n ṽ ] N N ṽ V ]. ] n V n ṽ

24 4 M. HYODO, T. MITANI, T. HIMNO AND T. SO The each term on right hand side in A.6 is evaluated as A. 7 N N ṽ ] n V n ṽ { nṽ ṽ 4 N N n u u ṽ } n N N ṽ ṽ ṽ 4 n u ṽ ṽ 4 ṽ ṽ ṽ 4 n n n N N 4 n n p n p n p n pp 4 n n n p n p n p n p n pp 4 n n n n p p N N n p n p n p n pp 4 { } γ γ c 4 c c c γ γ c 5 c 4 n { } c 4 4c c c 5 cn c c 4 c c γ γ c 5 on, nγ γ and N N ṽ ] A. 8 n V n ṽ { ṽ ṽ 4 N N n u u ṽ } n ṽ ṽ 4 ] N N u ṽ ṽ 4 n ṽ ṽ 4 n N N n n pp 4 n p n pp 4 γ γ cc 4 cγ γ cc n c c ccn on.

25 APPROXIMAT INTRVAL STIMATION FOR PMC 5 Combining A.6-A.8, we obtain that N N ṽ ] n V V c 4γ γ {cc } n ṽ n c 5 c 4 c 5 c cc c 5 on. γ γ F. Derivation of. From Lemma A., it holds that < U U V So, we consider to evaluate < N N ṽ n n ṽ U U a.s. A. 9 N N ṽ ] n U U n ṽ N N ṽ ] n U N N ṽ ] U n ṽ n U n ṽ U N N ṽ ] n. n ṽ We evaluate the first term on right hand side of A.9. The random variable N N ṽ /{n n ṽ }U can be rewritten as A. N N ṽ n n ṽ / U { N N ṽ p n N N ṽ n p u u ṽ ṽ } N N n / ṽ u N N n ṽ n u u ṽ u 4 ṽ N N n ṽ ṽ 4 n u u ṽ ṽ n N u N u. ṽ ṽ 4 The expectation of N N ṽ /{n ṽ }U is obtained by calculating the second moment of each term on right hand side of A.. These second

26 6 M. HYODO, T. MITANI, T. HIMNO AND T. SO moments can be calculated as follows: N N ṽ p n p u u ṽ ṽ N N n N N ṽ n / A. ṽ A. N N 4n p 4 4 n N N n p 4n p n n p N N p n n p N N p 4, u N N n ṽ n u u ṽ N N n p 4 p n p 4 n N N n p 4 p n p 4, u 4ṽ N N A. n ṽ ṽ 4 n u u ṽ N N p n p 4n p p p n p 4n p {n N N }p p p n p 4n p n p 4n p, ṽ n N u N u N n p N p A. 4 ṽ ṽ 4 n p 4n p. From A.-A.4, we can obtain that N N ṽ ] A. 5 n U n ṽ N N 4n p 4 4 N { pp n N n p p } n p 4n p n p { n pn p N N } p n p 4 N N n p n p 4 γ γ c 4 γ γ 4c n c {cγ c γ } cc γ ] γ on. cγ γ

27 APPROXIMAT INTRVAL STIMATION FOR PMC 7 Also, we have that N N ṽ ] A. 6 n U n ṽ N N n p nn p 4 N N n p n p nn n p p 4 cγ γ {5 cc 4} γ γ c n c c γ ] γ on, c and A. 7 N N ṽ ] n n ṽ N N n p n p n n p 4 c γ γ c { cc} cγ γ c n on. Combining A.5-A.7, we obtain that N N ṽ ] n U U { 4 γ n ṽ n ccγ γ γ γγ cγ γ } on. Acknowledgments The authors would like to express their gratitude to Professor Yasunori Fujikoshi for many valuable comments and discussions. References ] Chung, H.-C. and Han, C.-P. 9. Conditional confidence intervals for classification error rate. Computational Statistics and Data Analysis, 5, ] Fujikoshi, Y.. rror bounds for asymptotic approximations of the linear discriminant function when the sample sizes and dimensionality are large. Journal of Multivariate Analysis, 7, -7. ] Fujikoshi, Y. and Seo, T Asymptotic approximations for PMC s of the linear and the quadratic discriminant functions when the sample sizes and the dimension are large. Random Operators and Stochastic quations, 6, 69-8.

28 8 M. HYODO, T. MITANI, T. HIMNO AND T. SO 4] Kubokawa, T., Hyodo, M. and Srivastava, M. S.. Asymptotic expansion and estimation of PMC for linear classification rules in high dimension. Journal of Multivariate Analysis, 5, ] Lachenbruch, P. A On expected probabilities of misclassification in discriminant analysis, necessary sample size, and a relation with the multiple correlation coefficient. Biometrics., 4, ] Lachenbruch, P. A. and Mickey, M. R stimation of error rates in discriminant analysis. Technometrics,, -. 7] McLachlan, G. J Confidence intervals for the conditional probability of misallocation in discriminant analysis. Biometrics,, ] Okamoto, M. 96. An asymptotic expansion for the distribution of the linear discriminant function. Annals of Mathematical Statistics, 4, ] Okamoto, M Correction to An asymptotic expansion for the distribution of the linear discriminant function. Annals of Mathematical Statistics, 9, ] Siotani, M. 98. Large sample approximations and asymptotic expansions of classification statistic. Handbook of Statistics P. R. Krishnaiah and L. N. Kanal, ds., North-Holland Publishing Company, 6-. ] Yamada, T., Himeno, T. and Sakurai, T. 5. Cut-off point of linear discriminant rule for large dimension. Technical Report, No.5-4, Hiroshima statistical research group, Hiroshima University.

29 APPROXIMAT INTRVAL STIMATION FOR PMC 9 Table. The coverage probabilities p α \ n N : N 4 : : : : : : : : : Table. The coverage probabilities p α \ n N : N : : : : : : : : : Table. The coverage probabilities p 5 α \ n N : N 5 : : : : : : : : :

30 M. HYODO, T. MITANI, T. HIMNO AND T. SO Table 4. The expected lengths p n 4 α N : N L L L L L L : : : : : : : : : Table 5. The expected lengths p n α N : N L L L L L L : : : : : : : : : Table 6. The expected lengths p 5 n 5 α N : N L L L L L L : : : : : : : : :

31 APPROXIMAT INTRVAL STIMATION FOR PMC N, N 5, 75 N, N 5, 5 N, N 75, 5 4 N, N 75, 5 N, N 5, 5 N, N 5, 75 4 N, N 5, 75 N, N 5, 5 N, N 75, 5 Figure. Q-Q plots of B N,N for Case B

32 M. HYODO, T. MITANI, T. HIMNO AND T. SO N, N 5, 5 N, N, N, N 5, 5 N, N 75, 5 N, N 5, 5 N, N 5, 75 N, N, N, N, N, N, Figure. Q-Q plots of H p,n,n in Case A p

33 APPROXIMAT INTRVAL STIMATION FOR PMC 4 4 N, N, N, N, N, N, N, N 5, 45 N, N, N, N 45, 5 N, N, 6 N, N 4, 4 N, N 6, Figure. Q-Q plots of H p,n,n in Case A p Masashi Hyodo Department of Mathematical Information Science, Tokyo University of Science - Kagurazaka, Shinjuku-ku, Tokyo 6-86, Japan -mail: caicmhy@gmail.com Tomohiro Mitani Graduate School of Science, Tokyo University of Science - Kagurazaka, Shinjuku-ku, Tokyo 6-86, Japan Tetsuto Himeno Faculty of Science and Technology, Seikei University -- Kichijoji-Kitamachi, Musashino-shi, Tokyo 8-86, Japan

34 4 M. HYODO, T. MITANI, T. HIMNO AND T. SO Takashi Seo Department of Mathematical Information Science Tokyo University of Science, - Kagurazaka, Shinjuku-ku, Tokyo 6-86, Japan

Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples

Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples Nobumichi Shutoh, Masashi Hyodo and Takashi Seo 2 Department of Mathematics, Graduate

More information

EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large

EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large EPMC Estimation in Discriminant Analysis when the Dimension and Sample Sizes are Large Tetsuji Tonda 1 Tomoyuki Nakagawa and Hirofumi Wakaki Last modified: March 30 016 1 Faculty of Management and Information

More information

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis

Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Consistency of Test-based Criterion for Selection of Variables in High-dimensional Two Group-Discriminant Analysis Yasunori Fujikoshi and Tetsuro Sakurai Department of Mathematics, Graduate School of Science,

More information

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai

More information

SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA

SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA SIMULTANEOUS CONFIDENCE INTERVALS AMONG k MEAN VECTORS IN REPEATED MEASURES WITH MISSING DATA Kazuyuki Koizumi Department of Mathematics, Graduate School of Science Tokyo University of Science 1-3, Kagurazaka,

More information

On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors

On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors On the conservative multivariate Tukey-Kramer type procedures for multiple comparisons among mean vectors Takashi Seo a, Takahiro Nishiyama b a Department of Mathematical Information Science, Tokyo University

More information

Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis

Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Consistency of Distance-based Criterion for Selection of Variables in High-dimensional Two-Group Discriminant Analysis Tetsuro Sakurai and Yasunori Fujikoshi Center of General Education, Tokyo University

More information

On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control

On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control On the conservative multivariate multiple comparison procedure of correlated mean vectors with a control Takahiro Nishiyama a a Department of Mathematical Information Science, Tokyo University of Science,

More information

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical

More information

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem

T 2 Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem T Type Test Statistic and Simultaneous Confidence Intervals for Sub-mean Vectors in k-sample Problem Toshiki aito a, Tamae Kawasaki b and Takashi Seo b a Department of Applied Mathematics, Graduate School

More information

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi

High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis. Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi High-Dimensional AICs for Selection of Redundancy Models in Discriminant Analysis Tetsuro Sakurai, Takeshi Nakada and Yasunori Fujikoshi Faculty of Science and Engineering, Chuo University, Kasuga, Bunkyo-ku,

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems

More information

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine

More information

Testing equality of two mean vectors with unequal sample sizes for populations with correlation

Testing equality of two mean vectors with unequal sample sizes for populations with correlation Testing equality of two mean vectors with unequal sample sizes for populations with correlation Aya Shinozaki Naoya Okamoto 2 and Takashi Seo Department of Mathematical Information Science Tokyo University

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

The comparative studies on reliability for Rayleigh models

The comparative studies on reliability for Rayleigh models Journal of the Korean Data & Information Science Society 018, 9, 533 545 http://dx.doi.org/10.7465/jkdi.018.9..533 한국데이터정보과학회지 The comparative studies on reliability for Rayleigh models Ji Eun Oh 1 Joong

More information

An Unbiased C p Criterion for Multivariate Ridge Regression

An Unbiased C p Criterion for Multivariate Ridge Regression An Unbiased C p Criterion for Multivariate Ridge Regression (Last Modified: March 7, 2008) Hirokazu Yanagihara 1 and Kenichi Satoh 2 1 Department of Mathematics, Graduate School of Science, Hiroshima University

More information

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

Math 180B, Winter Notes on covariance and the bivariate normal distribution

Math 180B, Winter Notes on covariance and the bivariate normal distribution Math 180B Winter 015 Notes on covariance and the bivariate normal distribution 1 Covariance If and are random variables with finite variances then their covariance is the quantity 11 Cov := E[ µ ] where

More information

Statistical Inference On the High-dimensional Gaussian Covarianc

Statistical Inference On the High-dimensional Gaussian Covarianc Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional

More information

Testing Block-Diagonal Covariance Structure for High-Dimensional Data

Testing Block-Diagonal Covariance Structure for High-Dimensional Data Testing Block-Diagonal Covariance Structure for High-Dimensional Data MASASHI HYODO 1, NOBUMICHI SHUTOH 2, TAKAHIRO NISHIYAMA 3, AND TATJANA PAVLENKO 4 1 Department of Mathematical Sciences, Graduate School

More information

Introduction to Statistics and Error Analysis II

Introduction to Statistics and Error Analysis II Introduction to Statistics and Error Analysis II Physics116C, 4/14/06 D. Pellett References: Data Reduction and Error Analysis for the Physical Sciences by Bevington and Robinson Particle Data Group notes

More information

On testing the equality of mean vectors in high dimension

On testing the equality of mean vectors in high dimension ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Department of Econometrics and Business Statistics

Department of Econometrics and Business Statistics ISSN 440-77X Australia Department of Econometrics and Business Statistics http://wwwbusecomonasheduau/depts/ebs/pubs/wpapers/ The Asymptotic Distribution of the LIML Estimator in a artially Identified

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

ON CONSTRUCTING T CONTROL CHARTS FOR RETROSPECTIVE EXAMINATION. Gunabushanam Nedumaran Oracle Corporation 1133 Esters Road #602 Irving, TX 75061

ON CONSTRUCTING T CONTROL CHARTS FOR RETROSPECTIVE EXAMINATION. Gunabushanam Nedumaran Oracle Corporation 1133 Esters Road #602 Irving, TX 75061 ON CONSTRUCTING T CONTROL CHARTS FOR RETROSPECTIVE EXAMINATION Gunabushanam Nedumaran Oracle Corporation 33 Esters Road #60 Irving, TX 7506 Joseph J. Pignatiello, Jr. FAMU-FSU College of Engineering Florida

More information

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Expectation. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

Asymptotic Normality under Two-Phase Sampling Designs

Asymptotic Normality under Two-Phase Sampling Designs Asymptotic Normality under Two-Phase Sampling Designs Jiahua Chen and J. N. K. Rao University of Waterloo and University of Carleton Abstract Large sample properties of statistical inferences in the context

More information

Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples

Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples 90 IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003 Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples N. Balakrishnan, N. Kannan, C. T.

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

Testing Block-Diagonal Covariance Structure for High-Dimensional Data Under Non-normality

Testing Block-Diagonal Covariance Structure for High-Dimensional Data Under Non-normality Testing Block-Diagonal Covariance Structure for High-Dimensional Data Under Non-normality Yuki Yamada, Masashi Hyodo and Takahiro Nishiyama Department of Mathematical Information Science, Tokyo University

More information

Expectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Expectation. DS GA 1002 Probability and Statistics for Data Science.   Carlos Fernandez-Granda Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,

More information

Statistical Inference with Monotone Incomplete Multivariate Normal Data

Statistical Inference with Monotone Incomplete Multivariate Normal Data Statistical Inference with Monotone Incomplete Multivariate Normal Data p. 1/4 Statistical Inference with Monotone Incomplete Multivariate Normal Data This talk is based on joint work with my wonderful

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring A. Ganguly, S. Mitra, D. Samanta, D. Kundu,2 Abstract Epstein [9] introduced the Type-I hybrid censoring scheme

More information

A Simulation Comparison Study for Estimating the Process Capability Index C pm with Asymmetric Tolerances

A Simulation Comparison Study for Estimating the Process Capability Index C pm with Asymmetric Tolerances Available online at ijims.ms.tku.edu.tw/list.asp International Journal of Information and Management Sciences 20 (2009), 243-253 A Simulation Comparison Study for Estimating the Process Capability Index

More information

MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA

MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA J. Japan Statist. Soc. Vol. 37 No. 1 2007 53 86 MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA M. S. Srivastava* In this article, we develop a multivariate theory for analyzing multivariate datasets

More information

Estimation of two ordered normal means under modified Pitman nearness criterion

Estimation of two ordered normal means under modified Pitman nearness criterion Ann Inst Stat Math (25) 67:863 883 DOI.7/s463-4-479-4 Estimation of two ordered normal means under modified Pitman nearness criterion Yuan-Tsung Chang Nobuo Shinozaki Received: 27 August 23 / Revised:

More information

Estimation of multivariate 3rd moment for high-dimensional data and its application for testing multivariate normality

Estimation of multivariate 3rd moment for high-dimensional data and its application for testing multivariate normality Estimation of multivariate rd moment for high-dimensional data and its application for testing multivariate normality Takayuki Yamada, Tetsuto Himeno Institute for Comprehensive Education, Center of General

More information

A Primer on Asymptotics

A Primer on Asymptotics A Primer on Asymptotics Eric Zivot Department of Economics University of Washington September 30, 2003 Revised: October 7, 2009 Introduction The two main concepts in asymptotic theory covered in these

More information

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho M. de Carvalho and F. J. Marques Abstract We discuss jackknife Euclidean likelihood-based inference methods, with a special focus on the

More information

arxiv: v1 [stat.co] 26 May 2009

arxiv: v1 [stat.co] 26 May 2009 MAXIMUM LIKELIHOOD ESTIMATION FOR MARKOV CHAINS arxiv:0905.4131v1 [stat.co] 6 May 009 IULIANA TEODORESCU Abstract. A new approach for optimal estimation of Markov chains with sparse transition matrices

More information

Specification Test for Instrumental Variables Regression with Many Instruments

Specification Test for Instrumental Variables Regression with Many Instruments Specification Test for Instrumental Variables Regression with Many Instruments Yoonseok Lee and Ryo Okui April 009 Preliminary; comments are welcome Abstract This paper considers specification testing

More information

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University

More information

Smooth simultaneous confidence bands for cumulative distribution functions

Smooth simultaneous confidence bands for cumulative distribution functions Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

More information

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series Willa W. Chen Rohit S. Deo July 6, 009 Abstract. The restricted likelihood ratio test, RLRT, for the autoregressive coefficient

More information

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and

More information

Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang

Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang Supplementary Materials: Martingale difference correlation and its use in high dimensional variable screening by Xiaofeng Shao and Jingsi Zhang The supplementary material contains some additional simulation

More information

Estimation for generalized half logistic distribution based on records

Estimation for generalized half logistic distribution based on records Journal of the Korean Data & Information Science Society 202, 236, 249 257 http://dx.doi.org/0.7465/jkdi.202.23.6.249 한국데이터정보과학회지 Estimation for generalized half logistic distribution based on records

More information

A NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION

A NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION J. Japan Statist. Soc. Vol. 32 No. 2 2002 165 181 A NOTE ON ESTIMATION UNDER THE QUADRATIC LOSS IN MULTIVARIATE CALIBRATION Hisayuki Tsukuma* The problem of estimation in multivariate linear calibration

More information

Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution

Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution Pertanika J. Sci. & Technol. 18 (1): 209 221 (2010) ISSN: 0128-7680 Universiti Putra Malaysia Press Bootstrapping the Confidence Intervals of R 2 MAD for Samples from Contaminated Standard Logistic Distribution

More information

arxiv: v1 [math.st] 2 Apr 2016

arxiv: v1 [math.st] 2 Apr 2016 NON-ASYMPTOTIC RESULTS FOR CORNISH-FISHER EXPANSIONS V.V. ULYANOV, M. AOSHIMA, AND Y. FUJIKOSHI arxiv:1604.00539v1 [math.st] 2 Apr 2016 Abstract. We get the computable error bounds for generalized Cornish-Fisher

More information

AN ANALYSIS OF SHEWHART QUALITY CONTROL CHARTS TO MONITOR BOTH THE MEAN AND VARIABILITY KEITH JACOB BARRS. A Research Project Submitted to the Faculty

AN ANALYSIS OF SHEWHART QUALITY CONTROL CHARTS TO MONITOR BOTH THE MEAN AND VARIABILITY KEITH JACOB BARRS. A Research Project Submitted to the Faculty AN ANALYSIS OF SHEWHART QUALITY CONTROL CHARTS TO MONITOR BOTH THE MEAN AND VARIABILITY by KEITH JACOB BARRS A Research Project Submitted to the Faculty of the College of Graduate Studies at Georgia Southern

More information

Inferences about Parameters of Trivariate Normal Distribution with Missing Data

Inferences about Parameters of Trivariate Normal Distribution with Missing Data Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 7-5-3 Inferences about Parameters of Trivariate Normal Distribution with Missing

More information

Bias-corrected AIC for selecting variables in Poisson regression models

Bias-corrected AIC for selecting variables in Poisson regression models Bias-corrected AIC for selecting variables in Poisson regression models Ken-ichi Kamo (a), Hirokazu Yanagihara (b) and Kenichi Satoh (c) (a) Corresponding author: Department of Liberal Arts and Sciences,

More information

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity John C. Chao, Department of Economics, University of Maryland, chao@econ.umd.edu. Jerry A. Hausman, Department of Economics,

More information

Estimation of the Conditional Variance in Paired Experiments

Estimation of the Conditional Variance in Paired Experiments Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often

More information

Monte Carlo Simulation. CWR 6536 Stochastic Subsurface Hydrology

Monte Carlo Simulation. CWR 6536 Stochastic Subsurface Hydrology Monte Carlo Simulation CWR 6536 Stochastic Subsurface Hydrology Steps in Monte Carlo Simulation Create input sample space with known distribution, e.g. ensemble of all possible combinations of v, D, q,

More information

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)

More information

Complexity of two and multi-stage stochastic programming problems

Complexity of two and multi-stage stochastic programming problems Complexity of two and multi-stage stochastic programming problems A. Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA The concept

More information

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen Title Author(s) Some SDEs with distributional drift Part I : General calculus Flandoli, Franco; Russo, Francesco; Wolf, Jochen Citation Osaka Journal of Mathematics. 4() P.493-P.54 Issue Date 3-6 Text

More information

Citation 数理解析研究所講究録 (2003), 1334:

Citation 数理解析研究所講究録 (2003), 1334: On the conservatm of multiple com Titleamong mean vectors (Approximations Dtributions) Author(s) Seo, Takashi Citation 数理解析研究所講究録 (2003), 1334: 87-94 Issue Date 2003-07 URL http://hdl.handle.net/2433/43322

More information

Sobolev Spaces. Chapter 10

Sobolev Spaces. Chapter 10 Chapter 1 Sobolev Spaces We now define spaces H 1,p (R n ), known as Sobolev spaces. For u to belong to H 1,p (R n ), we require that u L p (R n ) and that u have weak derivatives of first order in L p

More information

Inference on reliability in two-parameter exponential stress strength model

Inference on reliability in two-parameter exponential stress strength model Metrika DOI 10.1007/s00184-006-0074-7 Inference on reliability in two-parameter exponential stress strength model K. Krishnamoorthy Shubhabrata Mukherjee Huizhen Guo Received: 19 January 2005 Springer-Verlag

More information

Section 8.1: Interval Estimation

Section 8.1: Interval Estimation Section 8.1: Interval Estimation Discrete-Event Simulation: A First Course c 2006 Pearson Ed., Inc. 0-13-142917-5 Discrete-Event Simulation: A First Course Section 8.1: Interval Estimation 1/ 35 Section

More information

A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration

A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration Ryoya Oda, Yoshie Mima, Hirokazu Yanagihara and Yasunori Fujikoshi Department of Mathematics, Graduate

More information

A Few Special Distributions and Their Properties

A Few Special Distributions and Their Properties A Few Special Distributions and Their Properties Econ 690 Purdue University Justin L. Tobias (Purdue) Distributional Catalog 1 / 20 Special Distributions and Their Associated Properties 1 Uniform Distribution

More information

Efficient Robbins-Monro Procedure for Binary Data

Efficient Robbins-Monro Procedure for Binary Data Efficient Robbins-Monro Procedure for Binary Data V. Roshan Joseph School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205, USA roshan@isye.gatech.edu SUMMARY

More information

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order

More information

Analysis of Type-II Progressively Hybrid Censored Data

Analysis of Type-II Progressively Hybrid Censored Data Analysis of Type-II Progressively Hybrid Censored Data Debasis Kundu & Avijit Joarder Abstract The mixture of Type-I and Type-II censoring schemes, called the hybrid censoring scheme is quite common in

More information

THE EFFECT OF SAMPLING DESIGN ON ANDERSON S EXPANSION OF THE DISTRIBUTION OF FISHER S SAMPLE DISCRIMINANT FUNCTION

THE EFFECT OF SAMPLING DESIGN ON ANDERSON S EXPANSION OF THE DISTRIBUTION OF FISHER S SAMPLE DISCRIMINANT FUNCTION Statistica Sinica 8(1998), 1115-1130 THE EFFECT OF SAMPLING ESIGN ON ANERSON S EXPANSION OF THE ISTRIBUTION OF FISHER S SAMPLE ISCRIMINANT FUNCTION Kam-Wah Tsui and Ching-Ho Leu University of Wisconsin-Madison

More information

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression

Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Working Paper 2013:9 Department of Statistics Some Approximations of the Logistic Distribution with Application to the Covariance Matrix of Logistic Regression Ronnie Pingel Working Paper 2013:9 June

More information

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i := 2.7. Recurrence and transience Consider a Markov chain {X n : n N 0 } on state space E with transition matrix P. Definition 2.7.1. A state i E is called recurrent if P i [X n = i for infinitely many n]

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Modfiied Conditional AIC in Linear Mixed Models

Modfiied Conditional AIC in Linear Mixed Models CIRJE-F-895 Modfiied Conditional AIC in Linear Mixed Models Yuki Kawakubo Graduate School of Economics, University of Tokyo Tatsuya Kubokawa University of Tokyo July 2013 CIRJE Discussion Papers can be

More information

Bootstrap inference for the finite population total under complex sampling designs

Bootstrap inference for the finite population total under complex sampling designs Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.

More information

Finite Population Sampling and Inference

Finite Population Sampling and Inference Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

Citation Osaka Journal of Mathematics. 40(3)

Citation Osaka Journal of Mathematics. 40(3) Title An elementary proof of Small's form PSL(,C and an analogue for Legend Author(s Kokubu, Masatoshi; Umehara, Masaaki Citation Osaka Journal of Mathematics. 40(3 Issue 003-09 Date Text Version publisher

More information

MATH4427 Notebook 2 Fall Semester 2017/2018

MATH4427 Notebook 2 Fall Semester 2017/2018 MATH4427 Notebook 2 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE Joseph P. Romano Department of Statistics Stanford University Stanford, California 94305 romano@stat.stanford.edu Michael

More information

Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution

Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution Journal of Computational and Applied Mathematics 216 (2008) 545 553 www.elsevier.com/locate/cam Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution

More information

STAT 536: Genetic Statistics

STAT 536: Genetic Statistics STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Model Assisted Survey Sampling

Model Assisted Survey Sampling Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling

More information

I forgot to mention last time: in the Ito formula for two standard processes, putting

I forgot to mention last time: in the Ito formula for two standard processes, putting I forgot to mention last time: in the Ito formula for two standard processes, putting dx t = a t dt + b t db t dy t = α t dt + β t db t, and taking f(x, y = xy, one has f x = y, f y = x, and f xx = f yy

More information

The Australian National University and The University of Sydney. Supplementary Material

The Australian National University and The University of Sydney. Supplementary Material Statistica Sinica: Supplement HIERARCHICAL SELECTION OF FIXED AND RANDOM EFFECTS IN GENERALIZED LINEAR MIXED MODELS The Australian National University and The University of Sydney Supplementary Material

More information

STAT 7032 Probability. Wlodek Bryc

STAT 7032 Probability. Wlodek Bryc STAT 7032 Probability Wlodek Bryc Revised for Spring 2019 Printed: January 14, 2019 File: Grad-Prob-2019.TEX Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH 45221 E-mail address:

More information

TORIC WEAK FANO VARIETIES ASSOCIATED TO BUILDING SETS

TORIC WEAK FANO VARIETIES ASSOCIATED TO BUILDING SETS TORIC WEAK FANO VARIETIES ASSOCIATED TO BUILDING SETS YUSUKE SUYAMA Abstract. We give a necessary and sufficient condition for the nonsingular projective toric variety associated to a building set to be

More information

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information