High-dimensional asymptotic results for EPMCs of W - and Z- rules

Size: px
Start display at page:

Download "High-dimensional asymptotic results for EPMCs of W - and Z- rules"

Transcription

1 1 High-dimesioal asymptotic results for EPMCs of W - ad Z- rules Takayuki Yamada 1 Tetsuro Sakurai ad Yasuori Fujikoshi 3 1 Istitute for Comprehesive Educatio Ceter of Geeral Educatio Kagoshima Uiversity Korimoto Kagoshima Japa Ceter of Geeral Educatio Tokyo Uiversity of Sciece Suwa Toyohira Chio-shi Nagao Japa 3 Departmet of Mathematics Graduate School of Sciece Hiroshima Uiversity Kagamiyama Higashi Hiroshima Hiroshima Japa Abstract This paper is cocered with high-dimesioal asymptotic results for W - ad Z- rules whe the sample size N ad the dimesio are large. First we give a uified locatio ad scale mixture expressio of the stadard ormal distributio for W ad Z statistics. The the EPMCs Expected Probability of Misclassificatios of W - ad Z- rules are obtaied i expaded forms with errors of ON. It is poited that Z-rule has smaller EER Expected Error Rate tha W -rule whe the prior probabilities are the same eglectig the terms of ON. Further asymptotic ubiased estimators are proposed for the EPMCs ad the EERs of W - ad Z- rules. Variable selectio criteria are also proposed based o asymptotic ubiased estimators of the EERs of W - ad Z- rules. It is poited that the o additioal iformatio model based o the coefficiets of the liear discrimiat fuctio is closely related to the subset of variables with the miimized EER i a high dimesioal situatio. Accuracies of our asymptotic results are checked umerically by coductig a Mote Carlo simulatio. AMS 000 subject classificatio: primary 6H30; secodary 6H1 Key Words ad Phrases: Discrimiat aalysis EPMC High-dimesioal asymptotic results Method by differetial operator Selectio of variable Uified locatio ad scale mixtured expressio W - rule Z-rule. 1 address:yamada@gm.kagoshima-u.ac.jp address:sakurai@rs.tus.ac.jp 3 address:fujikoshi y@yahoo.co.jp

2 1 Itroductio This paper is cocered with the problem of classifyig a observatio vector x as comig from oe of two populatios Π 1 ad Π. Let Π i have p-dimesioal ormal populatios with mea vectors µ i ad the p p commo positive defiite covariace matrix Σ which are deoted as N p µ i Σ. Cosider the case that all parameters are ukow. Suppose that traiig data x 1i... x Nii are idepedetly ad idetically distributed i.i.d. as N p µ i Σ i = 1. Let W be the liear discrimiat fuctio W x = x 1 x S x 1 1 } x 1 + x x 1 x ad S are the sample mea vectors ad the pooled sample covariace matrix defied by x i = 1 N i N i j=1 x ij i = 1 S = 1 = N = N 1 + N. N i x ij x i x ij x i i=1 j=1 The the liear discrimiat rule with a cutoff poit c which is also called W -rule classifies x as Π 1 if W x > c for a costat c ad as Π if W x < c. Furthermore Aderso [1]see also Aderso [3]; Chapter 6 itroduced the other discrimiat rule which is based o the likelihood ratio criterio for testig the composite ull hypothesis that x x x 1N1 Π 1 agaist the composite alterative hypothesis that x x 1... x N Π which is called maximum likelihood rule or Z-rule. Let Zx = N 1 x x S 1 x x 1 + N1 1 1 x x 1 S 1 x x 1 }. The the Z-rule with a cutoff poit c classifies x as Π 1 if Zx > c ad Π if Zx < c. There are two types of probability of misclassificatio. Oe is the probability of allocatig x ito Π eve though it is actually belogig to Π 1. The other is the probability that x is classified as Π 1 although it is actually belogig to Π. These two types of expected probabilities of misclassificatios EPMCs for W- ad Z- rules are expressed as e w 1 = P W x < c x Π 1 ad e w 1 = P W x > c x Π e z 1 = P Zx < c x Π 1 ad e z 1 = P Zx > c x Π. We also express e w 1 ad e z 1 as g w c; N 1 N = e w 1 g z c; N 1 N = e z 1. As is well kow the distributio of W whe x Π 1 is the same as that of W whe x Π by iterchagig N 1 ad N. Similarly the distributio of Z whe x Π 1 is the same as that of Z whe x Π by iterchagig N 1 ad N. These idicate that e w 1 or e z 1 is obtaied from e w 1 or e z 1 by replacig c N 1 N with c N N 1 ad hece e w 1 = g w c; N N 1 e z 1 = g z c; N N 1.

3 3 Thus i this paper we oly deal with e w 1 ad e z 1. Related to a uified expressio for W - rule ad Z- rule we cosider Z-rule such that classifies x as Π 1 if Zx > c ad as Π if W x < c That is we cosider Z- rule with cutoff poit c. c 1 + N = N N c. Note that the EPMCs of W - ad Z- rules are obtaied from the distributio fuctios of W ad Z. I geeral it is hard to evaluate these expected probabilities of misclassificatio EPMC explicitly but some asymptotic results icludig asymptotic expasios have bee obtaied. It is well kow that the discrimiat fuctios W x ad Zx coverges i distributio to the ormal distributios i.e. W x ad Zx if x Π i uder the asymptotic framework A0: D N 1 i / 1 A0 : N 1 N N 1 /N γ 0 p is fixed. Here = µ 1 µ Σ 1 µ 1 µ. Okamoto [19] derived a asymptotic expasio of the distributio of W x up to terms of order 1 ad Siotai ad Wag [1] [] exteded it up to terms of order 3. Furthermore Memo ad Okamoto [15] expaded the distributio of Zx up to terms of order ad Siotai ad Wag [1] [] exteded it up to terms of order 3. Aderso [] derived a asymptotic expasio of Studetized W x ad a asymptotic expasio of Studetized Zx was derived by Fujikoshi ad Kaazawa [8]. These ad some other asymptotic results were reviewed by Siotai [0] ad by McLachla [18]. Geerally the precisio of asymptotic approximatios uder A0 gets worth as the dimesio p becomes large. As a alterative approach to overcome this shortcomig it has bee cosidered to derive asymptotic distributios of discrimiat fuctios i a high-dimesioal situatio ad p ted to ifiity together. Fujikoshi ad Seo [9] derived the limitig distributio of a geeral discrimiat fuctio for a class of discrimiat rules which icludes both the W - rule ad Z- rule uder asymptotic framework A1: A1 : p N 1 N p/ γ 0 [0 1 ad N 1 /N γ 0. Note that m = p uder A1. Matsumoto [14] geeralized Fujikoshi ad Seo [9] s limitig result to a asymptotic expasio up to terms of order O 3/ O j/ is a term of j-th order with respect to p 1/ N 1/ 1 N 1/ m 1/ }. Fujikoshi [6] gave a geeral approximatio of a locatio ad scale mixture of the stadard ormal distributio ad gave its explicit error boud. He applied his result to Lachebruch [13] s approximatio of e w 1 ad gave the error boud which is O 1 uder A1. These ad some other asymptotic results are also reviewed i Fujikoshi et al. [10]. High-dimesioal asymptotic expasios for W have bee also give by Hyodo ad Kubokawa [11].

4 4 I this paper we give asymptotic expasios of the EPMCs for W - ad Z- rules with the errors of order O uder the asymptotic framework A1. Our derivatio is based o a uified locatio ad scale mixture expressio of the stadard ormal distributio for W ad Z. It is well kow see e.g. Fujikoshi [6] that W ca be expressed as a locatio ad scale mixtures of the stadard ormal distributio. We ote that Z ca be also expressed as a locatio ad scale mixtures of the stadard ormal distributio. Based o our asymptotic expasio formulas for the EPMCs it is show that Z-rule has smaller EEP Expected Error Rate tha W -rule whe the prior probabilities are the same eglectig the terms of ON. Further asymptotic ubiased estimators are proposed for EPMCs of W - ad Z- rules. Similarly we propose asymptotic ubiased estimator for EEPs. It is poited that the o additioal iformatio model based o the coefficiets of the liear discrimiat fuctio is closely related to the subset of variables with the miimized EER i a high dimesioal situatio. We propose variable selectio criteria based o ubiased estimator for EEPs. Our results are checked umerically by coductig a Mote Carlo simulatio. The preset paper is orgaized as follows. I sectio we give a uified locatio ad scale mixture expressio for the distributios of W ad Z. Further the expressios are expressed i terms of three stadard ormal variables ad four chi-square variables which are idepedet. Applyig the expressio to the method of differetial operator we obtai asymptotic expasios for the EPMCs of W - ad Z- rules with the errors of O. I Sectio 4 it is show that the EER of Z- rule is smaller tha the oe of Z- rule whe the prior probabilities are the same eglectig the terms of O. The result is proved Appedix A. I Sectios 5 ad 6 asymptotic ubiased estimators for the EPMCs ad the EERs of W - ad Z- rules are derived. I Sectio 7 simulatio results are results to see accuracies of our asymptotic results. I Sectio 8 we propose variable selectio criteria based o asymptotic ubiased estimators of the EERs of W - ad Z- rules. Cocludig remarks are give i Sectio 9. Hereafter the symbol D = deotes the equality i distributio. Throughout this paper we assume that coverges a positive costat as p. A uified expressio of W ad Z as locatio ad scale mixtures of N0 1 Followig Lachebruch [13] for x Π 1 it ca be expressed that W = x 1 x S x 1 1 } x 1 + x = Vw 1/ Z w U w V w = x 1 x S 1 ΣS 1 x 1 x Z w = V 1/ w x 1 x S 1 x µ 1 U w = x 1 x S 1 x 1 µ 1 1 D

5 5 ad D is the squared sample Mahalaobis distace defied by D = x 1 x S 1 x 1 x. The it is checked that V w is a positive radom variable ad U w V w are joitly idepedet of Z w. Further Z w is distributed as N0 1. This ormality follows by cosiderig the coditioal distributio of Z w whe x 1 x ad S are give. I this case W is called a locatio ad scale mixture of the stadard ormal distributio. Now we cosider to express U w ad V w i terms of simple variables. Let 1 u w = + 1 1/ Σ 1/ x 1 x v w = 1 Σ 1/ N 1 x 1 + N x N 1 µ 1 N µ N B = Σ 1/ SΣ 1/. The u w v w ad B are idepedet. I additio u w N p 1/N 1 + 1/N 1/ δ I p ad v w N p 0 I p δ = Σ 1/ µ 1 µ. It also holds that B is distributed as a Wishart distributio with degrees of freedom ad covariace matrix I p which is deoted as W p I p. Substitutig them we have U w = 1 u w B 1 u w N N 1 V w = N u wb u w. + u wb 1 v w N δ B 1 u w N1 N NN 1 O the other had for x Π 1 we ca express Zx as Zx = 1 } } N 1 a 1/ x x 1 + x x S 1 a 1/ x x 1 + x x = N 1 ω 1 ω u zb 1 t } u z = ω1 1 Σ 1/ a 1/ x x 1 + x x } t = ω 1 Σ 1/ a 1/ x x 1 + x x ω1 = 1 + N 1 a1/} ω = 1 + N 1 + a1/} a = 1 + N N 1. 1 Note that ω 1 = Op 1 ad ω 4 uder A1. The idepedecy of u z ad t ad these distributioal results ca be derived by usig the followig geeral result Lemma 1 for liear combiatios of i.i.d. radom vectors see e.g. Aderso [3]; Theorem Lemma 1. Suppose that x 1... x N are idepedet ad x i is distributed as N p µ i Σ. Let H = h ij be a N N orthogoal matrix. The y i = N j=1 h ijx i is distributed as N p ν i Σ ν i = N j=1 h ijµ j i = 1... N ad y 1... y N are idepedet.

6 6 From Lemma 1 we have that u z ad t are idepedet ad u z N p ω 1 1 δ I p t N p ω 1 δ I p δ = Σ 1/ µ 1 µ. Let v z = t ω 1 δ which is distributed as N p 0 I p. Now we shall see that Zx ca be expressed as a locatio of scale mixture of the stadard ormal distributio. Note that u zb 1 t = u zb 1 v z + u zb 1 ω 1 δ = u zb u z Z 0 + ω 1 δ B 1 u z Z 0 = 1 u zb u z u zb 1 v z. The coditioal distributio of Z 0 whe u z ad S are give is the stadard ormal distributio. Sice it does ot deped o u z ad S Z 0 is distributed as the stadard ormal distributio ad is idepedet of u z ad S. Therefore Zx is a locatio ad scale mixture of the stadard ormal distributio. Modifyig the sale ad the locatio we use the followig locatio ad scale mixture expressio for Zx: Note that So Zx = 1 1/ 1 N 1 + N 1 ω 1 ω Vz 1/ Z z U z N 1 ω 1 ω = 1 Zx = 1/ N U z = ω 1 δ B 1 u z V z = N u zb u z 1/ N Z z = Vz 1/ u zb 1 v z. 1 + N N N N = 1/ N V N N 1 + N 1 1 N N N 1. z Z z U z = c Vz 1/ Z z U z. Here the variable Vz 1/ Z z U z is a locatio ad scale mixture of the stadard ormal distributio. Further Zx > c V 1/ z Z z U z > c.

7 7 We have see that the discrimiat fuctio W x based o a cutoff poit c ad the discrimiat fuctio Zx based o a cutoff poit c ca be expressed as a locatio U ad scale V 1/ mixture of the stadard ormal distributio ad so these misclassificatio probabilities whe x Π 1 ca be expressed as } E[Φ V 1/ U + c ] 3 Φ is the distributio fuctio of the stadard ormal distributio. I order to treat for W ad Z i a uified way we defie two radom variables U ad V as follows: for x Π 1 U = ρ 1 u B 1 u + ρ v B 1 u δ B 1 u ρ 3 V = τ u B u 4 u N p ωδ I p v N p 0 I p B W p I p δ = Σ 1/ µ 1 µ ad u v ad B are idepedet. Here ρ i = ρ i N 1 N i = 1 3 τ = τn 1 N 0 ad ω = ωn 1 N 0 are costats which are O1 uder A1. I additio ω = N/ + O 1. The above results are summarized as i the followig Lemma. Lemma. Assume that x Π 1. Let U V be the radom variables as i 4 ad let Z be the stadard ormal radom variable which is idepedet of U V. The W x D = V 1/ Z U U V i 4 is defied with the followig ρ 1 ρ ρ 3 τ ad ω: ρ 1 = 1 N ρ = ρ 3 = N1 N NN 1 N τ = ω = + 1/ N1 N = N. 5 Similarly 1/c Zx D = V 1/ Z U c = [ 1 + N 1 /1 + N N }] 1/ U V i 4 is defied with the followig ρ1 ρ ρ 3 τ ad ω: ρ 1 = 0 ρ = 0 ρ 3 = ω1 = 1 + N 1 a1/} ω = N ω 1 τ = N 1 + N 1 ω = ω 1 1/ + a1/} a = 1 + N N I order to evaluate the expectatio as i 3 it is importat to express u B 1 u v B 1 u δ B 1 u ad u B u i terms of simple variables whose momets are computable. We use the followig lemma give by Yamada et al. [3] which expresses them as fuctios of the idepedet stadard ormal ad chi-squared variables.

8 8 Lemma 3. Let v 1 N p δ I p v N p 0 I p A W p I p ad v 1 v ad A are idepedet. The δ A 1 v 1 v A 1 v 1 v 1A 1 v 1 = D v 1A v 1 1 Y 1 1 Y 1 Y Z 1 + Z Y 1 Y Y Z 1 + Y + Z + Y 4}Z Z Z + Y 4 } Y Y Z Z + Y 4 } Y 3 = δ δ; Z i N0 1 i = 1 3; Y i χ f i i = 1 3 4; all the seve variables Z 1 Z Z 3 Y 1 Y Y 3 Y 4 are idepedet; f 1 = p + 1 f = p 1 f 3 = p + f 4 = p. Results which are similar to Lemma 3 have be give i the followig papers. Fujikoshi ad Seo [9]; Lemma. gave stochastic expressio for triplet of v 1A 1 v 1 v 1A 1 v ad v A 1 v ad Fujikoshi [7]; Lemma 4.1 gave for triplet of v 1A 1 v 1 v 1A v 1 ad δ A 1 v 1. Hyodo ad Kubokawa [11] has also give a differet expressio for the four statistics i Lemma 3. However their expressio does ot hold simultaeously. I fact they have used the same expressio as Lemma 4.1 i Fujikoshi [7] for the triplet of v 1A 1 v 1 v 1A v 1 ad δ A 1 v 1 ad added a expressio for v A 1 v 1 which was derived separately from the triplet. From Lemma 3 we ca write the U ad V as i 4 as follows: U V D = uy1 /f 1 Y /f Y 3 /f 3 Y 4 /f 4 Z 1 / Z / Z 3 / vy 1 /f 1 Y /f Y 3 /f 3 Y 4 /f 4 Z 1 / Z / Z 3 / uy 1 y y 3 y 4 z 1 z z 3 = u 1 y 1 y 4 z 1 z + u y 1 y y 3 y 4 z 1 z z 3 u 3 y 1 y y 3 y 4 z 1 z 7 u 1 y 1 y 4 z 1 z = a 1 z 1 + ω + z + a 4 y 4 } y a y u y 1 y y 3 y 4 z 1 z z 3 = a y 1 u 3 y 1 y y 3 y 4 z 1 z = a 3 vy 1 y y 3 y 4 z 1 z z 3 = a 6 y 1 y 1 5 y 3 z 1 + ω a 5 y 1 + a 5 y y 3 z 1 + ω + z + a 4y 4 }z 3 z y 3 z 1 + ω + z + a 4 y 4 }. 8 Here a i = f 1 ρ i i = 1 3 a 4 = f 4 a 5 = f f 3 a 6 = f 1 τ. Note that a i = O1 uder A1.

9 9 3 Asymptotic expasios for the EPMCs of W x ad Zx uder A1 I order to obtai asymptotic expasios for the EPMCs of W x ad Zx we may derive a asymptotic expasio of P V Z U < c uder A1. Further istead of the distributio of V Z U we cosider its stadardized versio defied by T = V v 0 Z U u0 v0 u 0 = uey 1 /f 1 EY /f EY 3 /f 3 EY 4 /f 4 EZ 1 / Ez / EZ 3 / = u v 0 = vey 1 /f 1 EY /f EY 3 /f 3 EY 4 /f 4 EZ 1 / Ez / EZ 3 / = v The it holds that P V Z U x = P T v 1/ 0 x + u 0. Now we cosider a asymptotic expasio of the distributio of T by expadig its characteristic fuctio Ct = E expitt }. Based o the fact that T is coditioally ormal whe U V is give the coditioal characteristic fuctio ca be expressed as Therefor we have Ψy 1 y y 3 y 4 z 1 z z 3 = expitµ t σ / µ = µy 1 y y 3 y 4 z 1 z z 3 = uy 1 y y 3 y 4 z 1 z z 3 u 0 v0 σ = σ y 1 y y 3 y 4 z 1 z z 3 = vy 1 y y 3 y 4 z 1 z z 3 v 0. [ Y1 Ct = E Ψ Y Y 3 Y 4 Z 1 Z Z ] 3. f 1 f f 3 f 4 To get a asymptotic expasio of Ct we use a powerful method kow as the method by the differetial operator which was used by James [1] Okamoto [19] etc. Sice the fuctio Ψ is aalytic about the poit Y 1 /f 1 Y /f Y 3 /f 3 Y 4 /f 4 Z 1 / Z / Z 3 / = we ca expad it a Taylor series as follows. Y1 Ψ Y Y 3 Y 4 Z 1 Z Z 3 f 1 f f 3 f 4 = expaψy 1 y y 3 y 4 z 1 z z 3 0 9

10 10 A = i=1 Yi 1 + f i y i 3 i=1 Z i z i ad the otatio 0 stads for the value at the poit that y 1 y y 3 y 4 z 1 z z 3 = The Ct = ΘΨy 1 y y 3 y 4 z 1 z z Θ = E[expA]. Note that Y 1 Y Y 3 Y 4 Z 1 Z ad Z 3 are idepedet ad Y i χ f i i = Z i N0 1 i = 1 3. Cosiderig the expectatio with respect to Y i s ad Z i s we have Θ = exp 1 f i log y i=1 i f i=1 i y i i=1 1 = exp f i yi zi + R 1 = 1 + i=1 i=1 1 f i yi + 1 i=1 3 z i=1 i z i + R 11 R 1 ad R are remaider terms which are O uder A1. Substitutig 11 ito 10 we have Ct = e t / b k it k + R 3 R 3 is a remaider term which has the same property as R 1. Ivertig the above expasio of the characteristic fuctio we ca obtai a asymptotic expasio of the distributio of T up to the order O 1 uder A1 which is give as the followig theorem. Theorem 1. Assume that x Π 1. Let U V be the radom variables defied as 4 ad let Z be a stadard ormal radom variable which is idepedet of U V. Let y = v 1/ 0 x + u 0 u 0 = ρ 1 ω ρ 3 ω + p } m + 1 ρ 1 The it holds that + 1 N v 0 = m + 1 m + P V 1/ Z U x = Φy 1 ω + p. b k H k 1 yϕy + O uder A1 Φ deotes the cumulative distributio fuctio of the stadard ormal distributio ϕ is the derivative of Φ ad H k x deotes the Hermite polyomial of degree k especially H 0 x = }

11 11 1 H 1 x = x H x = x 1 H 3 x = x 3 3x H 4 x = x 4 6x + 3. Here b 1 = 1 u 0 + ρ 1 m + 1 v0 b = 3 m p 1 p + m + 1 ρ 1 + b 3 = 1 v0 b 4 = + 1m p m p 1 4m ω + ρ τ + 1 v 0 m + 1 ρ 1ω ρ 3 + m + 1 u 0 + u 0 + m+1 ρ } 1ω p + ω 1 p ω m + 1 u 0 p 1 m + 1 m + ρ 3 ω p +. ω Corollary. Let g w c; N 1 N be the expected probability of misclassificatio of W - rule with cutoff poit c whe x Π 1. Let y w = v 1/ w The it holds that uder A1 c + u w u w = u w N 1 N = 1 m + 1 v w = v w N 1 N = g w c; N 1 N = Φy w m + 1 m + p p + Np }. l k H k 1 y w ϕy w + O l k = l k N 1 N for k = are give as follows. l 1 = u w + 1 } vw m + 1 l = 3 m p 1 N + 1m + + N 1N + Np N 1N + p + m + 1 N 1 m + 1 NN ] p 1 N + m + 1 m + NN 1 l 3 = 1 vw m + 1 u w + N u w + 1 N N 1 m+1 N + Np N 1N l 4 = m p 1 4m N N 1N + Np N 1N + N + 1 [ v w m + 1 u w 1 + } + Np N 1N Corollary 3. Let g z c ; N 1 N be the expected probability of misclassificatio of Z- rule with a cutoff poit c whe x Π 1 c = 1 + N 1 /1 + N N }c. Let y z = vz 1/ c+u z u z = u z N 1 N = N ω1 1 m + 1 N 1 N ω 1 v z = v z N 1 N + 1 N = m + 1 ω1 m + + p ω1 } }.

12 1 ad The it holds that uder A1 ω 1 = ω 1 N 1 N = 1 + N 1 a1/ } ω = ω N 1 N = 1 + N 1 + a1/ } a = an 1 N = 1 + N N1 1. g z c ; N 1 N = Φy z 1 ζ k H k 1 y z ϕy z + O ζ k = ζ k N 1 N for k = are give as follows. ζ 1 = u z m + 1 vz ζ = 3 m p 1 + 1m + + ω1 + p ω1 + 1 N v z m + 1 u z + m + 1 ω N 1 N + ζ 3 = 1 vz m + 1 u ω } z + 1u z + p ω1 ζ 4 = m p 1 4m ω1 4 + p ω1 p 1 N m + 1 ω m p ω 1 }. } There are some results o asymptotic results o the EPMC of W - ad Z- rules uder A1 see e.g. Seo ad Fujikoshi [9] Fujikoshi [6] Matsumoto [14] Hyodo ad Kubokawa [11] etc. It may be oted that our results have bee give a uified way for W - ad Z- rules ad so that their compariso becomes more easy. I fact i the ext sectio usig Corollaries ad 3 we show that Z-rule has a optimality i the compariso with W -rule. 4 Compariso of EERs Let π i be the prior probabilities of x drow from Π i for i = 1. The the expected error rate EER for W -rule with a cutoff poit c w is expressed as EER w c w = π 1 P W x < c w x Π 1 + π P W x > c w x Π. From Corollary the limit uder A1 is give as lim EER w c w A1 = π 1 Φ 1 c w γ γ 1 γ 0 1 γ0 + π Φ 1 c w γ 1 γγ γ + γ 1 + γ 0 1 γ0 0 + γ 1 + γ + γ 0 c w = 1 γ 0 c w 0 = lim p lim N 1 /N = γ ad lim p/ = γ 0. The miimum value with respect to c w or equivaletly c w is attaied at c w = c w0 = 1 1 γ 0 [ γ γ 1 γ γ γ + } γ 1 + γ 0 log π ]. 0 π 1

13 13 This result was poited by Hyodo ad Kubokawa [11] ad they studied asymptotic ubiased estimator for EER w c w0. From the above result we ca see that the limitig EER w c w takes the miimum value at c wm = c wm = 1 [ N p p + N 1 + Np 1 N p N N 1 N p log π ]. 1 π 1 For the case whe the prior probabilities are equal c wm = 1 N p p N p N N 1 ad the [ 1 lim A1 P W x < c wm x Π ] P W x > c wm x Π = Φ γ γ + γ 1 + γ 0 O the other had the EER for Z-rule with a cutoff poit c z is expressed as EER z c z = π 1 P Zx < c z x Π 1 + π P Zx > c z x Π. Let Usig Corollary 3 c 1 + N z = N N c z. lim EER z c A1 z = π 1 Φ 1 c z γ0 + π Φ 1 c z γ + γ 1 + γ 0 1 γ0 0 + γ 1 + γ + γ 0 c z = 1 γ 0 lim A1 c z = 1 γ 0 c z. The miimum value with respect to c z is attaied at c z = c z0 = 1 + γ + } γ 1 + γ 0 1 γ 0 log π. 0 π 1 This implies that the limitig EER for Z-rule i.e. lim A1 EER z c z takes the miimum value at c z = c zm = 1 + N N N } 1/ N 1 + Np 1 N p log π. 14 π 1 Whe the prior probabilities are equal c zm = 0 ad the the limitig error rate uder A1 is the same as 13. These imply that whe π 1 = π lim EER w c wm = lim EER z 0 A1 A1 which is equal to the right-had side of the equality i 13. I order to see the differece whe π 1 = π we eed to compare the ext terms of these asymptotic expasios. The fial result is give i the ext theorem.

14 14 Theorem 4. Let EER w c w ad EER z c z be the expected error rates of W - rule with a cutoff poit c w ad Z- rule with a cutoff poit c z respectively. 1 The miimums of EER w c w ad EER z c z are attaied at c w = c wm ad c w = c zm give i 1 ad 14 respectively. Whe π 1 = π it holds that EER w c wm EER z c zm = 1 1p 4v w m H 1 y c ϕy c + O y c = 1/ }. + Np +1 N 1N m+ Further sice H 1 y c = y c < 0 we have that EER z 0 is less tha or equal to EER w c wm eglectig the term of O. Whe N 1 = N the differece becomes O. The proof of Theorem 4 is give i Appedix A. We will show a asymptotic expasio for each of EER w 0 ad EER z 0 up to the terms of O 1 uder A0. Asymptotic expasio of EER w 0 is obtaied by usig Corollary i Okamoto [19] which is cited i Fujikoshi et al. [10] Corollary which is as follows. EER w 0 = Φ 1 + ϕ 1 [ } 1 4p ] p 1 + O. 4 Sice c wm = O 1 uder A0 we ca show that asymptotic expasio of EER w c wm up to the term of O 1 is the same as the oe of EER w 0. By virtue of COROLLARY i Memo ad Okamoto [15] which is cited i Fujikoshi et al. [10] Corollary 9.3. we have EER z 0 = Φ 1 + ϕ 1 [ } 1 4p ] p 1 + O. 4 These results imply that EER w 0 ad EER z 0 are the same up to the terms of O 1 uder A0. Hyodo ad Kubokawa [11] proposed a variable selectio procedure for W -rule i two-group discrimiat aalysis for high-dimesioal data. Their criteria is based o a estimator of EER w c wm which is ubiased up to the term of order O p 1 uder A1. From the above result we thik that Hyodo ad Kubokawa [11] s criteria with beig recosidered for Z-rule outperforms their criteria i terms of selectig the true set of variables. I Sectio 6 we give a asymptotic estimator of EER w c wm with errors of ON. 5 Estimatio for EPMCs I this sectio we cosider to estimate the expected probabilities of misclassificatio; e w 1 e w 1 e z 1 ad e z 1. These deped o ukow parameter. It is importat to estimate. A well

15 15 kow ubiased estimator is give by = p 1 D pn. Here we deote by. Such covetioal otatio is used hereafter. Firstly we give a stochastic expressio of D which is essetially stated i Lemma 3. Lemma 4. The followig equality holds i distributio: D D = + N z1 y 1 N z 1 + N y y 1 χ p + 1 y χ p 1 z 1 N0 1 ad y 1 y z 1 are idepedet. From Lemma 4 it ca be see that has cosistecy uder A1. Therefore from Corollary a cosistet estimator of g w c; N 1 N uder A1 is obtaied as Φ c + u wn 1 N v w N 1 N. Similarly a cosistet estimator of g z c ; N 1 N is obtaied as Φ c + u zn 1 N v z N 1 N. 15 However v z N 1 N does ot always take o-egative values ad so it is importat to modify the estimator such that it takes always o-egative values. For the purpose istead of usig we use The it ca be see that A = p + 1 D p ω. v w N 1 N + 1 A = m + 1m + D v z N 1 N + 1 A = m + 1m + D which take o-egative values. So we propose a cosistet estimator of misclassificatio probability give as i the followig theorem. Theorem 5. Assume that lim p > 0. The uder A1 P W x < c x Π 1 Φ c + u wn 1 N A v w N 1 N p 0 A P Zx < c x Π 1 Φ c + u zn 1 N A v z N 1 N p 0. A

16 16 We exted the result give i Theorem 5 to the oe based o asymptotic expasios. From Theorem 1 we ca see that y u 0 v 0 b 1... b 4 are fuctios of. Defie ŷ û 0 v 0 b 1... b 4 be the oes obtaied from y u 0 v 0 b 1... b 4 by replacig with A. Theorem 6. Assume that x Π 1 ad let U V ad Z be the same oes as i Theorem 1. The uder A1 ε k = ε k A with P V 1/ Z U x = E [ Φŷ + 1 ] ε k b k H k 1 ŷϕŷ + O ε 1 = 1 v0 m + 1 ρ 1ω ρ 3 ωd + r 1 ε = λ d + r ρ 1 ω ρ 3 ω d 1 v 0 m + 1 ε 3 = λ 1 v0 m + 1 ρ 1ω ρ 3 ωd 1 ε 4 = λ 8 d 1. Here d 1 = f 1 d = f } Np 1 + 4N Np 1 + f } + N λ = N v 0 m + 1 ω m + Np 1 r 1 = p ω. } Np 1 The proof of Theorem 6 is give i Appedix B. From Theorem 6 we ca obtai estimators of the misclassificatio probabilities for W -rule ad Z-rule which are ubiased up to the term with the order O 1 uder A1. These results are summarized as the followig corollaries. Corollary 7. Let ŷ w = v w 1/ c + û w û w = u w N 1 N w ad v w = v w N 1 N w with The g w c; N 1 N = E w = A = p + 1 D Np. [ Φŷ w + 1 ] ε wk l k H k 1 ŷ w ϕŷ w + O l k = l k N 1 N w ad ε wk = ε wk w. Here ε wk is the same as ε k give i

17 17 Theorem 6 i.e. ε w1 = 1 v w m + 1 d + r w1 ε w = λ w d + r w v w ε w3 = λ w 1 4 vw m + 1 d 1 ε w4 = λ w 8 d 1 d 1 m + 1 with λ w = 1 v w + 1 m + 1 m + ad r w1 = N. Corollary 8. Let ŷ z = v z 1/ c + û z û z = u z N 1 N z ad v z = v z N 1 N z with The g z c ; N 1 N = E z = A = p + 1 D p ω 1. [ Φŷ z + 1 ] ĝ zk ζ k H k 1 ŷ z ϕŷ z + O ζ k = ζ k N 1 N z ad ε zk = ε zk z. Here ε zk is the same as ε k give i Theorem 6 i.e. ε z1 = 1 vz m + 1 ε z = λ z d + r v z ε z3 = λ z 1 vz m + 1 ε z4 = λ z 8 d 1 N ω 1 1 ω 1 d + r z1 N ω1 m + 1 N 1 N ω d 1 N ω 1 1 ω 1 d 1 with λ z = N v z m + 1 ω1 m + N 1 N ad r z1 = Np 1 p ω 1. 6 Asymptotic ubiased estimator of EER I this sectio we propose asymptotically ubiased estimators of EERs for W -rule ad Z-rule uder A1. Our estimators are costructed usig the followig result. Lemma 5. Let ŷ c = 1/ w }. w + Np +1 N 1N m+

18 18 The the followig equalities hold uder A1: [ ] E Φ ŷ c + 1 ε wk w H k 1 ŷ c ϕ ŷ c = Φ y c + O ] E [ lk w = l k + O 1 k = Sice Lemma 5 ca be similarly proved with Theorem 6 we omit the proof. From 5 ad Lemma 5 we ca give a asymptotically ubiased estimator of EER w c wm uder A1 as i the followig theorem. Theorem 9. A estimator of EER w c wm whose bias is of O uder A1 is give by η wk = η wk w ad ÊER w c wm = Φ ŷ c 1 η wk H k 1 ŷ c ϕ ŷ c 16 η wk = ε wk l k. To costruct asymptotically ubiased estimator for EER z 0 we use the followig result. Lemma 6. For ŷ c defied i Lemma 6 [ ] 1 E H 1 ŷ c ϕ ŷ c v w uder A1. = 1 v w H 1 y c ϕy c + O 1 Lemma 6 ca be similarly proved with Theorem 6 ad so we omit the proof. From 37 Theorem 9 ad Lemma 6 we ca get a asymptotically ubiased estimator of EER z 0 uder A1 which is give i the followig theorem. Theorem 10. A estimator of EER z 0 whose bias is of O uder A1 is give by ÊER z 0 = ÊER w c wm + 1 1p 4 v w m H 1 ŷ c ϕ ŷ c. 17 We ote that ÊER z 0 is the same as ÊER w c wm for the case i which N 1 = N. I Sectio 4 we metioed that EER w 0 ad EER z 0 are the same up to the term of order O 1 uder A0. McLachla [16] gave a asymptotically ubiased estimator of EER w 0 uder A0 which is also a asymptotically ubiased estimator of EER z 0 which are stated as follows. Theorem 11. A estimator of EER w 0 whose bias is of O uder A0 is give by ÊER w 0 = Φ 1 D + ϕ 1 [ 1 1 D + 1 p 1 D + D ] 3 44p 1 D }. 18 This is also a estimator of EER z 0 whose bias is of O. Sice a asymptotic expasio of EER w c wm up to the term of O 1 is the same as the oe of EER w 0 uder A0 the right-had side of the equality i 18 is also a estimator of EER w c wm whose bias is of O uder A0.

19 19 7 Numerical comparisos for asymptotic approximatios To compare the accuracies of the derived asymptotic expasio approximatios for the misclassificatio probabilities with the oes of Fujikoshi ad Seo [9] s limitig approximatio we calculated these values whe p = 8 3 N 1 N = = the settig of is followed to Wyma et al. [4]. The settig for N ad p were treated as the case i which p : N = 1 : 5 whe p = 8 ad the case i which p : N = 1 : 5 whe p = 3. We cosidered for the case i which the cut-off poit c is zero. Table 1 gives the values of e W 1 ad e Z 1 whe p = 8 ad Table gives the values of e W 1 ad e Z 1 whe p = 3. I these tables we described the value of Fujikoshi ad Seo [9] s limitig approximatio at colum FS s Aprox ad the value of asymptotic expasio based o Corollary for W -rule ad Corollary 3 for Z-rule at colum YSF s AE. To compare the accuracy it is eeded the values of misclassificatio probabilities calculated by simulatio. Whe we treat the distributios of W - rule ad Z- rule without loss of geerality from ivariat property of the distributio for the orthogoal trasformatio of observatio vector we may assume that two give ormal populatios with the same covariace matrix are Π 1 : N p δ/e 1 I p Π : N p δ/e 1 I p e 1 = To compute misclassificatio probability geerate 10 4 traiig samples. For each traiig samples we geerate 10 4 test samples i which observatio vectors are i.i.d. N p δ/e 1 I p. The value of the coditioal misclassificatio probability is calculated by i each traiig samples. umber of misclassificatio We took the average of these 10 4 values of coditioal misclassificatio probability ad wrote it as the value of misclassificatio probability i colum Sim i Tables 1 ad. From Tables 1 ad we ca see that our proposed asymptotic expasio approximatios have good accuracy compared to Fujikoshi ad Seo [9] s limitig approximatio whe N 1 N. I additio for the case i which N 1 = N our proposed asymptotic expasio has almost the same precisio of approximatio with Fujikoshi ad Seo [9] s limitig approximatio. I Table 3 we give the values of EER w 0 EER w c wm ad EER z 0 obtaied by simulatio for the case π 1 = π obtaied by simulatio. Simulatig settig ad computatio method are the same as oes i Table 1. We ca see a tedecy from Table 3 that the magitude of these error rate has the order EER z 0 < EER w c wm < EER w 0 for almost all simulatio settigs. There are little differece betwee the magitudes of EER z 0 ad EER w c wm whe N 1 = N The differece appears i less tha 4th place of decimal poit.. We also checked the precisio of the proposed asymptotically ubiased estimators of e W 1 ad e Z 1 by simulatio. Simulatig settig ad procedure of calculatio are the same as oes i Table 1. as

20 0 Table 1: Compariso of approximatios of e w 1 ad e z 1 for cut-off poit 0 whe p = 8 N 1 N e w 1 e z 1 FS s Aprox YSF s AE Sim FS s Aprox YSF s AE Sim Table : Compariso of approximatios of e w 1 ad e z 1 for cut-off poit 0 whe p = 3 N 1 N e w 1 e z 1 FS s Aprox YSF s AE Sim FS s Aprox YSF s AE Sim

21 1 Table 3: Compariso of EER N 1 N p = 8 p = 3 EER w 0 EER w c wm EER z 0 EER w 0 EER w c wm EER z As a competitor we used the estimator obtaied as Fujikoshi ad Seo [9] s limitig approximatio by replacig with p = max 0} ad wrote the value i colum FS s Est i Table 4 for p = 8 ad i Table 5 for p = 43. The value i parethesis i colum FS s Est is obtaied by computig the mea squared error MSE = FS s Est i Sim i=1 FS s Est i stads for the value calculated by i-th traiig sample ad Sim stads for misclassificatio probability computed by simulatio. From Tables 4 ad 5 we ca see that the proposed estimators have good accuracy compared to method beig used Fujikoshi ad Seo [9] s approximatio. The MSE for proposed estimator is small tha for the method beig used Fujikoshi ad Seo [9] s approximatio for the case i which = We tried to compare the precisio of asymptotically ubiased estimator of expected error rate by simulatio for p = N 1 = N = 0. The settig of is the same as Table 5. We computed values of -types of coditioal misclassificatio probabilities by 19 ad take average of these values. This calculatio is repeated 10 3 times. We wrote the average of these 10 3 values i colum Sim i Table 6. The colum A0 i Table 6 gives the averages of 10 3 replicated values of 18 ad the colum A1 gives the averages of 10 3 replicated values of 16. We computed the mea squared error by usig similar calculatio to 0 ad wrote it i parethesis i each case. Table 6 shows that the precisio of approximatio 18 becomes worse as the dimesioality gets large. We ca check that the precisio of approximatio 16 is good i each case of p.

22 Table 4: Estimated values of e w 1 ad e z 1 for cut-off poit 0 whe p = 8 e w 1 e z 1 N 1 N FS s Est YSF s Est FS s Est YSF s Est sim MSE MSE MSE MSE sim Table 5: Estimated values of e w 1 ad e z 1 for cut-off poit 0 whe p = 3 e w 1 e z 1 N 1 N FS s Est YSF s Est FS s Est YSF s Est sim MSE MSE MSE MSE sim

23 3 Table 6: Compariso of asymptotic ubiased estimate of EER for W -rule whe N 1 = N = 0 p = 1 p = A0 A1 Sim A0 A1 Sim p = 5 p = 8 A0 A1 Sim A0 A1 Sim p = 16 p = 3 A0 A1 Sim A0 A1 Sim Criteria for selectio of variables We cosider the problem for selectio of variables i two-group discrimiat aalysis. McLachla [16] ad [17] proposed a criterio which is based o asymptotic ubiased estimator of the expected error rate uder A0. I this sectio we will derive such a criterio uder A1. The problem of selectio of variables is to idetify a sub-vector xj = x j1... x jk of x which is correspoded to a subset of subscripts 1... p} k = kj is the cardial umber of j i.e. kj = #j. Let J be the family of all possible subsets of 1... p}. The the problem may be regarded as how to select the best subset of j from J. If we oly use xj i discrimiat aalysis the correspodig W ad Z discrimiat fuctios become W xj = x 1 j x j Sj xj 1 1 } x 1j + x j Zxj = N 1 1 xj x j Sj 1 xj x j 1 + N xj x 1 j Sj 1 xj x 1 j } x g j ad Sj deote x g ad S correspodig to xj respectively. The expected error rate for W -rule o the model j is expressed as EER w c w ; j = π 1 P W xj < c w x Π 1 + π P W xj > c w x Π ad the oe for Z-rule is expressed as EER z c z ; j = π 1 P Zxj < c z x Π 1 + π P Zxj > c z x Π.

24 4 8.1 Criterio for selectio of variables For ease of explaatio we cosider the case that π 1 = π. Set the cut-off poit for W -rule correspodig to xj as c wm j = 1 N kj N kj N kj. N 1 From the statemet i Sectio 4 the limitig value for EER w c wm j; j takes the miimum value uder the high-dimesioal asymptotic framework A1j; A1j : kj N 1 N kj/ γj [0 1 ad N 1 /N γ 0. We see that lim A1j EER z 0; j takes the miimum value ad see that lim EER z0; j = lim EER wc wm j; j. 1 A1j A1j To obtai the criterio for variable selectio firstly we derive asymptotically ubiased estimators of EER w c wm j; j ad EER z 0; j. Sice the framework A1j is a imitatio which is obtaied from A1 by replacig p i A1 with kj it follows from the expasio 5 that EER w c wm j; j = Φy c j 1 l i jh i 1 y c jϕy c j + O i=1 y c j = 1/ j } j + Nkj N 1N +1 mj+ mj = kj j ad l i j are deoted ad l i which is give i 6 respectively correspodig to xj. By usig the same reaso we fid from Theorem 4 that EER z 0; j = EER w c wm + 1 1kj 4v w j mj N 1 N H 1 y c j ϕ y c j + O v w j is the v w correspodig to xj. The followig theorem gives asymptotic ubiased estimates of EER w c wm ad EER z 0; j. Theorem 1. Let ŷ c j = 1/ w j } w j + Nkj N 1N +1 mj+ w j is the w give i Corollary 7 correspodig to the model j. Let G wh j = ŷ c j + 1 η wi jh i 1 ŷ c j i=1

25 5 η wi j = η wi w j ; j ad η wi j ; j = ε wi j l i j. The uder the high-dimesioal asymptotic framework A1j EER w c wm j; j = E[ΦG wh j] + O. Let G zh j = G wh j + 1 1kj 4 v w j mj H 1 ŷ c j ϕ ŷ c j v w j = + 1 mj + 1 w j + mj + The uder the high-dimesioal asymptotic framework A1j EER z 0; j = E[ΦG zh j] + O. } Nkj. We propose the selectio method for W -rule with the cut-off poit c wm j base o M wh j = Φ G wh j ad propose for Z-rule with the cut-off poit 0 based o M zh j = Φ G zh j. The selected model ĵ Mwh is obtaied by satisfyig that M wh ĵ Mwh = mi j J M wh j ad ĵ Mzh is obtaied by satisfyig that M zh ĵ Mzh = mi j J M zh j. Sice Φ is a mootoe icreasig fuctio ĵ Mwh ad ĵ Mzh miimize G wh j ad G zh j respectively. 8. Relatioship betwee expected error rate ad o additioal iformatio model Firstly we cosider o additioal iformatio model Ωj which leads xj to be the best subsets of variables. Defie Ωj as Ωj : a k 0 for ay k j ad a k = 0 for ay k j c 1... p} 3 a = a 1... a p = Σ 1 µ 1 µ. Let j 0 be a fixed subset i J. We may assume without loss of geerality that j 0 = 1... k 0 }. We call Ωj 0 true model if µ 1 µ Σ} satisfies the coditio i 3. Let J 1 = j J : j j 0 } ad J = J1 c J. It is kow see e.g. Fujikoshi [4] that Ωj 0 is true if ad oly if j = for ay j J 1 ad j < for ay j J. 4

26 6 Theorem 13. Assume that Ωj 0 is true ad k 0 is fixed. The it holds that i lim A1 EER w c wm j; j EER w c wm j 0 ; j 0 } 0 for j J 1 \ j 0 } ii lim A1 EER w c wm j; j EER w c wm j 0 ; j 0 } > 0 for j J. I additio it holds that iii lim A1 EER z 0; j EER z 0; j 0 } 0 for j J 1 \ j 0 } iv lim A1 EER z 0; j EER z 0; j 0 } > 0 for j J. Proof. From 1 it is sufficiet to prove i ad ii oly. It follows from 13 that lim EER 1/ 0 j wc wm j; j = Φ A1j 0 j + + γ + γ 1 γj} 1 γj 0 j is a positive value which is defied as 0 j = lim kj j. From the assumptio we have Sice 0 γj < 1 it holds that lim EER w c wm j 0 ; j 0 = lim EER w c wm j 0 ; j 0 = Φ 1 A1 A0 j 0. 1/ 0 j 0 j + + γ + γ 1 1 γj} 1 γj 1/ 0j 0 j the equality hold for the case i which γj = 0. The coditio 4 implies that 1/ 0 j 0 j = 1 j 0 j J 1 1/ 0 j 0 j > 1 j 0 j J which prove i ad ii for the case i which kj as p. Assume that kj is fixed. The lim EER w c wm j; j = lim EER w c wm j; j = Φ 1 j A1 A0 which leads to i ad ii. From Theorem 13 we ca regard Ωj as a miimal realizatio of the parametric model such that EER w c wm j; j ad EER z 0; j are miimum i the sese of i-iv. Note that the miimizatio for Ωj leads the model j i which selected variables are overspecified for the miimizatio of EER w c wm j; j ad EER z 0; j i the sese of i ad iii.

27 7 8.3 Simulatio From Theorem 13 EER w c wm j 0 ; j 0 ad EER z 0; j 0 become miimum for ay j J i the limitig sese whe the model Ωj 0 is true. So i this simulatio we set the parameters µ 1 µ Σ} which satisfies 3. The covariace matrix Σ is assumed to be I p. We set µ = µ 1 ad µ 1 = α/ each of the first k elemets of µ 1 is α/ ad the remaiig is 0. Simulatio experimets were carried out for the case i which N 1 = N = 80 k = p/10 p/5 p/ p = α = 1 4. Whe #j is small the precisio of the approximatio of M wh j is ot good. So we use a switchig procedure that the model is selected by Mj for #j < 5 ad is selected by M wh j for # 5 Mj is the base of selectio criterio give i Fujikoshi [5]. Sice all cadidate models are too much for large p e.g. the umber of cadidate models is 103 whe p = 10 we use the forward step wise selectio method. The details of the selectio method is give i Algorithm 1. Algorithm 1 The algorithm for forward step wise selectio method 1: j ca } : j temp arg mi Mj #j=1 3: t mi #j=1 Mj 4: l 5: while l < 5 do 6: if t > arg mi #j=l Mj the 7: j temp j temp arg mi #j=1 8: t mi #j=1 Mj 9: l l : else 11: j ca j temp 1: ed if 13: ed while 14: while j ca = } 5 l p do 15: if t > arg mi #j=l M wh j the 16: j temp j temp arg mi #j=1 17: t mi #j=1 M wh j 18: l l : else 0: j ca j temp 1: ed if : ed while 3: retur j ca j temp Mj M wh j Table 7 gives the frequecies of selected variables for 100 trials j 0k+l = j J 1 : #j = k+l}. We ca see from the table that there are few models to be selected i J whe p k = 30 3 p k = 30 6 ad p k = O the other had the proposed method rarely selects the model i J whe p k = It is cosidered that these results are guarateed by assertios i ad ii i Theorem 13. However it seems that the proposed method rarely selects true model. It is expected

28 8 to costruct a variable selectio criterio which is cosistet. Table 7: Frequecies of proposed variable selectio for 100 trials p k p k α j 0 j 0k+1 j 0k+ l=3 j 0k+l J Cocludig remarks I this paper asymptotic expasios of expected probabilities of misclassificatio for W - ad Z- rules were derived up to the term of O 1 uder the high-dimesioal asymptotic framework A1. It may be oted that the orders of their errors are O. We compared expected error rates for these -rules asymptotically for the case i which prior probabilities are equal. It is kow that uder the large-sample asymptotic framework A0 expected error rates for these -rules are equal asymptotically. However from our high-dimesioal asymptotic results we have show that the expected error rate for Z-rule is lower tha or equal to the oe for W -rule. We proposed a asymptotic ubiased estimators of the expected probability of misclassificatio for W - ad Z- rules uder the high-dimesioal asymptotic framework A1. I additio asymptotic ubiased estimators of the expected error rate for these -rules were derived. Based o these ubiased estimators we proposed variable selectio criteria. Our asymptotic approximatios were umerically examied. A Proof of Theorem 4 I this sectio we give a proof of Theorem 4. Proof of Theorem 4. From Corollary g w c wm ; N 1 N = Φy wm 1 l k H k 1 y wm ϕy wm + O

29 9 uder A1 y wm = y c + 1/R 4 with The we have y c = 1/ } + Np +1 N 1N m+ N N 1 R 4 = R 4 N 1 N = g w c wm ; N 1 N = Φy c + 1 It follows from the duality that g w c wm ; N N 1 = Φy c + 1 From these expasios we obtai R 4 N 1 N EER w c wm = Φy c 1 R 4 N N p p m+ } + Np +1 N 1N m+ } p m+ } l k N 1 N H k 1 y c ϕy c + O. } l k N N 1 H k 1 y c ϕy c + O. l k H k 1 y c ϕy c + O 5. l k = 1 l kn 1 N + l k N N 1 } for k = Here l 1 = m + 1 y c l = 3 N m p 1 + 1m + + N 1N + + Np N + 1 [ y v w m + 1 c v w N 1N + 1 p p } p + 4 m + 1 m N1 + m N ] m + NN NN 1 l 3 = y c m N 1 + Np N 1N l 4 = l 4 N 1 N. 6 O the other had it follows from Corollary 3 that g z 0; N 1 N = Φy zm 1 ζ k H k 1 y c ϕy c + O 7 ω N 1 N } 1 y zm = y zm N 1 N =. [ + p ω 1 N 1 N } ] +1 m+

30 30 Write an 1 N = 1 + N N1 1 = 1 + N 1 From Maclauri expasio for 1 + x 1 1/ we have a 1/ = N N1 1 N N = 1 + x N N N N1 1 + O 3. 8 It also holds that N1 1 = 1 N1 1 + N N Substitutig 9 ito 8 we obtai ad so a 1/ = N N N1 1 1 N N O 3 ω 1 N 1 N } = N + 1 N 1 + O 30 4 N 1 ω N 1 N } = N 1 + O = 41 + x + O N 1 From Maclauri expasio for 1 x 1/ we have ω N 1 N } 1 = N O. 3 8 N 1 Substitutig 30 ad 3 ito y zm N 1 N ad expadig it asymptotically uder A1 we have y zm N 1 N = y c + 1 AN 1 N + 1 BN 1 N H 1 y c + O 33 v w AN 1 N = 1 3 N 4 N 1 } + 1p 1 N BN 1 N = m + 1. m + 4 N 1 Substitutig 33 ito 7 ad expadig it asymptotically uder A1 we obtai g z 0; N 1 N = Φy c + 1 [ AN 1 N + 1 } BN 1 N H 1 y c v w } ζ k N 1 N H k 1 y c ϕy c + O. 34 It follows from the duality that g z 0; N N 1 = Φy c + 1 [ AN N } BN N 1 H 1 y c v w } ζ k N N 1 H k 1 y c ϕy c + O. 35

31 31 From 34 ad 35 we ca obtai a asymptotic expasio of EER z 0 which is as follows. EER z 0 = Φy c + 1 Ā + 1 } B H 1 y c ζ k H k 1 y c ϕy c + O 36 v w Ā = 1 AN 1 N + AN N 1 } = 1 N 8 B = 1 BN 1 N + BN N 1 } = ζ k = 1 ζ kn 1 N + ζ k N N 1 } + 1p 4m + 1 m + N 1 N for k = Substitutig 30 ad 3 ito ζ k ad expadig it asymptotically we have ζ 1 = l 1 + O 1 ζ = 3 m p 1 + 1m + + N N 1N + Np N 1N N v w m + 1 y c v w + 4m + 1 m + ζ 3 = l 3 + O 1 ζ 4 = l 4 + O 1. } + O 1 From 5 ad 36 EER w c wm EER z 0 = 1 It ca be computed that l ζ = Sice it ca be expressed that we fid that This gives l ζ + Ā + 1 B H 1 y c ϕy c + O. v w N + 1 1p v w m m + 1 m + N } + O 1. 1 N + 8 N = N 1N 8N l ζ + Ā p B = v w 4v w m O 1. EER w c wm EER z 0 = 1 1p 4v w m H 1 y c ϕy c + O. 37

32 3 B Proof of Theorem 6 I this sectio we give a proof of Theorem 6. Proof of Theorem 6. For dealig with the expectatio of A it is o loss of geerality from Lemma 4 that A may be A = f 1 y 1 + N z1 N z 1 + N y p ω 38 y 1 y ad z 1 are defied i Lemma 4 f 1 = m + 1 ad f = p 1. Let w i = f i /y i /f i 1 for i = 1. It ca be expressed that f 1 y 1 = 1 + k 5 1 w 1 + w 1 f 1 f /f 1 w 1 Replacig f 1 /y 1 i 38 with this expressio ad expadig the resultat expressio we have Np 1 r 1 = p ω D 1 = D 1 w 1 w z 1 = + = f 1 f 1 D = D w 1 w z 1 + A = + r / D D + R 39 } Np 1 N w 1 z 1 + Np 1 Np 1 w f } } N w1 Np 1 + w 1 z 1 w + N z f 1 f N 1 N 1 R is a remaider term cosistig of 3/ times a homogeeous polyomial of degree 3 i w 1 w ad z 1 of which the coefficiets are O1 uder A1 plus times a homogeeous polyomial of degree 4 plus a remaider term that is O 5/ uder A1 for fixed w 1 w ad z 1. From the assumptio that ω = N/ + O 1 we fid that r 1 = N + Op which yields that r 1 = O 1 uder A1. It follows from 39 that û 0 = u 0 + m + 1 ρ 1ω ρ 3 ω r 1 + 1/ D D + R N v 0 = v 0 + m + 1 ω r 1 + 1/ D D + R m + Taylor series expasio of v 1/ 0 at v 0 = v 0 up to the term with the order O v 0 v 0 5 gives that v 1/ 0 = v 1/ 0 1 λ r 1 λ D 1/ λ D + 3 } 8 λ D1 + R 3 41

Constrained linear discriminant rule for 2-groups via the Studentized classification statistic W for large dimension

Constrained linear discriminant rule for 2-groups via the Studentized classification statistic W for large dimension Costraied liear discrimiat rule for -groups via the Studetized classificatio statistic W for large dimesio Takayuki Yamada Istitute for Comprehesive Educatio Ceter of Geeral Educatio Kagoshima Uiversity

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Asymptotic Results for the Linear Regression Model

Asymptotic Results for the Linear Regression Model Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Rank tests and regression rank scores tests in measurement error models

Rank tests and regression rank scores tests in measurement error models Rak tests ad regressio rak scores tests i measuremet error models J. Jurečková ad A.K.Md.E. Saleh Charles Uiversity i Prague ad Carleto Uiversity i Ottawa Abstract The rak ad regressio rak score tests

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Strong consistency of log-likelihood-based information criterion in high-dimensional canonical correlation analysis

Strong consistency of log-likelihood-based information criterion in high-dimensional canonical correlation analysis Strog cosistecy of log-likelihood-based iformatio criterio i high-dimesioal caoical correlatio aalysis Ryoya Oda, Hirokazu Yaagihara ad Yasuori Fujikoshi Departmet of Mathematics, Graduate School of Sciece,

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso

More information

Supplemental Material: Proofs

Supplemental Material: Proofs Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Basics of Probability Theory (for Theory of Computation courses)

Basics of Probability Theory (for Theory of Computation courses) Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

Performance Accuracy of Linear classifiers for Two-level Multivariate Observations in High-dimensional Framework

Performance Accuracy of Linear classifiers for Two-level Multivariate Observations in High-dimensional Framework THE UNIVERSITY OF TEXAS AT SAN ANTONIO, COLLEGE OF BUSINESS Workig Paper SERIES Date February, 6 WP # MGST-53-6 Performace Accuracy of Liear classifiers for Two-level Multivariate Observatios i High-dimesioal

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Study the bias (due to the nite dimensional approximation) and variance of the estimators

Study the bias (due to the nite dimensional approximation) and variance of the estimators 2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION [412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION BY ALAN STUART Divisio of Research Techiques, Lodo School of Ecoomics 1. INTRODUCTION There are several circumstaces

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

11 THE GMM ESTIMATION

11 THE GMM ESTIMATION Cotets THE GMM ESTIMATION 2. Cosistecy ad Asymptotic Normality..................... 3.2 Regularity Coditios ad Idetificatio..................... 4.3 The GMM Iterpretatio of the OLS Estimatio.................

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

MATHEMATICAL SCIENCES PAPER-II

MATHEMATICAL SCIENCES PAPER-II MATHEMATICAL SCIENCES PAPER-II. Let {x } ad {y } be two sequeces of real umbers. Prove or disprove each of the statemets :. If {x y } coverges, ad if {y } is coverget, the {x } is coverget.. {x + y } coverges

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Lecture 8: Convergence of transformations and law of large numbers

Lecture 8: Convergence of transformations and law of large numbers Lecture 8: Covergece of trasformatios ad law of large umbers Trasformatio ad covergece Trasformatio is a importat tool i statistics. If X coverges to X i some sese, we ofte eed to check whether g(x ) coverges

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ. 2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For

More information

LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES

LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES J Lodo Math Soc (2 50, (1994, 465 476 LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES Jerzy Wojciechowski Abstract I [5] Abbott ad Katchalski ask if there exists a costat c >

More information

Chapter 9 - CD companion 1. A Generic Implementation; The Common-Merge Amplifier. 1 τ is. ω ch. τ io

Chapter 9 - CD companion 1. A Generic Implementation; The Common-Merge Amplifier. 1 τ is. ω ch. τ io Chapter 9 - CD compaio CHAPTER NINE CD-9.2 CD-9.2. Stages With Voltage ad Curret Gai A Geeric Implemetatio; The Commo-Merge Amplifier The advaced method preseted i the text for approximatig cutoff frequecies

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

Approximate Confidence Interval for the Reciprocal of a Normal Mean with a Known Coefficient of Variation

Approximate Confidence Interval for the Reciprocal of a Normal Mean with a Known Coefficient of Variation Metodološki zvezki, Vol. 13, No., 016, 117-130 Approximate Cofidece Iterval for the Reciprocal of a Normal Mea with a Kow Coefficiet of Variatio Wararit Paichkitkosolkul 1 Abstract A approximate cofidece

More information

Mathematical Statistics - MS

Mathematical Statistics - MS Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

Lecture 24: Variable selection in linear models

Lecture 24: Variable selection in linear models Lecture 24: Variable selectio i liear models Cosider liear model X = Z β + ε, β R p ad Varε = σ 2 I. Like the LSE, the ridge regressio estimator does ot give 0 estimate to a compoet of β eve if that compoet

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

x iu i E(x u) 0. In order to obtain a consistent estimator of β, we find the instrumental variable z which satisfies E(z u) = 0. z iu i E(z u) = 0.

x iu i E(x u) 0. In order to obtain a consistent estimator of β, we find the instrumental variable z which satisfies E(z u) = 0. z iu i E(z u) = 0. 27 However, β MM is icosistet whe E(x u) 0, i.e., β MM = (X X) X y = β + (X X) X u = β + ( X X ) ( X u ) \ β. Note as follows: X u = x iu i E(x u) 0. I order to obtai a cosistet estimator of β, we fid

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

Self-normalized deviation inequalities with application to t-statistic

Self-normalized deviation inequalities with application to t-statistic Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

On forward improvement iteration for stopping problems

On forward improvement iteration for stopping problems O forward improvemet iteratio for stoppig problems Mathematical Istitute, Uiversity of Kiel, Ludewig-Mey-Str. 4, D-24098 Kiel, Germay irle@math.ui-iel.de Albrecht Irle Abstract. We cosider the optimal

More information

Department of Mathematics

Department of Mathematics Departmet of Mathematics Ma 3/103 KC Border Itroductio to Probability ad Statistics Witer 2017 Lecture 19: Estimatio II Relevat textbook passages: Larse Marx [1]: Sectios 5.2 5.7 19.1 The method of momets

More information

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values Iteratioal Joural of Applied Operatioal Research Vol. 4 No. 1 pp. 61-68 Witer 2014 Joural homepage: www.ijorlu.ir Cofidece iterval for the two-parameter expoetiated Gumbel distributio based o record values

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE Vol. 8 o. Joural of Systems Sciece ad Complexity Apr., 5 MOMET-METHOD ESTIMATIO BASED O CESORED SAMPLE I Zhogxi Departmet of Mathematics, East Chia Uiversity of Sciece ad Techology, Shaghai 37, Chia. Email:

More information

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable. Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9 Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Appendix to Quicksort Asymptotics

Appendix to Quicksort Asymptotics Appedix to Quicksort Asymptotics James Alle Fill Departmet of Mathematical Scieces The Johs Hopkis Uiversity jimfill@jhu.edu ad http://www.mts.jhu.edu/~fill/ ad Svate Jaso Departmet of Mathematics Uppsala

More information

Estimation of the Mean and the ACVF

Estimation of the Mean and the ACVF Chapter 5 Estimatio of the Mea ad the ACVF A statioary process {X t } is characterized by its mea ad its autocovariace fuctio γ ), ad so by the autocorrelatio fuctio ρ ) I this chapter we preset the estimators

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

DETERMINATION OF MECHANICAL PROPERTIES OF A NON- UNIFORM BEAM USING THE MEASUREMENT OF THE EXCITED LONGITUDINAL ELASTIC VIBRATIONS.

DETERMINATION OF MECHANICAL PROPERTIES OF A NON- UNIFORM BEAM USING THE MEASUREMENT OF THE EXCITED LONGITUDINAL ELASTIC VIBRATIONS. ICSV4 Cairs Australia 9- July 7 DTRMINATION OF MCHANICAL PROPRTIS OF A NON- UNIFORM BAM USING TH MASURMNT OF TH XCITD LONGITUDINAL LASTIC VIBRATIONS Pavel Aokhi ad Vladimir Gordo Departmet of the mathematics

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2 82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Random assignment with integer costs

Random assignment with integer costs Radom assigmet with iteger costs Robert Parviaie Departmet of Mathematics, Uppsala Uiversity P.O. Box 480, SE-7506 Uppsala, Swede robert.parviaie@math.uu.se Jue 4, 200 Abstract The radom assigmet problem

More information

TAMS24: Notations and Formulas

TAMS24: Notations and Formulas TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information