Simultaneous critical values for t-tests in very high dimensions

Size: px

Start display at page:

Download "Simultaneous critical values for t-tests in very high dimensions"

Buck Manning
6 years ago
Views:

1 Bernoulli 17(1, 2011, DOI: /10-BEJ272 Siultaneous critical values for t-tests in very high diensions HONGYUAN CAO 1 and MICHAEL R. KOSOROK 2 1 Departent of Health Studies, 5841 South Maryland Avenue MC 2007, University of Chicago, Chicago, IL, 60637, USA. E-ail: hycao@uchicago.edu 2 Departent of Biostatistics and Departent of Statistics and Operations Research, 3101 Mcgavran- Greenberg Hall, CB 7420, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA. E-ail: kosorok@unc.edu This article considers the proble of ultiple hypothesis testing using t-tests. The observed data are assued to be independently generated conditional on an underlying and unknown two-state hidden odel. We propose an asyptotically valid data-driven procedure to find critical values for rejection regions controlling the k-failywise error rate (k-fwer, false discovery rate (FDR and the tail probability of false discovery proportion (FDTP by using one-saple and two-saple t-statistics. We only require a finite fourth oent plus soe very general conditions on the ean and variance of the population by virtue of the oderate deviations properties of t-statistics. A new consistent estiator for the proportion of alternative hypotheses is developed. Siulation studies support our theoretical results and deonstrate that the power of a ultiple testing procedure can be substantially iproved by using critical values directly, as opposed to the conventional p-value approach. Our ethod is applied in an analysis of the icroarray data fro a leukeia cancer study that involves testing a large nuber of hypotheses siultaneously. Keywords: epirical processes; FDR; high diension; icroarrays; ultiple hypothesis testing; one-saple t-statistics; self-noralized oderate deviation; two-saple t-statistics 1. Introduction Aong the any challenges raised by the analysis of large data sets is the proble of ultiple testing. Exaples include functional agnetic resonance iaging, source detection in astronoy and icroarray analysis in genetics and olecular biology. It is now coon practice to siultaneously easure thousands of variables or features in a variety of biological studies. Many of these high-diensional biological studies are aied at identifying features showing a biological signal of interest, usually through the application of large-scale significance testing. The possible outcoes are suarized in Table 1. Traditional ethods that provide strong control of the failywise error rate (FWER = P(V 1 often have low power and can be unduly conservative in any applications. One way around this is to increase the nuber k of false rejections one is willing to tolerate. This results in a relaxed version of FWER, k-fwer = P(V k. Benjaini and Hochberg [1] (hereafter referred to as BH pioneered an alternative. Define the false discovery proportion (FDP to be the nuber of false rejections divided by the nuber of rejections (FDP = V/(R 1. The only effect of the R 1 in the denoinator is that the ISI/BS

2 348 H. Cao and M.R. Kosorok Table 1. Outcoes when testing hypotheses Hypothesis Accept Reject Total Null true U V 0 Alternative true F S 1 Total W R ratio V/R is set to zero when R = 0. Without loss of generality, we treat FDP = V/R and define the false discovery tail probability FDTP = P(V αr, where α is pre-specified, based on the application. Several papers have developed procedures for FDTP control. We shall not attept a coplete review here, but ention the following: van der Laan, Dudoit and Pollard [26] proposed an augentation-based procedure, Lehann and Roano [18] derived a step-down procedure and Genoves and Wasseran [13] suggested an inversion-based procedure, which is equivalent to the procedure of [26] under ild conditions [13]. The false discovery rate (FDR is the expected FDP. BH provided a distribution-free, finitesaple ethod for choosing a p-value threshold that guarantees that the FDR is less than a target level γ. Since this publication, there has been a considerable aount of research on both the theory and application of FDR control. Benjaini and Hochberg [2] and Benjaini and Yekutieli [3] extended the BH ethod to a class of dependent tests. A Bayesian ixture odel approach to obtain ultiple testing procedures controlling the FDR is considered in [11,21 24]. Wu [29] considered the conditional dependence odel under the assuption of Donsker properties of the indicator function of the true state for each hypothesis and derived asyptotic properties of false discovery proportions and nubers of rejected hypotheses. A systeatic study of ultiple testing procedures is given in the book [9]. Other related work can be found in [6,7]. One challenge in ultiple hypothesis testing is that any procedures depend on the proportion of null hypotheses, which is not known in reality. Estiating this proportion has long been known as a difficult proble. There have been soe interesting developents recently, for exaple, the approach of [20] (seealso[11,13,17,19]. Roughly speaking, these approaches are only successful under a condition which [13] calls the purity condition. Unfortunately, the purity condition depends on p-values and is hard to check in practice. The general fraework for k-fwer, FDTP, FDR control and the estiation of the proportion of alternative hypotheses is based on p-values which are assued to be known in advance or can be accurately approxiated. However, the assuption that p-values are always available is not realistic. In soe special settings, approxiate p-values have been shown to be asyptotically equivalent to exact p-values for controlling FDR [12,16]. However, these approxiations are only helpful in certain siultaneous error control settings and are not universally applicable. Moreover, if the p-values are not reliable, any procedures derived later are probleatic. This otivates us to propose a ethod to find critical values directly for rejection regions to control k-fwer, FDTP and FDR by using one-saple and two-saple t-statistics. The advantage of using t-tests is that they require iniu conditions on the population, only existence of the fourth oent, which is relatively easily satisfied by ost statistical distributions, rather than other stringent conditions such as the existence of the oent generating function. In addition, we approxiate tail probabilities of both null and alternative hypotheses accurately, rather than

3 t-tests in very high diensions 349 p-value approaches that only consider the case under null hypotheses. Thus, a better ranking of hypotheses is obtained. Furtherore, we propose a consistent estiate of the proportion of alternative hypotheses which only depends on test statistics. As long as the asyptotic distribution of the test statistic is known under the null hypothesis, we can apply our ethod to estiate this proportion, resulting in ore precise cut-offs. The BH procedure controls the FDR conservatively at π 0 γ, where π 0 is the proportion of null hypotheses and γ is the targeted significance level. If π 0 is uch saller than 1, then the statistical power is greatly coproised. The power we use in this paper is NDR = E[S]/ 1,as defined in [8]. In the situation that t-statistics can be used, our procedure gives a better approxiation and ore accurate critical values can be obtained by plugging in the estiate of π 0.The validity of our approach is guaranteed by epirical process ethods and recent theoretical advances on self-noralized oderate deviations, in cobination with Berry Esseen-type bounds for central and non-central t-statistics. To illustrate, we siulate a Markov chain, as in [25], of Bernoulli variables (H i, i = 1,...,5000, to indicate the true state of each hypothesis test (H i = 1 if the alternative is true; H i = 0 if the null is true. Conditional on the indicator, observations x ij,i = 1,...,5000,j = 1,...,80, are generated according to the odel x ij = μ i + ɛ ij. The one-saple t-statistic is used to perfor siultaneous hypothesis testing. Figure 1 shows the plot of MCMC results of the realized and noinal FDR control based on the BH ethod for different control levels. Fro this plot, we can see that as the control level increases, the BH procedure becoes ore and ore conservative. For instance, the FDR actually obtained is when the noinal level is set at 0.2, reflecting a significant loss in power. The three ethods of ultiple testing control we utilize are k-fwer, FDTP and FDR. The criterion for using k-fwer is, asyptotically, P(V k γ. (1.1 Since we only apply our ethod when there are discoveries (R >0, we need the FDTP, with a given proportion 0 <α<1 and significance level 0 <γ <1, to satisfy, asyptotically, Siilarly, the criterion for using FDR is, asyptotically, P(V αr γ. (1.2 FDR γ or 1 0 P(V αrdα γ. (1.3 The ain contributions of this paper are as follows: (1 Moderate deviation results which only require the finiteness of fourth oent, fro which the statistic is coputed in probability theory, are applied in ultiple testing. Thus, the applicability of this procedure is draatically expanded: it can deal with non-noral populations and even highly skewed populations. (2 The critical values for rejection regions are coputed directly, which circuvents the interediate p-value step. (3 An asyptotically consistent estiation of the proportion of alternative hypotheses is developed for ultiple testing procedures under very general conditions. The reainder of the paper is organized as follows. In Section 2, we present the basic data structure, our goals, the procedures and theoretical results for the one-saple t-test. Two-saple

4 350 H. Cao and M.R. Kosorok Figure 1. Claied and obtained FDR control using the BH procedure. t-test results are discussed in Section 3. Section 4 is devoted to nuerical investigations using siulation and Section 5 applies our procedure to detect significantly expressed genes in a icroarray study of leukeia cancer. Soe concluding rearks and a discussion are given in Section 6. Proofs of results fro Sections 2 and 3 are given in the Appendix. 2. One-saple t-test In this section, we first introduce the basic fraework for siultaneous hypothesis testing, followed by our ain results. Estiation of the unknown proportion of alternative hypotheses π 1 is presented next. We conclude the section by presenting theoretical results for the special case of copletely independent observations. This special setting is the basis for the ore general ain results and is also of independent interest since fairly precise rates of convergence can be obtained Basic fraework As a specific application of ultiple hypothesis testing in very high diensions, we use gene expression icroarray data. At the level of single genes, researchers seek to establish whether each

5 t-tests in very high diensions 351 gene in isolation behaves differently in a control versus a treatent situation. If the transcripts are pairwise under two conditions, then we can use a one-saple t-statistic to test for differential expression. The atheatical odel is X ij = μ i + ɛ ij, 1 j n, 1 i. (2.1 It should be noted that the following discussion is under this odel and does not hold in general. Here, X ij represents the expression level in the ith gene and jth array. Since the subjects are independent, for each i, ɛ i1,ɛ i2,...,ɛ in are independent rando variables with ean zero and variance σ 2 i. The null hypothesis is μ i = 0 and the alternative hypothesis is μ i 0. For the relationship between different genes, we propose the conditional independence odel, as follows. Let (H i be a {0, 1}-valued stationary process and, given (H i,x ij,i = 1,...,,are independently generated. The dependence is iposed on the hypothesis (H i, where H i = 0if the null hypothesis is true and H i = 1 if the alternative is true. Fro Table 1, we can see that H i = 1 and (1 H i = 0. It is assued that (H i satisfy a strong law of large nubers: 1 H i π 1 (0, 1 a.s. (2.2 This condition is satisfied in a variety of scenarios, for exaple, the independent case, Markov odels and stationary odels. Consider the one-saple t-statistic where X i = 1 n n j=1 T i = n X i /S i, X ij, S 2 i = 1 n 1 n (X ij X i 2. If we use t as a cut-off, then the nuber of rejected hypotheses and the nuber of false discoveries are, respectively, j=1 R = 1 { Ti t}, V = (1 H i 1 { Ti t}. (2.3 Under the null hypothesis, it is well known that T i follows a Student t-distribution with n 1 degrees of freedo if the saple is fro a noral distribution. Asyptotic convergence to a standard noral distribution holds when the population is copletely unknown, provided that it has a finite fourth oent under the null hypothesis. Moreover, under the alternative hypothesis, T i can also be approxiated by a noral distribution, but with a shift in location. We will show that F 0 (t := P( T i t H i = 0 = P( Z t ( 1 + o(1 = 2 (t ( 1 + o(1, (2.4 F 1 (t := P( T i t H i = 1 = E [ P ( Z + nμ i /σ i t μ i,σ i ]( 1 + o(1, (2.5

6 352 H. Cao and M.R. Kosorok uniforly for t = o(n 1/6 under soe regularity conditions, where Z denotes the standard noral rando variable, is the tail probability of the standard noral distribution and the critical values t n, that control the FDTP and FDR asyptotically at prescribed level γ are bounded. These assuptions are fairly realistic in practice. We do not require the critical value for k- FWER to be bounded. Although we do not typically know 1, F 0 (t or F 1 (t in practice, we need the following theore the proof of which is given in the Appendix as the first step. We will shortly extend this result, in Theore 2.2 below, to perit estiation of the unknown quantities. Theore 2.1. Assue that E(ɛ ij μ i,σi 2 = 0, Var(ɛ ij μ i,σi 2 = σ i 2, li sup Eɛ4 ij <, 0< π 1 < 1 α and (2.2 is satisfied. Also, assue that there exist ɛ 0 > 0 and c 0 > 0 such that Let and P ( nμ i /σ i ɛ 0 H i = 1 c 0 n 1. (2.6 μ (t = α 1 F 1 (t (1 α 0 F 0 (t (2.7 σ 2 (t = α2 1 F 1 (t ( 1 F 1 (t + (1 α 2 0 F 0 (t ( 1 F 0 (t. (2.8 (i If t fdtp n, is chosen such that t fdtp n, = inf{t : μ (t/σ (t z γ }, (2.9 where z γ is the γ th quintile of the standard noral distribution, then holds. (ii If tn, fdr is chosen such that { tn, fdr = inf then holds. (iii If tn, k-fwer is chosen such that where η(t Poisson(θ(t and li P(FDP α = li P(V αr γ (2.10 t : } 0 F 0 (t 0 F 0 (t + 1 F 1 (t γ, (2.11 li FDR = li E(V/R γ (2.12 t k-fwer n, = inf { t : P ( η(t k γ }, (2.13 θ(t= o F 0 (t,

7 t-tests in very high diensions 353 then holds. li k-fwer = li P(V k γ (2.14 Reark 2.1. In the next section, we use a Gaussian approxiation for F 0 (t and F 1 (t for both FDTP and FDR, for which the critical values are shown to be bounded. In this case, can be arbitrarily large, while the critical value reains bounded. Due to sparsity, we use a Poisson approxiation for k-fwer, for which the critical value is no longer bounded as, and we require log = o(n 1/ Main results Note that in Theore 2.1, there are an unknown paraeter 1 and unknown functions F 0 (t and F 1 (t involved in μ (t and σ (t. For practical settings, we need to estiate these quantities. We will begin by assuing that we have a strongly consistent estiate of π 1 and will then provide one such estiate in the next section. Given H, note that p(t = P( T i t= (1 H i P ( T i t H i = 0 + H i P( T i t H i = 1 can be estiated fro the epirical distribution ˆp (t of { T i }, where ˆp (t = 1 I { Ti t}, (2.15 and that P( T i t H i = 0 is close to P( Z t when n is large, by (2.4. The next theore, provedintheappendix, provides a consistent estiate of the critical value t n,. Theore 2.2. Let and ν (t = α ˆp (t 2(1 ˆπ 1 (t (2.16 τ 2 (t = α2( ˆp (t 2(1 ˆπ 1 (t ( 1 1ˆπ ( ˆp (t 2(1 ˆπ 1 (t 1 ( (1 α 2 (1 ˆπ 1 (t ( 1 2 (t, where ˆπ 1 is a strongly consistent estiate of π 1. Assue that the conditions of Theore 2.1 are satisfied. (i If ˆt fdtp n, is chosen such that then { ˆt n, fdtp ν (t = inf t : z γ }, (2.18 τ (t ˆt fdtp n, t fdtp n, =o(1 a.s. (2.19

8 354 H. Cao and M.R. Kosorok (ii If ˆt fdr n, is chosen such that then (iii If ˆt k-fwer n, is chosen such that where ζ(t Poisson( θ(t and then, as long as log = o(n 1/3, we have { ˆt n, fdr = inf t : 2(1 ˆπ } 1 (t γ, (2.20 ˆp (t ˆt fdr n, t fdr n, =o(1 a.s. (2.21 ˆt k-fwer n, = inf { t : P ( ζ(t k } γ, (2.22 θ(t= 2(1 ˆπ 1 (t, ˆt n, k-fwer tn, k-fwer =o(1 a.s. (2.23 Reark 2.2. This theore deals with the general dependence case, where (H i 1 is assued to follow a two-state hidden odel and the data are generated independently conditional on (H i 1. The proof is ainly based on the independence case, which we present in Section 2.4 below, plus a conditioning arguent Estiating π 1 In the previous section, we assued that ˆπ 1 was a consistent estiator of π 1.Wenowdevelop one such estiator. By the two-group nature of ultiple testing, the test statistic is essentially a ixture of null and alternative hypotheses with proportion as a paraeter. By virtue of oderate deviations, the distribution of t-statistics can be accurately approxiated under both null and alternative hypotheses. However, for the alternative approxiation, an unknown ean and variance are involved. So, we think of a functional transforation of the t-statistics which has a ceiling at 1 to first get a conservative estiate of π which is consistent under certain conditions. Let c>0 and define g c (x = in( x,c/c. It is easy to see that g c is a decreasing function of c, bounded by 1, and that the derivative dg c dc is bounded by 1/c. Hence, the function class {g c} indexed by c is a Donsker class and thus also Glivenko Cantelli. Let ĝ c = 1 g c (T i. (2.24 Theore 2.3. We have π 1 li sup,n c>0 ĝ c E(g c (Z 1 E(g c (Z a.s.

9 t-tests in very high diensions 355 If, in addition, we assue that nμi /σ i for all i with H i = 1,i = 1,...,, a.s. as n, (2.25 then where π 1 = li sup,n c>0 ĝ c E(g c (Z 1 E(g c (Z a.s., E(g c (Z = 2 c 2π (1 e c2 /2 + 2 (c. Proof. We can write 1 {Hi =0} ĝ c = := 0 I + 1 II. g c (T i 1 {Hi =0} 1 {Hi =0} 1 {Hi =1} + g c (T i 1 {Hi =1} 1 {Hi =1} Let H ={H i, 1 i }. Conditional on H, T i, 1 i, are independent rando variables. We consider I first. Let g c (T i H1 {Hi =0} E(g c (T i H1 {Hi =0} A (c = 1, {Hi =0} 1 {Hi =0} let E be the infinite sequence 1 {H1 =0}, 1 {H2 =0},...and let F be the event that 1 {Hi =0} as. By the assuption (2.2, we know that P(F= 1. Thus, ( [ ( P li A (c =0 = E P li A (c =0 ] E = 1, sup c>0 sup c>0 where the second equality follows fro the fact that, conditional on E, the ters in the su are i.i.d. and thus the standard Glivenko Cantelli theore applies. Arguing siilarly, based on conditioning on the sequence 1 {H1 =1}, 1 {H2 =1},...,we can also establish that sup g c (T i H1 {Hi =1} 1 {Hi =1} c>0 E(g c (T i H1 {Hi =1} 1 {Hi =1} 0 Now, note that II 1. Thus, since 0 / (1 π 1 a.s. and 1 / π 1 a.s., we have that when,n, ĝ c (1 π 1 E(g c (Z + π 1 We now have the following lower bound for π 1 : a.s. = E(g c (Z + ( 1 E(g c (Z π 1. a.s. π 1 li sup,n c>0 ĝ c E(g c (Z 1 E(g c (Z a.s. (2.26

10 356 H. Cao and M.R. Kosorok Define 1 := (1 π 1 E(g c (Z + π E(g c (T i H1 {Hi =1}, E(g c (Z + nμ i /σ i 1 {Hi =1} 2 := (1 π 1 E(g c (Z + π 1. 1 {Hi =1} Letting n,wehavesup c> a.s.Also, 2 = (1 π 1 E(g c (Z + π {Hi =1} ( E (g c Z + nμi (1 π 1 E(g c (Z + π 1 P( Z + nμ i /σ i ch i 1 {Hi =1} σ i (I{ Z+ nμ i /σ i c} + I { Z+ nμ i /σ i <c} H i (1 π 1 E(g c (Z + π 1 = E(g c (Z + π 1 ( 1 E(gc (Z. Note that Therefore, sup ĝ c 1 0 a.s. as,n. c ĝ c E(g c (Z + π 1 ( 1 E(gc (Z a.s. as,n. Thus, we obtain π 1 li sup,n c>0 ĝ c E(g c (Z 1 E(g c (Z a.s. (2.27 As a consequence of this theore, we propose the following estiate of π 1 : ĝ c E(g c (Z ˆπ 1 := sup c>0 1 E(g c (Z, (2.28 where E(g c (Z = 2 c 2π (1 e c2 /2 + 2 (c. Reark 2.3. If we use ˆπ 1,asgivenin(2.28, then Theore 2.2 yields a fully autoated procedure to carry out ultiple hypothesis testing in very high diensions in practical data settings.

11 t-tests in very high diensions Consistency and rate of convergence under independence In order to prove the ain results in the general, possibly dependent, t-test setting, we need results under the assuption of independence between t-tests. Specifically, we assue in this section that (T i,h i, i = 1,...,are independent, identically distributed rando variables with π 1 = P(T i = 1. This independence assuption can also yield stronger results than the ore general setting and is of independent interest. The next theore, proved in the Appendix, provides a strong consistent estiate of the critical value t n,, as well as its rate of convergence. Theore 2.4. Let and ν (t = α ˆp (t 2(1 π 1 (t (2.29 τ 2 (t = α2 ˆp (t ( 1 ˆp (t + 4α(1 π 1 ˆp (t (t + 2(1 π 1 (t ( 1 2α 2(1 π 1 (t. Assue the conditions of Theore 2.1 with (2.2 replaced by the assuption that (T i,h i, i = 1,...,, are i.i.d. and π 1 = P(T i = 1. Let J ={i : H i = 1} be the set that contains the indices of alternative hypotheses. Also, assue that μ i,σ i are i.i.d. for i J. (i If ˆt fdtp n, is chosen such that { ˆt n, fdtp ν (t = inf t : z γ }, (2.30 τ (t then ˆt fdtp n, t fdtp n, =O ( n 1/2 + 1/2 (log log 1/2 a.s. (2.31 and ˆt fdtp n, t fdtp n, =O(n 1/2 + 1/2 in probability. (2.32 Here, t fdtp n, is the critical value defined in (A.26. (ii If ˆt fdr n, is chosen such that then and { ˆt n, fdr = inf t : 2(1 π } 1 (t γ, (2.33 ˆp (t ˆt fdr n, t fdr n, =O ( n 1/2 + 1/2 (log log 1/2 a.s. (2.34 ˆt fdr n, t fdr n, =O(n 1/2 + 1/2 in probability. (2.35

12 358 H. Cao and M.R. Kosorok Here, tn, fdr is the critical value defined in (A.28. (iii If ˆt n, k-fwer is chosen such that where ζ(t Poisson( θ(t and then Here t k-fwer n, ˆt k-fwer n, = inf { t : P ( ζ(t k } γ, (2.36 θ(t= 2(1 ˆπ 1 (t, ˆt n, k-fwer tn, k-fwer =O((log 1/2 a.s. (2.37 is the critical value defined in (A.30. Reark 2.4. If α = γ in Theore 2.4, then it is not difficult to see that ˆt n, fdtp ˆt n, fdr = O( 1/2 a.s.therefore, (2.31 and (2.32 reain valid with ˆt n, fdtp replaced by ˆt n,. fdr This shows that controlling FDTP is asyptotically equivalent to controlling FDR. This is also true in the ore general dependence case. Thus, we will focus priarily on FDR in our nuerical studies. Reark 2.5. Note that π 1 is assued to be known in order to get a precise rate of convergence for FDTP and FDR. If ˆπ 1 is estiated with rate of convergence r n, then the correct convergence rate for the in probability result for FDR and FDTP would involve an additional ter O(r n added in (2.32 and (2.35. It is unclear what the correction would be for the alost sure rate in (2.31 and (2.34. These corrections are beyond the scope of this paper and will not be pursued further here. Note that the rate of ˆπ 1 is not needed in the ain results presented in Sections Two-saple t-test In this section, the results of the previous section are extended to the two-saple t-test setting. The estiator of the unknown paraeter π 1 reains the sae as in the one-saple case, but with T i in (2.24 being the two-saple, rather than one-saple, t-statistic. Theoretical results for the rates of convergence under independence are also presented, as in the previous section Basic set-up and results When two groups, such as a control and an experiental group, are independent, which we assue here, a natural statistic to use is the two-saple t-statistic. As far as possible, we adopt the sae notation as used in the one-saple case, and we assue that (2.2 holds. We observe the rando variables X ij = μ i + ɛ ij, 1 j n 1, 1 i, Y ij = ν i + ω ij, 1 j n 2, 1 i,

13 t-tests in very high diensions 359 with the index i denoting the ith gene, j indicating the jth array, μ i representing the ean effect for the ith gene fro the first group and ν i representing the ean effect for the ith gene fro the second group. The sapling processes for the two groups are assued to be independent of each other. The saple sizes n 1 and n 2 are assued to be of the sae order, that is, 0 <b 1 n 1 /n 2 b 2 <. We will also assue that for each i, ɛ i1,ɛ i2,...,ɛ in1 are independent rando variables with ean zero and variance σi 2; ω i1,ω i2,...,ω in2 are independent rando variables with ean zero and variance τi 2. The null hypothesis is μ i = ν i, the alternative hypothesis is μ i ν i and the dependence is assued to be generated in the sae anner as the dependence in the one-saple setting. Consider the two-saple t-statistic where Then Ti X i Ȳ i =, S1i 2 /n 1 + S2i 2 /n 2 X i = 1 n 1 X ij, Ȳ i = 1 n 2 Y ij, n 1 n 2 j=1 S 2 1i = 1 n 1 1 R = n 1 j=1 (X ij X i 2, S2i 2 = 1 (Y ij Ȳ i 2. n 2 1 j=1 1 { T i t}, V = n 2 j=1 (1 H i 1 { T i t}. (3.1 The two-saple t-statistic is one of the ost coonly used statistics to construct confidence intervals and carry out hypothesis testing for the difference between two eans. There are several preises underlying the use of two-saple t-tests. It is assued that the data have been derived fro populations with noral distributions. Based on the fact that S 1i σ i,s 2i τ i a.s., with oderate violation of the assuption, statisticians quite often recoend using the two-saple t-test, provided the saples are not too sall and the saples are of equal or nearly equal size. When the populations are not norally distributed, it is a consequence of the central liit theore that two-saple t-tests reain valid. A ore refined confiration of this validity under non-norality based on oderate deviations is shown in [4]. Furtherore, under the alternative hypothesis, the asyptotic results still hold, but with a shift in location siilar to the one-saple case under certain conditions, that is, P( Ti t H i = 0 = P( Z t ( 1 + o(1, ( P( Ti t H i = 1 = P Z + μ i ν i (1 t + o(1, B n1,n 2 uniforly in t = o(n 1/6, where Bn 2 1,n 2 = σi 2/n 1 +τi 2/n 2. Under the assuption of (2.2, asyptotic critical values to control FDTP, FDR and k-fwer are very siilar to the one-saple t-test

14 360 H. Cao and M.R. Kosorok case with the one-saple t-statistic T i replaced by the two-saple t-statistic Ti. The following theore, proved in the Appendix, is analogous to Theore 2.1 and is a necessary first step. Theore 3.1. Assue that E(ɛ ij μ i, σi 2 = 0, E(ω ij ν i, τi 2 = 0, Var(ɛ ij μ i,σi 2 = σ i 2, Var(ω ij ν i,τi 2 = τ i 2, li sup Eɛ4 ij <, li sup Eτ4 i,j <, 0<π 1 < 1 α and that (2.2 is satisfied. Assue that there exist ɛ 0 and c 0 such that ( μ i ν i P ɛ 0 Hi = 1 c 0 for all n 1,n 2. (3.2 B n1,n 2 The conclusions of Theore 2.1 then hold with the one-saple t-statistic T i replaced by the two-saple t-statistic T i Main results The unknown paraeter 1 and functions F 0 (t and F 1 (t in Theore 3.1 are estiated siilarly as in the one-saple case with the one-saple t-statistic replaced by its two-saple counterpart. The following theore, the proof of which is given in the Appendix, gives our ain results for two-saple t-tests. Theore 3.2. Assue that the conditions in Theore 3.1 are satisfied. Replace the one-saple t-statistic T i by the two-saple t-statistic Ti in Theore 2.2. Let ˆπ 1 be a strong consistent estiate of π 1, as in (2.28, using the two-saple t-statistic T (i If ˆt fdtp n, is chosen such that then (ii If ˆt fdr n, is chosen such that then (iii If ˆt k-fwer n, is chosen such that i. { ˆt n, fdtp ν (t = inf t : z γ }, (3.3 τ (t ˆt fdtp n, t fdtp n, =o(1 a.s. (3.4 { ˆt n, fdr = inf t : 2(1 ˆπ } 1 (t γ ˆp (t (3.5 ˆt fdr n, t fdr n, =o(1 a.s. (3.6 ˆt k-fwer n, = inf { t : P ( ζ(t k } γ, (3.7

15 t-tests in very high diensions 361 where ζ(t Poisson( θ(t and then, provided log = o(n 1/3, we have θ(t= 2(1 ˆπ 1 (t, ˆt n, k-fwer tn, k-fwer =o(1 a.s. (3.8 Reark 3.1. ˆπ 1 can be estiated via (2.28 by using two-saple t-statistics. Theore 2.3 is applicable in the two-saple setting, as well as in the one-saple case, and consistency follows. Thus, Theore 3.2 gives a fully autoated procedure to conduct ultiple hypothesis testing using two-saple t-statistics after we plug in the ˆπ 1 given in ( Consistency and rate of convergence under independence Results for the independence setting are needed for the proofs of the ain results, as was the case for one-saple t-tests. We can, once again, obtain ore precise estiation copared with the general dependence case. The following theore, proved in the Appendix, gives us conditions and conclusions using two-saple t-statistics for controlling FDTP and FDR asyptotically, as well as rates of convergence under the assuption that (T i,h i are independent of each other for 1 i. Assue that π 1 is the proportion of the alternative hypotheses aong hypothesis tests, that is, π 1 = P(H i = 1.LetJ ={i : H i = 1}. Theore 3.3. Assue the conditions of Theore 3.1 are satisfied. Rather than (2.2, we assue that (T i,h i are independent and identically distributed. In addition, π 1 = P(T 1 = 1 and μ i,σ i are i.i.d. for i J. Let and p(t = P( T1 (3.9 a 1 (t = αp(t (1 π 1 P ( T1 1 = 0, (3.10 b1 2 (t = α2 p(t ( 1 p(t + 2α(1 π 1 p(tp ( T1 1 = 0 + (1 π 1 P ( T1 1 = 0 ( 1 2α (1 π 1 P ( T1 1 = 0, ˆp (t = 1 I { T i t}, (3.11 ν (t = α ˆp (t 2(1 π 1 (t, (3.12 τ 2 (t = α2 ˆp (t ( 1 ˆp (t + 4α(1 π 1 ˆp (t (t + 2(1 π 1 (t ( 1 2α 2(1 π 1 (t. The conclusions of Theore 2.4 then hold with the one-saple t-statistics T i replaced by the two-saple t-statistics T i.

16 362 H. Cao and M.R. Kosorok Reark 3.2. In the above sections, we developed our theores based on two-sided tests. The results for the case of one-sided tests are very siilar, but with the rejection region {T i t} for each test. We oit the details. 4. Nuerical studies In this section, we present nuerical studies based on siulated data and copare the power of our approach with [1] (BH and [23] (ST approaches using one-saple t-statistics. The results for using two-saple t-statistics are very siilar and so we oit the details here Siulation study 1 We investigate the results for the i.i.d. case first. Recall the odel X ij = μ i + ɛ ij, 1 i, 1 j n. We set the signal using μ i Unif (0.5, 1 or μ i Unif ( 1, 0.5, which is of the correct order for the standardized error ter. Here, the nuber of hypothesis tests is = , which is the sae for all following siulation studies, unless otherwise noted. The proportion of alternatives π 1 = 0.2 and the error ter t(4 are used just to illustrate the asyptotic results. We vary the nuber of arrays n fro 20 to 50 to 300 to evaluate our asyptotic approxiation. Epirical distributions of FDTP, FDR and k-fwer based on repetitions are treated as the gold standard since they have alost negligible Monte Carlo error. The saples are generated to evaluate our proposed ethod based on asyptotic theory. Specifically, for each saple, we calculate the saple paths of the following quantities indexed by t: ν (t/τ (t for studying FDTP, 2(1 ˆπ 1 (t/ ˆp (t for studying FDR and P(Poisson(2(1 ˆπ 1 (t 10 for studying 10-FWER (here, we choose k = 10 just for the purposes of illustration. ˆπ 1 is defined as in (2.28. Figure 2 shows the overlay of the true path and 100 rando estiated paths for FDTP, FDR and k-fwer, respectively. As n increases, we see that the true path and estiated paths are fairly close to each other, which, in turn, validates our asyptotic theory. We can see that the slopes of FDTP and 10-FWER are very steep, which eans a sall change in the critical value results in a large change in the level of control, while the FDR has a flatter trend Siulation study 2 Under the sae set-up as in the previous section, we siulate data with different error ters: standard noral (N(0, 1, Student t with one degree of freedo (Cauchy, Student t with four degrees of freedo (t(4, Student t with ten degrees of freedo (t(10, Laplace and exponential. Note that, except for the Cauchy error ter, all of the error ters satisfy the condition

17 t-tests in very high diensions 363 Figure 2. Overlay of true and 100 rando estiated saple paths with respect to cut-off t for the three procedures under differing saple sizes. of finite fourth oent. Epirical distributions of FDTP, FDR and k-fwer based on repetitions are treated as the gold standard for obtaining true critical values. Each scenario is repeated 1000 ties to evaluate our proposed ethod for estiating the critical value based on asyptotic theory. We control FDR at different levels (fro 0.01 to 0.2 to get true and estiated critical values. Asyptotically, the estiated critical value ˆt based on our theory should be very close to the true critical value t and lie on a diagonal line of the square. Fro Figure 3, the estiated critical values ˆt do not atch the true critical value t under the Cauchy error since the Cauchy distribution does not have finite fourth oent. For the Cauchy distribution, even the central liit theore does not hold since it does not have finite ean. As the nuber of arrays n increases, the estiated critical values ˆt atch the true critical values t better under syetric error ters (N(0, 1, t(4, t(10 and Laplace, but not quite so well under asyetric errors (e.g., exponential errors. The difficulty with the exponential error ters suggests the value of conducting research to derive higher order approxiations. We plan to undertake this in the near future.

18 364 H. Cao and M.R. Kosorok Figure 3. Coparison of true and estiated critical values using FDR for different error ters and nubers of arrays n Siulation study 3 The above results are fro the independent test setting. We carried out siilar siulation studies for the dependent setting and found that the corresponding plots are quite siilar to the above results and the sae conclusions can be drawn. To see whether our proposed ethod obtains the claied level of control, we use a hidden Markov chain to generate dependent indicators H i,i = 1,...,. Conditional on H i,i = 1,...,, the data is generated independently. The transition probability of the hidden Markov chain is set to ( 1 p1 p 1, p 0 1 p 0 where p 1 is the transition probability fro 0 to 1 and p 0 is the transition probability fro 1 to 0. In the siulation, p 0 = 0.8 and p 1 = 0.2. Based on the liiting stationary distribution, the alternative proportion should be π 1 = p 1 /(p 0 +p 1. Under the null hypothesis, we siulate data fro four error ters (N(0, 1, t(4, Laplace and exponential and, under the alternative hypothesis, we siulate data with ean effects half fro Unif (0.1, 0, 5 and half fro Unif ( 0.5, 0.1, plus the sae four error ters. Figure 4 uses FDR as the control criterion. For different control levels γ, we copare the claied level of control and the actually obtained level of control

19 t-tests in very high diensions 365 Figure 4. Coparison of noinal and obtained control level for different error ters and nubers of arrays n. based on our ethod for different nubers of arrays: sall (n = 20, ediu (n = 50 and large (n = 300. Fro Figure 4, we can see that when the nuber of arrays n is sall (n = 20, we do not, in general, achieve the claied level of control. If we have a ediu saple size (n = 50, the obtained level of control is very close to the noinal level of control and the results are alost perfect if we have a large nuber of arrays (n = 300, even for the asyetric exponential error ter. This strongly supports our theoretical predictions but suggests that higher order approxiations would be useful in soe settings. To see the perforance of our ethod using 10-FWER, Table 2 suarizes the control level actually obtained for different error ters and nubers of arrays n when the noinal control Table 2. Obtained control level using 10-FWER with noinal control level 0.05 n N(0, 1 t(4 Laplace Exponential (9.0e (7.0e (1.1e 02 1 ( (1.2e (9.1e (1.2e 02 1 ( (3.8e (2.8e (2.7e (4.6e 03

20 366 H. Cao and M.R. Kosorok level is The obtained control level is incorrect when the nuber of arrays n is sall, which can be deduced fro the saples paths of 10-FWER given in Figure 1. It has a very steep slope, so when n is sall, the approxiation is crude and there is a noticeable difference between the estiated critical value and the true critical value, yielding a big difference in the control level. For large saple sizes, the obtained control level is reasonably good because our asyptotic theory begins to take effect. The exponential error setting appears not to perfor as well as the other error settings Siulation study 4 All previous nuerical studies involve the alternative proportion estiate ˆπ 1 defined in (2.28. In this section, we investigate nuerically how this estiate is affected by nuber of arrays n and copare with the alternative estiate proposed by [23]. The first siulation set-up is siilar to the one in the previous section. We drew N = 1000 sets of data as follows. Dependent indicators H i,i = 1,...,, are generated fro a hidden Markov chain with the liiting alternative proportion π 1 = 0.2. Conditional on these, a vector of expected values, μ = (μ 1,...,μ,was constructed. The expected values for the true null hypotheses were set to 0 with standard noral noise, whereas the expected values for the alternative hypotheses were drawn fro Unif (0.1, 0.5 plus standard noral noise. Correspondingly, 1000 replications of the proportion estiate ˆπ 1 were calculated using (2.28. The root eans square error (RMSE is given as RMSE = 1 N ( (n ˆπ N 1 π (n 2, 1 n=1 where ˆπ (n 1 is the estiate of π 1 for the nth siulated data set and π (n 1 is the truth. Table 3 suarizes the effect of n. As the nuber of arrays n increases, the RMSE gets saller, which validates our asyptotic prediction. In the second siulation, we copare our proportion estiate with the one using spline soothing proposed by [23]. Recall the proportion estiate π 0 (λ = #{p i >λ; i = 1,...,}/ ((1 λ. The soothing approach proceeds as follows: first, π 0 (λ are calculated over a (fine grid of λ; then, a natural cubic spline y with three degrees of freedo is fitted to (λ, ˆπ 0 (λ; finally, π 0 is estiated by ˆπ 0 = y(1. The siulation set-up is siilar to the previous one, except that we have two groups here with n 1 = 70 and n 2 = 80. We change the alternative proportion to copare the perforances of our approach (π1 ck with the spline soothing approach (π st 1 intable 4. They produce very siilar results; both are conservative, with less bias using our approach and less variance using the spline soothing approach. The advantage of our approach is that it Table 3. RMSE for N = 1000 estiated values of π 1 n RMSE

21 t-tests in very high diensions 367 Table 4. Proportion estiate coparison π ˆπ 1 ck ˆπ 1 st sd( ˆπ 1 ck sd( ˆπ 1 st is coputationallyvery fast, while thespline soothingapproachrequires that p-values are first obtained using perutation, which is coputationally uch ore intensive than our approach (which can be coputed directly fro the t-statistics Coparison with BH and ST procedures In this section, we copare our approach with the BH and ST procedures under the dependence structure described in [29]. We also use a hidden Markov odel to siulate the indicator function H i,i = 1,...,. Conditional on H i,i = 1,...,, the data is generated independently. The nuber of hypotheses tested = 5000 and the nuber of arrays n = 80. The data generating echanis is otherwise the sae as in the independence case. First, we construct a one-saple t-statistic and apply our procedure to obtain the critical value for the rejection region. We then obtain p-values and q-values, and apply the BH and ST procedures to decide which genes are significantly expressed. We now briefly describe the BH procedure. Let p i be the arginal p-value of the ith test, 1 i, and let p (1 p ( be the order statistics of p 1,...,p.Givena control level γ (0, 1, let r = ax { i {0, 1,...,+ 1} : p (i γi/ }, where p 0 = 0 and p (+1 = 1. The BH procedure rejects all hypotheses for which p (i p (r. If r = 0, then all hypotheses are accepted. The q-value in [23] is siilar to the well-known p-value, except that it is a easure of significance in ters of FDR, rather than type I error, and an estiate of alternative proportion is plugged in, based on available p-values, as described in the previous section. We revisit the otivating exaple and give a plot of the claied FDR and actually obtained FDR by using the proposed critical value ethod. Fro Figure 5, we can see that our procedure controls the FDR at the claied level asyptotically, although soewhat liberally for finite saples, and has better power at the sae target FDR level copared with the BH and ST procedures. 5. Applications to icroarray analysis We now apply the proposed procedure to the analysis of a leukeia cancer data set [14] inorder to identify differentially expressed genes between AML and ALL. For the original data, see

22 368 H. Cao and M.R. Kosorok Figure 5. FDR control and power coparison.

23 t-tests in very high diensions In this analysis, we use the ethodology developed for the dependence case. The raw data consist of = 7129 genes and 72 saples coing fro two classes: 47 in class ALL (acute lyphoblastic leukeia and 25 in class AML (acute yeloid leukeia. Our siulation results showed reasonable perforance of the procedure for a oderate saple size in this range. For each gene location, the two-saple t-statistic coparing the 47 ALL responses with the 25 AML responses was coputed. Using our proposed approach for the dependent case, we find the critical value for controlling FDR at level γ, { ˆt n, fdr = inf t : 2(1 ˆπ } 1 (t γ, ˆp (t where ˆp = 1 { Ti t}/ and ˆπ 1 is estiated by (2.28. In Figure 6, we plot the FDR level and the nuber of significantly expressed genes by our (CK procedure, BH procedure and the q-value based Storey Tibshirani (ST procedure. Fro the plot, we can see that our procedure detects the largest nuber of significant genes, followed by the ST procedure and then the BH procedure, which is the ost conservative one. At FDR level 0.01, we detected 870 genes, the ST procedure detected 778 genes and the BH procedure detected 614 genes. Using the two-saple t-test, siilarly to the higher power of our approach in siulation studies, we detected all of the genes that the other two approaches detected. The Figure 6. Coparison between our (CK procedure, the ST procedure and the BH procedure using real data.

24 370 H. Cao and M.R. Kosorok BH procedure is very conservative at the expense of power loss. The ST procedure requires perutation to obtain p-values, while our procedure gets the critical value directly and is thus faster in ters of coputation. The estiation of π 1 is by our procedure and by the ST procedure. These results can serve as a first exploratory step for ore refined analyses concerning these significant genes. Another issue ay be that the critical value approach based on asyptotic FDR control ay not be conservative enough in soe settings. 6. Concluding rearks and discussion We have presented a new approach for the significance analysis of thousands of features in highdiensional biological studies. The approach is based on estiating the critical values of the rejection regions for high-diensional ultiple hypothesis testing, rather than the conventional p-value approaches in the literature. We developed a detailed ethod that can be used to identify differentially expressed genes in icroarray experients. The proposed procedure perfors well for large saples, reasonably well for interediate saples and not quite as well for sall saples, and appears to perfor better than existing alternatives under realistic saple sizes. Our ethod is also coputationally faster than the copeting approaches. The potential for iproveent in sall-saple perforance otivates the need for a second-order expansion of our theoretical work. In addition, we have proposed a new consistent estiate of the proportion of alternative hypotheses under certain conditions. Nuerical studies deonstrate that our ethodology fits the truth well and iproves the statistical power in ultiple testing. Extensions of the current work can be pursued in several directions. First, as stated above, the precision of the asyptotic approxiations has roo for iproveent in sall-to-oderately-sall saple sizes, suggesting that a second-order expansion would be valuable. Second, in the dependence case, it would be of interest to see how the rate of convergence could be derived under various assuptions on the for of the dependence. Thirdly, the plug-in estiator π 1 is consistent, but soewhat ad hoc. Coplete, theoretical properties of this estiator reain to be explored. Last, but not least, we only considered a fixed proportion π 1 of alternative hypotheses. It is of great interest also to consider the sparsity setting, in which π 1 0as, and to see what patterns eerge. Appendix: Proofs of ain results Our ain tools are liit theores of epirical processes, Berry Esseen bounds and selfnoralized oderate deviations for one- and two-saple t-statistics. A.1. Preliinary leas We first state a non-unifor Berry Esseen inequality for nonlinear statistics. Lea A.1 ([5]. Let ξ 1,ξ 2,...,ξ n be independent rando variables with Eξ i = 0, n Eξi 2 = 1 and E ξ i 3 <. Let W n = n ξ i and = (ξ 1,...,ξ n be a easurable

25 t-tests in very high diensions 371 function of {ξ i }. Then P(W n + z (z P ( >( z +1/3 + C( z +1 3 ( 2 + n ( Eξ 2 1/2 ( i E( i 2 n 1/2 + E ξ i. 3 (A.1 This is [5], Theore 2.2, and the proof can be found there. The next lea provides a Berry Esseen bound for non-central t-statistics. Lea A.2. Let X, X 1,...,X n be i.i.d. rando variables with E(X = 0, σ 2 = EX 2 and EX 4 <. Let X = 1 n X i, sn 2 n = 1 n (X i X 2. n 1 Then ( n( P X + c s n x ( x nc/σ (1 + x K (1 + x nc/σ n for any c and x, where K is a finite constant that ay depend on σ and EX 4. Proof. Without loss of generality, assue that x 0 and σ = 1. Using the fact that (A.2 1 t (1 + t 1/2 1 + t for t 1, (A.3 we have and Therefore, xs n = x(1 + s 2 n 11/2 x(1 + s 2 n 1 xs n x(1 s 2 n 1. ( n( X + c P x = P ( n( X + c xs n s n P ( n X x nc + x s 2 n 1. (A.4 (A.5 (A.6 We now apply (A.1 with ξ i = X i / n, W n = n X and z = x nc, = x s 2 n 1, i = x s 2 n,i 1, where s 2 n,i is defined as s2 n with 0 replacing X i.

26 372 H. Cao and M.R. Kosorok Noting that ( n sn 2 1 = 1 (Xj 2 n 1 1 n X n 1, j=1 sn,i 2 1 = 1 ( (Xj 2 n 1 1 n( X X i /n, 2 j i we have E s 2 n 1 2 KEX 4 /n (A.7 and E(sn 2 1 s2 n,i 2 = (n 1 2 E( (Xi 2 1 n X 2 + n( X X i /n = (n 1 2 E( (Xi 2 1 X ( i 2( X X i /n + X i /n (n 1 2 E( 2(Xi Xi 2 ( 2( X X i /n + X i /n 2 2 ( 4EX 4 (n EXi 2 ( 8( X X i /n 2 + 2EXi 2 /n (A.8 KEX 4 /n 2. It follows fro (A.7 and (A.8 that 2 K x EX 4 n, ( P > z +1 K x EX 4, 3 n(1 + z n (Eξi 2 1/2( E( i 2 1/2 x EX 4 K n and n E ξ i 3 EX3 n. Therefore, by (A.1, ( P n X x nc + x sn 2 1 ( x nc K(1 + x (1 + x nc n. (A.9

27 t-tests in very high diensions 373 Siilarly, and ( n( X + c P x P ( n X x nc x sn 2 s 1 n P ( n X x nc x sn 2 1 ( x nc K(1 + x (1 + x nc n. (A.10 This proves (A.2. We also need a oderate deviation for the non-central t-statistics, as given in the following lea. Lea A.3. Suppose that X, X i,i = 1,...,n, are independent identically distributed rando variables. Let n X i X =, sn 2 n = 1 n 1 n (X i X 2. If X satisfies E X 4 <, E(X 2 = σ 2 > 0 and E(X = 0, then ( n( X + c P t = P ( ( Z + c n/σ t 1 + o(1 s n (A.11 uniforly in c and t = o(n 1/6. Here, and in the sequel, Z denotes a standard noral rando variable. Proof. When t is bounded, (A.11 follows fro Lea A.2. Consider large t with t = o(n 1/6. We need the following result of [27,28]: ( n( X + c P t = ( 1 ( t c n/σ ( 1 + o(1 (A.12 s n uniforly in c n/σ t/5 and t = o(n 1/6. We note that following the sae lines as their proof, we can see that (A.12 reains valid for t/5 c n/σ t. We write ( ( n( X + c P n( t X + c = P s n By (A.12, the reark above and the fact that s n 1 (t + x = o ( 1 (t x ( n( X c t + P t. s n

28 374 H. Cao and M.R. Kosorok for x 1 (recall here that we assue t is large, (A.11 holds for t c n/σ t. Now, assue c n/σ > t. Then, by (A.2, ( n( P X + c t P ( Z + c n/σ t = o(1. s n Since c n/σ > t, wehavep( Z + c n/σ t 1/2 and hence ( n( X + c P t = P ( Z + c n/σ t ( 1 + o(1. s n This copletes the proof of (A.11. The lea below shows that t n, defined in (A.26 under independence is bounded. Lea A.4. Assue that there exist ε 0 > 0 and c 0 > 0 such that Let t n, satisfy (A.37. Then where t 0 is the solution of P ( nμ 1 /σ 1 ε 0 c0. t n, t 0, (A.13 (A.14 απ 1 c 0 exp ( (t 0 ε 0 ε 0 = 12(1 + t0 ε 0. (A.15 Proof. It suffices to show that Eξ1 (t 0 (var(ξ 1 (t 0 1/2 z γ. It is easy to see that P( Z + a t 0 is a onotone increasing function of a>0. Hence, P ( Z + nμ 1 /σ 1 t 0 P ( Z + nμ 1 /σ 1 t 0, nμ 1 /σ 1 ε 0 P( Z + ε 0 t 0 P ( nμ 1 /σ 1 ε 0 c 0 P( Z + ε 0 t 0 c 0 ( 1 (t0 ε 0 c 0 3(1 + t 0 ε 0 exp( (t 0 ε 0 2 /2 c 0 3(1 + t 0 ε 0 exp( t0 2 /2 + (t 0 ε 0 ε 0. (A.16 (A.17 Here, we use the fact that 1 2 e x2 /2 1 1 (x e x2 /2 2π(1 + x for x 0.

29 t-tests in very high diensions 375 Under the null hypothesis H 1 = 0, which corresponds to μ i = 0, we apply Lea A.3 and obtain P( T 1 t H 1 = 0 = P( Z t ( 1 + o(1 uniforly in t = o(n 1/6. Under the alternative hypothesis H 1 = 1, we apply Lea A.3 to X ij μ i and obtain uniforly in t = o(n 1/6. Also, note that P( T 1 t H 1 = 1 = P ( n( X 1 μ 1 + μ 1 /s 1 t H1 = 1 (A.18 = E[P( Z + nμ 1 /σ 1 t μ 1,σ 1 ] ( 1 + o(1 (A.19 = P ( Z + nμ 1 /σ 1 t ( 1 + o(1 P( T 1 t = P( T 1 t,h 1 = 0 + P( T 1 t,h 1 = 1 = (1 π 1 P ( T 1 t H 1 = 0 + π 1 P( T 1 t H 1 = 1 = (1 π 1 P ( Z t ( 1 + o(1 + π 1 P ( Z + nμ 1 /σ 1 t ( 1 + o(1. (A.20 By (A.34, (A.18, (A.20 and (A.17, Eξ 1 (t 0 = α(1 π 1 P ( Z t 0 ( 1 + o(1 + απ 1 P ( Z + nμ 1 /σ 1 t 0 ( 1 + o(1 (1 π 1 P ( Z t 0 ( 1 + o(1 c 0 απ 1 6(1 + t 0 ε 0 exp( t0 2 /2 + (t 0 ε 0 ε 0 2P(Z t0 απ 1 c 0 6(1 + t 0 ε 0 exp( t0 2 /2 + (t 0 ε 0 ε 0 e t0 2/2 ( = e t2 0 /2 απ 1 c 0 6(1 + t 0 ε 0 exp( (t 0 ε 0 ε 0 1 (A.21 = e t2 0 /2, by (A.15 and the definition of t 0. It is easy to see that Eξ1 2 1 and var(ξ 1(t 0 1 in particular. Thus, by (A.21, Eξ1 (t 0 (var(ξ 1 (t 1/2 e t2 0 /2 z γ, (A.22 provided that is large enough. This proves (A.16. The following i.i.d. results are essential for the general results.

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations