Web-based Supplementay Mateials fo Contolling False Discoveies in Multidimensional Diectional Decisions, with Applications to Gene Expession Data on Odeed Categoies Wenge Guo Biostatistics Banch, National Institute of Envionmental Health Sciences Reseach Tiangle Pak, NC 27709, U.S.A. email: wenge.guo@gmail.com and Sanat K. Saka Depatment of Statistics, Temple Univesity, Philadelphia, PA 19122, U.S.A. email: sanat@temple.edu and Shyamal D. Peddada Biostatistics Banch, National Institute of Envionmental Health Sciences Reseach Tiangle Pak, NC 27709, U.S.A. email: peddada@niehs.nih.gov 1
1 Web Appendix A: Poof of Theoem 1 We begin by calculating the pue diectional (), E { } S R 1. Let I1 = {1 j m : δ j 0} be the set of indices of false null hypotheses, then S can be expessed as S = ( q ( I P ij R )) j I 1 qm α, T ijδ ij 0, i=1 whee I( ) is indicato function. Thus { } { } S E (S R) df DR = E = E R 1 R 1 j I = E 1 P { ( q i=1 P ij R α, T qm ijδ ij 0 ) } R R 1 { m q ( = P P ij )} qm α, T ijδ ij 0, R = 1 =1 j I 1 m q =1 j I 1 i=1 1 P i=1 { The inequality follows fom the Bonfeoni inequality. P ij } qm α, T ijδ ij 0, R =. (A.1) Fo any given i and j, without loss of geneality, we assume δ ij 0. When δ ij > 0, we have { P P ij } qm α, T ijδ ij 0, R = { = P P ij } qm α, T ij 0, R = { P F ij (T ij, 0) } 2qm α, R = { ( ) } = P T ij Fij 1 2qm α, 0, R =, (A.2) whee Fij 1 (, 0) is the invese function of F ij (, 0). The inequality in the above calculations follows fom the definition of P ij and the assumption F ij (0, 0) = 1 2. Noting that T j = (T 1j,, T qj ), j = 1,, m, ae independent of each othe, the last pobability in (A.2) can be simplified to { ( )} P T ij Fij 1 2qm α, 0 P ( R ( j) = 1 ) ( ( ) ) = F ij Fij 1 2qm α, 0, δ ij P ( R ( j) = 1 )
2 ( ( ) ) F ij Fij 1 2qm α, 0, 0 P ( R ( j) = 1 ) = 2qm α P ( R ( j) = 1 ), (A.3) whee R ( j) denotes the numbe of ejections in the stepup pocedue with citical constants α k = k+1 m α, k = 1,, m 1 based on {P 1,, P m } \ {P j }. The above inequality follows fom the assumption that F ij (, δ ij ) is stochastically inceasing in δ ij 0. Similaly, when δ ij = 0, we have { P P ij } qm α, T ijδ ij 0, R = { = P P ij } qm α, R = { = P P ij } qm α, R( j) = 1 qm α P ( R ( j) = 1 ). (A.4) The last inequality follows fom the fact that the two-sided p-value P ij satisfies the condition (2) when δ ij = 0. Using (A.2) (A.4) in (A.1), we have df DR m q =1 j I 1 i=1 α qm P ( R ( j) = 1 ) = m 1 m α. (A.5) Noting that the pooled p-values P j, j = 1,, m, satisfy the condition (2), then fo independent p-values P j s, the usual of the q-dimensional diectional BH pocedue satisfies the following inequality, F DR m 0 m α ; (A.6) see Benjamini and Hochbeg (1995), Benjamini and Yekutieli (2001) o Saka (2002). Combining (A.5) and (A.6), we have mdf DR = F DR + df DR m 0 m α + m 1 m α = α, (A.7)
and hence the poof is complete. 3
4 Web Appendix B: Some Additional Simulation Results In addition to evaluating the pefomance of Pocedue 1, we also evaluated the pefomance of Pocedue 2 in the same simulation study. Web Figue 7 pesents the simulated, and m and Web Figue 8 pesents the aveage powe of Pocedue 2 plotted against the numbe of false null hypotheses fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8. Compaing Figue 1 with Web Figue 7 and Web Figue 1 with Web Figue 8, we find that both pocedues, Pocedue 1 and Pocedue 2, pefom similaly. We also used a simulation study to evaluate the pefomance of Pocedue 2 unde dependence within genes. We geneated m independently distibuted (q + 1)-dimensional andom nomal vectos Z 1,..., Z m, whee the components Z ij, j = 1,, q + 1 in each Z i with Z ij N(µ ij, 1), ae dependent with compound symmety stuctue o autoegessive ode one stuctue (AR(1)), espectively, and have a coelation paamete ρ. Let δ ij = (µ i,j+1 µ ij )/ 2, i = 1,..., m; j = 1,..., q. Out of the m paamete vectos δ i = (δ i1,..., δ iq ), i = 1,..., m, m 0 wee set to a null vecto each, and all the δ ij s in 50%, 25% and 25% of the emaining m m 0 δ i s wee selected andomly fom the intevals ( 0.75, 0.75), ( 4.25, 2.75) and (2.75, 4.25) espectively. Fo each i = 1,, m, and j = 1,, q, the statistic T ij = (Z i,j+1 Z ij )/ 2 fo testing H j 0i : δ ij = 0 vs. H j 1i : δ ij 0 and the coesponding two-sided p-value P ij = 2 {1 Φ ( T ij )} wee then computed, whee Φ( ) is the standad nomal cdf. The pooled p-values wee calculated accoding to the Simes test fo each i = 1,..., m, and Pocedues 2 wee applied to thei espective lists of pooled p-values fo testing the m null hypotheses descibed in (1). Simila to the above simulation study, the simulated values of the, and m wee obtained by epeating the simulation steps 10,000 times. Web Figues 9 and 11 povide the simulated, and m of Pocedue 2 plotted against the numbe of false null hypotheses fo m = 1, 000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3 unde dependence within genes accoding
5 to compound symmety stuctue and AR(1) stuctue, espectively. The aveage powe of Pocedue 2 unde the above dependence stuctues, ae povided in Web Figues 10 and 12, espectively. As seen fom Web Figues 9 and 11, the simulated m of Pocedue 2 is seveely affected by dependence within genes, but it is still below the pe-specified level. In addition, as seen fom Web Figues 10 and 12, thee is no monotone elationship between the aveage powe of Pocedue 2 and coelation paamete ρ. [Figue 1 about hee.] [Figue 2 about hee.] [Figue 3 about hee.] [Figue 4 about hee.] [Figue 5 about hee.] [Figue 6 about hee.] [Figue 7 about hee.] [Figue 8 about hee.] [Figue 9 about hee.] [Figue 10 about hee.] [Figue 11 about hee.] [Figue 12 about hee.]
6 LIST OF FIGURES 1 Powe of Pocedue 1 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8. 2 Pefomance of Pocedue 1 unde dependence acoss genes in tems of its contol of the (solid), (dashed) and m (dotted) fo m = 1000, q = 5, α = 0.05, ρ = 0, 0.2, 05 and 0.8, and δ = (100, 0,..., 0). 3 Standad deviation of the m of Pocedue 1 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05, ρ = 0, 0.2, 05 and 0.8, and δ = (100, 0,..., 0). 4 A numeical compaison of Pocedues 1 and 2 and the no-adjustment pocedue in tems of the contol of the, and m and also powe fo m = 1200, q = 4, ρ = 0, and α = 0.05. 5 Pefomance of Pocedue 1 with espect to the dimension q in tems of its contol of the, dfrd and m fo m = 1000, m 0 = 600, α = 0.05 and ρ = 0. 6 Powe pefomance of Pocedue 1 with espect to the dimension q fo m = 1000, m 0 = 600, α = 0.05 and ρ = 0 when testing H 0i : δ i = 0 vs. H 1i : δ i 0, whee i = 1,, 1000, δ i = (δ i1,..., δ iq ) and all the δ ij s in 200, 100 and 100 of the 400 non-null δ i s wee selected andomly fom the intevals ( 0.75, 0.75), ( 4.25, 2.75) and (2.75, 4.25) espectively. 7 Pefomance of Pocedue 2 unde dependence acoss genes in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.2, 05 and 0.8. 8 Powe of Pocedue 2 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8.
7 9 Pefomance of Pocedue 2 unde dependence within genes with compound symmety stuctue in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.1, 0.2 and 0.3. 10 Powe of Pocedue 2 unde dependence within genes with compound symmety stuctue fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3. 11 Pefomance of Pocedue 2 unde dependence within genes with AR(1) stuctue in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.1, 0.2 and 0.3. 12 Powe of Pocedue 2 unde dependence within genes with AR(1) stuctue fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3.
8 Aveage Powe 0.1 0.2 0.3 0.4 0.5 ho = 0 ho = 0.2 ho = 0.5 ho = 0.8 1000 Figue 1. Powe of Pocedue 1 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8.
9 ho = 0 ho = 0.2 0.00 0.02 0.04 0.00 0.02 0.04 ho = 0.5 ho = 0.8 0.00 0.02 0.04 0.00 0.02 0.04 Figue 2. Pefomance of Pocedue 1 unde dependence acoss genes in tems of its contol of the (solid), (dashed) and m (dotted) fo m = 1000, q = 5, α = 0.05, ρ = 0, 0.2, 05 and 0.8, and δ = (100, 0,..., 0).
10 ho = 0 ho = 0.2 Standad deviation 0e+00 4e!04 8e!04 Standad deviation 0e+00 4e!04 8e!04 ho = 0.5 ho = 0.8 Standad deviation 0.0000 0.0010 0.0020 Standad deviation 0.0000 0.0010 0.0020 Figue 3. Standad deviation of the m of Pocedue 1 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05, ρ = 0, 0.2, 05 and 0.8, and δ = (100, 0,..., 0).
11 Simes adjustment Bonfeoni adjustment 0.00 0.02 0.04 0.06 m 0.00 0.02 0.04 0.06 m 0 200 600 1000 0 200 600 1000 No adjustment Powe compaison 0.00 0.05 0.10 0.15 0.20 m Aveage Powe 0.2 0.4 0.6 0.8 SA BA NA 0 200 600 1000 0 200 600 1000 Figue 4. A numeical compaison of Pocedues 1 and 2 and the no-adjustment pocedue in tems of the contol of the, and m and also powe fo m = 1200, q = 4, ρ = 0, and α = 0.05.
12 0.00 0.01 0.02 0.03 0.04 0.05 0.06 m 5 10 15 dimension (q) Figue 5. Pefomance of Pocedue 1 with espect to the dimension q in tems of its contol of the, dfrd and m fo m = 1000, m 0 = 600, α = 0.05 and ρ = 0.
13 Aveage Powe 0.30 0.35 0.40 0.45 0.50 0.55 5 10 15 dimension (q) Figue 6. Powe pefomance of Pocedue 1 with espect to the dimension q fo m = 1000, m 0 = 600, α = 0.05 and ρ = 0 when testing H 0i : δ i = 0 vs. H 1i : δ i 0, whee i = 1,, 1000, δ i = (δ i1,..., δ iq ) and all the δ ij s in 200, 100 and 100 of the 400 non-null δ i s wee selected andomly fom the intevals ( 0.75, 0.75), ( 4.25, 2.75) and (2.75, 4.25) espectively.
14 ho = 0 ho = 0.2 0.00 0.02 0.04 m 0.00 0.02 0.04 m ho = 0.5 ho = 0.8 0.00 0.02 0.04 m 0.00 0.02 0.04 m Figue 7. Pefomance of Pocedue 2 unde dependence acoss genes in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.2, 05 and 0.8.
15 Aveage Powe 0.1 0.2 0.3 0.4 0.5 0.6 ho = 0 ho = 0.2 ho = 0.5 ho = 0.8 1000 Figue 8. Powe of Pocedue 2 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8.
16 ho = 0 ho = 0.1 0.00 0.02 0.04 0.06 m 0.00 0.01 0.02 0.03 0.04 m ho = 0.2 ho = 0.3 0.000 0.010 0.020 m 0.000 0.004 0.008 m Figue 9. Pefomance of Pocedue 2 unde dependence within genes with compound symmety stuctue in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.1, 0.2 and 0.3.
17 Aveage Powe 0.0 0.1 0.2 0.3 0.4 0.5 0.6 ho = 0 ho = 0.1 ho = 0.2 ho = 0.3 1000 Figue 10. Powe of Pocedue 2 unde dependence within genes with compound symmety stuctue fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3.
18 ho = 0 ho = 0.1 0.00 0.02 0.04 0.06 m 0.00 0.01 0.02 0.03 0.04 m ho = 0.2 ho = 0.3 0.000 0.010 0.020 m 0.000 0.004 0.008 m Figue 11. Pefomance of Pocedue 2 unde dependence within genes with AR(1) stuctue in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.1, 0.2 and 0.3.
19 Aveage Powe 0.0 0.1 0.2 0.3 0.4 0.5 0.6 ho = 0 ho = 0.1 ho = 0.2 ho = 0.3 1000 Figue 12. Powe of Pocedue 2 unde dependence within genes with AR(1) stuctue fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3.