Multiple Testin
Test Hypothesis in in Micoaay Studies Micoaay studies aim to discove enes in bioloical samples that ae diffeentially expessed unde diffeent expeimental conditions aim at havin hih pobability of declain enes to be sinificantly expessed if they ae tuly expessed (hih powe ~ low type II eo isk), while keepin the pobability of makin false declaations of expession acceptably low (contollin type I eo isk) Lee & Whitmoe (22) Statistics in Medicine 21, 3543-357 expession..5.1.15 1 2 3 4 ene
Multiple Testin Micoaay studies typically involve the simultaneous study of thousands of enes, the pobability of poducin incoect test conclusions (false positives and false neatives) must be contolled fo the whole ene set. fo each ene thee ae two possible situations - the ene is not diffeentially expessed, e.. hypothesis H is tue - the ene is diffeentially expessed at the level descibed by the altenative hypothesis H A test declaation (decision) tue hypothesis unexpessed (H ) expessed (H A ) - the ene is diffeentially expessed (H ejected) - the ene is unexpessed (H not ejected) unexpessed (H not ejected) tue neative false neative (type II eo β) test declaation expessed (H ejected) false positive (type I eo α) tue positive Lee & Whitmoe (22) Statistics in Medicine 21, 3543-357
Multiple Testin Testin simultaneously hypothesis H 1,..., H, of these hypothesis ae tue # not ejected # ejected hypothesis hypothesis # tue hypothesis (unexpessed enes) U V # false hypothesis (expessed enes) T S - total -R R counts U, V, S, T ae andom vaiables in advance of the analysis of the study data obseved andom vaiable R = numbe of ejected hypothesis U, V, S, T not obsevable andom vaiables V = numbe of type I eos (false positives) T = numbe of type II eos (false neatives) Dudoit et al. (22) Multiple Hypothesis Testin in Micoaay Expeiments, Technical Repot
Type I I and II II Eo Rates # not ejected hypothesis # ejected Hypothesis # tue hypothesis (unexpessed enes) U V # false hypothesis (expessed enes) T S - total -R R α = pobability of type I eo fo any ene = E(V)/ β 1 = pobability of type II eo fo any ene = E(T)/(- ) α F = family-wise eo ate (FWER) = P(V > ) (pobability of at least one type I eo) False discovey ate (FDR) (Benjamini & Hochbe, 1995) = expected popotion of false positives amon the ejected hypothesis FDR = E( Q), V / R Q = : R > : R = Dudoit et al. (22) Multiple Hypothesis Testin in Micoaay Expeiments, Technical Repot
Ston vs. weak contol expectations and pobabilities ae conditional on which hypothesis ae tue ston contol: contol of the Type I eo ate unde any combination of tue and false hypotheses, i.e., any value of h H 1, fo all {,..., }, = weak contol: contol of the Type I eo ate only when all hypothesis ae tue, i.e. unde the complete null-hypothesis H C = h = 1H, with = Dudoit et al. (22) Multiple Hypothesis Testin in Micoaay Expeiments, Technical Repot
Notations Notations Fo hypothesis H, = 1,..., : obseved test statistics t obseved unadjusted p-values p Odeed p-values and test statistics: } {,, t t t p p p = 2 1 2 1 1 Dudoit et al. (22) Multiple Hypothesis Testin in Micoaay Expeiments, Technical Repot
Contol of of the family-wise eo ate (FWER) obseved p-values Bonfeoni Holm Step-down Hochbe Step-up p 1 α/ α/ α/ p 2 α/ α/(-1) α/(-1) : : : : p α/ α/(-+1) α/(-+1) : : : : p 1 α/ α/2 α/2 p α/ α α
Contol of of the family-wise eo ate (FWER) 1. sinle-step Bonfeoni pocedue eject H with p α/, adjusted p-value p ~ = min( p, 1) 2. Holm (1979) step-down pocedue * = min{ : p > α /( + 1)}, eject H fo 1,, = adjusted p - value p ~ = max {min(( m k + 1) p, 1)} k = 1,, k * 1, 3. Hochbe (1988) step-up pocedue * = max{ : p α /( + 1)}, eject H fo 1,, = adjusted p - value p ~ = min {min(( k + 1) p, 1)} k =,, m 4. Sinle-step Šidák pocedue adjusted p - value p ~ = 1 ( 1 ) p k *, Dudoit et al. (22) Multiple Hypothesis Testin in Micoaay Expeiments, Technical Repot
Resamplin Estimate joint distibution of the test statistics T 1,...,T unde the complete null C H hypothesis by pemutin the columns of the ene expession data matix X. Pemutation aloithm fo non-adjusted p-values Fo the b-th pemutation, b = 1,...,B 1. Pemute the n columns of the data matix X. 2. Compute test statistics t 1,b,..., t,b fo each hypothesis. The pemutation distibution of the test statistic T fo hypothesis H, =1,...,, is iven by the empiical distibution of t,1,..., t,b. Fo two-sided altenative hypotheses, the pemutation p-value fo hypothesis H is p * = B 1 I( t, b t j ) B b= 1 whee I(.) is the indicato function, equalin 1 if the condition in paenthesis is tue, and othewise. Dudoit et al. (22) Multiple Hypothesis Testin in Micoaay Expeiments, Technical Repot
Contol of of the family-wise eo ate (FWER) Pemutation aloithm of Westfall & Youn (1993) - step-down pocedue without assumin t distibution of the test statistics fo each ene s diffeential expession - adjusted p-values diectly estimated by pemutation - ston contol of FWER - takes dependency stuctue of hypotheses into account
Contol of of the family-wise eo ate (FWER) Pemutation aloithm of Westfall & Youn (maxt) - Ode obseved test statistics: t t t 1 2 - fo the b-th pemutation of the data (b = 1,...,B): divide data into atificial contol and teatment oup compute test statistics t 1b,..., t b compute successive maxima of the test statistics u u, b, b = t, b = max{ u + 1, b, t, b } fü = 1,..., 1 - compute adjusted p-values: p ~ * = B 1 I( u t ) B b= 1, b Dudoit et al. (22) Multiple Hypothesis Testin in Micoaay Expeiments, Technical Repot
Contol of of the family-wise eo ate (FWER) Pemutation aloithm of Westfall &Youn Example ene 1 4 5 2 3 t.1.2 2.8 3.4 7.1 t t 1 sot obseved values : t 2 t 1 ene t b u b I(u b > t ) p ~ = / B 1 1.3 1.3 1 935.935 4.8 1.3 1 876.876 5 3. 3. 1 138.138 2 2.1 3. 145.145 3 1.8 3. 48.48 B=1 pemutations adjusted p-values O. Hatmann - NFN Symposium, 19.11.22 Belin
Example: Leukemia study, olub et et al. al. (1999) patients with ALL (acute lymphoblastic leukemia) n 1 =27 AML (acute myeloid leukemia) n 2 =11 Affy-Chip: 6817 enes eduction to 351 enes accodin to cetain exclusion citeia fo expession values
Example: Leukemia study, olub et et al. al. (1999) Dudoit et al. (22)
Example: Leukemia study, olub et et al. al. (1999) Dudoit et al. (22)
Contol of of the False Discovey Rate (FDR) While in some cases FWER contol is needed, the multiplicity poblem in micoaay data does not equie a potection aainst aainst even a sinle type I eo, so that the seve loss of powe involved in such potection is not justified. Instead, it may be moe appopiate to emphasize the popotion of eos amon the identified diffeentially expessed enes. The expectation of this popotion is the False Discovey Rate (FDR). FDR = E( Q), Q = V / R : R : R > = R = numbe of ejected hypothesis V = numbe of type I eos (false positives) Reine, Yekutieli & Benjamini (23) Bioinfomatics 19, 368-375
Contol of of the False Discovey Rate (FDR) 1. Linea step-up pocedue (Benjamini & Hochbe, 1995) * = max{ : p adjusted p - value q}, p ~ = eject min k =,, H {min( fo k p k = 1,,, 1)} *, - contols FDR at level q fo independent test statistics FDR q q 2. Benjamini & Yekutieli (21) - pocedue 1 contols the FDR unde cetain dependency stuctues (positive eession dependency) - step-up pocedue fo moe eneal cases (eplace q by q / 1/ i ) i 1 * = max adjusted p - { } : p q /( 1/ i), eject H fo i = 1 ~ { } value p = min min( p 1/ i, 1) k =, l, k k i = 1 = = 1, l, *, - this modification may be to consevative fo the micoaay poblem Reine, Yekutieli & Benjamini (23) Bioinfomatics 19, 368-375
Contol of of the False Discovey Rate (FDR) 3. Adaptive pocedues (Benjamini & Hochbe, 2) - ty to estimate and use q*=q / instead of q in pocedue 1 to ain moe powe - Stoey (21) suests a simila vesion to estimate, which ae implemented in SAM (Stoey & Tibshiani, 23) - adaptive methods offe bette pefomance only by utilizin the diffeence between / and 1, if the diffeence is small, i.e. when the potential popotion of diffeentially expessed enes is small, they offe little advantae in powe while thei popeties ae not well established. 4. Resamplin FDR adjustments - Yekutieli & Benjamini (1999) J. Statist. Plan. Infeence 82, 171-196 - Reine, Yekutieli & Benjamini (23) Bioinfomatics 19, 368-375 Reine, Yekutieli & Benjamini (23) Bioinfomatics 19, 368-375
Example: Leukemia study, olub et et al. al. (1999) Dudoit et al. (22)
Example: Apo AI AI Exp., Callow et et al. al. (2) Apolipopotein A1 (Apo A1) expeiment in mice aim: identification of diffeentially expessed enes in live tissues expeimental oup: contol oup: 8 mice with apo A1-ene knocked out (apo A1 KO) 8 C57B1/6 mice expeimental sample: cdna fo each of the 16 mice labeled with ed (Cy5) efeence-sample: pooled cdna of the 8 contol mice labeled with een (Cy3) cdna Aays with 6384 cdna pobes, 2 elated to lipid-metabolism 16 hybidizations oveall
Example: Apo AI AI Exp., Callow et et al. al. (2) Dudoit et al. (22)
Beispiel 2: 2: Apo AI AI Exp., Callow et et al. al. (2) Dudoit et al. (22)
Multiple Testin -- Summay Fo multiple testin poblems thee ae seveal methods to contol the family-wise eo ate (FWER). FDR contollin pocedues ae pomisin altenatives to moe consevative FWER contollin pocedues. Ston contol of the type one eo ate is essential in the micoaay context. Adjusted p-values povide flexible summaies of the esults fom a multiple testin pocedue and allow fo a compaison of diffeent methods. Substantial ain in powe can be obtained by takin into account the joint distibution of the test statistics (e.. Westfall & Youn, 1993; Reine, Yekutieli & Benjamini 23). Recommended softwae: Bioconducto R multtest packae (http://www.bioconducto.o/) Adapted fom S. Dudoit, Bioconducto shot couse 22
Multiple Testin -- Liteatue Benjamini, Y. & Hochbe, Y. (1995). Contollin the false discovey ate: a pactical and poweful appoach to multiple testin, J. R. Statist. Soc. B 57: 289-3. Benjamini,Y. and Hochbe,Y. (2) On the adaptive contol of the false discovey ate in multiple testin with independent statistics. J. Educ. Behav. Stat., 25, 6 83. Benjamini,Y. and Yekutieli,D. (21b) The contol of the false discovey ate unde dependency. Ann Stat. 29, 1165 1188. Callow, M. J., Dudoit, S., on, E. L., Speed, T. P. & Rubin, E. M. (2). Micoaay expession pofilin identifies enes with alteed expession in HDL deficient mice, enome Reseach 1(12): 222-229. S. Dudoit, J. P. Shaffe, and J. C. Boldick (Submitted). Multiple hypothesis testin in micoaay expeiments, Technical Repot #11 (http://stat-www.bekeley.edu/uses/sandine/publications.html) olub, T. R., Slonim, D. K., Tamayo, P., Huad, C., aasenbeek,m., Mesiov, J. P., Colle, H., Loh, M., Downin, J. R., Caliiui, M. A., Bloomeld, C. D. & Lande, E. S. (1999). Molecula classication of cance: class discovey and class pediction by ene expession monitoin, Science 286: 531-537.
Multiple Testin -- Liteatue Hochbe, Y. (1988). A shape bonfeoni pocedue fo multiple tests of sinificance, Biometika 75: 8-82. Holm, S. (1979). A simple sequentially ejective multiple test pocedue, Scand. J. Statist. 6: 65-7. M.-L. T. Lee &.A. Whitmoe (22) Powe and sample size fo DNA micoaay studies. Statistics in Medicine 21, 3543-357. A. Reine, D. Yekutieli & Y. Benjamini (23) Identifyin diffeentially expessed enes usin false discovey ate contollin pocedues. Bioinfomatics 19, 368-375 Westfall, P. H. & Youn, S. S. (1993). Resamplin-based multiple testin: Examples and methods fo p-value adjustment, John Wiley & Sons. Yekutieli,D. and Benjamini,Y. (1999) Resamplin-based false discovey ate contollin multiple test pocedues fo coelated test statistics. J. Stat. Plan Infe., 82, 171 196.
2 x 2 Factoial Expeiments Two expeimental factos, e.. teatment (unteated T -, teated T +) stain (knock out KN, wild-type WT) Linea model y = β + β1x1 + β2x2 + β3x1x2 + ε, ε ~ N(, σ 2 ) x x 1 2 = 1 = 1 : stain = KN : stain = WT : teatment = T : teatment = T + KN stain WT β 1 - stain effect β 2 - teatment effect β 3 - inteaction effect of stain and teatment teatment T - T+ β β + β 2 β + β 1 β +β 1 +β 3
2 x 2 Factoial Expeiments β 3 > β + + β1 + β 2 β3 β + + β1 + β 2 β3 WT β 3 T+ β 3 β + β 2 β + β 2 β + β 1 KN β 2 β + β 1 β 2 β β 1 β T- β 1 T- T+ KN WT
2 x 2 Factoial Expeiments β 3 < β + β 2 β 3 β + β 2 T+ β 3 β + + β1 + β 2 β3 β + β 1 β WT β 1 KN β 2 β + + β1 + β 2 β3 β + β 1 β β 2 T- β 1 T- T+ KN WT
2 x 2 Factoial Expeiments H : β 3 = H A : β 3 - effect of stain is independent of teatment o - effect of teatment is independent of stain o - stain and teatment ae additive - teatment inteacts with stain - teatment modifies effect of stain - stain modifies effect of teatment - teatment and stain ae nonadditive H : β 1 = β 3 = H A : β 1 o β 3 - stain is not associated with expession Y - stain is associated with expession Y - stain is associated with expession Y fo eithe T- o T+ H : β 2 = β 3 = H A : β 2 o β 3 - teatment is not associated with expession Y - teatment is associated with expession Y - teatment is associated with expession Y fo eithe KN o WT F.E. Haell, J. (21) Reession Modelin Stateies, Spine
2 x 2 Factoial Expeiments --Teatment effect 7 8 9 1 11 12 13 T- T+ KN WT
2 x 2 Factoial Expeiments --Stain effect 7 8 9 1 11 12 T- T+ KN WT
2 x 2 Factoial Expeiments --Stain effect 6. 6.2 6.4 6.6 6.8 7. T- T+ KN WT
2 x 2 Factoial Expeiments --Inteaction effect 8. 8.2 8.4 8.6 8.8 9. T- T+ KN WT