Sas 300C: Theory of Saisics Sprig 2018 Lecure 8 April 18, 2018 Prof Emmauel Cades Scribe: Emmauel Cades Oulie Ageda: Muliple Tesig Problems 1 Empirical Process Viewpoi of BHq 2 Empirical Process Viewpoi of FDR Corol 3 Improvig o BHq? The maerial for his lecure is ake from Sorey, Siegmud ad Taylor (2004) [1] 1 The Empirical Process Viewpoi of BH(q) I he previous lecure, we iroduced he BHq sep-up procedure by relaig i o Simes We plo he sored p-values ad he criical lie as show i Figure 1(a) ad look for he firs ime (goig from large o small p-values) a p-value falls below he criical lie A equivale represeaio ca be obaied by swappig he x ad he y-axes, as i Figure 1(b) The laer bes allows o describe he BH procedure i erms of a empirical process I paricular, he coordiaed o he y-axis of Figure 1(b) are he values of he empirical CDF ˆF () of he p-values, which is defied as ˆF () = #i : p i Assumig ha he p-values ad he hypoheses are ordered accordig o p(1) p(), H(1) H(), we defied he BH(q) as rejecig H(1),, H(i 0 ) where i 0 = max i : p(i) q i Usig he defiiio of he empirical CDF, he criical p-value p = p(i 0 ) ca be rewrie as p = max p(i) : p(i) q i = max p(i) : p(i) q ˆF (p(i)), 1
p-values ˆF () / i/ i 0 / (a) P-values o he y axis, idices o x i/ 0 (b) P-values o he x axis, idices o y Figure 1: Sored p-values ad BH(q) hreshold lie wih he coveio ha p = q/ if he above se is empy Hece, p = max p 1,, p : q ˆF (), wih he same coveio as above if he se is empy Now se τ BH = max : ˆF () 1/ q (1) ad oe ha τ BH q/ By cosrucio, he BH procedure rejecs all hypoheses wih p i τ BH Tha is, whe k rejecios are made, all p-values less or equal o qk/ are rejeced The BH procedure ca be jusified wih he followig simple argume due o [1] Take a fixed value of (0, 1) ad cosider he rule ha rejecs H 0,i if ad oly if p(i) The eries i he able of oucomes (show below) will deped o he value of H 0 acceped H 0 rejeced Toal H 0 rue U() V () 0 H 0 false T () S() 0 R() R() The false discovery proporio ad false discovery rae, defied as before, will also deped o : Fdp() = V () max(r(), 1) The hreshold should he be chose as large as possible, while corollig he FDR a level q Observe ha by defiiio, we have [ ] V () FDR() = E[Fdp()] = E max(r(), 1) Esimaes of FDR process ca be ivered o give a FDR-corollig hresholdig procedure If we have a esimae FDR() of FDR(), he we ca ake he hreshold τ = sup 1 : FDR() q 2
This defies he larges (mos liberal) hresholdig cu-off, s our esimaed FDR is corolled Noe ha, sice we choose τ by lookig a all he p-values, his is a daa-depede hresholdig procedure How ca we esimae Recall ha: FDR() = V () max(r(), 1)? The umber of rejecios R() is kow The umber of false rejecios V () is o kow A soluio sars by oig ha EV () = 0 Sill, 0 is o kow However, oe ca make a coservaive esimae equal o This leads o FDR() = This choice leads us o BH(q) procedure sice τ BH = sup max(r(), 1) = 1 : [1] proved ha he FDR esimae is biased upward Theorem 1 Uder idepedece, E[ FDR()] FDR() ˆF () 1/ (2) R() 1 q A good hig/exercise for you would be o verify his iequaliy 2 Marigale Proof of FDR (BH(q)) The esimae of FDR ca also be ivered o yield FDR corol, givig aoher proof of he resul of Bejamii ad Hochberg [1] Theorem 2 (BH (1995)) E [FDR(τ BH )] = q 0 /; ie he procedure rejecig all hypoheses wih p i τ BH corols he FDR Proof We use Marigale Theory 1 Filraio: F = σv (s), R(s), s 1 2 τ BH is a soppig ime wr backward filraio F sice τ BH F Ideed, kowledge of R(s) = ˆF (s) for s deermies wheher τ BH or o 3
3 V () is a marigale ruig backward i ime: ] F s [ V () E = 1 E [V () F s] = 1 V (s) s = V (s) s where we used ha uder F s, V (s) = #p 0 i : p0 i s ad hese p0 i o [0, s] ad are idepede are uiformly disribued 4 Opioal Soppig Time Theorem: Therefore, max(r(τ BH ), 1) = max( ˆF (τ BH ), 1/) = τ BH /q ( ) V (τ BH ) FDR(τ BH ) = E max(r(τ BH ), 1) = q ( ) V E (τbh ) = q τ BH EV (1) = q 0 3 Improvig o BHq? Cosider he iepreaio of BHq iroduced i secio (1) Ca he disribuios of p-values be used o improve he simple coservaive esimae of π 0 = 0? Fix λ [0, 1) ad defie ˆπ λ 0 = R(λ) (1 λ) Based o his esimae for π 0 = 0, oe obais he followig esimae for FDR: FDR λ () = For λ = 0, BHq is recovered For a geeral λ, ad hece ˆπ λ 0 max(r(), 1) (3) ˆπ λ 0 = 0 V (λ) + 1 S(λ) (1 λ) 0 V (λ) (1 λ) [ ] E ˆπ 0 λ 0 = π 0 The idea is ha if o ull p-values are small, he 1 S(λ) 0 ad ˆπ 0 λ gives a accurae esimae of π 0 The goal would be o show ha he hreshold ˆπ 0 τ = sup 1 : FDR() λ = max(r(), 1) q provides FDR corol modified versio of (3) While his may acually o be he case, [1] proves such a hig for a 4
Refereces [1] Sorey, J, Taylor, J, & Siegmud, D (2004) Srog Corol, Coservaive Poi Esimaio ad Simulaeous Coservaive Cosisecy of False Discovery Raes: A Uified Approach Joural of he Royal Saisical Sociey Series B (Saisical Mehodology), 66(1), 187-205 5