Web-based Supplementary Materials for. Controlling False Discoveries in Multidimensional Directional Decisions, with

Similar documents
Goodness-of-fit for composite hypotheses.

Pearson s Chi-Square Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted Histograms

A NEW VARIABLE STIFFNESS SPRING USING A PRESTRESSED MECHANISM

Lecture 7 Topic 5: Multiple Comparisons (means separation)

4/18/2005. Statistical Learning Theory

Surveillance Points in High Dimensional Spaces

Topic 5. Mean separation: Multiple comparisons [ST&D Ch.8, except 8.3]

Hypothesis Test and Confidence Interval for the Negative Binomial Distribution via Coincidence: A Case for Rare Events

Designing a Sine-Coil for Measurement of Plasma Displacements in IR-T1 Tokamak

Journal of Inequalities in Pure and Applied Mathematics

Unobserved Correlation in Ascending Auctions: Example And Extensions

2 Governing Equations

1D2G - Numerical solution of the neutron diffusion equation

Central Coverage Bayes Prediction Intervals for the Generalized Pareto Distribution

Tight Upper Bounds for the Expected Loss of Lexicographic Heuristics in Binary Multi-attribute Choice

Compactly Supported Radial Basis Functions

Section 8.2 Polar Coordinates

New problems in universal algebraic geometry illustrated by boolean equations

ANA BERRIZBEITIA, LUIS A. MEDINA, ALEXANDER C. MOLL, VICTOR H. MOLL, AND LAINE NOBLE

COMPUTATIONS OF ELECTROMAGNETIC FIELDS RADIATED FROM COMPLEX LIGHTNING CHANNELS

CENTER FOR MULTIMODAL SOLUTIONS FOR CONGESTION MITIGATION (CMS)

is the instantaneous position vector of any grid point or fluid

The Substring Search Problem

Supplementary Figure 1. Circular parallel lamellae grain size as a function of annealing time at 250 C. Error bars represent the 2σ uncertainty in

Relating Branching Program Size and. Formula Size over the Full Binary Basis. FB Informatik, LS II, Univ. Dortmund, Dortmund, Germany

Liquid gas interface under hydrostatic pressure

6 Matrix Concentration Bounds

arxiv: v2 [physics.data-an] 15 Jul 2015

3.1 Random variables

Research Article On Alzer and Qiu s Conjecture for Complete Elliptic Integral and Inverse Hyperbolic Tangent Function

On the Simes inequality and its generalization

DonnishJournals

SPATIAL BOOTSTRAP WITH INCREASING OBSERVATIONS IN A FIXED DOMAIN

Fractional Zero Forcing via Three-color Forcing Games

Revision of Lecture Eight

SMT 2013 Team Test Solutions February 2, 2013

On Computing Optimal (Q, r) Replenishment Policies under Quantity Discounts

Modeling of High Temperature Superconducting Tapes, Arrays and AC Cables Using COMSOL

Chapter 3 Optical Systems with Annular Pupils

Multiple Testing Multiple Testing

Macro Theory B. The Permanent Income Hypothesis

Gradient-based Neural Network for Online Solution of Lyapunov Matrix Equation with Li Activation Function

School of Electrical and Computer Engineering, Cornell University. ECE 303: Electromagnetic Fields and Waves. Fall 2007

ELASTIC ANALYSIS OF CIRCULAR SANDWICH PLATES WITH FGM FACE-SHEETS

11) A thin, uniform rod of mass M is supported by two vertical strings, as shown below.

working pages for Paul Richards class notes; do not copy or circulate without permission from PGR 2004/11/3 10:50

Do Managers Do Good With Other People s Money? Online Appendix

Multiple Criteria Secretary Problem: A New Approach

Alternative Tests for the Poisson Distribution

Computational Methods of Solid Mechanics. Project report

n 1 Cov(X,Y)= ( X i- X )( Y i-y ). N-1 i=1 * If variable X and variable Y tend to increase together, then c(x,y) > 0

A Bijective Approach to the Permutational Power of a Priority Queue

ON THE INVERSE SIGNED TOTAL DOMINATION NUMBER IN GRAPHS. D.A. Mojdeh and B. Samadi

Physics 111 Lecture 5 (Walker: 3.3-6) Vectors & Vector Math Motion Vectors Sept. 11, 2009

The geometric construction of Ewald sphere and Bragg condition:

Pushdown Automata (PDAs)

INFLUENCE OF GROUND INHOMOGENEITY ON WIND INDUCED GROUND VIBRATIONS. Abstract

Review: Electrostatics and Magnetostatics

LET a random variable x follows the two - parameter

Psychometric Methods: Theory into Practice Larry R. Price

TESTING THE VALIDITY OF THE EXPONENTIAL MODEL BASED ON TYPE II CENSORED DATA USING TRANSFORMED SAMPLE DATA

Supplementary information Efficient Enumeration of Monocyclic Chemical Graphs with Given Path Frequencies

ASTR415: Problem Set #6

12th WSEAS Int. Conf. on APPLIED MATHEMATICS, Cairo, Egypt, December 29-31,

Chem 453/544 Fall /08/03. Exam #1 Solutions

MONTE CARLO SIMULATION OF FLUID FLOW

Math 2263 Solutions for Spring 2003 Final Exam

Bayesian Analysis of Topp-Leone Distribution under Different Loss Functions and Different Priors

Absorption Rate into a Small Sphere for a Diffusing Particle Confined in a Large Sphere

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

7.2. Coulomb s Law. The Electric Force

arxiv: v2 [astro-ph] 16 May 2008

APPLICATION OF MAC IN THE FREQUENCY DOMAIN

The Chromatic Villainy of Complete Multipartite Graphs

GENLOG Multinomial Loglinear and Logit Models

BASIC ALGEBRA OF VECTORS

A dual-reciprocity boundary element method for axisymmetric thermoelastodynamic deformations in functionally graded solids

Mitscherlich s Law: Sum of two exponential Processes; Conclusions 2009, 1 st July

Internet Appendix for A Bayesian Approach to Real Options: The Case of Distinguishing Between Temporary and Permanent Shocks

Mathematical Analysis and Numerical Simulation of High Frequency Electromagnetic Field in Soft Contact Continuous Casting Mold

CALCULATING THE NUMBER OF TWIN PRIMES WITH SPECIFIED DISTANCE BETWEEN THEM BASED ON THE SIMPLEST PROBABILISTIC MODEL

KOEBE DOMAINS FOR THE CLASSES OF FUNCTIONS WITH RANGES INCLUDED IN GIVEN SETS

QIP Course 10: Quantum Factorization Algorithm (Part 3)

Information Retrieval Advanced IR models. Luca Bondi

As is natural, our Aerospace Structures will be described in a Euclidean three-dimensional space R 3.

7.2.1 Basic relations for Torsion of Circular Members

Stanford University CS259Q: Quantum Computing Handout 8 Luca Trevisan October 18, 2012

Safety variations in steel designed using Eurocode 3

Recent Advances in Chemical Engineering, Biochemistry and Computational Chemistry

Handout: IS/LM Model

A New Method of Estimation of Size-Biased Generalized Logarithmic Series Distribution

Chapter 5 Linear Equations: Basic Theory and Practice

Experiment I Voltage Variation and Control

Supplementary material for the paper Platonic Scattering Cancellation for Bending Waves on a Thin Plate. Abstract

Hydroelastic Analysis of a 1900 TEU Container Ship Using Finite Element and Boundary Element Methods

Mean Curvature and Shape Operator of Slant Immersions in a Sasakian Space Form

Concomitants of Multivariate Order Statistics With Application to Judgment Poststratification

Estimation of the Correlation Coefficient for a Bivariate Normal Distribution with Missing Data

arxiv: v1 [physics.pop-ph] 3 Jun 2013

On the Poisson Approximation to the Negative Hypergeometric Distribution

Transcription:

Web-based Supplementay Mateials fo Contolling False Discoveies in Multidimensional Diectional Decisions, with Applications to Gene Expession Data on Odeed Categoies Wenge Guo Biostatistics Banch, National Institute of Envionmental Health Sciences Reseach Tiangle Pak, NC 27709, U.S.A. email: wenge.guo@gmail.com and Sanat K. Saka Depatment of Statistics, Temple Univesity, Philadelphia, PA 19122, U.S.A. email: sanat@temple.edu and Shyamal D. Peddada Biostatistics Banch, National Institute of Envionmental Health Sciences Reseach Tiangle Pak, NC 27709, U.S.A. email: peddada@niehs.nih.gov 1

1 Web Appendix A: Poof of Theoem 1 We begin by calculating the pue diectional (), E { } S R 1. Let I1 = {1 j m : δ j 0} be the set of indices of false null hypotheses, then S can be expessed as S = ( q ( I P ij R )) j I 1 qm α, T ijδ ij 0, i=1 whee I( ) is indicato function. Thus { } { } S E (S R) df DR = E = E R 1 R 1 j I = E 1 P { ( q i=1 P ij R α, T qm ijδ ij 0 ) } R R 1 { m q ( = P P ij )} qm α, T ijδ ij 0, R = 1 =1 j I 1 m q =1 j I 1 i=1 1 P i=1 { The inequality follows fom the Bonfeoni inequality. P ij } qm α, T ijδ ij 0, R =. (A.1) Fo any given i and j, without loss of geneality, we assume δ ij 0. When δ ij > 0, we have { P P ij } qm α, T ijδ ij 0, R = { = P P ij } qm α, T ij 0, R = { P F ij (T ij, 0) } 2qm α, R = { ( ) } = P T ij Fij 1 2qm α, 0, R =, (A.2) whee Fij 1 (, 0) is the invese function of F ij (, 0). The inequality in the above calculations follows fom the definition of P ij and the assumption F ij (0, 0) = 1 2. Noting that T j = (T 1j,, T qj ), j = 1,, m, ae independent of each othe, the last pobability in (A.2) can be simplified to { ( )} P T ij Fij 1 2qm α, 0 P ( R ( j) = 1 ) ( ( ) ) = F ij Fij 1 2qm α, 0, δ ij P ( R ( j) = 1 )

2 ( ( ) ) F ij Fij 1 2qm α, 0, 0 P ( R ( j) = 1 ) = 2qm α P ( R ( j) = 1 ), (A.3) whee R ( j) denotes the numbe of ejections in the stepup pocedue with citical constants α k = k+1 m α, k = 1,, m 1 based on {P 1,, P m } \ {P j }. The above inequality follows fom the assumption that F ij (, δ ij ) is stochastically inceasing in δ ij 0. Similaly, when δ ij = 0, we have { P P ij } qm α, T ijδ ij 0, R = { = P P ij } qm α, R = { = P P ij } qm α, R( j) = 1 qm α P ( R ( j) = 1 ). (A.4) The last inequality follows fom the fact that the two-sided p-value P ij satisfies the condition (2) when δ ij = 0. Using (A.2) (A.4) in (A.1), we have df DR m q =1 j I 1 i=1 α qm P ( R ( j) = 1 ) = m 1 m α. (A.5) Noting that the pooled p-values P j, j = 1,, m, satisfy the condition (2), then fo independent p-values P j s, the usual of the q-dimensional diectional BH pocedue satisfies the following inequality, F DR m 0 m α ; (A.6) see Benjamini and Hochbeg (1995), Benjamini and Yekutieli (2001) o Saka (2002). Combining (A.5) and (A.6), we have mdf DR = F DR + df DR m 0 m α + m 1 m α = α, (A.7)

and hence the poof is complete. 3

4 Web Appendix B: Some Additional Simulation Results In addition to evaluating the pefomance of Pocedue 1, we also evaluated the pefomance of Pocedue 2 in the same simulation study. Web Figue 7 pesents the simulated, and m and Web Figue 8 pesents the aveage powe of Pocedue 2 plotted against the numbe of false null hypotheses fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8. Compaing Figue 1 with Web Figue 7 and Web Figue 1 with Web Figue 8, we find that both pocedues, Pocedue 1 and Pocedue 2, pefom similaly. We also used a simulation study to evaluate the pefomance of Pocedue 2 unde dependence within genes. We geneated m independently distibuted (q + 1)-dimensional andom nomal vectos Z 1,..., Z m, whee the components Z ij, j = 1,, q + 1 in each Z i with Z ij N(µ ij, 1), ae dependent with compound symmety stuctue o autoegessive ode one stuctue (AR(1)), espectively, and have a coelation paamete ρ. Let δ ij = (µ i,j+1 µ ij )/ 2, i = 1,..., m; j = 1,..., q. Out of the m paamete vectos δ i = (δ i1,..., δ iq ), i = 1,..., m, m 0 wee set to a null vecto each, and all the δ ij s in 50%, 25% and 25% of the emaining m m 0 δ i s wee selected andomly fom the intevals ( 0.75, 0.75), ( 4.25, 2.75) and (2.75, 4.25) espectively. Fo each i = 1,, m, and j = 1,, q, the statistic T ij = (Z i,j+1 Z ij )/ 2 fo testing H j 0i : δ ij = 0 vs. H j 1i : δ ij 0 and the coesponding two-sided p-value P ij = 2 {1 Φ ( T ij )} wee then computed, whee Φ( ) is the standad nomal cdf. The pooled p-values wee calculated accoding to the Simes test fo each i = 1,..., m, and Pocedues 2 wee applied to thei espective lists of pooled p-values fo testing the m null hypotheses descibed in (1). Simila to the above simulation study, the simulated values of the, and m wee obtained by epeating the simulation steps 10,000 times. Web Figues 9 and 11 povide the simulated, and m of Pocedue 2 plotted against the numbe of false null hypotheses fo m = 1, 000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3 unde dependence within genes accoding

5 to compound symmety stuctue and AR(1) stuctue, espectively. The aveage powe of Pocedue 2 unde the above dependence stuctues, ae povided in Web Figues 10 and 12, espectively. As seen fom Web Figues 9 and 11, the simulated m of Pocedue 2 is seveely affected by dependence within genes, but it is still below the pe-specified level. In addition, as seen fom Web Figues 10 and 12, thee is no monotone elationship between the aveage powe of Pocedue 2 and coelation paamete ρ. [Figue 1 about hee.] [Figue 2 about hee.] [Figue 3 about hee.] [Figue 4 about hee.] [Figue 5 about hee.] [Figue 6 about hee.] [Figue 7 about hee.] [Figue 8 about hee.] [Figue 9 about hee.] [Figue 10 about hee.] [Figue 11 about hee.] [Figue 12 about hee.]

6 LIST OF FIGURES 1 Powe of Pocedue 1 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8. 2 Pefomance of Pocedue 1 unde dependence acoss genes in tems of its contol of the (solid), (dashed) and m (dotted) fo m = 1000, q = 5, α = 0.05, ρ = 0, 0.2, 05 and 0.8, and δ = (100, 0,..., 0). 3 Standad deviation of the m of Pocedue 1 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05, ρ = 0, 0.2, 05 and 0.8, and δ = (100, 0,..., 0). 4 A numeical compaison of Pocedues 1 and 2 and the no-adjustment pocedue in tems of the contol of the, and m and also powe fo m = 1200, q = 4, ρ = 0, and α = 0.05. 5 Pefomance of Pocedue 1 with espect to the dimension q in tems of its contol of the, dfrd and m fo m = 1000, m 0 = 600, α = 0.05 and ρ = 0. 6 Powe pefomance of Pocedue 1 with espect to the dimension q fo m = 1000, m 0 = 600, α = 0.05 and ρ = 0 when testing H 0i : δ i = 0 vs. H 1i : δ i 0, whee i = 1,, 1000, δ i = (δ i1,..., δ iq ) and all the δ ij s in 200, 100 and 100 of the 400 non-null δ i s wee selected andomly fom the intevals ( 0.75, 0.75), ( 4.25, 2.75) and (2.75, 4.25) espectively. 7 Pefomance of Pocedue 2 unde dependence acoss genes in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.2, 05 and 0.8. 8 Powe of Pocedue 2 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8.

7 9 Pefomance of Pocedue 2 unde dependence within genes with compound symmety stuctue in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.1, 0.2 and 0.3. 10 Powe of Pocedue 2 unde dependence within genes with compound symmety stuctue fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3. 11 Pefomance of Pocedue 2 unde dependence within genes with AR(1) stuctue in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.1, 0.2 and 0.3. 12 Powe of Pocedue 2 unde dependence within genes with AR(1) stuctue fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3.

8 Aveage Powe 0.1 0.2 0.3 0.4 0.5 ho = 0 ho = 0.2 ho = 0.5 ho = 0.8 1000 Figue 1. Powe of Pocedue 1 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8.

9 ho = 0 ho = 0.2 0.00 0.02 0.04 0.00 0.02 0.04 ho = 0.5 ho = 0.8 0.00 0.02 0.04 0.00 0.02 0.04 Figue 2. Pefomance of Pocedue 1 unde dependence acoss genes in tems of its contol of the (solid), (dashed) and m (dotted) fo m = 1000, q = 5, α = 0.05, ρ = 0, 0.2, 05 and 0.8, and δ = (100, 0,..., 0).

10 ho = 0 ho = 0.2 Standad deviation 0e+00 4e!04 8e!04 Standad deviation 0e+00 4e!04 8e!04 ho = 0.5 ho = 0.8 Standad deviation 0.0000 0.0010 0.0020 Standad deviation 0.0000 0.0010 0.0020 Figue 3. Standad deviation of the m of Pocedue 1 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05, ρ = 0, 0.2, 05 and 0.8, and δ = (100, 0,..., 0).

11 Simes adjustment Bonfeoni adjustment 0.00 0.02 0.04 0.06 m 0.00 0.02 0.04 0.06 m 0 200 600 1000 0 200 600 1000 No adjustment Powe compaison 0.00 0.05 0.10 0.15 0.20 m Aveage Powe 0.2 0.4 0.6 0.8 SA BA NA 0 200 600 1000 0 200 600 1000 Figue 4. A numeical compaison of Pocedues 1 and 2 and the no-adjustment pocedue in tems of the contol of the, and m and also powe fo m = 1200, q = 4, ρ = 0, and α = 0.05.

12 0.00 0.01 0.02 0.03 0.04 0.05 0.06 m 5 10 15 dimension (q) Figue 5. Pefomance of Pocedue 1 with espect to the dimension q in tems of its contol of the, dfrd and m fo m = 1000, m 0 = 600, α = 0.05 and ρ = 0.

13 Aveage Powe 0.30 0.35 0.40 0.45 0.50 0.55 5 10 15 dimension (q) Figue 6. Powe pefomance of Pocedue 1 with espect to the dimension q fo m = 1000, m 0 = 600, α = 0.05 and ρ = 0 when testing H 0i : δ i = 0 vs. H 1i : δ i 0, whee i = 1,, 1000, δ i = (δ i1,..., δ iq ) and all the δ ij s in 200, 100 and 100 of the 400 non-null δ i s wee selected andomly fom the intevals ( 0.75, 0.75), ( 4.25, 2.75) and (2.75, 4.25) espectively.

14 ho = 0 ho = 0.2 0.00 0.02 0.04 m 0.00 0.02 0.04 m ho = 0.5 ho = 0.8 0.00 0.02 0.04 m 0.00 0.02 0.04 m Figue 7. Pefomance of Pocedue 2 unde dependence acoss genes in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.2, 05 and 0.8.

15 Aveage Powe 0.1 0.2 0.3 0.4 0.5 0.6 ho = 0 ho = 0.2 ho = 0.5 ho = 0.8 1000 Figue 8. Powe of Pocedue 2 unde dependence acoss genes fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.2, 0.5 and 0.8.

16 ho = 0 ho = 0.1 0.00 0.02 0.04 0.06 m 0.00 0.01 0.02 0.03 0.04 m ho = 0.2 ho = 0.3 0.000 0.010 0.020 m 0.000 0.004 0.008 m Figue 9. Pefomance of Pocedue 2 unde dependence within genes with compound symmety stuctue in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.1, 0.2 and 0.3.

17 Aveage Powe 0.0 0.1 0.2 0.3 0.4 0.5 0.6 ho = 0 ho = 0.1 ho = 0.2 ho = 0.3 1000 Figue 10. Powe of Pocedue 2 unde dependence within genes with compound symmety stuctue fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3.

18 ho = 0 ho = 0.1 0.00 0.02 0.04 0.06 m 0.00 0.01 0.02 0.03 0.04 m ho = 0.2 ho = 0.3 0.000 0.010 0.020 m 0.000 0.004 0.008 m Figue 11. Pefomance of Pocedue 2 unde dependence within genes with AR(1) stuctue in tems of its contol of the, and m fo m = 1000, q = 5, α = 0.05, and ρ = 0, 0.1, 0.2 and 0.3.

19 Aveage Powe 0.0 0.1 0.2 0.3 0.4 0.5 0.6 ho = 0 ho = 0.1 ho = 0.2 ho = 0.3 1000 Figue 12. Powe of Pocedue 2 unde dependence within genes with AR(1) stuctue fo m = 1000, q = 5, α = 0.05 and ρ = 0, 0.1, 0.2 and 0.3.