Tel Aviv University. Improved Multiple Test Procedures for Discrete Distributions: New Ideas and Analytical Review

Size: px
Start display at page:

Download "Tel Aviv University. Improved Multiple Test Procedures for Discrete Distributions: New Ideas and Analytical Review"

Transcription

1 Tel Aviv University The Raymond and Beverly Sackler Faculty of Exact Sciences Department of Statistics and Operation Research School of Mathematical Sciences Improved Multiple Test Procedures for Discrete Distributions: New Ideas and Analytical Review Thesis submitted in partial fulfillment of graduation requirements For the degree of M. Sc. In Applied Statistics By Roee Gutman Prepared under the supervision of Prof. Yosef Hochberg September

2 Acknowledgment Thanks to Prof. Yosef Hochberg from the Raymond and Beverly Sackler Faculty of Exact Sciences, Department of Statistics and Operation Research, School of Mathematical Science, Tel Aviv University for his long lasting mentorship, supportive originality and ideas, by which he guided me through the world of statistics. Special thanks to Prof. Yoav Benjamini and to Dr. Felix Abramovich from the Raymond and Beverly Sackler Faculty of Exact Sciences, Department of Statistics and Operation Research, School of Mathematical Science, Tel Aviv University for their willingness to help with this project. Their meritorious comments and advice are deeply appreciated. 2

3 Abstract Hypothesis testing is one of the basic problems in statistics. The solution that is usually used for this problem is the likelihood ratio test (LRT). LRT has several well known advantages. However when the significance level is predefined and the power is maximized, it is not always the most powerful test. Furthermore, when discrete distributions are used other methods should be considered as well. This study explores some of the options for single hypothesis testing when the underlying distribution is discrete. The power issue in multiple hypotheses is of utmost importance. Procedures dedicated to multiple comparisons, when the underlying distribution of the test statistics is discrete, enable a gain in power. We review and compare most of the existing non-randomized multiple hypotheses testing procedures for discrete distributions. In addition, three new procedures are proposed: TWW k, stepwise TWW k and an expansion of Paroush Integer Programming (IP) (1969) procedure for multiple tests. All procedures (the new and the existing ones) are compared to each other using mathematical analysis. If mathematical analysis did not define a universally most powerful test, simulation analysis was performed using several test cases where procedures for multiple hypotheses for discrete distributions should be used. This work makes it clear that discrete analysis procedures should be used with multiple hypotheses testing in order to gain power. Additionally, this work more precisely couples the most appropriate discrete procedures with the specific test case. 3

4 Contents: 1. Introduction. 2. New thoughts about discrete single hypothesis. 3. A comparative review of old and new ideas in multiple testing of discrete distributions. 3.1 Single step methods. 3.2 A newly proposed single hypothesis method. 3.3 Comparison between single step multiple hypothesis methods. 3.4 Stepwise methods. 3.5 A newly proposed stepwise method. 3.6 Comparison between stepwise multiple hypothesis methods. 3.7 Global hypothesis testing. 3.8 A newly proposed global hypothesis testing. 4. Applications of the multiple testing procedures. 4.1 Example 1 cdna transcripts. 4.2 Example 2 Animal carcinogenicity test. 4.3 Example 3 Efficacy of a respiratory therapy. 4.4 Example 4 Relationship between DVT and 3 genetic factors. 5. Applications of the multiple testing procedures to simulated data sets. 5.1 Independent test statistics. 5.2 Independent trend simulation. 5.3 The dependent case of multinomial sampling. 5.4 The dependent case of extended multinomial hyper-geometric distribution. 6. Discussion. 7. References. 8. Appendices. 4

5 Introduction: Hypothesis testing is one of the most thoroughly explored problems in statistics. In testing a hypothesis, one wishes to decide, based on observations X, whether or not a hypothesis that has been formulated prior to observing X is correct. The choice is dichotomized between accepting or rejecting the hypothesis. The procedure that solves such a problem is called a test of the hypothesis in question. A non-randomized test procedure assigns to each possible value (x Range(X)) one of the decisions either accept or reject H 0. The values may be then classified into two regions S 0 and S 1. If x falls into S 0 the hypothesis H 0 is accepted, otherwise it is rejected. This paper thesis deals exclusively with non-randomized tests. When performing a test, one may arrive at the correct decision, or may commit one of two errors: rejecting the hypothesis when it is true or accepting it when it is false. It is desired to carry out a test in a manner that keeps the probabilities of the two types of errors to a minimum. Unfortunately, for a given number of observations, it is impossible to simultaneously minimize both probabilities. It is customary, therefore, to assign a bound to the probability of incorrectly rejecting H 0 when it is true, and to attempt to minimize the other probability subject to this condition. Neyman & Pearson proposed the likelihood ratio test (LRT). This test has a number of desired properties; it is easy to apply, it leads to definite and reasonable conclusions, and it possesses various pleasant large sample properties. In view of these properties, the test seems to be universally satisfactory. There are, however, scenarios, in which the LRT is unsatisfactory, and may be even useless. This can be demonstrated by the following example given by E.L. Lehmann (1950): Null /2 /2 ½ - ½ - Hypothesis Alternatives p*c (1-p)*c (1-c) (1-c) (1-c) * (½- ) * (½- ) * (1- ) (1- ) (1- ) and c are constants 0< 1/2, /(2- )<c< and p ranges over interval [0,1]. It is desired to test the null hypothesis at significance level. The LRT rejects H 0 when x = +2 or x =-2. Hence its power against each alternative is c. Since c<, this test is, literally, worse than useless, because a test with power can be obtained without observing X at all, simply by the use of random numbers. The value of a test is significantly improved if it rejects H 0 when x=0 while acquiring the power 5

6 (1-c)/(1-)* >, so that a reasonable test for the hypothesis does exist. This kind of examples made it necessary to improve the existing methods for discrete distributions in a way that will not rely on the LRT. In the following sections some existing procedures will be reviewed and some original new multiple comparisons methods will be presented. A power comparison between these new and the existing methods will be performed, and be utilized in a critical review of all new and existing methods. Section two raises some thoughts about methods for discrete distributions. Section three presents some of the existing multiple comparisons methods and suggests new ones. In addition mathematical comparison between the methods presented is applied. Section four displays several test cases where discrete methods for multiple comparisons should be considered. Section five includes power comparison, using simulation analysis, between the methods for which mathematical analysis came out inconclusive. Section six summarizes the results and makes suggestions for the methods that should be used for discrete distributions. 6

7 2. New Thoughts about Discrete Single Hypothesis: Discrete single testing contains a finite sample space with N outcomes (sample points). To each point, two numbers are attached P i and Q i, which are the probability of point i under two alternative hypotheses H 0 and H 1 N ( P, Q 0 P Q 1) i i i i i1 i1 N The testing also includes a test statistics t and a decision rule, such as t t() for accepting or for rejecting H 0. The problem at this stage is to construct a non-randomized most powerful test for H 0 in favor of H 1, where the probability of rejecting H 0, when it is true, is at most. A simple solution to this problem may be the likelihood ratio test (LRT) that can be obtained from many elementary textbooks (e.g. Lehmann E. L. 1986, Ferguson 1967). The points are ranked according to R i = Q i /P i, then each point is associated to the rejection region, if and only if (i) is smaller than, or equal to, a given constant c, so that c is the maximal number that P[ i] (where P[1] P[N] are ordered by R i ). The Neyman Pearson basic lemma states that the LRT for H 0 against H 1 with a constant level of significance is more powerful than any other test with the same or smaller level of significance (for a given number C, the LRT is the uniformly most C powerful among all tests with P[ i] ). However, the theorem does not imply that the LRT is i1 the most powerful test when the significance level is predefined. This point is illustrated by the following example: Table 1 Sample points ranked by R i Total P i Q i R i From this table, the points within the rejection region can easily be derived. The differences between LRT and non-randomized most powerful testing are depicted in the next table. C 1 7

8 Points within the rejection region Size of the test LRT Non-randomized most powerful test 5% 1, 2, 3 1, 2, 4 10% 1, 2, 3, 4 1, 4, 5 The power of the non-randomized most powerful test is 3% higher than the LRT when = 5%. It increase to 5%, when = 10%. The difference between these two tests stems from the fact that is selected a-priori and that the sample space is discrete and has a finite number of points. To obtain the most powerful test one has to compare no more than 2 N potential rejection areas; however, this method is inefficient and cumbersome. Paroush (1969) devised a method that uses integer programming (IP) to solve this problem. He transformed the statistical test into a form of linear programming: Max subject to i X i Q i X i P and X 0, 1 for all i=1,..,n i Where P i s share the probabilities to receive the value associated with X i under the null hypothesis. Q i s are the probabilities to receive the value associated with X i under the alternatives. The sample point k will be in the rejection region of the test, if and only if, the associated X k will be in the optimal solution. Example: H 0 : Y~B(10,0.5) H 1 :Y~B(10,0.3) The probability of type I error = 0.05 Using the Neyman Pearson LRT the points that are included in the rejection region will be {0, 1}. The power of the test will be and the probability to commit type I error will be Using IP the rejection region will consist of the points {0, 2, 10}. The power of the test will be and the type I error will be The later result improves the power of the test, while exhausting the level to its end. However, this result contains an illogical point (point 10). We expect that if the sample result was 10, it would imply that the probability of success is either higher than or equal to 0.5, and not smaller as claimed by H 1. This outcome occurs since IP doesn t take into account the direction of the rejection region and tries to exhaust to its end. In order to make some sense of the results received from IP, one may change the target function; minimizing a function of the two types of errors instead of minimizing type II error (maximizing the power). This may be entertained by using an increasing function [i.e. Q i X i + (1-X i ) * P i ]. Maximizing this function will minimize both types of error while controlling type I error at level 8

9 not larger than. The rejection region includes, now, only the points {0, 2}, a result that seems more logical. Another example for comparing the LRT to IP, will be the Fisher s tea drinking lady problem. A British woman claimed to be able to distinguish whether milk or tea was added to the cup first. To test her claim, she was given eight cups of tea, in four of which milk was added first. She was told that there were four cups of each type. The results were recorded in the following table: Truth \ Before After Sum Lady says Before X 1 4 After X 2 4 Sum In this problem the alternative hypothesis is a compound hypothesis. However, the equation suggested by Paroush requires the exact distribution under the alternative. This problem can be overcome quite easily by simply replacing the Q i by P i, thus achieving a test with type I error <, and maximize the size of the test (P{XS 1 } under the null hypothesis. S 1 is the rejection region). Fisher s tea drinking lady problem can be solved using several types of probabilistic models. One type of model for this problem will be the binomial distribution for the total number of successes (X 1 + X 2 ). The probability that the lady will classify the right cup will than be equal for both groups. Under the null hypothesis (X 1+ X 2 )~B(8, 0.5) the probability will be as follows The test of interest for the researchers was to discover if the lady has the power to decide whether the milk was poured before or after the tea. This question leads to a one sided test. However, it might also be of interest, if the lady could classify the glasses in the opposite direction (can classify the cups but into the wrong groups). Under these circumstances the test is two sided, but with a smaller interest in one side. One can suggest the following rejection region: reject the null hypothesis if it falls into the region {0, 1, 8} when the significance level is This is giving a stronger probability to reject one side of the test, while still keeping the option to reject the null hypothesis by an opposite surprising result. This test controls the type I error < and receives higher power than the LRT. Another way to solve the problem will be by using the Fisher Exact procedure (Agresti 1990) that relies on the hypergeometric probability. The one-sided rejection region received in this case 9

10 by IP is {0, 4} and by LRT is {0} when = In this case Pr(X 1 =0) = Pr(X 1 =4) and in case of two-sided test the rejection region for both IP and LRT will be similar. However, if we change the problem a bit and take 13 glasses for each group instead of 4 and stay with the same level of significance, the IP rejection region will consist of {0,1, 2, 4, 12, 13} retrieving a type I error of , while the LRT rejection region for one sided test will consist only of {0, 1, 2, 3} retrieving a type I error of , and for two-sided test will consist of {0, 1, 2, 3, 11,12, 13}; type I error of IP produces a significantly larger rejection region than LRT. It is of note, that the IP alone may yield several rejection areas with an equal type I error (e.g. in the last example the region {0, 1, 2, 9, 12, 13} has the same probability for type I error). The researcher needs to decide up-front which hypothesis is in question. If it is the lady has no power (H 0 ) vs. the lady has the power (H 1 ) to claim the identity of the glasses, the researcher should choose the one sided test, that its rejection region will consist of {0, 1, 2, 4, 12, 13}. On the other hand, if the researcher wants to test weather the lady can distinguish between the glasses, even if the group identity is wrong, his rejection region will consist of {0, 1, 2, 9, 12, 13}. Another disadvantage that may arise when using the IP method for hypothesis testing is the lack of alpha consistency (AC). Hypothesis that is accepted at a given level maybe rejected at a lower level. This can be seen in Table 1 where point 2 is included in the rejection region if = 0.05 but is excluded from the rejection region when = A Comparative Review of Old and New Ideas in Multiple Testing of Discrete Distributions The development and use of procedures for multiple testing for discrete distributions are more imperative than for single testing, since in multiple testing the gain in power is much more crucial. The next section will elaborate on multiple comparisons methods where the sampling distribution is discrete. 3.1 Single Step Methods Consider an animal carcinogenicity experiment. The experiment includes J+1 groups j=0, 1,,J. Where group j=0 is the control group. In each group there are n j animals. We define n ji to be the number of animals in group j that their i th organ (i=1,..,i) was available for histopathological examination, and x ji corresponds to the number of these n ji in which tumor was discovered in the i th organ. The purpose of the experiment is to determine if each of the J experimental groups 10

11 differs from the control group, in the rates of occurrence of tumor discovery at one or more of the I sites. This problem is described in the literature as a multiple testing problem. In general, the problem involves a family of hypotheses H 01,,H 0N (alternative H 11,,H 1N ). The hypotheses are tested simultaneously and a multiple level has to be controlled. A valid procedure to solve this problem will maintain strong control of the familywise error rate (FWE) at its nominal level. (i.e. the probability of rejecting at least one true H 0i (i=1,..,n) is at most no matter which and how many H 0i are true (Hochberg and Tamhane, 1987). A simple way to solve the question is to use the Bonferroni method. This method rejects all hypotheses with p-values less than or equal to /n. If the underlying distribution is continuous, the p-values are uniformly distributed on [0,1] under the null hypothesis (H 0i ). For discrete data statistics, however, there actually exists a smallest attainable p-value i * for each hypothesis. Gart, Chu and Tarone (1979) noted, that the number of significance tests could be reduced by eliminating those tests for which the smallest p-value is higher than ( i * > ). Tarone (1990) improved this idea by noting that even for hypotheses with /n < i * < rejection may never be possible. At each site of the I possible sites, a significance test can be performed (the sites are indexed by i). For each integer k, define R k = {H i : k i * < } (the set of sites satisfying k i * < ) and m(k) = R k, where is the nominal significance level and i * is the minimum achievable level at site i. Thus m(1) is the number of sites that can be rejected at the nominal level. If m(1) > 1, a correction for multiple comparisons must be considered. Gart et al (1970) and Mantel (1980), noted that the denominator in Bonferroni test can be reduced from I to m(1). In many cases the correction factor can be further reduced. Claim 0: For any integer k < m(1), m( k1) m( k) and m[ m(1)] m(1) This can be seen quite easily since if Hi Rk 1 then Hi Rk ( H R 1 * * i k H R ), thus R i k1 i k i k k1 Rk. Using the same reasoning it can be deduced that m[ m(1)] m(1). From Claim 0 it stems that if the correction factor will be m(1) there may exists some H i such that their * i. Thus we will not be able to reject them whatever their p-value will be. By m(1) excluding those hypotheses the correction factor can be reduced until we reach the smallest 11

12 number k such that m( k) k. Define K to be the smallest value of k such that m( k) k. This reduction will only have effect when dealing with discrete data, since in continuous data m(1) = m(2) =.= m(i) = I, so that K = I and the usual Bonferroni method is applied. The values K and R k can be determined using only the information in the marginal total. Tarone s procedure rejects H 0i if and only if H 0i is contained in R k and p i < /K, where p i is the observed significance level at site i. From this follows Pr(reject at any site) R Pr(reject at site i) m(k)*/k. k Define i as the largest achievable significance level such that i /K for i=1,,m(1). Using the above modified Bonferroni, we see that R k Pr(reject at site i) = R k i < except when m(k)=k and i = /K for all i in R k. When R k i is considerably less then, Tarone suggested to expand the critical region of the significance test by using marginal information until the largest possible rejection region is obtained (such as adding the tail outcome of smallest probability not included in the rejection area). Unfortunately, Tarone s procedure (T) lacks AC (Roth 1998). The following example will demonstrate the lack of AC in T. If n=5 i = 0.002, 0.024, 0.029, 0.029, 0.07 and the p i = for all I, Then at level = 0.09, K = 4, so that the critical value is 0.09/4 = , and none of the hypotheses is rejected. But at the = 0.05 level, K = 2 and the critical value is so we are able to reject hypotheses 1 and 2. Roth (1998) developed procedure T* that modifies T and achieves AC while simultaneously increasing the power. T* maintains strong control of FWE <. The procedure rejects all H 0i s such that p i < /K* where M = {x[0,1] m(x) x} and K* = inf{xm}. A simple way to construct T* in practice will be to arrange the smallest attainable p- value in an increasing manner * (1) * (n). If m(k)=k then K*=K else K*= / * (K). Roth showed that T* has AC and it is a universal improvement to T. When trying to redefine this procedure for FWE, by simply defining R j ={H i * (i) } and redefining m(j), K, M, K* in terms of the new R j s, this T* procedure is no longer valid. It can be further demonstrated using the previous example with = Specifically; K = 3 and K* = 0.058/0.029, but T* rejects {H 0i p i 0.029}. Since T* is based on Bonferroni, and since there are 4 H 0i s that might be rejected by this rule (i.e. m(k*) = 4 > K*), the FWE can potentially be as large as *4 = 12

13 0.116 > 0.058, so that the validity disappears. Therefore, the procedure will have to be modified and adapted to stand the FWE criterion. The major problem of T*, when the FWE criterion is used, arises from those cases where m(k*) > K*. Hommel and Krummenauer (1998) and Roth (1999) solved this major obstacle by redefining T* as follows: When m(k*)k*, reject {H 0i p i /K*}, and when m(k*) > K*, reject {H 0i p i /K* and * i < /K* }. Roth even suggested another procedure named T k. This procedure which rejects {H 0i R k p i /m(k)}does improve T, and is more powerful than T (since m(k) K), but still lacks AC. Furthermore, it is more powerful than T* for special cases when * (m(k)+1) * (k) and m(k)<k. Nevertheless, T k is not universally more powerful than T*, since T* may reject some hypotheses that are outside of R k. Westfall and Wolfinger (W & W 1997) suggested a different approach based on the full set of possible values for each P i, rather than just on the minimum attainable p-values i * for each P i. They defined adjusted p-values (p j ) as p j = Pr(min P i p j ) where P i refers to the random p- values considered under their null hypotheses. This test is being widely used in the analysis of toxicology data (Heyse & Rom 1988). The justification for using the min(p i ) is that it measures the degree of surprise that an analyst should experience after isolating the smallest p-value from a long list of p-values calculated from a given data set. An additional justification for using the min(p i ) is that the p-values are always on the same scale. If we define p i (i=1,..,k) as the observed p-values of given tests, given that the distribution of the test statistics is discrete, the observed values of the random p-values P i will be {p it : t = 1,,m i } (m i is the maximum available value for the i th test statistic) where Pr(P i p it )=p it. The adjusted value, p j will be the probability that a p-value as small as p j will be observed in the entire study when all null hypotheses are true. Using discreetness p j = 1 - ( 1 k i1 p it ( j) ) where p it(j) = max t {p it : p it p j } if min t {p it } p 0 otherwise For each hypothesis, the procedure computes its adjusted p-value and compares it to FWE=. The procedure assumes independence between the tests, thus making the method rather conservative, although less than the Bonferroni method. The simplest method to bind the true 13

14 values of p j will be to use the Bonferroni inequality. The discrete Bonferroni adjusted p-values are p j = min{ k i1 p it ( j), 1}. 3.2 A Newly Proposed Single Step Method We propose a new method, TWW k, that controls the FWE and incorporates the discreteness of the distribution. This method will use W &W on the set defined by T k. TWW k rejects {H 0i R k p i } where p i = Pr(min P i p j ),{ j H R }. This method controls the FWE. Pr(reject oj k at least one H 0i H C 0 ) = Pr( min {1 (1- P j ) m(k) } H C 0 ) = 1 - Pr( min {1 (1- P j ) m(k) } > 1 jm ( K ) 1 jm ( K ) H C 0 ) = 1- Pr(P j > 1-(1- ) 1/m(K) for all j H C 0 ) = 1 - m( K ) Pr(P j > 1-(1- ) 1/m(K) H 0 C ) 1-{(1- ) 1/m(K) } m(k) (assuming P j ~U[0,1]) =. j1 This method is tested against the existing methods in the following chapters. 3.3 Comparison between Single Step Multiple Hypothesis Procedures This chapter is devoted to the comparison between the methods suggested so far. Some of the methods are universally more powerful than others; for other methods there exist situations where one method is better than other and vice versa. Claim 1: T* is universally more powerful than T. T rejects H 0i when p i /K. Procedure T* rejects H 0i when p i /K* when m(k*)k* and to reject {H 0i p i /K* and i * < /K*} when m(k*) > K*. 1) If m(k) = K than K*=K and T=T*. 2) m(k) < K => m(k) K-1 => /K < (k) * /(K-1) => K*= / (k) * => p i (k) *, but /K < (k) * => T* is universally more powerful than T. Claim 2: T k is universally more powerful than T. T can reject H 0i when p i /K and H 0i R k. T k rejects H 0i when p i /m(k) and H 0i R k. m(k) K than /K /m(k) => T k is universally more powerful than T. Claim 3: None of T k and T* is universally more powerful than the other. A simple example can demonstrate this claim, suppose we have 4 hypotheses that their minimal attained p-values are (0.01, , 0.012, 0.015) K in this case will be equal to 4, m(k) = 3, and 14

15 K*= The p-values attained under H 0i will be (0.016, 0.016, 0.016, 0.016). In this case T k rejects {H 01, H 02, H 03 } while T* rejects none of the hypotheses, when =0.05. If we change the attained p-values to (0.015, 0.012, 0.01, 0.015), keep the same minimal attained p-values and keep the same significance level, all of the hypotheses will be rejected by T*, while T k will only reject {H 01, H 02, H 03 }. Claim 4: Westfall & Wolfinger (WW) method is universally more powerful to T*. I will start with the case of independence: 1) The adjusted p-values that where devised by H & K are p i = min{1,q(p i )*p i }, q = q(p i ) => * (q) p i < * (q+1) 2) The adjusted p-values devised by W & W p i are 1 (1 p ( )) 1 (1 ) q it j pi (q has the same meaning as in 1). The inequality is true since there can be at most q p j which are smaller than p i. 3) If p i = 1 using H & K 1 (1-p i ) q < 1 the adjusted p-values devised by W & W are smaller than the one suggested by H & K. 4) If p i = q(p i )*p i => p i = q*p i using H & K, for each p-value devised by W & W p i 1 (1-p i ) q. Using the Taylor series 1 (1-p i ) q = q* p i 0.5q(q-1)(1-) q-2 p 2 i (where is between 0 and p i )< q* p i =>the adjusted p-values are smaller in W & W than those defined by H & K, thus the method suggested by W & W is much more powerful than T* suggested by (H & K and Roth). In case the p-values are dependent using W & W p i = k i1 p it ( j) q pi = q*p i again smaller than i1 the p-values suggested by H & K, one can conclude that W & W is universally more powerful than T*. Claim 5: TWW k is universally more powerful than T k. This can be shown in a similar way to the proof that was described in the previous section. The adjusted p-value in T k method is p i = m(k) * p i for {i H 0i R k }. The adjusted p-value in TWW k for {i H 0i R k } are p i = 1 - (1-p i ) m(k) under independence and p i = min{ k i1 p it ( j), 1} otherwise. Using the methods mentioned above it can be seen that p i m(k) * p i for all {i H 0i R k }. => TWW k is universally more powerful than T k. 15

16 Claim 6: None of W & W, and the TWW k /T k, is universally more powerful than the others. This can be demonstrated by the following example: There are H 0i i=1,..,4 the minimum achievable level for each hypothesis is 0.01, 0.03, 0.05, 0.05 respectively. Suppose the p-values achieved in the experiment were 0.05, 0.05, 0.05, Using the TWW k /T k when =0.05 than K = 2 and m(k) = 1, thus we will reject H 01 since it s p- value is smaller than, or equal to, and H 01 R k. Using W & W method it is obvious that the adjusted p-value for each hypothesis will be 1-(1-0.05) 4 = => none of the hypotheses is rejected at the 0.05 level. If we change the p-values achieved in the experiment a bit p 1 =0.01, p 2 =0.03, p 3 =0.05, p 4 =0.05, and we assume that the values that can be received under H 01 = {0.01, 0.02, 0.04 }, the results received by TWW k /T k will not change, and we will still reject H 01. However, the adjusted p-values suggested by W&W H 01 = 0.01, H 02 =0.0494, so we are able to reject both H 01 and H 02 when =0.05. Thus, it can be seen, that none of the procedures is universally more powerful than the other. 3.4 Stepwise Procedures: Stepwise methods provide a further increase of the power of multiple testing methods. These techniques are not unique to discrete distributions, but need to be mentioned since it improves the power of the multiple hypothesis tests. The procedure suggested by Westfall and Wolfinger can easily be adapted to stepwise analysis. The p-values are adjusted using the step down technique, by adjusting the smallest p-value according to min(p i ) distribution. The second smallest p-value is adjusted according to the min(p i ) distribution of all the variables excluding the variable whose unadjusted p-value was smallest, and so on. Hommel and Krummenauer (H & K) (1998) developed another step down procedure, which is similar to Holm s (1979) Bonferroni test, but incorporates Tarone s discrete methods (T, T*). This procedure was named TH*: 1) Set I = {1,,n}. 2) For j=1,..,#i define: m I (,j)=#{i I i * /j}, number of hypotheses with indices i I that can be rejected at level /j. K I () = min {j=1,,#i m I (,j) j} and b I () = / K I (). 16

17 3) For i I reject H i iff p i b I () for some 0<. 4) Let J= index set of all hypotheses that have been rejected in step 3. 5) If J is empty stop otherwise set I=I-J and return to step 2. For practical performance of the third step for a specific I = {i 1,,i t } one can apply the single step method suggested by Roth(1999) and H & K (1998) described in section 3.1. Both latter procedures use the step down technique. Roth (1998) described a step up procedure R based on Hochberg s procedure (H). Procedure R is composed of two procedures: procedure L (that is closely related to H), and a component procedure C. R rejects H i if it was rejected by either L or C. Procedure L: 1) Accept the entire p i s that are not in R 1 = {H i i * < }. 2) Order the p i s from highest to lowest p (1),, p (t). 3) Let Q = {jp (j) < /j, p (j) R 1 } define q = min{j Q}. 4) Reject all of the H i R 1 such that p i < /q. Procedure C: 1) Consider only the {H i R k } order the p i s from highest to lowest by p (1),, p (m(k)) if m(k) < K than q (i) = 0 for I = m(k),..,k. 2) For j=1,..,k define p* j = max{{q (j) } {p i H i R j - R k }} 3) Let W= {j p* j < /j} define w = min{j W}. 4) Reject H i if p i < /w. Roth showed that procedure R is valid if H is valid for all subsets of R1 of size q* where q* is defined as the larger of m(k) and max{{0} {i=1,,k-1 M i is not empty}}. The validity of R requires weaker assumptions than those required for H. Pairwise independence suffices in case q*=2 and can be extended to independence of subgroups of size q* for q*>2. Similarly, pairwise TP2 (simply positively correlated) suffices for q*=2 and it is a conjecture that this notion can be extended to subgroups of size q* in case q*>2.. Roth suggested another variation of R (RMOD), which generalizes Rom s (1990) procedure instead of Hochberg s. The arguments for preferring either of the variations (Rom s or Hochberg s procedure) are analogous to the continuous case. Like T and T k R/RMOD also lacks AC. Like T and T k R also lacks AC. 17

18 3.5 A Newly Proposed Stepwise Method Using the mechanisms described in section 3.2, one can apply W & W stepwise method to the group of p-values with a hypothesis that belongs only to R k. This method has properties similar to those of TWW k (lack of AC, universally more powerful than T k and T), but it has a higher power, since we use a stepwise method rather than a single step. 3.6 Comparison between Stepwise Multiple Hypothesis Methods In this section comparison between stepwise methods will be performed. Comparison between single step methods and stepwise methods was not performed since each single step method has a matching stepwise method, which is more powerful. However, there exist situations where a one type of single step method is more powerful than other type of stepwise method. Claim 7: Stepwise Westfall & Wolfinger (WW) method is universally more powerful than TH*. Using the same technique described for single step procedure, at each stage of the stepwise method, it can be shown that W & W stepwise method is universally more powerful than both H & K and Roth T* method. Claim 8: None of Roth s R/RMOD method, W & W stepwise method and stepwise TWW k is universally more powerful than the others. This can be seen using the following simple examples. Suppose we attain four independent hypotheses. Their minimal attained p-values are (0.03, 0.015, , 0.002). The p-values attained under H 0i are (0.06, 0.026, 0.02, 0.023). Using Roth s R/RMOD method, it is clear that the third and fourth hypotheses will be rejected, when = Suppose that all the hypotheses, except the first, can reach p-value of 0.02, then if the stepwise W & W or stepwise TWW k is used none of the hypotheses will be rejected. If we change the problem a bit, and assume that the maximum achieved p-value that is smaller than 0.02 for the second and forth hypotheses is 0.015, than we will reject the second, third and fourth hypotheses using either stepwise W & W or stepwise TWW k. Roth s R/RMOD method still rejects only the third and fourth hypotheses. Head on comparison between stepwise W & W and stepwise TWW k shows that no method is universally more powerful than the other. This can be demonstrated by examples similar to the ones given for the single step methods. 18

19 Claim 9: None of Roth s R/RMOD method, W & W stepwise method and T k /TWW k are universally more powerful than the others. In Claim 8 it had been demonstrated that Roth s R/RMOD is not universally more powerful than W & W stepwise method and vice versa. Now I will demonstrate that Roth s R/RMOD method is not universally more powerful than T k /TWW k. Suppose we attain four independent hypotheses. Their minimal attained p-values are (0.01, , 0.012, 0.015). K in this case will be equal to 4, and m(k) = 3. The p-values attained under H 0i are (0.06, 0.026, 0.016, 0.023). When = 0.05 Roth s R/RMOD method rejects none of the hypotheses while T k /TWW k reject {H 03 }. If we change the attained p-values under H 0i to (0.045, 0.026, 0.016, 0.023) and keep the same minimal attained p-values and the same significance level, Roth s R/RMOD method rejects all four hypotheses while T k rejects only {H 03 }. In order to compare W & W stepwise method to T k /TWW k, I will use the same example given in Claim 6, as can be seen in the first part of the claim W &W will not reject any of the hypotheses, thus stepwise W & W method does not reject any of the hypotheses. However, T k /TWW k rejects H 01. In the second part of the example W &W rejects H 01 and H 02, since stepwise W & W is more powerful than W & W than it rejects at least H 01 and H 02, thus more powerful than T k /TWW k. Claim 10: None of T k /TWW k and TH* are universally more powerful than the others. Using the same example as in Claim 3 it can be seen quite easily that when the p-values attained under H 0i will be (0.016, 0.016, 0.016, 0.016), T k /TWW k will reject {H 01, H 02, H 03 }, while TH* will reject none of the hypotheses when the significance level is If we change the attained p-values to (0.015, 0.012, 0.01, 0.015), keep the same minimal attained p-values and keep the same significance level, all of the hypotheses will be rejected by TH*, while T k /TWW k will still reject only reject {H 01, H 02, H 03 }. Claim 11: Stepwise TWW k is not universally more powerful than neither TH* nor T*. Using the same example described in Claim 3 and Claim 10, one can see that since stepwise TWW k is universally more powerful than T k /TWW k it rejects {H 01, H 02, H 03 } in the first part of the example, while TH* and T* do not reject any of the hypotheses. In the second part of the example stepwise TWW k will still reject only {H 01, H 02, H 03 }, while TH* and T* rejects all of the hypotheses. The results of the statistical power comparisons of all single step and stepwise methods suggested so far, is schematically depicted in Fig 1. 19

20 3.5 Global Hypothesis Test: In evaluating results of several hypotheses testing, the first step, usually, is to test the global null m hypothesis H0 = H0i. The rejection of the global hypothesis leads to the conclusion that at least i1 one of the individual hypotheses is false. On using a stepwise method with global hypothesis testing, one can reach conclusions on all of the individual hypotheses. Rom (1992) devised a procedure to reject the global null hypothesis for discrete distribution. He took advantage of the discreetness of the joint distribution by evaluating the probability: {P (1) <p (1) } or {P (1) =p (1) } {P (2) <p (2) } or,,or {P (1) =p (1) } {P (2) =p (2) } {P (n) p (n) } where p (i) is the ordered p-value received in the test. This probability is the overall significance of the observed p-values and is compared to for testing the global null hypothesis. This probability is smaller than or equal to P (i) p (i) the inequality that is used in the Bonferroni procedure and some of its modifications. The multiple test procedure can also be put in the following simple form: reject the global hypothesis if: {p (1) <c 1 }or{p (1) =c 1 } {p (2) <c 2 }or,,or{p (1) =c 1 } {p (2) =c 2 }. {p (n) c n } The critical points c 1,,c n can be computed exactly if we know the underlying distribution or via Monte Carlo. As can be easily seen, Rom actually calculates the exact probability of P (1) p (1) (which is the min(p i ) probability). In order to make conclusions on the single hypothesis we can use the method suggested by W & W, and thus receive a shortcut from the full closure test. H & K suggested a different method that tests the global hypothesis for discrete distributions. Their method is based on the Rüger test (Rüger 1978): Choose an integer s, 1sn in advance; reject H 0 iff p (s) s /n The method uses the same principle of T* on the Rüger test. H & K defined m s (,j)=#{i I i * (s)/j}, K s () = min {j=s,,n m s (,j) j} and b s () = (s)/ K s (). The rejection rule will be: reject H i iff p (s) b s () for some 0<. The algorithm can be constructed in the following way: 1) Choose r, s rn such that, r = s and 0< * n+1 or s< r < n and ((r-1)* r )/s <(r* r+1 )/s or 20

21 r=n and ((n-1)* n )/s <1 2) Then K s ()=r and b s () = (s)/r 3) Reject H 0 iff p (s) (s)/r or p (s) < * r. This procedure does not provide any means to make decisions on individual hypothesis. One can apply the test on the full closure test (Markus et al 1976) for all intersection hypotheses. 3.7 A Newly Proposed Global Hypothesis Testing Methods To test the global null hypothesis an expansion of Paroush IP method can be applied: Define vector P as the vector of all probabilities P i1, i2,..., in = Pr(i 1 =a 1,,i n =a n ) under the null hypothesis. Define Q i1, i2,..., in = Pr(i 1 =q 1,,i n =q n ) under the alternative hypothesis. Define vector X N,1 (N= size of sample space). The test will be Max subject to i X i Q i X i P and X 0, 1 for all i=1,..,n The sample point k will be in the rejection region of the test, if and only if, the associated X k will be in the optimal solution. As with the single hypothesis test, if there exists no a-priori knowledge of Q, and a complex hypothesis (one sided/two sided) is being applied, than by defining Q = P a level test is achieved. In order to make decisions on individual hypotheses, there is a need to use the full closure test for all intersection hypotheses. A small shortcut can be made using the following method: 1. Run the procedure on the global null hypothesis, the results will be a group of points in the N dimensional space. Define this group of points as T = {(t 1,,t n ) (t 1,,t n ) chosen by IP method}. 2. When running the IP procedure on each intersection hypothesis ( H 0,...,H 0 ), apply it i i 1 i K only to the group of points X = ( x,..., i x 1 i ), such that there is at least one K pointtt ( t..., t ) x t x t... x t. 1 N i1 i1 i2 i2 ik ik When applying the IP algorithm, one may receive several groups of points that control the same level. Only one group of points is chosen for the test. The question is which group of points should be used. One approach, when applying this algorithm to a complex hypothesis, is to choose the group with a larger number of points. The reason for this approach is that under the null hypothesis these groups have the same probability of being rejected. However, under the alternative, these points might have a higher probability thus receiving higher power. This group can be achieved easily using a small change in the IP equation: 21

22 Max ( Q ) X i i subject to P i X i and X i 0, 1 for all i=1,..,n Where is equal to the smallest absolute value of the differences between the all the pairs of Q i, divided by N. A major disadvantage of this procedure is the great amount of time and computer power needed in order to apply this calculation at each step of the closure. This problem increases exponentially as the number of hypotheses increases. When the hypotheses are dependent the exponent increases much faster (depending on the computer configuration and software used). (Fig 2, Fig 3) 22

23 4. Applications of the Multiple Testing Procedures. This section includes several experiments where discrete multiple methods should be used. All discrete multiple methods excluding IP + closure and H & K improvement to the Rüger + closure test are applied. IP +closure cannot be used due to the great amount of time and computer memory needed for this procedure (it was applied only in Example 2). The Rüger +closure test was not applied due to the relatively arbitrary choice of s which is needed for each of the closure steps. In order to compute the Exact distribution (in Examples 3 and Example 4), of the p-values for TWW k and W & W method, a permutational method (described by Westfall and Young 1993) was computed using 1,000 resampled data sets. 4.1 Example 1 cdna Transcripts: This data set was reported in Tarone (1990). In this experiment complementary DNA (cdna) transcripts are produced from transcribed RNA obtained from cells grown under normal conditions and from cells grown under unusual conditions. The cdna transcripts from a gene of interest are sequenced and compared to a known nucleotide sequence, in order to determine the number of nucleotides in each transcript. The frequencies of the nucleotide changes are compared in transcripts from both the control and the study cells to determine if the transcribed RNA in the study cell differs from that in the control cell. The known sequence may be several nucleotides in length; so that a multiple comparison problem should be addressed. Let N 0i number of transcript in control group, N 1i number of transcripts in study group, X 0i number of observed nucleotide in control, X 1i number of observed nucleotide in study. The p- values are calculated using one-sided Fisher s exact test. Nucleotide X 0i /N 0i X 1i /N 1i Minimal Attained p-value attained p-value 1 1/11 3/ /11 4/ /11 2/ /10 8/ /10 3/ /9 2/ /9 2/ /9 2/ /8 2/

24 The hypotheses that were rejected using the different methods at the different level of significance are summarized in the table below: Significance level\ Method Tarone s T {} {H 04 } {H 04 } T* {} {H 04 } {H 04 } TH* {} {H 04 } {H 04 } W & W {H 04 } {H 04 } {H 04 } Stepwise W & W {H 04 } {H 04 } {H 04 } T k {} {H 04 } {H 04 } Roth s R {} {H 04 } {H 04 } Roth s RMOD {} {H 04 } {H 04 } TWW k {H 04 } {H 04 } {H 04 } Stepwise TWW k {H 04 } {H 04 } {H 04 } * {} None of the hypotheses were rejected. The newly proposed methods (bold) are equal to, or superior to, all other in the 0.01 level of significance. 4.2 Example 2 - Animal Carcinogenicity Test.: This example was also reported in Tarone (1990), and it describes an animal experiment that tested the carcinogenicity of a test compound. Several organs and tissues were examined for the presence of tumor. The experiment included three groups: control (0), low dose (1), and high dose (2). The groups consisted of equally spaced doses. The number of observed tumors was recorded for each type group [animal (mouse, rat), gender (male, female), and tumor site]. A trend statistic of the following form was defined T j = X 0j * 0 + X 1j * 1 + X 2j * 2 were X ij are the number of observed tumors at dose group i, and type group j. Upper-tailed p-values were computed for each type group, using Fisher's exact statistics. 24

25 X 0i /N 0i X 1i /N 1i X 2i /N 2i Minimal attained p-value Attained p-value Male Rat Tissue subcut 2/19 4/50 1/ Liver 0/19 1/50 3/ Kidney 0/19 6/50 13/ x Thyroid: follicular 3/19 1/49 2/ Thyroid: C-cell 1/19 2/49 2/ Thyroid: all 4/19 3/49 4/ x Pituitary 0/16 4/44 1/ Pancreatic islets 1/16 2/50 1/ Hematopoietic 1/17 4/49 2/ Female Rat Liver 2/20 5/49 3/ x Kidney & renal 0/20 1/49 3/ pelvis Pituitary 6/20 10/45 3/ x Thyroid: follicular 1/19 2/49 6/ x Thyroid: C-cell 0/19 6/49 4/ x Thyroid: all 1/19 8/49 10/ x Mammary 7/20 13/48 11/ x Uterus 1/20 1/48 3/ Male mouse Lung 1/18 3/50 3/ Liver 2/18 19/50 44/ x x Kidney 1/18 2/50 3/ Hematopoietic 0/18 1/50 3/ Female mouse Liver 0/20 37/45 39/ x x Multiple organs 0/20 4/46 0/

26 The results are presented in the table below: Sig. level\ Method Tarone s T {Male mouse liver, Female mouse liver} T* {Male mouse liver, Female mouse liver} TH* W & W Stepwise W & W T k Roth s R Roth s RMOD TWW k {Male mouse liver, Female mouse liver} {Male mouse liver, Female mouse liver} {Male mouse liver, Female mouse liver} {Male mouse liver, Female mouse liver} {Male mouse liver, Female mouse liver} {Male mouse liver, Female mouse liver} {Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} Stepwise TWW k {Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} {Male rat kidney, Male mouse liver, Female mouse liver} None of the methods tested including the new ones, was more powerful than the others testing the hypotheses in this example. 4.3 Example 3 - Efficacy of a Respiratory Therapy: The data for this example is from W & W (1997). It is the result of an experiment on the efficacy of respiratory therapy given by Koch, Carr, Amara, Stokes and Uryniak (1990). The analysis of the rating of respiratory health was based on multinomial distribution, such that each group (placebo, active) was selected from multinomial probability, and the objective was to compare the rating categories for the active and placebo groups. 26

27 Number in response category Group Very poor Poor Fair Good Excellent Total Placebo Active The categories that were found significantly different, using the multiple comparison procedures are depicted below (Roth s R/RMOD method was not used this time, since the hypotheses are dependent). Significance level\ Method Tarone s T {} { Very poor } { Very poor } T* {} { Very poor } { Very poor } TH* {} { Very poor } { Very poor } W & W { Very poor } { Very poor } { Very poor } Stepwise W & W { Very poor } { Very poor } { Very poor } T k {} { Very poor } { Very poor } TWW k { Very poor } { Very poor } { Very poor } Stepwise TWW k { Very poor } { Very poor } { Very poor } IP + Closure {} { } { } * {} None of the hypotheses were rejected. Two of the newly proposed methods were equal to or more powerful than all traditional methods. IP, in this example, was inferior (see discussion). 4.4 Example 4 - Relationship between DVT and 3 Genetic Factors: This example comes from an experiment that tested the relationship between Deep Vein Thrombosis (DVT) and three different genetic factors (Fact V, Fact II and MTHFR) (Salamon et. al.1999). The population was divided into Healthy (controls), and those with DVT. Each subject was tested for the presence of one of the three genetic factors. The subjects were then divided into one of the eight available genetic groups (a genetic group is built of the combination of presence or absence of all the three factors). The results of this study were published in Arteriosclerosis, Thrombosis and Vascular Biology (Salomon et. al. 1999). Genetic Group/ Number of patients Healthy controls None Fact V Fact II MTHFR Fact V + Fact II Fact V + MTHFR Fact II + MTHFR DVT All 3 Factors 27

Familywise Error Rate Controlling Procedures for Discrete Data

Familywise Error Rate Controlling Procedures for Discrete Data Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

Optimal exact tests for multiple binary endpoints

Optimal exact tests for multiple binary endpoints Optimal exact tests for multiple binary endpoints Robin Ristl 1, Dong Xi, Ekkehard Glimm 3,4, Martin Posch 1 December 3, 16 arxiv:161.7561v1 [stat.me] Dec 16 Abstract In confirmatory clinical trials with

More information

Chapter Seven: Multi-Sample Methods 1/52

Chapter Seven: Multi-Sample Methods 1/52 Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze

More information

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials Research Article Received 29 January 2010, Accepted 26 May 2010 Published online 18 April 2011 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.4008 Mixtures of multiple testing procedures

More information

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Applying the Benjamini Hochberg procedure to a set of generalized p-values U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure

More information

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018 High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously

More information

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE By Wenge Guo and M. Bhaskara Rao National Institute of Environmental Health Sciences and University of Cincinnati A classical approach for dealing

More information

On Generalized Fixed Sequence Procedures for Controlling the FWER

On Generalized Fixed Sequence Procedures for Controlling the FWER Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.1002/sim.0000 On Generalized Fixed Sequence Procedures for Controlling the FWER Zhiying Qiu, a Wenge Guo b and Gavin Lynch c Testing

More information

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS Journal of Biopharmaceutical Statistics, 21: 726 747, 2011 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2011.551333 MULTISTAGE AND MIXTURE PARALLEL

More information

On adaptive procedures controlling the familywise error rate

On adaptive procedures controlling the familywise error rate , pp. 3 On adaptive procedures controlling the familywise error rate By SANAT K. SARKAR Temple University, Philadelphia, PA 922, USA sanat@temple.edu Summary This paper considers the problem of developing

More information

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA

More information

On weighted Hochberg procedures

On weighted Hochberg procedures Biometrika (2008), 95, 2,pp. 279 294 C 2008 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn018 On weighted Hochberg procedures BY AJIT C. TAMHANE Department of Industrial Engineering

More information

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper

More information

Multiple Sample Categorical Data

Multiple Sample Categorical Data Multiple Sample Categorical Data paired and unpaired data, goodness-of-fit testing, testing for independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation & Hypothesis Testing In doing research, we are usually interested in some feature of a population distribution (which

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago

More information

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications Thomas Brechenmacher (Dainippon Sumitomo Pharma Co., Ltd.) Jane Xu (Sunovion Pharmaceuticals Inc.) Alex Dmitrienko

More information

Lecture 6 April

Lecture 6 April Stats 300C: Theory of Statistics Spring 2017 Lecture 6 April 14 2017 Prof. Emmanuel Candes Scribe: S. Wager, E. Candes 1 Outline Agenda: From global testing to multiple testing 1. Testing the global null

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

This paper has been submitted for consideration for publication in Biometrics

This paper has been submitted for consideration for publication in Biometrics BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza

More information

IMPROVING TWO RESULTS IN MULTIPLE TESTING

IMPROVING TWO RESULTS IN MULTIPLE TESTING IMPROVING TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf October 11,

More information

E509A: Principle of Biostatistics. GY Zou

E509A: Principle of Biostatistics. GY Zou E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What

More information

Week 5 Video 1 Relationship Mining Correlation Mining

Week 5 Video 1 Relationship Mining Correlation Mining Week 5 Video 1 Relationship Mining Correlation Mining Relationship Mining Discover relationships between variables in a data set with many variables Many types of relationship mining Correlation Mining

More information

ON TWO RESULTS IN MULTIPLE TESTING

ON TWO RESULTS IN MULTIPLE TESTING ON TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf Two known results in

More information

Hochberg Multiple Test Procedure Under Negative Dependence

Hochberg Multiple Test Procedure Under Negative Dependence Hochberg Multiple Test Procedure Under Negative Dependence Ajit C. Tamhane Northwestern University Joint work with Jiangtao Gou (Northwestern University) IMPACT Symposium, Cary (NC), November 20, 2014

More information

Random Variable. Pr(X = a) = Pr(s)

Random Variable. Pr(X = a) = Pr(s) Random Variable Definition A random variable X on a sample space Ω is a real-valued function on Ω; that is, X : Ω R. A discrete random variable is a random variable that takes on only a finite or countably

More information

Chapter 2 Class Notes

Chapter 2 Class Notes Chapter 2 Class Notes Probability can be thought of in many ways, for example as a relative frequency of a long series of trials (e.g. flips of a coin or die) Another approach is to let an expert (such

More information

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design Comparing Adaptive Designs and the Classical Group Sequential Approach to Clinical Trial Design Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj

More information

Adaptive Designs: Why, How and When?

Adaptive Designs: Why, How and When? Adaptive Designs: Why, How and When? Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj ISBS Conference Shanghai, July 2008 1 Adaptive designs:

More information

Non-specific filtering and control of false positives

Non-specific filtering and control of false positives Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview

More information

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm Math 408 - Mathematical Statistics Lecture 29-30. Testing Hypotheses: The Neyman-Pearson Paradigm April 12-15, 2013 Konstantin Zuev (USC) Math 408, Lecture 29-30 April 12-15, 2013 1 / 12 Agenda Example:

More information

Stepwise Gatekeeping Procedures in Clinical Trial Applications

Stepwise Gatekeeping Procedures in Clinical Trial Applications 984 Biometrical Journal 48 (2006) 6, 984 991 DOI: 10.1002/bimj.200610274 Stepwise Gatekeeping Procedures in Clinical Trial Applications Alex Dmitrienko *,1, Ajit C. Tamhane 2, Xin Wang 2, and Xun Chen

More information

A note on tree gatekeeping procedures in clinical trials

A note on tree gatekeeping procedures in clinical trials STATISTICS IN MEDICINE Statist. Med. 2008; 06:1 6 [Version: 2002/09/18 v1.11] A note on tree gatekeeping procedures in clinical trials Alex Dmitrienko 1, Ajit C. Tamhane 2, Lingyun Liu 2, Brian L. Wiens

More information

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical

More information

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA

Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals. Satoshi AOKI and Akimichi TAKEMURA Minimal basis for connected Markov chain over 3 3 K contingency tables with fixed two-dimensional marginals Satoshi AOKI and Akimichi TAKEMURA Graduate School of Information Science and Technology University

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

Hypothesis Testing. ECE 3530 Spring Antonio Paiva

Hypothesis Testing. ECE 3530 Spring Antonio Paiva Hypothesis Testing ECE 3530 Spring 2010 Antonio Paiva What is hypothesis testing? A statistical hypothesis is an assertion or conjecture concerning one or more populations. To prove that a hypothesis is

More information

Control of Directional Errors in Fixed Sequence Multiple Testing

Control of Directional Errors in Fixed Sequence Multiple Testing Control of Directional Errors in Fixed Sequence Multiple Testing Anjana Grandhi Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102-1982 Wenge Guo Department of Mathematical

More information

Mathematical Statistics

Mathematical Statistics Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics

More information

Adaptive, graph based multiple testing procedures and a uniform improvement of Bonferroni type tests.

Adaptive, graph based multiple testing procedures and a uniform improvement of Bonferroni type tests. 1/35 Adaptive, graph based multiple testing procedures and a uniform improvement of Bonferroni type tests. Martin Posch Center for Medical Statistics, Informatics and Intelligent Systems Medical University

More information

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1 4 Hypothesis testing 4. Simple hypotheses A computer tries to distinguish between two sources of signals. Both sources emit independent signals with normally distributed intensity, the signals of the first

More information

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs with Remarks on Family Selection Dissertation Defense April 5, 204 Contents Dissertation Defense Introduction 2 FWER Control within

More information

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple

More information

Glossary for the Triola Statistics Series

Glossary for the Triola Statistics Series Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling

More information

Basic counting techniques. Periklis A. Papakonstantinou Rutgers Business School

Basic counting techniques. Periklis A. Papakonstantinou Rutgers Business School Basic counting techniques Periklis A. Papakonstantinou Rutgers Business School i LECTURE NOTES IN Elementary counting methods Periklis A. Papakonstantinou MSIS, Rutgers Business School ALL RIGHTS RESERVED

More information

Consonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich

Consonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 446 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Azeem

More information

Optimal rejection regions for testing multiple binary endpoints in small samples

Optimal rejection regions for testing multiple binary endpoints in small samples Optimal rejection regions for testing multiple binary endpoints in small samples Robin Ristl and Martin Posch Section for Medical Statistics, Center of Medical Statistics, Informatics and Intelligent Systems,

More information

Decision Making Beyond Arrow s Impossibility Theorem, with the Analysis of Effects of Collusion and Mutual Attraction

Decision Making Beyond Arrow s Impossibility Theorem, with the Analysis of Effects of Collusion and Mutual Attraction Decision Making Beyond Arrow s Impossibility Theorem, with the Analysis of Effects of Collusion and Mutual Attraction Hung T. Nguyen New Mexico State University hunguyen@nmsu.edu Olga Kosheleva and Vladik

More information

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49 4 HYPOTHESIS TESTING 49 4 Hypothesis testing In sections 2 and 3 we considered the problem of estimating a single parameter of interest, θ. In this section we consider the related problem of testing whether

More information

The Impossibility of Certain Types of Carmichael Numbers

The Impossibility of Certain Types of Carmichael Numbers The Impossibility of Certain Types of Carmichael Numbers Thomas Wright Abstract This paper proves that if a Carmichael number is composed of primes p i, then the LCM of the p i 1 s can never be of the

More information

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

More information

Exam: high-dimensional data analysis January 20, 2014

Exam: high-dimensional data analysis January 20, 2014 Exam: high-dimensional data analysis January 20, 204 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question not the subquestions on a separate piece of paper. - Finish

More information

Multiple comparisons of slopes of regression lines. Jolanta Wojnar, Wojciech Zieliński

Multiple comparisons of slopes of regression lines. Jolanta Wojnar, Wojciech Zieliński Multiple comparisons of slopes of regression lines Jolanta Wojnar, Wojciech Zieliński Institute of Statistics and Econometrics University of Rzeszów ul Ćwiklińskiej 2, 35-61 Rzeszów e-mail: jwojnar@univrzeszowpl

More information

Sample Size Estimation for Studies of High-Dimensional Data

Sample Size Estimation for Studies of High-Dimensional Data Sample Size Estimation for Studies of High-Dimensional Data James J. Chen, Ph.D. National Center for Toxicological Research Food and Drug Administration June 3, 2009 China Medical University Taichung,

More information

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1 Lecture 2 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Reports of the Institute of Biostatistics

Reports of the Institute of Biostatistics Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions

More information

Bipartite Subgraphs of Integer Weighted Graphs

Bipartite Subgraphs of Integer Weighted Graphs Bipartite Subgraphs of Integer Weighted Graphs Noga Alon Eran Halperin February, 00 Abstract For every integer p > 0 let f(p be the minimum possible value of the maximum weight of a cut in an integer weighted

More information

Lesson 1: Successive Differences in Polynomials

Lesson 1: Successive Differences in Polynomials Lesson 1 Lesson 1: Successive Differences in Polynomials Classwork Opening Exercise John noticed patterns in the arrangement of numbers in the table below. 2.4 3.4 4.4 5.4 6.4 5.76 11.56 19.36 29.16 40.96

More information

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33 Introduction 1 STA442/2101 Fall 2016 1 See last slide for copyright information. 1 / 33 Background Reading Optional Chapter 1 of Linear models with R Chapter 1 of Davison s Statistical models: Data, and

More information

Closure properties of classes of multiple testing procedures

Closure properties of classes of multiple testing procedures AStA Adv Stat Anal (2018) 102:167 178 https://doi.org/10.1007/s10182-017-0297-0 ORIGINAL PAPER Closure properties of classes of multiple testing procedures Georg Hahn 1 Received: 28 June 2016 / Accepted:

More information

Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box Durham, NC 27708, USA

Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box Durham, NC 27708, USA Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box 90251 Durham, NC 27708, USA Summary: Pre-experimental Frequentist error probabilities do not summarize

More information

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures

More information

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity Prof. Kevin E. Thorpe Dept. of Public Health Sciences University of Toronto Objectives 1. Be able to distinguish among the various

More information

Multiple Endpoints: A Review and New. Developments. Ajit C. Tamhane. (Joint work with Brent R. Logan) Department of IE/MS and Statistics

Multiple Endpoints: A Review and New. Developments. Ajit C. Tamhane. (Joint work with Brent R. Logan) Department of IE/MS and Statistics 1 Multiple Endpoints: A Review and New Developments Ajit C. Tamhane (Joint work with Brent R. Logan) Department of IE/MS and Statistics Northwestern University Evanston, IL 60208 ajit@iems.northwestern.edu

More information

Significance Testing with Incompletely Randomised Cases Cannot Possibly Work

Significance Testing with Incompletely Randomised Cases Cannot Possibly Work Human Journals Short Communication December 2018 Vol.:11, Issue:2 All rights are reserved by Stephen Gorard FRSA FAcSS Significance Testing with Incompletely Randomised Cases Cannot Possibly Work Keywords:

More information

Dose-response modeling with bivariate binary data under model uncertainty

Dose-response modeling with bivariate binary data under model uncertainty Dose-response modeling with bivariate binary data under model uncertainty Bernhard Klingenberg 1 1 Department of Mathematics and Statistics, Williams College, Williamstown, MA, 01267 and Institute of Statistics,

More information

Notes on statistical tests

Notes on statistical tests Notes on statistical tests Daniel Osherson Princeton University Scott Weinstein University of Pennsylvania September 20, 2005 We attempt to provide simple proofs for some facts that ought to be more widely

More information

QUANTITATIVE TECHNIQUES

QUANTITATIVE TECHNIQUES UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION (For B Com. IV Semester & BBA III Semester) COMPLEMENTARY COURSE QUANTITATIVE TECHNIQUES QUESTION BANK 1. The techniques which provide the decision maker

More information

The optimal discovery procedure: a new approach to simultaneous significance testing

The optimal discovery procedure: a new approach to simultaneous significance testing J. R. Statist. Soc. B (2007) 69, Part 3, pp. 347 368 The optimal discovery procedure: a new approach to simultaneous significance testing John D. Storey University of Washington, Seattle, USA [Received

More information

COMPLETION OF PARTIAL LATIN SQUARES

COMPLETION OF PARTIAL LATIN SQUARES COMPLETION OF PARTIAL LATIN SQUARES Benjamin Andrew Burton Honours Thesis Department of Mathematics The University of Queensland Supervisor: Dr Diane Donovan Submitted in 1996 Author s archive version

More information

Multiple testing: Intro & FWER 1

Multiple testing: Intro & FWER 1 Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes

More information

Bipartite Perfect Matching

Bipartite Perfect Matching Bipartite Perfect Matching We are given a bipartite graph G = (U, V, E). U = {u 1, u 2,..., u n }. V = {v 1, v 2,..., v n }. E U V. We are asked if there is a perfect matching. A permutation π of {1, 2,...,

More information

arxiv: v1 [math.st] 31 Mar 2009

arxiv: v1 [math.st] 31 Mar 2009 The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 3 Probability Contents 1. Events, Sample Spaces, and Probability 2. Unions and Intersections 3. Complementary Events 4. The Additive Rule and Mutually Exclusive

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Review Basic Probability Concept

Review Basic Probability Concept Economic Risk and Decision Analysis for Oil and Gas Industry CE81.9008 School of Engineering and Technology Asian Institute of Technology January Semester Presented by Dr. Thitisak Boonpramote Department

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

Estimates for probabilities of independent events and infinite series

Estimates for probabilities of independent events and infinite series Estimates for probabilities of independent events and infinite series Jürgen Grahl and Shahar evo September 9, 06 arxiv:609.0894v [math.pr] 8 Sep 06 Abstract This paper deals with finite or infinite sequences

More information

Chapters 10. Hypothesis Testing

Chapters 10. Hypothesis Testing Chapters 10. Hypothesis Testing Some examples of hypothesis testing 1. Toss a coin 100 times and get 62 heads. Is this coin a fair coin? 2. Is the new treatment more effective than the old one? 3. Quality

More information

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of

More information

Lines With Many Points On Both Sides

Lines With Many Points On Both Sides Lines With Many Points On Both Sides Rom Pinchasi Hebrew University of Jerusalem and Massachusetts Institute of Technology September 13, 2002 Abstract Let G be a finite set of points in the plane. A line

More information

Lecture 21: October 19

Lecture 21: October 19 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use

More information

Data Mining. CS57300 Purdue University. March 22, 2018

Data Mining. CS57300 Purdue University. March 22, 2018 Data Mining CS57300 Purdue University March 22, 2018 1 Hypothesis Testing Select 50% users to see headline A Unlimited Clean Energy: Cold Fusion has Arrived Select 50% users to see headline B Wedding War

More information

Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, Discreteness versus Hypothesis Tests

Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, Discreteness versus Hypothesis Tests Stat 5421 Lecture Notes Fuzzy P-Values and Confidence Intervals Charles J. Geyer March 12, 2016 1 Discreteness versus Hypothesis Tests You cannot do an exact level α test for any α when the data are discrete.

More information

Statistical Theory 1

Statistical Theory 1 Statistical Theory 1 Set Theory and Probability Paolo Bautista September 12, 2017 Set Theory We start by defining terms in Set Theory which will be used in the following sections. Definition 1 A set is

More information

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Joseph P. Romano Department of Statistics Stanford University Michael Wolf Department of Economics and Business Universitat Pompeu

More information

Statistical testing. Samantha Kleinberg. October 20, 2009

Statistical testing. Samantha Kleinberg. October 20, 2009 October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find

More information

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. STAT 302 Introduction to Probability Learning Outcomes Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. Chapter 1: Combinatorial Analysis Demonstrate the ability to solve combinatorial

More information

PERFECTLY secure key agreement has been studied recently

PERFECTLY secure key agreement has been studied recently IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 2, MARCH 1999 499 Unconditionally Secure Key Agreement the Intrinsic Conditional Information Ueli M. Maurer, Senior Member, IEEE, Stefan Wolf Abstract

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

Hypothesis Testing. ) the hypothesis that suggests no change from previous experience

Hypothesis Testing. ) the hypothesis that suggests no change from previous experience Hypothesis Testing Definitions Hypothesis a claim about something Null hypothesis ( H 0 ) the hypothesis that suggests no change from previous experience Alternative hypothesis ( H 1 ) the hypothesis that

More information

Review of Basic Probability

Review of Basic Probability Review of Basic Probability Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA 01003 September 16, 2009 Abstract This document reviews basic discrete

More information

Hypothesis testing (cont d)

Hypothesis testing (cont d) Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able

More information

Statistical Significance of Ranking Paradoxes

Statistical Significance of Ranking Paradoxes Statistical Significance of Ranking Paradoxes Anna E. Bargagliotti and Raymond N. Greenwell 1 February 28, 2009 1 Anna E. Bargagliotti is an Assistant Professor in the Department of Mathematical Sciences

More information

The Difference in Proportions Test

The Difference in Proportions Test Overview The Difference in Proportions Test Dr Tom Ilvento Department of Food and Resource Economics A Difference of Proportions test is based on large sample only Same strategy as for the mean We calculate

More information

Chapters 10. Hypothesis Testing

Chapters 10. Hypothesis Testing Chapters 10. Hypothesis Testing Some examples of hypothesis testing 1. Toss a coin 100 times and get 62 heads. Is this coin a fair coin? 2. Is the new treatment on blood pressure more effective than the

More information

A variant of Namba Forcing

A variant of Namba Forcing A variant of Namba Forcing Moti Gitik School of Mathematical Sciences Raymond and Beverly Sackler Faculty of Exact Science Tel Aviv University Ramat Aviv 69978, Israel August 0, 009 Abstract Ronald Jensen

More information