E MERGING T ECHNOLOGIES

Size: px

Start display at page:

Download "E MERGING T ECHNOLOGIES"

Steven Tyler
5 years ago
Views:

1 E MERGING T ECHNOLOGIES A Remedy for the Overzealous Bonferroni Technique for Multiple Statistical Tests Zvi Drezner Steven G. Mihaylo College of Business and Economics, California State University-Fullerton, Fullerton, California USA Taly Dawn Drezner Department of Geography, N430 Ross, Yor University, 4700 Keele Street, Toronto, Ontario M3J 1P3 Canada Abstract. Ecological research often involves multiple statistical tests. It is common practice to employ the Bonferroni technique or its more advanced sequential variant for such multiple tests. Indeed, Moran ( Oios, 100, 2003, 403) found that 13% of ecological papers apply this technique. The seminal paper by Rice ( Evolution, 43, 1989, 223 ) that introduced this technique to the ecological community, is cited to date over times. However, these techniques are conservative and some null hypotheses that should be rejected are not. Using order statistics we find that significant results are correlated even when the data consist of independent events. The Bonferroni methods assume independent significant results which results in Type II error with their application. We propose a simple approach, which we term the correlated Bonferroni technique, to rectify this shortcoming, which reduces rejection of significant results. Ecologists may be able to confirm the significance of their results while they are unable to confirm it using the original Bonferroni technique. Researchers may revisit their projects and find that significant results were mistaenly ignored. We provide an Excel file (see supplement) that researchers can easily use. We illustrate the correlated Bonferroni technique with an example. Key words: Bonferroni ; correlated events ; multiple statistical tests ; statistics Introduction The vast majority of ecological (and biological) papers incorporate multiple statistical tests and/or analyses throughout the process of hypothesis testing. Statistically, this appears to introduce type I error (false positives) as when many tests are run and multiple significant results are obtained, the lielihood Emerging Technologies January

2 Emerging Technologies of obtaining P = 0.05 from among those tests is higher than 5%. Thus, if one runs enough tests, results may yield spurious significant results. The original Bonferroni technique (BT) suggested by Bonferroni ( 1936 ), Miller ( 1981 ), and the sequential Bonferroni technique (SBT) suggested by Holm ( 1979 ), Rice ( 1989 ) who aimed to correct this issue and their use has been generally adopted in many studies, including by one of us (e.g., Drezner 2003, 2008, 2011 ). Recently, ecologists in particular have questioned whether such a correction is valid for continued use in ecological research. A wide variety of issues have been raised, including that the SBT is too conservative and that ecological research often relies on multiple tests (Cabin and Mitchell 2000 ). More analyses lead to a greater number of significant results. However, this in turn drives down the P - value cut- offs using the SBT eventually reversing all or many of the originally significant test results when the SBT is applied. Ultimately this process discourages in depth and continued analysis. Greater analytical depth should be encouraged rather than discouraged (Moran 2003 ). Also, there is no consensus for when and how to apply the SBT. Rules for grouping tests for the application of the technique vary even among journal editors (Cabin and Mitchell 2000, Moran 2003 ). As Moran ( 2003 ) argues, many significant results, even with P - values that are not far below 0.05, may be more telling of ecological conditions than one isolated but very small P - value. He also argues that the number of tests should be considered. For example, if 5 of 10 tests are significant at P = 0.05 but each is not much lower than 0.05, then the BT or SBT will fail to reject all null hypotheses. However, the probability that 5 tests of 10 are significant at the P = 0.05 level is only which means that some of the results must be significant (Moran 2003 ). It is well established (e.g., Cabin and Mitchell 2000, Moran 2003 ) that the BT and SBT are conservative leading to a Type II error when they are applied. It is assumed in the BT and SBT that ordered statistics are independent, while they actually are not. By order statistics theory (Arnold et al ), the largest statistic and the second largest statistic are correlated and this correlation can be calculated. The BT and SBT assume that the correlation coefficient is zero or nearly so. For example, by a simulation based on the methods in Arnold et al. ( 1992 ), for 10 significant results of a normal distribution, the expected correlation between the largest statistic and the second largest is even when the data are randomly generated from independent standardized normal distributions! In this article, we propose the correlated Bonferroni technique (CBT) that accounts for the correlation that exists between ordered data. When it is used sequentially, we term it the sequential correlated Bonferroni technique (SCBT). When a correlation of zero is assumed, the CBT is equivalent to the BT and the SCBT is equivalent to the SBT. The correlated Bonferroni technique The CBT and its variant SCBT are designed to incorporate the correlation between sorted events and thus enable us to reject some null hypotheses that mistaenly are not rejected by applying the BT or SBT. By the BT and SBT, the critical value when events are more significant than α is approximately α /. We derive the critical values for a given correlation coefficient between significant results ( ρ ) and in the Appendix. We provide an easy to use Excel spreadsheet (see supplement) in which the critical P - values are calculated for any value of when the number of significant results and α is entered. 92 Bulletin of the Ecological Society of America, 97(1)

3 Table 1. Results for the number of seeds example. Nine significant test results are sorted in column 2. In column 3 we depict the SBT critical values. The results from the Excel spreadsheet (see supplement) are shown in column 4. Note that the BT and SBT give the same values in column 3. Sorted Critical values P-values SBT SCBT We define θ (ρ, ) as the critical value for the th significant result. If the smallest P-value of significant results is less than θ (ρ, ) (given in Eq. 1 below), the th null hypothesis can be rejected with significance α. θ(ρ,)= α ( (1) + α α ) ρ λ(ρ,) where λ (ρ, ) is given by Eq. 10 in the Appendix. These critical values for a given ρ and are embedded in the Excel file (supplement). When ρ = 0 is used, the critical values by (1) are the same as the BT critical values (approximately α / ). The critical values for rejecting the null hypothesis increase as ρ increases. Let s be the number of significant results. To estimate ρ for a given s, we found the correlation coefficient for 3 s 100 by simulation of randomly generated normal variates. By linear regression we found the relationship (significance ) ρ s s s s 2. (2) This equation is also embedded in the Excel file (supplement). An application example Consider the mean number of seeds for a species in 10 different populations, to determine which populations are significantly different from one another. The data consist of 10 columns with each containing the number of seeds on each plant in that population. This sets up a one- way ANOVA. Emerging Technologies January

4 Emerging Technologies A Tuey post hoc test is performed on the 10 column ANOVA. Suppose that s = 9 pairs (out of 45) were significant with P - values less than A sorted list of such = 1,,9 P - values are given in Table 1. In the third column, we give 0.05/ which are the BT or SBT critical values and in column 4, we give the results from our Excel file (supplement). When applying the SBT, only the smallest two P-values are significant at the α = 0.05 level and the other seven null hypotheses cannot be rejected despite their original P - values being less than However, using our SCBT, we conclude that all nine P-values are significant at the α = 0.05 level and thus all nine null hypotheses should be rejected. Literature cited Abramowitz, M., and I. Stegun Handboo of mathematical functions. Dover Publications Inc., New Yor, NY, USA. Arnold, B. C., N. Balarishnan, and H. N. Nagaraja A first course in order statistics. John Wiley & Sons, New Yor, NY, USA. Bonferroni, C. E Teoria statistica delle classi e calcolo delle probabilita. Libreria internazionale Seeber, Economiche e commerciali di Firenze 8 : Cabin, R. J., and R. J. Mitchell To bonferroni or not to bonferroni: when and how are the questions. Bulletin of the Ecological Society of America, 81 : Drezner, Z Computation of the multivariate normal integral. ACM Transactions on Mathematical Software, 18 : Drezner, T. D A test of the relationship between seasonal rainfall and saguaro cacti branching patterns. Ecography, 26 : Drezner, T. D Variation in age and height of onset of reproduction in the saguaro cactus (carnegiea gigantea) in the sonoran desert. Plant Ecology, 194 : Drezner, T. D Cactus surface temperatures are impacted by seasonality, spines and height on plant. Environmental and Experimental Botany, 74 : Holm, S A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics, 6 (2 ): Johnson, N. L., and S. Kotz Distributions in statistics: continuous multivariate distributions. Wiley, New Yor, NY, USA. Miller, R. G Simultaneous statistical inference. McGraw Hill, New Yor, NY, USA. Moran, M. D Arguments for rejecting the sequential bonferroni in ecological studies. Oios, 100 : Rice, W. R Analyzing tables of statistical tests. Evolution, 43 (1 ): Appendix: Incorporating Association between Probabilities Let the probability that a hypothesis is not true be 1 P j = Pr (Z j h j ) by some distribution. The probability P that at least one hypothesis is true is determined by a multivariate distribution based on a correlation matrix R between hypotheses: P = 1 Pr ( Z 1 h 1,Z 2 h 2,,Z h ) (3) 94 Bulletin of the Ecological Society of America, 97(1)

5 The multivariate normal case It is reasonable to assume that the individual probabilities are generated by a normal distribution. If the probabilities are a result of a normal distribution, i.e. P j = Pr (Z h j ), correlated probabilities are a result of a multivariate normal distribution. For a multivariate normal distribution with a correlation matrix R (Drezner 1992, Johnson and Kotz 1972 ): Pr ( ) 1 Z 1 h 1,Z 2 h 2,,Z h = (2π) R h 1 h e 1 2 zt R 1z dz dz 1 (4) where R is the determinant of R. We derive the expression for the probability P, when the correlation coefficient between all events is the same, by a multivariate normal distribution based on a correlation matrix R with equal off diagonal values. Equal off diagonal correlations For the multivariate normal distribution, when the correlation matrix R has equal off diagonal correlations ρ 0, by Johnson and Kotz ( 1972 :48), the -dimensional integral (4) can be reduced to a one- dimensional integral: Pr ( ) 1 Z 1 h 1,Z 2 h 2,,Z h = 2π Φ j=1 ( hj z ) ρ 1 ρ e z 2 2 dz (5) where Φ( ) is the cumulative standardized normal probability. The integral (5) can be calculated using either Simpson s integration formula or Gaussian quadrature formulas based on Hermite polynomials (Abramowitz and Stegun 1972 ). Let From (6) we get Pr(Z j h j )=Φ(h j )=1 P j. h j =Φ 1 ( 1 P j ) (6) (7) and the probability P that at least one hypothesis is true is: P = 1 1 2π ( Φ 1 1 P j ) z ρ Φ e 1 ρ z 2 2 dz j=1 (8) Emerging Technologies January

6 Emerging Technologies Table 2. The calculated critical values for α = ρ = 0 ρ = 0.1 ρ = 0.2 ρ = 0.3 ρ = 0.4 ρ = 0.5 ρ = 0.6 ρ = 0.7 ρ = 0.8 ρ = Bulletin of the Ecological Society of America, 97(1)

7 Table 2. (Continued) ρ = 0 ρ = 0.1 ρ = 0.2 ρ = 0.3 ρ = 0.4 ρ = 0.5 ρ = 0.6 ρ = 0.7 ρ = 0.8 ρ = Emerging Technologies January

8 Emerging Technologies When P j = P 0 for all j, the probability that at least one of them occurs is by (8) : [ ( P = 1 1 z0 z )] ρ Φ e z 2 2 dz 2π 1 ρ (9) where Φ( ) is the standardized normal distribution and Φ(z0 ) = 1 P. For ρ = 0 P = 1 (1 P 0 0 ) which leads to the original Bonferroni. For ρ = 1 P = P 0 for any. Rather than using P 0 α, we use the value of P0 that yields P = α in (9). For any α, ρ we can define ( 0 θ 1 such that the probability P 0 = θ yields in (9) P = α. For the original Bonferroni ( ρ = 0), θ = 1 (1 α) 1 ) α. For ρ = 1, θ = α. For 0 < ρ < 1, θ should be between these two values. We calculated θ for α = 0.05 and = 2,3,,30, ρ = 0,0.1,0.2,,0.9 listed in Table 2. ρ = 0 is the original Bonferroni. We fit θ by the expression which is exact for ρ = 0 and ρ = 1. θ(ρ,)= 1 ( ) ρ λ(ρ,) The power λ was fit by multiple regression yielding with significance of : λ(ρ,) ( ) ( (1 ρ) ) 1 ρ 2 (10) Supporting Information Additional supporting information may be found in the online version of this article at library.wiley.com/doi/ /bes2.1214/suppinfo Table S1. Calculating the critical values for a given number of significant results. 98 Bulletin of the Ecological Society of America, 97(1)

Online Appendix for Targeting Policies: Multiple Testing and Distributional Treatment Effects

Online Appendix for Targeting Policies: Multiple Testing and Distributional Treatment Effects Steven F Lehrer Queen s University, NYU Shanghai, and NBER R Vincent Pohl University of Georgia November 2016