Statistical comparison of univariate tests of homogeneity of variances
|
|
- Lindsey Richardson
- 6 years ago
- Views:
Transcription
1 Submitted to the Journal of Statistical Computation and Simulation Statistical comparison of univariate tests of homogeneity of variances Pierre Legendre* and Daniel Borcard Département de sciences biologiques, Université de Montréal, C.P. 628, succursale Centre-ville, Montréal, Québec H3C 3J7, Canada Abstract This paper compares empirical type I error and power of different tests that have been proposed to assess the homogeneity of within-group variances, prior to anova. The tests of homogeneity of variance (THV) compared in this study are: Bartlett's test, the Scheffé-Box log-anova test, Cochran s C test and Box s M test, in their parametric and permutational forms. The main questions addressed in the paper are: () under what conditions is heterogeneity of variances really a problem in anova, and (2) under these conditions, is any of the THVs useful for the detection of heterogeneity? A preliminary simulation study confirmed that anova is very sensitive to heterogeneity of the variances, even when the data are normally distributed. Any pattern of heteroscedasticity results in inflated type I error, the worst results occurring when one variance is larger than the others. A second study was conducted to find out which tests of homogeneity of variances should be used under extreme conditions (small sample sizes, non-normal distributions). The best overall methods are Bartlett's or Box's tests; even with normally distributed data, one should avoid Cochran's test which is only sensitive to a single high variance, as well as the loganova test because it has low power with small to moderate sample sizes. With non-normal data, Bartlett's and Box's tests can be used if the samples are fairly large. Species abundancelike data should be log-transformed and subjected to parametric or permutational Bartlett's or Box's tests. An Appendix presents a comparison of the Welch-corrected t-test with the parametric and permutational forms of the t-test. The test with Welch correction is useful when the data are normal, sample sizes are small, and the variances are heterogeneous. Otherwise, use the parametric t-test for normal data, or the permutational t-test for skewed data. For heteroscedastic data that cannot be normalized, a nonparametric test should be used. Keywords: Anova; permutation test; power; simulation study; test of homogeneity of variances; t-test; type I error Running head: Tests of homogeneity of variances *Corresponding author. Tel.: (5) 33-75, Fax (5) , Pierre.Legendre@umontreal.ca Tel.: (5) 33-75, Fax (5) , BorcardD@magellan.umontreal.ca
2 2. Introduction Several tests of homogeneity of variances (THV) have been proposed in the literature during the 30 s, 0 s and 50 s (Bartlett 37a, 37b; Cochran, 5; Box 53). Natural selection has left us with only a few that are presented in current textbooks. Authors usually present one or two, making various claims about their robustness (or lack of) to departures from normality. Many authors claim that a test of homogeneity of variances is a prerequisite to analysis of variance. Others, like Zar (), confide that the tests presently available have such poor performance that they are not really useful, Anova being more robust to departures from homoscedasticity than can be detected using a test of homogeneity of variances, especially under conditions of non-normality. Underwood (7) reminds us that the analysis of variance presents problems with heterogeneity in balanced samples only when one of the variances is markedly larger than the others; it is not especially sensitive to non-normality of the data which badly affects most of the classical tests for homogeneity of variances. In analysis of variance, the Behrens-Fisher problem is that of comparing means of samples drawn from normal populations without assuming equal variances; valid solutions to the Behrens-Fisher problem exist for two groups and one variable (t-test) but not, to our knowledge, for several groups nor for the multivariate case. The main question addressed in this paper is thus a double interrogation: () under what conditions is heterogeneity of variances really a problem in anova, and (2) under these conditions, is any of the tests of homogeneity of variances useful for the detection of heterogeneity? With the advent of microcomputers and the availability of ever more powerful machines, permutation tests have gained in popularity during the past 20 years. This type of test is known to alleviate the conditions of normality often associated with parametric statistical tests and preserve correct type I error independently of the distributions of the variables under study. Is this also the case with the permutational forms of the tests of homogeneity of variances? A simulation study was undertaken (3) to verify this hypothesis and () find out which tests of homogeneity should be used under extreme conditions (small sample sizes, non-normal distributions) and (5) which one(s) should be described in textbooks and taught in introductory courses of statistics. 2. Methods 2.. THV statistics The tests under study are those found in various textbooks. We will use the following symbols in the descriptions of the statistics: k = number of groups, n j = number of 2 observations within group j, n = total number of observations, s p = pooled within-group variance, SSW = sum of within-group sums-of-squares.
3 3. Bartlett s test is the one most often presented in textbooks and taught in introductory courses because of its ease of computation. The test statistic B involves a comparison of the separate within-group sums-of-squares to the pooled within-group sum-of-squares: B = ( n k)ln s p ( n j )ln s j where s p = SSW ( n k). () j = A correction factor C B is computed as: k C B = ( k ) j = ( n j ) ( n k) and applied to B to obtain the corrected B C statistic: k (2) B C = B C B. (3) 2 χ k B C has an asymptotic distribution of ( ). Bartlett s test is known to be powerful if the sampled populations are normal, but badly affected by non-normality (Box 53, Zar ). 2. The Scheffé-Box log-anova test (Martin and Games 77) is based on papers by Box (53) and Scheffé (5). In this test, one first divides the observations of each group at random among a number of subgroups and computes the log of the variance of each subgroup. The test uses an F-type statistic which compares the among-group mean-squares to the within-group mean-squares of the logarithms of the subgroup variances instead of the raw data: F SS among ( k ) = ; () k SS within ( m j ) j = m j is the number of subgroups within group j; m j is approximately equal to where n j is the number of observations within group j. Computation of the test statistic is described in detail in Sokal and Rohlf (5) and in Scherrer (8), for example. The log-anova F- statistic is tested against a critical value of F with the degrees of freedom of the numerator and denominator of eq.. The log-anova test is said to be less sensitive to departures from normality than Bartlett s test (Sokal and Rohlf 5). When the number of observations is small, the results of the log-anova test may be quite unstable. The reason is the following: in this test, one first assigns at random the observations of each group to subgroups, as described above. The variance of each subgroup is computed and log-transformed, before being used to compute an F-statistic based on these log(variance) values. The number of subgroups is approximately equal to the square root of n j
4 the number of observations in a group; this number, which determines the number of log(variance) values representing each group in the analysis, is very small if n j, which is the number of observations in group j, is small. Moreover, if n j is small, a different random assignment of the objects of a group to the subgroups may result in a very different set of subgroup variances. This is why the results of the log-anova test may vary greatly for different assignments of the objects to the subgroups, especially for small n. 3. Cochran s C test statistic (Cochran, 5) is: C = 2 s largest s 2 j. (5) Tables of critical values for some combinations of degrees of freedom (ν = to, 6, 36 and ) and number of groups (2 to 0, 5, 20) have been reproduced by different authors (e.g., Winer 7) from the table originally published by Eisenhart et al. (7). The degrees of freedom (ν) are: ν = max(n j ) where n j is the number of observations in group j. Professor A. J. Underwood (pers. comm. ) suggested that Cochran s test may be the best method to detect cases where the variance of one of the groups is much larger than that of the other groups. This is a situation where the analysis of variance of balanced samples is known to present problems (Underwood 7).. Box s M statistic provides a test of multivariate dispersion which can be used with a single variable (p =, where p designates the number of variables): Box s M = ( n k)ln S ( n j )ln S j k j = where S is the pooled within-group dispersion matrix and S j is the dispersion matrix for group j. For univariate data, the M and B statistics are identical. A correction factor, different from that of Bartlett s test, is used with the M statistic to turn it into a statistic distributed like chi-square: (6) C M = 2p 2 + 3p ( p + ) ( k ) k j = ( n j ) ( n k) (7) and M C = M C M. (8)
5 5 After applying the correction factor C M, the corrected M C statistic has an asymptotic 2 distribution. χ p( p+ ) ( k ) 2 Hartley's (50) test, which uses the ratio of the largest to the smallest variances (and thus resembles Cochran's C test with a less optimal use of the information available), and Kullback's (5) test of homogeneity of variance-covariance matrices, which is largely similar to Box's M, were not included in the present simulation study. All these statistics can also be tested by the method of permutations. From the descriptions above, readers will appreciate the fact that these statistics use only the withingroup dispersion portions of the data. This indicates that an appropriate permutation procedure would require the groups to be centered on a common mean, before permutations, in order for the test to produce realizations of a null hypothesis related to within-group dispersions only. Without this precaution, imagine what would happen in the case of groups differing in their means: permutation of the data across groups would create pseudo-groups with larger variances than the original groups. After centering the groups on a common mean (e.g. the origin), the null hypothesis under test is (H 0 ) the exchangeability of objects among the groups after the differences in positions of the means have been removed Simulation methods Two computer programs were written in Fortran 77 to carry out the simulations: one for anova and one for the tests of homogeneity of variances. The programs were designed to read a file of parameters describing the characteristics of a stack of simulation problems, and run them in a sequence. Random data with the proper characteristics were generated within the program and transformed as required, and the various forms of tests (parametric and permutational) were computed. Permutations of the simulated data, where needed, were done using a uniform random generation algorithm sensu Furnas (8). For each problem, the program wrote one line of results to an output file. This line contained the simulation parameters of the problem, the rate of rejection of the null hypothesis for each test after the stated number of simulations, and the 5% confidence interval of this rejection rate. The output files were assembled into a data base which was used to produce summary tables as well as the figures presented in this paper. The simulation program for anova computed a parametric and a permutational anova for each simulated data set. In the case of two groups only, a t-test with Welch correction was also computed. The simulation program for the tests of homogeneity of variances computed the four statistics described in Section 2. and tested each one parametrically and permutationally. Data were generated with the following distributions: () random normal deviates with specified mean and variance, (2) random power deviates, i.e., y' = base y where base
6 6 was chosen to have the value. and y was a random normal deviate with specified mean and variance, and (3) truncated random power deviates, i.e., power deviates as in (2) where the negatives values were truncated to zero and the positive values were rounded to the nearest integer, to simulate species abundance data which are encountered in ecological data analysis, or other types of frequency data as found in other fields of application. With option (2), a log-transformation of random power deviates restores normality. In contrast, with option (3), log transformation of truncated random power deviates does not necessarily produce normally distributed data. Random lognormal deviates were not used in the simulations because the distributions were too variable among simulated data sets, due to the appearance of extreme outlier values; we would have needed a very large number of simulations to obtain reasonable confidence intervals for the rejection rates of the tests. Random power data with base. were used instead. The problem of outlier values was minimal, so that we obtained reasonable confidence intervals after 5000 simulations. In the study of type I error, data were generated in such a way that the null hypothesis was true (H 0 in anova: the population means are equal; H 0 in THV: the population variances are equal). The rate of type I error was computed as the proportion of the simulations where the null hypothesis was rejected at the α = 5% significance level. In the power study, data were simulated in such a way that the null hypothesis was false. was computed as the proportion of the simulations where the null hypothesis was rejected at the α = 5% significance level Simulation setup In each simulation problem, the rate of rejection of the null hypothesis (at α = 5%) was computed after 5000 independent simulations involving the desired number of observations from the selected type of distribution. The permutation tests involved random permutations of the data; following Hope (68), the reference value of the statistic obtained for the unpermuted data was added to the distribution of values obtained after permutation, before calculating the permutational probability associated with the statistic. Since we wanted to test all the selected methods under the same conditions, we had to restrict the combinations of number of observations and numbers of groups to those for which tables of critical values exist for Cochran s test. This explains why the simulations have been run with 0, 7, 37 and 5 observations in each group. Within one run the number of observations was equal in all groups. We ran our tests with two or three groups, but only the results involving three groups will be presented here, since the two-group simulations gave essentially similar results and three-group situations are more relevant to anova questions. Some anova results for two groups are presented in the Appendix to assess the Welch correction for two groups with unequal variances.
7 7 3. Results 3.. Anova: type I error Since the simulation results of the anova tests were largely independent of sample size (within our limits: 0 to 5 observations per group), the results shown in Figure are those based on n j = 0. The population means were equal in the anova simulations for type I error Homogeneous variances of the parametric anova is correct for normally distributed data (Fig. a), but it is slightly too low for the power base. distribution (Fig. b). The apparent increase in type I error, in Figure a, from variance to 6 to, is a random effect which is not found in the other series of simulations that we have done using n j = 7, 37 and 5 (results not presented in detail here). Thus, application of a parametric anova requires at least that the distribution be symmetric, a property which can be obtained to a convenient degree by log-transforming the data simulated under the base-. distribution (including its truncated version). As long as the variances are homogeneous, permutational anova yields correct type I error irrespective of the distribution (normal or power base., truncated power base., and log-transformed data computed from the latter; some of these results are not illustrated) Heterogeneous variances For normal data, type I error is inflated as soon as one of the variances is higher than the others, but less so when one of the variances is smaller than the others (Fig. c). Both parametric and permutational tests suffer from this problem. The problem is worse for the power base. distribution (Fig. d); using a permutation test does not correct the problem Anova: power In the anova power simulations, means differed among groups but variances were the same (5 in the case of the normal distribution, in the case of the power base. distribution). Having shown above that heterogeneous variances alter type I error of anova in most cases, the question of power for data with heterogeneous variances is irrelevant. Our simulation results show that, with homogeneous variances, the power of anova is good for both the normal and power base. distributions, as soon as the contrast between the two extreme means is sufficient (in this study: 0 and 5 for normal data;.6 and 5 for power data; see Figs. e and f). Parametric and permutational anova have approximately the same power.
8 8 These results confirm that anova, in both its parametric and permutational forms, is very sensitive to heterogeneity of the variances, even when the data are normally distributed. Any pattern of heteroscedasticity results in inflated type I error, the worst results occurring when one variance is larger than the others. However, permutational anova retains correct type I error in the presence of skewed distributions as long as the variances are homogeneous. It has the same power as parametric anova Tests of homogeneity of variances (THV): type I error For the THVs, we shall first present the simulation results for the normal and power base. continuous data, comparing the behavior of the different methods in the presence of symmetrical and skewed distributions. We postpone to Section 3.5 the presentation of the results using the truncated power base. distribution, which were devised to study the behavior of the THVs in the presence of species abundance-like data. The population variances were equal in the THV simulations for type I error Normal distribution In this situation, all the methods tested, in their parametric (Fig. 2a) and permutational (Fig. 2b) forms, have correct type I error base. distribution Under this distribution, the log-anova test (parametric and permutational) is the only one to maintain correct type I error (Figs. 2c and 2d). is inflated in all other tests in both forms, although less so for the permutational forms. 3.. Tests of homogeneity of variances (THV): power 3... Normal distribution Figures 2e and 2f show the power of the THVs with the same combination of variances as those used to assess type I error in the anova simulations (Fig. c). The loganova is clearly less powerful than the other tests, even in its permutational form. In most cases, Cochran s test is the most powerful when one of the variances is markedly higher than the others, but it loses most of its power when the variances are spread more evenly. The Bartlett, Cochran and Box parametric tests perform slightly better than their permutational counterparts; this is not the case for the log-anova test. To summarize, a comparison between Figure c and Figures 2e and 2f shows that, as long as the data are normally distributed, the Bartlett and Box tests are powerful enough to detect heterogeneous variances when these induce inflated type I error in anova.
9 3..2. base. distribution A cursory glance at Figures 2g and 2h may lead one to believe that all but the loganova test have high power and are thus appropriate methods. However, the left-hand and right-hand groups of simulations in both graphs represent in fact cases with equal variances, i.e., measures of type I error. They are drawn here to remind us that the log-anova test is the only one that maintains correct type I error, and thus that its power, low as it may be, is that of the only reliable method among those investigated here. of the log-anova test reaches less than 5% in the best cases. We conclude that all the THVs under study are unusable for skewed distributions, such as the power base. data Effect of sample size We stated above that, for anova, the number of objects per group did not influence the results significantly, at least within the range used in the simulations reported here (n j = 0 to 5). Does that hold for the THVs, or do the results change with sample size? Normal data: the simulation results (not shown) do not vary with n j. remains correct for all methods, using parametric or permutational tests, regardless of n j. base. data (Figs. 3a and 3b): type I error of the log-anova test is correct, as in Figures 2c and 2d, and is unaffected by n j. The results for the other THVs are interesting: their parametric forms show worse results when n j gets larger, while their permutational forms improve markedly, the rejection rate reaching the α significance level with n j = Normal data: as expected, the power of all THVs improve greatly when the number of observations per group increases (Figs. 3c and 3d). At the maximum value subjected to simulations (n j = 5), the power of all methods, including that of the log-anova test, is. For intermediate sample sizes, Bartlett and Box tests are clearly the most powerful in the presence of a gradual distribution of the variances. base. data: the log-anova test, which is the only one with an overall correct type I error under this distribution (Figs. 3a and 3b), has mediocre power, culminating at about 8% for n j = 5 for the variances used in these simulations (Figs. 3e and 3f). The other THVs, whose type I errors are correct only when n j is large, do hardly better, with powers between 3 and 5%. These results confirm that for skewed distributions the tests of homogeneity of variances are not usable.
10 Truncated power base. distribution: data simulating species abundances This distribution deserves a special section because it simulates species abundance data which are of great interest to ecologists. Contrary to the continuous power base. data used above, these data have been altered in such a way that log-transformation generally does not restore complete normality, much like with true species abundance data. These data being truncated at zero, the logarithmic transformation cannot restore the complete distribution Anova: type I error, homogeneous variances Our simulations yielded the same results for truncated as for untruncated power base. data; the latter results are shown in Figure b. of the parametric anova was slightly too low (around ), while that of the permutational anova was correct Anova: power, homogeneous variances Again, the results are similar those obtained for untruncated power base. data, shown in Figure f. is good when the contrast between the two extreme variances is high THV: type I error The performances of the Bartlett, Cochran and Box tests are approximately the same (Figs. a and b) as for untruncated power base. data (i.e., equally bad; compare with Figs. 2c and 2d). The log-anova test reacts surprisingly badly to this type of data, at least when n j = 0 and the within-group variances are small. of the parametric loganova test is always inflated. of the permutational test is correct only in the presence of high within-group variances. It is too low in the other cases, and improves when the sample size increases (see below), so that the permutational log-anova test is valid for small or large values of within-group variances because type I error is smaller than or equal to α. Transforming the truncated power base. data, using y' = ln(y+), greatly improves type I error of all but the log-anova test (Figs. c and d). s of Bartlett s, Cochran s and Box s tests are approximately correct in their parametric form, and correct in their permutational form THV: power The type I error results reported in Section indicate that the only meaningful use of the THVs for truncated power base. data is after log-transformation of the data, using y' = ln(y+). The results are shown in Figures e and f. The log-anova test is unusable either
11 because of its inflated type I error (parametric form) or because it has nearly no power (permutational form). Among the other tests, Bartlett s and Box s are the most powerful in general; as usual, Cochran s test is slightly better at detecting heterogeneity when one of the variances is markedly higher than the others Effect of sample size For truncated power base. data, increasing the sample size tends to restore correct type I error for the log-anova test (Figs. 5a and 5b). This occurs at smaller sample size (n j = 7) in the permutational than in the parametric test (n j = 37). For the other tests, the results are the same as with untruncated power base. data (Figs. 3a and 3b): larger sample size means more inflated type I error for the parametric versions, but for permutation tests, type I error becomes correct at n j = 5. Log-transforming the data (using y' = ln(y+)) improved the results drastically, in particular when using permutations (Figs. 5c and 5d), for all but the log-anova test. The permutational versions of Bartlett s, Cochran s and Box s tests have a valid type I error over the range of sample sizes simulated. remains too low for the log-anova test with small n j, but the test remains valid Despite the good performance of the log-anova test in terms of type I error (Figs. 5a and 5b) for truncated power base. data, power simulation results are not presented because power is too low for this test to be useful with this kind of data (between 56 and 2 for the same range of combinations as, for instance, those presented in Figures e and f). For the other THVs, a power study was run only at n j = 5, because these tests had incorrect type I error for truncated power base. data for lower n j. The results (not illustrated) show that Bartlett s and Box s tests have a power above 0., while Cochran s test performs poorly (power around ) when one variance is markedly smaller than the others. simulations were done to study the effect of sample size for log-transformed truncated power base. data. A marked improvement in power occurs at high sample sizes (Figs. 5e and 5f). The parametric log-anova power results are meaningless for n j = 0 and 7 because type I error is inflated in Figures 5c and 5d for these sample sizes. For larger sample sizes, where type I error is correct, the power of the parametric log-anova test is smaller than that of the other THVs. While the parametric tests seem to have better power overall, remember that they generally have slightly inflated type I error (Figs. 5c and 5d) whereas the permutation tests have correct type I error.
12 2. Discussion We can now go back to the questions that motivated this study. First, we can state that heterogeneity of variances is always a problem in anova, and is troublesome even in the most benign cases, i.e., when one of the variances is smaller than the others. The problem is the worst when one of the variances is markedly larger than the others. The effect of variance heterogeneity on anova is moderately to extremely inflated type I error. Answers to our other questions (usability of the THVs, differences between permutation and parametric tests, extreme conditions) need elaborate answers that will be presented below in the form of a table of recommendations (TABLE I). This is commanded by the many characteristics of the data that influence the simulation results. For instance, our simulations have shows that anyone wanting to apply anova to non-normal data is caught between contradictory requirements. On the one hand, anova is not very sensitive to skewness but needs homogeneous variances; on the other hand, the available THVs often give fanciful results when the data are skewed. Thus, it is highly recommended to normalize the data as well as possible, even though anova itself does not require it. For multi-modal data, transformations should aim at reducing skewness. To summarize, the best overall methods to test the homogeneity of variances are Bartlett's or Box's tests. Even with normally distributed data, one should avoid Cochran's test which is only sensitive to a single high variance, as well as the log-anova test because it has low power with small to moderate sample sizes. With non-normal data, Bartlett's and Box's tests can be used if the samples are fairly large. Species abundance-like data should be log-transformed and subjected to parametric or permutational Bartlett's or Box's tests. Acknowledgments We are most thankful to General Cambronne for assistance during the simulation work. This research was supported by NSERC grants OGP and EQP0608 to P. Legendre.
13 3 References Bartlett, M. S. (37a) Some examples of statistical methods of research in agriculture and applied biology. J. Roy. Statist. Soc. Suppl. : Bartlett, M. S. (37b) Properties of sufficiency and statistical tests. Proc. Roy. Statist. Soc. Ser. A 60: Box, G. E. P. (53) Non-normality and tests on variances. Biometrika 0: Cochran, W. G. () The distribution of the largest of a set of estimated variances as a fraction of their total. Annals of Eugenics (London) : Cochran, W. G. (5) Testing a linear relation among variances. Biometrics 7: Edgington, E. S. (5) Randomization Tests (Third Edition). New York: Marcel Dekker. Eisenhart, C. (7) Significance of the largest of a set of sample estimates of variance. In: Selected techniques of statistical analysis for scientific and industrial research and production and management engineering (Eds. Eisenhart, C., Hastay, M. W. and Wallis, W. A.), pp , New York: McGraw-Hill. Furnas, G. W. (8) The generation of random, binary unordered trees, J. Classif.,, Hartley, H. O. (50) The maximum F-ratio as a short-cut test for heterogeneity of variance. Biometrika 37: Hope, A. C. A. (68) A simplified Monte Carlo test procedure, J. Roy. Statist. Soc. B, 50, Kullback, S. (5) Information theory and statistics. New York: Wiley. Martin, C. G. and Games, P. A. (77) Anova tests for homogeneity of variances: nonnormality and unequal samples. Journal of Educational Statistics, 2, Scheffé, H. (5) The analysis of variance. New York: Wiley. Scherrer, B. (8) Biostatistique. Boucherville: Gaëtan Morin Ed.
14 Sokal, R. R. and Rohlf, F. J. (5) Biometry The Principles and Practice of Statistics in Biological Research (Third Edition). New York: W. H. Freeman. Underwood, A. J. (7) Experiments in ecology Their logical design and interpretation using analysis of variance. Cambridge: Cambridge University Press. Welch, B. L. (36) Specification of rules for rejecting too variable a product, with particular reference to an electric lamp problem. J. Roy. Statist. Soc., Suppl. 3, 2-8. Welch, B. L. (38) The significance of the difference between two means when the population variances are unequal. Biometrika, 2, Winer, B. J. (7) Statistical principles in experimental design (Second Edition). New York: McGraw-Hill. Zar, J. H. () Biostatistical analysis (Fourth Edition). Upper Saddle River, N.J.: Prentice Hall.
15 5 Appendix: t-test with Welch correction This appendix presents additional simulation results in which the t-test with Welch (36, 38) correction was compared to parametric and permutational t-tests for two types of data distributions and for equal and unequal population variances. The result of a t-test is identical to that of an anova computed for two groups; the t-statistic is the square root of the F-statistic used in anova. The Welch correction, described in most textbooks of statistics (e.g., Scherrer, 8, and Zar, ), for example, is a widely used solution to the Behrens- Fisher problem of testing for the difference in the means of two populations when the variances are unequal. These simulations, which led us to formulate recommendations with respect to the use of this correction, should prove useful to application domains where unequal variances are commonly encountered. The Welch correction was designed to provide a valid t-test in the presence of unequal population variances. It consists of using a corrected number of degrees of freedom ν to assess the significance of the t-statistic computed as usual. ν is the next smaller integer of the value obtained from the following equation: ν 2 2 [ ( s n ) + ( s 2 n2 )] 2 where and are the sample variances of groups and 2 respectively, whereas n and n 2 are the number of observations in groups and 2. When the variances are equal, equation reduces to the usual formula ν = (n + n 2 2) when the two groups have equal numbers of observations, but to a lower value when n n 2, making the test with Welch correction too conservative. We will see if the simulations can illustrate this bias, and what are its practical consequences, if any, for the users of the test. s 2 s 2 2 = ( s n ) 2 2 ( s n 2 ) n n 2 (). Equal sample sizes Simulations were carried out using two groups of data of equal sizes, with n = n 2 = {0,, 50, 00}. The first series of simulations used normal random deviates with mean 0 and standard deviations between and, as specified in the graphs; the values were chosen in such a way that the standard deviations of the two reference populations added to 0. For the power study, the population mean of group was 0 while that of group 2 was simulations were run in each case, during which the following statistics were computed: a standard parametric t-test, a permutational t-test (using random permutations), and a t-test with Welch correction. The t-test with Welch correction is expected to produce correct type I error
16 6 when the variances are not homogeneous, whereas the permutational t-test is expected to do the same for skewed data. Figures 6a and 6e show that the t-test with Welch correction has correct type I error for any and all combinations of population variances, and for all sample sizes. The parametric and permutational t-tests are affected by inequality of the variances. The effect is strong when sample size is small (n j = 0 in Fig. 6a), but disappears gradually as sample size increases. For example, with n j = 50, the tests have slightly inflated type I error only in the most extreme case of inequality of the population variances (Fig. 6e). No inflation of type I error was found at n j = 00 (results not illustrated). The power of the three tests is comparable when they are valid, i.e., when type I error is not larger than α (Figs. 6b and 6f). In the second series of simulations, power base. data were used, as described in Section 2.2 of the main paper. Otherwise, the design of the simulations was the same as for normal data. When the variances are equal or nearly so (σ = σ 2 in Figs. 6c and 6g), all three tests are valid for all sample sizes. The permutational t-test presents the advantage of having correct type I error whereas the other two forms of the test are too conservative; the permutational t- test also has the highest power (Figs. 6d and 6h). When the variances are unequal (σ = σ 2, e.g., in Fig. 6g), type I error of all three tests becomes inflated to various degrees, so that the tests are invalid and should not be used. This was the case with all sample sizes used in the simulations, except with n j = 0 (Fig. 6c); power of the tests is irrelevant when type I error is larger than α. The parametric and Welchcorrected t-test are valid when sample sizes are very small since type I error is not larger than α (Fig. 6c where n j = 0), but power is so low that the tests are unusable (Fig. 6d). 2. Unequal sample sizes Simulations were also conducted with different sample sizes chosen in such a way that n + n 2 = {20, 50 or 00}. Otherwise, the design of the simulations was the same as for equal sample sizes; the standard deviations were made to vary between and, as in Fig. 6, the values being chosen in such a way that the standard deviations of the two reference populations added to 0. Simulations were carried out to measure type I error (with the population means equal) and power (with the population means unequal). The first series of simulations used normal random deviates with mean 0 and standard deviations between and summing to 0 for the two groups, as specified in the graphs. When the population variances are equal, the parametric and permutational t-tests have correct type I error for any combination of sample sizes (Fig. 7a), whereas the test with Welch correction becomes too conservative when sample sizes are strongly unequal.
17 7 of all tests decreases as the sample sizes become more unequal (Fig. 7b); the t-test with Welch correction has lower power than the other forms in the most extreme cases of inequality of the sample sizes. Of course, power of all tests increases as n + n 2 grows from 20 to 50 to 00 (not illustrated). When the population variances are unequal, all tests have correct type I error for equal group sizes (e.g., Fig. 7c, except for a slight inflation of type I error in the most extreme case of inequality of the population variances, already shown in Fig. 6e), but type I error becomes increasingly too conservative as the group sizes become more unequal. All tests remain valid with sample sizes n + n 2 = 50 or 00, but for strongly unequal group sizes, power becomes too low for the tests to be useful (Fig. 7d). We already know from Fig. 6a that for very small sample sizes, such as n + n 2 = 20, there is a small inflation of type I error of the parametric and permutational t-tests in the case of equal group sizes; this is quickly compensated by the conservativeness of these tests in the case of unequal group sizes. The second series of simulations, based upon power base. random deviates (skewed distribution), gave the following results. When the population variances are equal, only the permutational t-test has correct type I error for any combination of sample sizes and for all n + n 2 = {20, 50 or 00} subjected to simulations (Fig. 7e); it also has good power (Fig. 7f). The test with Welch correction is too conservative for all combinations of sample sizes; power is reduced compared to the power of the permutational t-test. The parametric t-test has conservative type I error in most cases, but when the samples become strongly unequal in size, it has inflated type I error. In the area where the parametric t-test is valid (sample sizes equal or moderately unequal), its power is less than that of the permutational t-test. With unequal variances, the behavior of the three tests becomes erratic (e.g., Fig. 7g). When the tests are valid, they have poor power so that they are useless (Fig. 7h). 3. Recommendations The recommendations that can be derived from our simulation results are complex. They are presented in tabular form (TABLE II). To summarize, the test with Welch correction is useful when the data are normal, sample sizes are small, and the variances are heterogeneous. Otherwise, use the parametric t-test for normal data, or the permutational t- test for skewed data. For heteroscedastic data that cannot be normalized, a nonparametric test should be used.
18 8 TABLE I Recommended strategy for THVs and anova. References to the appropriate sections of the Results are given in brackets.. Are the data normal? Yes -> Run parametric Bartlett or Box THV; avoid Cochran (sensitive only to a single high variance) and log-anova (low power if n j < 5) [3..] -> go to 2 No -> go to 2. THV result: Variances homogeneous -> run parametric or permutational anova [3..]. Variances heterogeneous -> homogenize variances [3..2] -> go to 3 3. Variance homogenization: Successful -> run parametric or permutational anova [3..]. Unsuccessful -> choose alternative method, e.g. nonparametric anova.. Normalizing transformation of the data: Successful -> go to Unsuccessful -> go to 5 5. Distribution of data: Real, continuous, positively skewed -> go to 6 Species abundances: data are null or positive, discrete, positively skewed -> go to 7 Other distributions: not simulated in this study. 6. Sample size: n j < 5 -> no THV is appropriate. is correct only for log anova [3..3.], but this test has very low power [3..3.2]. * n j is large -> Bartlett or Box tests can be used, but power is low [3..3.2]. -> go to 2 7. Sample size: n j < 5: type I error is correct only for log anova [3.5.5.], but this test is unusable because of its very low power [ ]. * -> go to 8 n j is large -> Bartlett or Box tests can be used [ ]. -> go to 2 8. Log-transform the data using y' = ln(y+), then: Use permutational Bartlett or Box test (which have correct type I error but slightly lower power) or their parametric form (slightly higher power but also sometimes slightly inflated type I error) [3.5., ] -> go to 2 * If anova is computed on data with skewed distribution without the results of a prior THV, a significant result may only be found if at least one of the means differs sufficiently from the others (Fig. f). Otherwise, use nonparametric anova.
19 TABLE II Recommendations for t-test of equality of two means.. Sample sizes equal? Yes -> go to 2 No -> go to 6 2. Equal sample sizes. Distribution: Normal-> -> go to 3 Skewed-> -> go to 5 3. Normal distributions. THV result: Variances homogeneous -> use any one of the 3 tests (simplest: parametric t-test). Variances unequal-> -> go to. Variances unequal. Sample size: Small-> use the t-test with Welch correction. Large-> use any one of the 3 tests (simplest: parametric t-test). 5. Skewed distributions. THV result: Variances homogeneous -> all 3 tests are valid, but the permutational t-test is preferable because it has correct type I error and the highest power. Variances unequal-> normalize the data or use a nonparametric test (Wilcoxon-Mann- Whitney test, median test, Kolmogorov-Smirnov two-sample test, etc.). 6. Unequal sample sizes. Distribution: Normal-> -> go to 7 Skewed-> -> go to 8 7. Normal distributions. THV result: Variances homogeneous -> use the parametric or permutational t-tests (simplest: parametric t-test). Variances unequal-> use any one of the 3 tests (simplest: parametric t-test). is low when the sample sizes are strongly unequal; avoid the Welch-corrected t-test in the most extreme cases of sample size inequality (lower power). 8. Skewed distributions. THV result Variances homogeneous -> use the permutational t-test. Variances unequal-> normalize the data or use a nonparametric test (Wilcoxon-Mann- Whitney test, median test, Kolmogorov-Smirnov two-sample test, etc.).
20 20 Figure captions FIGURE Results of the ANOVA simulation study for normal [(a), (c) and (e)] and power base. [(b), (d) and (f)] data. (a) and (b): type I error and 5% confidence intervals (error bars) at α = 5, in the presence of various amounts of withingroup population variance (abscissa). (c) and (d): type I error and 5% confidence intervals (error bars) at α = 5, in the presence of heterogeneous within-group variances. Simulations were run with three groups; the population variances are shown under the abscissa. At both ends of the graphs, results for equal variances are also shown for comparison. (e) and (f): power of ANOVA in the presence of homogeneous variances and various combinations of means (abscissa). (e) Normal distribution, variance = 5; (f) power base. distribution, variance =. Overlapping symbols have been offset horizontally to improve clarity. The 5% confidence error bars, which are closer than the size of the symbols, have been omitted. FIGURE 2 FIGURE 3 Results of the THV simulation study for parametric [(a), (c), (e) and (g)] and permutational [(b), (d), (f) and (h)] tests for normal and power base. data and n j = 0. When they were closer than the size of the symbols, the 5% confidence error bars have been omitted. (a) and (b): type I error and 5% confidence intervals (error bars) of the four THVs, tested at α = 5, for normally distributed data. The simulations involved three groups from populations with equal variances (abscissa) and zero means. (c) and (d): type I error of the four THVs for the power base. distribution. The simulations involved three groups with equal population variances (abscissa). (e) and (f): power of the four THVs for normal data. The simulations involved three groups drawn from populations with zero means and variances shown under the abscissa. Overlapping symbols have been offset horizontally to improve clarity. (g) and (h): power of the four THVs for power base. data. The simulations involved three groups with equal means and variances shown under the abscissa. At both ends of the graphs, results for equal variances have been added for comparison. Overlapping symbols have been offset horizontally to improve clarity. Effect of sample size on the four THVs for normal and power base. data. (a), (c) and (e): parametric tests; (b), (d) and (f): permutation tests. When they were closer than the size of the symbols, the 5% confidence error bars have been omitted. (a) and (b): type I error for power base. data. The within-group variances were equal to in these simulations. (c) and (d): power for normal data. Simulations were run with three groups with variances equal to, and. (e) and (f): power for power base. data. Simulations were run with three groups with variances equal to, and. For the parametric tests, in all but
21 2 the log-anova tests the results are drawn only for n j =5, because for smaller values of n j the type I error of the tests is inflated. FIGURE FIGURE 5 FIGURE 6 FIGURE 7 Results of the THV simulation study for parametric [(a), (c) and (e)] and permutational [(b), (d) and (f)] tests for truncated or truncated and logtransformed power base. data and n j = 0. The 5% confidence error bars, which are closer than the size of the symbols, have been omitted. (a) and (b): type I error for truncated power base. data. Simulations were run with three groups with equal variances (abscissa) and equal means. (c) and (d): type I error for truncated and log-transformed power base. data. Simulations were run with three groups with equal variances (abscissa) and equal means. (e) and (f): power for truncated and log-transformed power base. data. Simulations were run with three groups with equal means and variances shown under the abscissa. At both ends of the graphs, results for equal variances have been added for comparison. Overlapping symbols have been offset horizontally for better clarity. In the parametric results, the lines connecting the log-anova symbols are dashed to remind us that this test is unusable because of its incorrect type I error. Effect of sample size on the four THVs on truncated or truncated and logtransformed base. data. (a), (c) and (e): parametric tests; (b), (d) and (f): permutation tests. When they were closer than the size of the symbols, the 5% confidence error bars have been omitted. (a) and (b): type I error for truncated power base. data. The error bars, which are closer than the size of the symbols, have been omitted. (c) and (d): type I error for truncated and logtransformed power base. data. (e) and (f): power for truncated and logtransformed power base. data. Simulations were run with three groups with variances equal to, and. (a) and 5% confidence intervals (error bars) for t-tests of difference between two group means, at α = 5, for two groups of normal data (n = n 2 = 0) with equal means, as a function of the two population standard deviations (σ k, abscissa). (b) simulation results for the same data. (c, d) Same as (a, b) for power base. data. (e, f, g, h) Same as (a, b, c, d) using n = n 2 = 50. When they were closer than the size of the symbols, the 5% confidence error bars have been omitted. (a) and 5% confidence intervals (error bars) for t-tests of difference between two group means, at α = 5, for normal data with equal population standard deviations (σ j ), as a function of the sample sizes (n j, abscissa). (b) simulation results for the same data. (c, d) Same as (a, b) for unequal population standard deviations (σ j ). (e, f, g, h) Same as (a, b, c, d)
22 using power base. data. When they were closer than the size of the symbols, the 5% confidence error bars have been omitted. 22
23 Normal data base. data (a) 0 6 Within-group variance 0 (b) 0 6 Within-group variance 8 (c) 8 (d) Variance of group Variance of group 2 Variance of group (e) Mean of group Mean of group 2 Mean of group 3 (f) Parametric ANOVA Permutational ANOVA Legendre & Borcard, Fig.
24 Parametric THV Permutational THV (a) 0 6 Within-group variance (b) 0 6 Within-group variance (c) 0 6 Within-group variance (d) 0 6 Within-group variance (e) Variance of group Variance of group 2 Variance of group 3 (f) (g) Variance of group Variance of group 2 Variance of group 3 (h) Bartlett Log-Anova Cochran C Box M Legendre & Borcard, Fig.2
Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption
Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Alisa A. Gorbunova and Boris Yu. Lemeshko Novosibirsk State Technical University Department of Applied Mathematics,
More informationComparison of two samples
Comparison of two samples Pierre Legendre, Université de Montréal August 009 - Introduction This lecture will describe how to compare two groups of observations (samples) to determine if they may possibly
More informationResearch Article A Nonparametric Two-Sample Wald Test of Equality of Variances
Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner
More informationA nonparametric two-sample wald test of equality of variances
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David
More informationA Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions
Journal of Modern Applied Statistical Methods Volume 12 Issue 1 Article 7 5-1-2013 A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions William T. Mickelson
More informationInferences About the Difference Between Two Means
7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent
More informationPOWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE
POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE Supported by Patrick Adebayo 1 and Ahmed Ibrahim 1 Department of Statistics, University of Ilorin, Kwara State, Nigeria Department
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationInferential Statistics
Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free
More informationDidacticiel Études de cas. Parametric hypothesis testing for comparison of two or more populations. Independent and dependent samples.
1 Subject Parametric hypothesis testing for comparison of two or more populations. Independent and dependent samples. The tests for comparison of population try to determine if K (K 2) samples come from
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future
More informationExtending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie
Extending the Robust Means Modeling Framework Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie One-way Independent Subjects Design Model: Y ij = µ + τ j + ε ij, j = 1,, J Y ij = score of the ith
More informationINTRODUCTION TO ANALYSIS OF VARIANCE
CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two
More informationNAG Library Chapter Introduction. G08 Nonparametric Statistics
NAG Library Chapter Introduction G08 Nonparametric Statistics Contents 1 Scope of the Chapter.... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric Hypothesis Testing... 2 2.2 Types
More information13: Additional ANOVA Topics. Post hoc Comparisons
13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Post hoc Comparisons In the prior chapter we used ANOVA
More informationThe entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.
One-Way ANOVA Summary The One-Way ANOVA procedure is designed to construct a statistical model describing the impact of a single categorical factor X on a dependent variable Y. Tests are run to determine
More informationINFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF INDEPENDENT GROUPS ABSTRACT
Mirtagioğlu et al., The Journal of Animal & Plant Sciences, 4(): 04, Page: J. 344-349 Anim. Plant Sci. 4():04 ISSN: 08-708 INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF
More informationTHE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook
BIOMETRY THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH THIRD E D I T I O N Robert R. SOKAL and F. James ROHLF State University of New York at Stony Brook W. H. FREEMAN AND COMPANY New
More informationS Abelman * Keywords: Multivariate analysis of variance (MANOVA), hypothesis testing.
S Afr Optom 2006 65 (2) 62 67 A p p l i c a t i o n o f m u l t i v a r i a t e a n a l y s i s o f v a r i - a n c e ( M A N O VA ) t o d i s t a n c e r e f r a c t i v e v a r i - a b i l i t y a n
More informationOn Selecting Tests for Equality of Two Normal Mean Vectors
MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationTEST POWER IN COMPARISON DIFFERENCE BETWEEN TWO INDEPENDENT PROPORTIONS
TEST POWER IN COMPARISON DIFFERENCE BETWEEN TWO INDEPENDENT PROPORTIONS Mehmet MENDES PhD, Associate Professor, Canakkale Onsekiz Mart University, Agriculture Faculty, Animal Science Department, Biometry
More informationBIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES
BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method
More information* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.
Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course
More informationNon-parametric tests, part A:
Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are
More informationApplication of Variance Homogeneity Tests Under Violation of Normality Assumption
Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com
More informationSolutions to Final STAT 421, Fall 2008
Solutions to Final STAT 421, Fall 2008 Fritz Scholz 1. (8) Two treatments A and B were randomly assigned to 8 subjects (4 subjects to each treatment) with the following responses: 0, 1, 3, 6 and 5, 7,
More informationParametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous
More informationEverything is not normal
Everything is not normal According to the dictionary, one thing is considered normal when it s in its natural state or conforms to standards set in advance. And this is its normal meaning. But, like many
More informationAppendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny
008 by The University of Chicago. All rights reserved.doi: 10.1086/588078 Appendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny (Am. Nat., vol. 17, no.
More informationContents. Acknowledgments. xix
Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables
More informationThe Robustness of the Multivariate EWMA Control Chart
The Robustness of the Multivariate EWMA Control Chart Zachary G. Stoumbos, Rutgers University, and Joe H. Sullivan, Mississippi State University Joe H. Sullivan, MSU, MS 39762 Key Words: Elliptically symmetric,
More informationA Monte-Carlo study of asymptotically robust tests for correlation coefficients
Biometrika (1973), 6, 3, p. 661 551 Printed in Great Britain A Monte-Carlo study of asymptotically robust tests for correlation coefficients BY G. T. DUNCAN AND M. W. J. LAYAKD University of California,
More informationDescriptive Statistics-I. Dr Mahmoud Alhussami
Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.
More informationDETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics
DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and
More informationCompare several group means: ANOVA
1 - Introduction Compare several group means: ANOVA Pierre Legendre, Université de Montréal August 009 Objective: compare the means of several () groups for a response variable of interest. The groups
More informationEmpirical Power of Four Statistical Tests in One Way Layout
International Mathematical Forum, Vol. 9, 2014, no. 28, 1347-1356 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.47128 Empirical Power of Four Statistical Tests in One Way Layout Lorenzo
More informationRank-Based Methods. Lukas Meier
Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data
More informationTwo-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function
Journal of Data Science 7(2009), 459-468 Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function Rand R. Wilcox University of Southern California Abstract: When comparing
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester
More informationTHE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED
THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED H. J. Keselman Rand R. Wilcox University of Manitoba University of Southern California Winnipeg, Manitoba Los Angeles,
More informationConventional And Robust Paired And Independent-Samples t Tests: Type I Error And Power Rates
Journal of Modern Applied Statistical Methods Volume Issue Article --3 Conventional And And Independent-Samples t Tests: Type I Error And Power Rates Katherine Fradette University of Manitoba, umfradet@cc.umanitoba.ca
More informationSpecies Associations: The Kendall Coefficient of Concordance Revisited
Species Associations: The Kendall Coefficient of Concordance Revisited Pierre LEGENDRE The search for species associations is one of the classical problems of community ecology. This article proposes to
More informationKumaun University Nainital
Kumaun University Nainital Department of Statistics B. Sc. Semester system course structure: 1. The course work shall be divided into six semesters with three papers in each semester. 2. Each paper in
More informationPSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests
PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution
More informationIntroduction to hypothesis testing
Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If
More informationMODEL II REGRESSION USER S GUIDE, R EDITION
MODEL II REGRESSION USER S GUIDE, R EDITION PIERRE LEGENDRE Contents 1. Recommendations on the use of model II regression methods 2 2. Ranged major axis regression 4 3. Input file 5 4. Output file 5 5.
More informationTwo-Mean Inference. Two-Group Research. Research Designs. The Correlated Samples t Test
Two-Mean Inference 6430 Two-Group Research. We wish to know whether two groups (samples) of scores (on some continuous OV, outcome variable) are different enough from one another to indicate that the two
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationIncreasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University
Power in Paired-Samples Designs Running head: POWER IN PAIRED-SAMPLES DESIGNS Increasing Power in Paired-Samples Designs by Correcting the Student t Statistic for Correlation Donald W. Zimmerman Carleton
More informationTransition Passage to Descriptive Statistics 28
viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of
More informationStatistical Procedures for Testing Homogeneity of Water Quality Parameters
Statistical Procedures for ing Homogeneity of Water Quality Parameters Xu-Feng Niu Professor of Statistics Department of Statistics Florida State University Tallahassee, FL 3306 May-September 004 1. Nonparametric
More informationROBUSTNESS OF TWO-PHASE REGRESSION TESTS
REVSTAT Statistical Journal Volume 3, Number 1, June 2005, 1 18 ROBUSTNESS OF TWO-PHASE REGRESSION TESTS Authors: Carlos A.R. Diniz Departamento de Estatística, Universidade Federal de São Carlos, São
More informationAPPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV
DOI 10.1007/s11018-017-1213-4 Measurement Techniques, Vol. 60, No. 5, August, 2017 APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV B. Yu. Lemeshko and T.
More informationEfficient Robbins-Monro Procedure for Binary Data
Efficient Robbins-Monro Procedure for Binary Data V. Roshan Joseph School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205, USA roshan@isye.gatech.edu SUMMARY
More informationVarCan (version 1): Variation Estimation and Partitioning in Canonical Analysis
VarCan (version 1): Variation Estimation and Partitioning in Canonical Analysis Pedro R. Peres-Neto March 2005 Department of Biology University of Regina Regina, SK S4S 0A2, Canada E-mail: Pedro.Peres-Neto@uregina.ca
More informationTentative solutions TMA4255 Applied Statistics 16 May, 2015
Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationWhat Is ANOVA? Comparing Groups. One-way ANOVA. One way ANOVA (the F ratio test)
What Is ANOVA? One-way ANOVA ANOVA ANalysis Of VAriance ANOVA compares the means of several groups. The groups are sometimes called "treatments" First textbook presentation in 95. Group Group σ µ µ σ µ
More informationMATH Notebook 3 Spring 2018
MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................
More informationAnalysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution
Journal of Computational and Applied Mathematics 216 (2008) 545 553 www.elsevier.com/locate/cam Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution
More informationTABLES OF P-VALUES FOR t- AND CHI-SQUARE REFERENCE DISTRIBUTIONS. W. W. Piegorsch
TABLES OF P-VALUES FOR t- AND CHI-SQUARE REFERENCE DISTRIBUTIONS by W. W. Piegorsch University of South Carolina Statistics Technical Report No. 194 62Q05-3 August 2002 Department of Statistics University
More informationDiversity partitioning without statistical independence of alpha and beta
1964 Ecology, Vol. 91, No. 7 Ecology, 91(7), 2010, pp. 1964 1969 Ó 2010 by the Ecological Society of America Diversity partitioning without statistical independence of alpha and beta JOSEPH A. VEECH 1,3
More informationBackground to Statistics
FACT SHEET Background to Statistics Introduction Statistics include a broad range of methods for manipulating, presenting and interpreting data. Professional scientists of all kinds need to be proficient
More informationTopic 8. Data Transformations [ST&D section 9.16]
Topic 8. Data Transformations [ST&D section 9.16] 8.1 The assumptions of ANOVA For ANOVA, the linear model for the RCBD is: Y ij = µ + τ i + β j + ε ij There are four key assumptions implicit in this model.
More informationLecture 26. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
s Sign s Lecture 26 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 19, 2007 s Sign s 1 2 3 s 4 Sign 5 6 7 8 9 10 s s Sign 1 Distribution-free
More informationANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS
ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing
More informationCommunity surveys through space and time: testing the space-time interaction in the absence of replication
Community surveys through space and time: testing the space-time interaction in the absence of replication Pierre Legendre, Miquel De Cáceres & Daniel Borcard Département de sciences biologiques, Université
More informationPartial regression and variation partitioning
Partial regression and variation partitioning Pierre Legendre Département de sciences biologiques Université de Montréal http://www.numericalecology.com/ Pierre Legendre 2017 Outline of the presentation
More informationNAG Library Chapter Introduction. g08 Nonparametric Statistics
g08 Nonparametric Statistics Introduction g08 NAG Library Chapter Introduction g08 Nonparametric Statistics Contents 1 Scope of the Chapter.... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric
More informationOutline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity
1/25 Outline Basic Econometrics in Transportation Heteroscedasticity What is the nature of heteroscedasticity? What are its consequences? How does one detect it? What are the remedial measures? Amir Samimi
More informationChapter 14: Repeated-measures designs
Chapter 14: Repeated-measures designs Oliver Twisted Please, Sir, can I have some more sphericity? The following article is adapted from: Field, A. P. (1998). A bluffer s guide to sphericity. Newsletter
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More informationACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS
ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS TABLE OF CONTENTS INTRODUCTORY NOTE NOTES AND PROBLEM SETS Section 1 - Point Estimation 1 Problem Set 1 15 Section 2 - Confidence Intervals and
More informationNonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown
Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding
More informationAssessing Congruence Among Ultrametric Distance Matrices
Journal of Classification 26:103-117 (2009) DOI: 10.1007/s00357-009-9028-x Assessing Congruence Among Ultrametric Distance Matrices Véronique Campbell Université de Montréal, Canada Pierre Legendre Université
More informationUnit 14: Nonparametric Statistical Methods
Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based
More informationOne-way ANOVA. Experimental Design. One-way ANOVA
Method to compare more than two samples simultaneously without inflating Type I Error rate (α) Simplicity Few assumptions Adequate for highly complex hypothesis testing 09/30/12 1 Outline of this class
More information1 One-way Analysis of Variance
1 One-way Analysis of Variance Suppose that a random sample of q individuals receives treatment T i, i = 1,,... p. Let Y ij be the response from the jth individual to be treated with the ith treatment
More informationPresented to the Graduate Council of the. North Texas State University. in Partial. Fulfillment of the Requirements. For the Degree of.
AN EMPIRICAL INVESTIGATION OF TUKEY'S HONESTLY SIGNIFICANT DIFFERENCE TEST WITH VARIANCE HETEROGENEITY AND UNEQUAL SAMPLE SIZES, UTILIZING KRAMER'S PROCEDURE AND THE HARMONIC MEAN DISSERTATION Presented
More informationIENG581 Design and Analysis of Experiments INTRODUCTION
Experimental Design IENG581 Design and Analysis of Experiments INTRODUCTION Experiments are performed by investigators in virtually all fields of inquiry, usually to discover something about a particular
More informationAnalysis of 2x2 Cross-Over Designs using T-Tests
Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous
More informationAn Overview of the Performance of Four Alternatives to Hotelling's T Square
fi~hjf~~ G 1992, m-t~, 11o-114 Educational Research Journal 1992, Vol.7, pp. 110-114 An Overview of the Performance of Four Alternatives to Hotelling's T Square LIN Wen-ying The Chinese University of Hong
More informationA comparison study of the nonparametric tests based on the empirical distributions
통계연구 (2015), 제 20 권제 3 호, 1-12 A comparison study of the nonparametric tests based on the empirical distributions Hyo-Il Park 1) Abstract In this study, we propose a nonparametric test based on the empirical
More informationResearch Note: A more powerful test statistic for reasoning about interference between units
Research Note: A more powerful test statistic for reasoning about interference between units Jake Bowers Mark Fredrickson Peter M. Aronow August 26, 2015 Abstract Bowers, Fredrickson and Panagopoulos (2012)
More informationOrthogonal, Planned and Unplanned Comparisons
This is a chapter excerpt from Guilford Publications. Data Analysis for Experimental Design, by Richard Gonzalez Copyright 2008. 8 Orthogonal, Planned and Unplanned Comparisons 8.1 Introduction In this
More informationComparison of Two Samples
2 Comparison of Two Samples 2.1 Introduction Problems of comparing two samples arise frequently in medicine, sociology, agriculture, engineering, and marketing. The data may have been generated by observation
More informationIntroduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test
Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test la Contents The two sample t-test generalizes into Analysis of Variance. In analysis of variance ANOVA the population consists
More informationAnalysis of Multivariate Ecological Data
Analysis of Multivariate Ecological Data School on Recent Advances in Analysis of Multivariate Ecological Data 24-28 October 2016 Prof. Pierre Legendre Dr. Daniel Borcard Département de sciences biologiques
More informationTHE PAIR CHART I. Dana Quade. University of North Carolina. Institute of Statistics Mimeo Series No ~.:. July 1967
. _ e THE PAR CHART by Dana Quade University of North Carolina nstitute of Statistics Mimeo Series No. 537., ~.:. July 1967 Supported by U. S. Public Health Service Grant No. 3-Tl-ES-6l-0l. DEPARTMENT
More informationAN IMPROVEMENT TO THE ALIGNED RANK STATISTIC
Journal of Applied Statistical Science ISSN 1067-5817 Volume 14, Number 3/4, pp. 225-235 2005 Nova Science Publishers, Inc. AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC FOR TWO-FACTOR ANALYSIS OF VARIANCE
More informationNonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health
Nonparametric statistic methods Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health Measurement What are the 4 levels of measurement discussed? 1. Nominal or Classificatory Scale Gender,
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationData are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)
BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data
More informationPractical Solutions to Behrens-Fisher Problem: Bootstrapping, Permutation, Dudewicz-Ahmed Method
Practical Solutions to Behrens-Fisher Problem: Bootstrapping, Permutation, Dudewicz-Ahmed Method MAT653 Final Project Yanjun Yan Syracuse University Nov. 22, 2005 Outline Outline 1 Introduction 2 Problem
More informationNONPARAMETRICS. Statistical Methods Based on Ranks E. L. LEHMANN HOLDEN-DAY, INC. McGRAW-HILL INTERNATIONAL BOOK COMPANY
NONPARAMETRICS Statistical Methods Based on Ranks E. L. LEHMANN University of California, Berkeley With the special assistance of H. J. M. D'ABRERA University of California, Berkeley HOLDEN-DAY, INC. San
More informationM A N O V A. Multivariate ANOVA. Data
M A N O V A Multivariate ANOVA V. Čekanavičius, G. Murauskas 1 Data k groups; Each respondent has m measurements; Observations are from the multivariate normal distribution. No outliers. Covariance matrices
More informationOctober 1, Keywords: Conditional Testing Procedures, Non-normal Data, Nonparametric Statistics, Simulation study
A comparison of efficient permutation tests for unbalanced ANOVA in two by two designs and their behavior under heteroscedasticity arxiv:1309.7781v1 [stat.me] 30 Sep 2013 Sonja Hahn Department of Psychology,
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationGENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE
DOI 10.1007/s11018-017-1141-3 Measurement Techniques, Vol. 60, No. 1, April, 2017 GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY
More information