AN EMPIRICAL INVESTIGATION OF TUKEY 1 S HONESTLY SIGNIFICANT DIFFERENCE TEST WITH VARIANCE HETEROGENEITY AND EQUAL SAMPLE SIZES,

Size: px
Start display at page:

Download "AN EMPIRICAL INVESTIGATION OF TUKEY 1 S HONESTLY SIGNIFICANT DIFFERENCE TEST WITH VARIANCE HETEROGENEITY AND EQUAL SAMPLE SIZES,"

Transcription

1 37<? A/8/i A/o, /S3 AN EMPIRICAL INVESTIGATION OF TUKEY 1 S HONESTLY SIGNIFICANT DIFFERENCE TEST WITH VARIANCE HETEROGENEITY AND EQUAL SAMPLE SIZES, UTILIZING BOX'S COEFFICIENT OF VARIANCE VARIATION DISSERTATION Presented to the Graduate Council of the North Texas State University in Partial Fulfillment of the Requirements For the Degree of DOCTOR OF PHILOSOPHY By Michael W. Strozeski, B.S., M.Ed. Denton, Texas May, 1980

2 1980 MICHAEL WAYNE STROZESKI ALL RIGHTS RESERVED

3 Strozeski, Michael Wayne, An Empirical Investigation of Tukey's Honestly Significant Difference Test with Variance Heterogeneity and Equal Sample Sizes, Utilizing Box's Coefficient of Variance Variation. Doctor of Philosophy (Educational Research), May, 1980, 1.45 pp., 50 tables, bibliography, 50 titles. This study sought to determine boundary conditions for robustness of the Tukey HSD statistic when the assumptions of homogeneity of variancewereviolated. Box's coefficient of variance variation, C, was utilized to index the degree of variance heterogeneity. Selected numbers of comparison groups and equal sample sizes were evaluated. Tukey's HSD statistic was declared robust if the actual significance level fell within the 95 per cent confidence limits around the corresponding nominal significance level. A Monte Carlo computer simulation technique was employed to generate data under controlled violation of the homogeneity of variance assumption. For each sample size and number of treatment groups condition, an analysis of variance F-test was computed, and Tukey's multiple comparison technique was calculated. This procedure was repeated 4,000 times; the actual level of significance was determined and compared to the nominal significance level of The index of variance variation was systematically adjusted, and this procedure

4 was repeated until the C value was reached, such that any increase in its value would produce an FWI error rate that exceeded the upper limit of the 95 per cent confidence interval about the 0.05 level of significance, thereby establishing a boundary for C. On the basis of the synthesis and analysis of the generated data, the following conclusions were drawn. First, the Tukey HSD statistic was found to be generally robust when the violations of homogeneity of varianceswereof small magni- tude. In all cases, however, as the value of C was increased from zero, a point was reached at which the Tukey HSD statistic was no longer robust and too many FWI errors were produced. Second, when either the violation of the homogeneity of variance assumption was more pronounced (C values were larger) or the number of treatment groups increased, discrepancies between the actual and nominal significance levels occurred. With larger numbers of treatment groups, Tukey's HSD was less robust. The boundary value for C decreased as the number of treatment groups increased. As C values were increased, FWI errors increased. Third, Tukey's HSD was found to be more robust with larger sample sizes. This trend was generally supported in all of the sample size groups proposed for this study. This conclusion was further supported by the addition of fortyeight and seventy-two sample size groups to the five treatment groups experiment. In both of these additional sample size

5 o cases, the C value was greatly increased by the larger sample sizes. A fourth and final conclusion was reached. When the two additional sample size cases were added to investigate the large sample sizes, the Tukey test was found to be conservative when C was set at zero. The actual significance level fell below the lower limit of the 95 per cent confidence interval around the 0.05 nominal significance level. Apparently, large sample sizes decrease the likelihood of an FWI error but may increase the likelihood of Type II error.

6 TABLE OF CONTENTS LIST OF TABLES... Page Chapter I. INTRODUCTION Statement of the Problem Purpose of the Study Hypothesis Mathematical Model of Tukey's HSD Statistic Definition of Terms Delimitations Chapter Bibliography II. SURVEY OF RELATED RESEARCH 9 Chapter Bibliography III. PROCEDURE FOR DATA COLLECTION 3 Procedures for Producing Data Model Validation Statistical Tests of Pseudorandom Numbers Experiment Simulation Procedure Summary of Procedures Chapter Bibliography IV. ANALYSIS OF DATA AND FINDINGS 45 Part 1. k = Three Treatment Groups Part. k = Four Treatment Groups Part 3. k = Five Treatment Groups Part 4. k = Six Treatment Groups Part 5. k = Seven Treatment Groups Part 6. Larger Samples V. SUMMARY, CONCLUSIONS, IMPLICATIONS, AND RECOMMENDATIONS 81 Summary Conclusions Implications Recommendations Chapter Bibliography iii

7 IV Page APPENDIX A 89 APPENDIX B 95 APPENDIX C 116 APPENDIX D 13 APPENDIX E 15 APPENDIX F 137 BIBLIOGRAPHY 139

8 LIST OF TABLES Table Page 1. Actual Levels of Significance Under Conditions of Non-Violation of the Assumptions Underlying the Use of Tukey's HSD Statistic Number of Treatment Groups and Size of Sample Per Experiment Condition Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=3 Groups with n-3 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k =3 Groups with n=6 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=3 Groups with n=1 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=3 Groups with n=14 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in v

9 VI Table Page Simulated Experiments on the Tukey HSD Test for k=4 Groups with n=3 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=4 Groups with n=6 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for fe=4 Groups with n=1 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k-4 Groups with n=4 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=5 Groups with n=3 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=5 Groups with n=6 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test

10 Vll Table Page for k=5 Groups with n=1 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k-5 Groups with n-4 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=6 Groups with n=3 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=6 Groups with n-6 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k-6 Groups with n=j Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k-6 Groups with n=4 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test

11 Vlll Table Page for k=7 Groups with n=3 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=7 Groups with n=6 Observations in Each Group for Varying Degrees of Variance Variation, C 7 1. Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=7 Groups with n=1 Observations in Each Group for Varying Degrees of Variance Variation, C 74. Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test forfc.=7groups with n=4 Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=5 Groups with n=4s Observations in Each Group for Varying Degrees of Variance Variation, C Comparison of Actual Significance Levels for Familywise Type I Error Rates to the Nominal 0.05 Significance Level in Simulated Experiments on the Tukey HSD Test for k=s Groups with n=7 Observations in Each Group for Varying Degrees of Variance Variation, C Degree of Variance Variation, C, Above which the Actual Significance Level Significantly Differed from the Nominal 0.05 Significance Level 80

12 IX Table Page 6. Ninety-Five Per Cent Confidence Limits for a Proportion Corresponding to a Nominal Significance Level Obtained Versus Expected Means and Variances for k=3 Treatment Groups with n=3 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k-3 Treatment Groups with n=6 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=3 Treatment Groups with n=1 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k-3 Treatment Groups with n=4 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=4 Treatment Groups with n=3 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments T01 3. Obtained Versus Expected Means and Variances for k=4 Treatment Groups with n-6 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=4 Treatment Groups with n=1 Observations in Each Group and the Expected Value of a. for Each Computer Run of 4,000 Experiments 10

13 X Table Page 34. Obtained Versus Expected Means and Variances for k=4 Treatment Groups with n=4 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=5 Treatment Groups with n=3 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=5 Treatment Groups with n=6 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k-5 Treatment Groups with n=1 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=5 Treatment Groups with n=4 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=6 Treatment Groups with n=3 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments 10B 40. Obtained Versus Expected Means and Variances for k=6 Treatment Groups with n=6 Observations in Each Group and the Expected Value of a. for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=6 Treatment Groups with n-7 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments 115

14 XI Table Page 4. Obtained Versus Expected Means and Variances for k=6 Treatment Groups with n=4 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Ill 43. Obtained Versus Expected Means and Variances for k=7 Treatment Groups with n=3 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=7 Treatment Groups with n=6 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k = 7 Treatment Groups with n=1 Observations in Each Group and the Expected Value of a for Each Computer Run of 4,000 Experiments Obtained Versus Expected Means and Variances for k=7 Treatment Groups with n=4 Observations in Each Group and the Expected Value of a.for Each Computer Run of 4,000 Experiments Ten Per Cent Intervals for the Normal Distribution with a Mean of Zero and a Standard Deviation of One Expected and Observed Frequencies of One Hundred Numbers in Ten Per Cent Intervals Corresponding to a Normal Distribution A Summary of Variance Heterogeneity as Indexed by C and the Corresponding Ratio of Variances which Resulted in Familywise Type I Error Rates in Excess of the Nominal 0.05 Significance Level Critical F max Values for Corresponding Degrees of Variance Variation 138

15 CHAPTER INTRODUCTION A very frequent concern of educational researchers is determining whether or not k group means differ from one another. The analysis of variance (ANOVA) is often used to test whether or not sample means are indicative of experimental treatment effects or of merely chance variation. Experimenters usually follow a significant F-test in analysis of variance with a multiple comparison statistic when k is greater than two, because ANOVA indicates only the presence of overall treatment effects. Multiple comparison statistics enable the researcher to locate the specific mean differences which have caused the ANOVA F-test to be significant. Tukey's multiple comparison test is a frequently-cited procedure when the researcher's multiple comparison hypotheses are for pairwise differences (Games, 1971; Keselman and Toothaker, 1974; Marascuilo, 1971). Tukey's multiple comparison test specifies the familywise Type I error rate at a for a family of tests on all possible pairs of means allowing the error rate per comparison to decrease as k increases. According to Petrinovich and Hardyck (1969), little has been published on the characteristics and properties of Tukey's Honestly Significant

16 Difference (HSD) procedure. Petrinovich and Hardyck provided more information about Tukey's HSD procedure, but Games (1971) indicated that they provided only limited evidence and that further study is needed. Glass, Peckham, and Sanders (197) indicated that the role of unequal variances in combination with equal sample sizes appears to have boundary conditions which have not been sufficiently probed. Agreeing with Glass, Peckham, and Sanders, authors Rogan, Keselman, and Breen (1977) state that data from their investigations indicate that the degree of variance heterogeneity may play some part in determining those boundary conditions. This study was designed to provide further evidence about the robustness of Tukey's HSD procedure. Whenever populations differ with respect to variances and the means are equal, statistical tests designed to determine the mean difference can be influenced by the difference in variances. The statistical test may yield more or fewer significant results by chance than would be expected. Evidence for the results being influenced by the unequal variances is obtained when significant departures from expected results are found based on the familywise Type I (FWI) error rate at a when means are equal. Any study of robustness of a statistical procedure involves creating differences in parameters other than the parameter for which the statistical procedure was designed

17 to test a difference. The variable manipulated in this study was the population variance. This research was performed to determine the robustness of Tukey's HSD procedure in the presence of variance heterogeneity. Variance heterogeneity was indexed by use of Box's (1954) coefficient of variance variation. Statement of the Problem The problem of this study was the effect of violating the assumptions of homogeneity of variance with equal sample sizes upon Tukey's Honestly Significant Difference (HSD) multiple comparison procedure, utilizing Box's coefficient to index the degree of variance variation. Purpose of the Study The purpose of this study was to empirically evaluate the effects of varying degrees of heterogeneity and equal sample sizes when the degree of variance variation was within a range of 0.00 tofc-1,where k equals the number of treatment groups. Hypothesis The following hypothesis was formulated to carry out the purpose of this study [C = coefficient of variance variation which indexes the degree of heterogeneity; n = sample size; k = number of samples]:

18 Using Tukey's (HSD) procedure, actual significance levels will not differ significantly from nominal significance levels at the 0.05 level of significance when C has a value from 0.00 to fe-1 for experimental conditions of n = 3, 6, 1, and 4, and fe = 3, 4, 5, 6, and 7. Mathematical Model of Tukey's HSD Statistic Tukey's HSD statistic was mathematically defined by Kirk (1968, p. 88) as HSD = q M I,,nx 1 /MS a,v / error ' (1) /J n where HSD = the value to be exceeded for a comparison involving two means to the declared significant? q n a,v = the value determined by entering a table for the percentage points of the studentized range with v degrees of freedom corresponding to the MS error term degrees, of freedom, a level of significance, and the number of treatment groups in the experiment or range of levels in the MS error experiment; = an estimate taken from the one-way analysis of variance mean square within group of the experiment; n = the sample size of each group.

19 If the difference between two groups exceeded the HSD value, then the results were declared significant at the given a level. Definition of Terms Actual Significance Level. The percentage of computed statistical values which exceed the tabled value of the statistic in an empirical investigation. Coefficient of Variance Variation, C. The degree of heterogeneity present in an experimental paradigm as indexed by a coefficient of variance variation, C, in the formula 1 5: k (a t ~ CT ) () t=l - (ar Familywise Error Rate. An error rate that is the ratio of the number of families with at least one statement (comparison) falsely declared significant to the total number of families. Monte Carlo Simulation. A procedure in which random samples are drawn from populations having specified parameters, and then a given statistic is calculated. Nominal Significance Level. The percentage of computed statistical values which exceed the tabled value of the statistic for the theoretical distribution.

20 Pseudorandom Numbers. Pseudorandom numbers are "pseudo" since once the generating sequence is begun, each number is precisely determinedbythe preceding number. Pseudorandom numbers have the basic properties of randomness which makes them quite usable in simulation studies (Lehman and Bailey, 1968). Hereafter in this study, pseudorandom numbers are referred to as random numbers. Robust. When a violation of an assumption underlying a statistical model does not seriously affect the result, then that statistical model is said to be robust. Significant Difference between Nominal and Actual Significance Levels. An actual significance level which fails to fall within a 95 per cent confidence interval about the nominal significance level is said to be statistically different from the nominal significance level. Limitations This study was subject to experimental limitations due to experimental conditions simulated with the following conditions : 1. A selected number (3 to 7) of treatment groups was considered.. Selected equal sample sizes were employed, varying from three to twenty-four. 3. Degrees of variance heterogeneity were selected, ranging from 0.00 to a possible maximum of fe-1.

21 CHAPTER BIBLIOGRAPHY Box, G. E. P. Some theoremson quadratic forms applied in the study of variance problems. I. Effect of inequality of variances in the one-way classification. Annals of Mathematical Statistics, 1954, 5^, Games, Paul A. Multiple comparison of means. American Educational Research Journal, 1971, 8_{3), Glass, G. V., Peckham, P. D., and Sanders, J. R. Consequences of failure to meet assumptions underlying the fixed effects analysis of variance and covariance. Review of Educational Research, 197, (3), Keselman, H. J., and Toothaker, L. E. Comparison of Tukey's t-method and Scheffe's s-method for various numbers of all possible differences of averages contrast under violation of assumptions. Educational and Psychological Measurement, 1974, 3, Kirk, Roger E. Experimenta1 design: procedures for the behavioral sciences. Belmont, California: Brooks/Cole Publishing Company, Lehman, R., and Bailey, D. E. Digital computing, fortran n and its applications to the behavioral sciences. New York: John Wiley and Sons, 1963.

22 Marascuilo, L. A. Statistical methods for behavioral science research. New York: McGraw-Hill, Petrinovich, L. R., and Hardyck, C. D. Error rates for multiple comparison methods: some evidence concerning the frequency of erroneous conclusions. Psychological Bulletin, 1961, 71, Rogan, J. C., Keselman, H. J., and Breen, L. J. Assumption violations and rates of type I error for the Tukey multiple comparison test: a review and empirical investigation via a coefficient of variance variation. Journal of Experimental Education, 1977, A6_(1), 0-5.

23 CHAPTER II SURVEY OF RELATED RESEARCH The effects of violating the assumptions underlying the fixed-effects analysis of variance (ANOVA) on Type I error rate have been of great concern to educational researchers and statisticians since before 1930 (Pearson, 199). For the most part, the major effects of violation of assumptions underlying ANOVA are now quite well known. Concern about whether or not ANOVA assumptions are satisfied is not unfounded. Assumptions of most mathematical models are almost always false to some extent. The important question to be asked is not whether these assumptions have been exactly met, but whether violations of these assumptions have had any serious effects on the probability statements that have been formulated based on the standard assumptions. Applied statistics in education and the social sciences experienced a largely unnecessary hegira to non-parametric statistics during the 1950s. Increasingly during the 1950s and early 1960s the fixed effects, normal theory ANOVA was replaced by such comparable nonparametric techniques as the Wilcoxon test, Mann-Whitney U-test, Kruskal-Wallis one-way ANOVA, and the Friedman two-way ANOVA for ranks

24 10 [Siegel, 1956]. The change to non-parametrics was unnecessary primarily because researchers asked, 'Are normal theory ANOVA assumptions met?' instead of 'How important are the inevitable violations of normal theory ANOVA assumptions?' (Glass, Peckham, and Sanders, 197, p. 37). The following assumptions were made for the simple oneway fixed effects model ANOVA in this study: 1. XL. = y + r. + e. y (3). e.. ~ NID(0,a ) (4) <-3 3. It. = 0 (5) 3 The first assumption was that of additivity. Any observation was taken to be the simple sum of three components. First, \i, the population mean; second, x., the effect of 3 treatment 3 on the dependent variable for all of the observations in group 3; and third, e.., the error of the ^c,jth *-5 observation. The second assumption was that the e^_y s have a normal distribution with a population mean of zero and a variance of a and that they were independent. According to Glass, Peckham, and Sanders (197), the third assumption need be of little concern; it is merely a consequence of choosing to express X- in three terms (y,x., 3 j e..) instead of two, for example, 3- = y + x- and e

25 11 Three different violations of assumption have been considered in the past: (a) non-normality, (b) different variances from different groups, and (c) non-independence. The thrust of this study was to investigate the (b) violation, i.e., heterogeneity of variances. Hsu (1938) was one of the first to obtain concise mathematical results in the study of the effects of heterogeneous variances. Hsu determined the actual significance level of a result tested at the 0.05 level for different values of the ratio of a to a in a two-tailed t test. Scheffe (1959) and 1 ~~ Pratt (1964) addressed the same problem. Box (1954) studied the effect on alpha level of heterogeneous variances in the one-way ANOVA. One example of Box's findings was that if three treatments were compared and n l =9, n =5 ' and n 3~ 1 ' and the P P ulation variances are in the ratio of 1:1:3, the probability of a Type I error was actually 0.17,when the experimenter would expect it to be Of particular interest for this study was that Box's results agreed quite closely with those of Hsu. When n's were equal the actual and the nominal significance levels agreed quite closely. Also of special interest for this study was the finding that with seven groups of n=3 (equal n's) and a variance ratio of 1:1:1:1:1:1:7, Box found an actual significance level of 0.1 when the nominal significance level of 0.05 was expected.

26 1 One of the most significant and comprehensive studies was made by Dee W. Norton at the State University of Iowa in 195 (Lindquist, 1956). From Norton's investigations, it appeared that marked heterogeneity of variance has a small but real effect on the form of the F-distribution. Kohr and Games (1974) indicated that the F-test was robust with regard to heterogeneity of variance with equal sample sizes but more susceptible to error when sample sizes are unequal. The F-test and the analysis of variance have been investigated (Atiqullah, 196; Norton, 195 [found in Lindquist]; Pearson, 1931; and Scheffe, 1959), with the conclusion that they have a high degree of robustness. The result of robustness for the analysis of variance has precipitated similar questions concerning assumptions underlying multiple comparison procedures. Hypotheses about mean differences from a set of k means (fe>) may provide a situation that requires the use of some multiple comparison technique. According to Kirk (1968), the analysis of variance is equivalent to a simultaneous test of the hypothesis that all possible comparisons among means are equal to zero.... If an over-all test of significance using an F-ratio is significant, an experimenter can be certain that some set of orthogonal comparisons contains at least one significant comparison among means.... It remains for an experimenter to carry out followup tests [multiple comparisons] to determine what has happened (p. 73).

27 13 One solution for determining the location of a significant difference was to use multiple t tests; but according to Games (1971), although this procedure has been found to be powerful, it allowed the familywise (FWI) error rate to increase as the number of t tests increased, sometimes resulting in an unacceptable error rate. In order to locate significant differences in means without producing high FWI rate, other multiple comparison procedures have been developed. Games (1971) reported twelve different multiple comparison procedures. Games discussed and compared the multiple t test, Scheffe's least significant difference test, Bonferroni's t statistic, Tukey's procedure, and Dunnett's test. Games also reviewed sequential multiple comparison techniques,including the Newman-Keuls test and Duncan's multiple range test. Evidence that multiple comparison procedures are controversial topics in statistics was presented by Petrinovich and Hardyck (1969) when they stated that Textbook authors at least in the area of psychological statistics have not been particularly helpful. Authors such as Edwards [1960], Federer [1955], Hays [1963], McNemar [195], and Winer [196] either offer no evaluation as to which method is preferable, or preface their remarks with a cautionary statement to the effect that mathematical statisticians are not entirely in agreement concerning the preferred

28 14 method. Similarly, disagreement exists as to when these methods may be used. Some discussions state that a significant F ratio over all conditions must be obtained before multiple comparison methods can be used; other discussions make no mention of such a requirement, or deny that it is necessary at all (p. 44). Hopkins and Chadbourn (1967) [found in Games, 1971, p.559] suggested that the overall F-test be routinely run first, then if it is found to be significant, a multiple comparison procedure should follow. According to them, this second stage should be the Bonferroni t procedure, the Newman-Keuls, the Tukey wholly significant difference test (WSD), or the Scheffe, depending on certain factors. According to Games (1971), There seems to be little point in applying the overall F-test prior to running C contrasts by procedures that set P(EI>0) alpha (method 3 and the Bonferroni t's). If the C contrast express the experimental interests directly, they are justified whether the overall F is significant or not and P(EI>0) is still controlled. The Newman-Keuls and WSD also control P(EI>0), so do not need a significant F to justify them (p. 560). Here, Games used the symbols P(EI>0) to represent the familywise risk of Type I error. The familywise rate was the risk of making one of more Type I errors in the entire set of contrasts that comprise a family.

29 15 Tukey's multiple comparison test has been a frequently cited procedure when the researcher's multiple comparison hypotheses are for pairwise differences (Games, 1971; Keselman and Toothaker, 1974; Marasculio, 1971). Evidence of interest in Tukey's HSD procedure has been presented in papers published in education, psychology, and statistics journals (Howell and Games, 1973a, 1973b; Keselman, Murray, and Rogan, 1976; Keselman, Toothaker, and Shooter, 1975; Petrinovich and Hardyck, 1969; Steel and Torrie, 1966). For the most part, these papers have investigated the effects of the violation of the assumptions under which the Tukey test was derived. The importance of these studies has been related to the validity of the use of the Tukey test in actual educational situations because these actual educational situations seldom, if ever, meet the assumptions under which the Tukey test was developed. Just as in the case of the ANOVA F-test, Tukey's HSD test was derived under the assumptions that the observations of each of the populations under study are independently and normally distributed with equal variances. Further, Tukey's HSD method was derived under the restriction that the variances of the sample means be equal; hence, each sample mean must be based on an equal number of observations. When the requirement of equal sample sizes cannot be met, several unequal n forms of the Tukey procedure have been suggested. Winer (196, p. 101) suggested that the estimated variance, S /n, should be replaced with the average of the

30 16 variances of the means when sample sizes do not differ a great deal. Steel and Torrie (1966, p. 114) suggested the use of the Kramer method with the Tukey test. Kramer's method only employed the sample sizes of the means actually involved in the simple contrast. Miller (1966, p. 48) suggested the use of an average or median value of the group sizes as an approximate value of n. Smith (1971) compared Kramer's method, Winer's harmonic mean, and Miller's unequal n forms of the Tukey test for unequal sample sizes under conditions of homogeneous population variances. Smith recommended the use of the Kramer method. Keselman, Murray, and Rogan (1976) reported that the Tukey test did not have to be restricted to comparisons having equal n 1 s. They recommended Kramer's unequal n procedure. Howell and Games (1973) investigated the robustness of the harmonic mean form of the Tukey test under conditions of unequal sample sizes coupled with various patterns of population variance heterogeneity. They found that when the smallest sample size was selected from the population with the smallest variance, and the largest sample size was selected from the population with the largest variance, the Tukey test was conservative, i.e., the empirical significance level was less than the nominal significance level. When the smallest sample size was sampled from the population with the largest variance, and the largest sample size was sampled from the population with the smallest variance, the Tukey test was found to be liberal; i.e., the empirical significance level was found to

31 17 be greater than the nominal significance level. Petrinovich and Hardyck (1969) and Keselman and Toothaker (1974) examined the robustness of the harmonic mean form of the Tukey test and reported results similar to Howell and Games (1973). Also, the Tukey test was found to be robust to conditions of non-normality. Ramseyer and Tcheng (1973) investigated three different multiple comparison procedures that make use of the studentized range statistic, q. The procedures they studied were the Tukey HSD test, the Newman-Keuls test, and the Duncan multiple range test. They studied the effect on the Type I error rate of assumption violations on these three procedures. In Ramseyer and Tcheng's investigation, homogeneity of variance was violated with variance ratios of (a) 1:1: [fe=3], (b) 1:1:4 [fe=3], (c) 1:1:1:: [fe=5], and (d) 1:1:1:4:4 [fe=5]. Normality was violated with populations that were positively and negatively exponentially skewed and rectangularly distributed. A combination of the violation of the normality assumption and the homogeneous variance assumption was also studied. Ramseyer and Tcheng concluded that q is robust to the violation of homogeneity of variance and normality. They also reported that violation of normality produced Type I error rates lower than nominal levels. Carmer and Swanson (1978) used computer simulation techniques to study the Type I and Type III error rates for ten pairwise multiple comparison procedures, including the Tukey

32 18 statistic. Their results indicated that Scheffe.'s test, Tukey's test, and Newman-Keuls' test were less appropriate than a restricted least-significant-difference (LSD) test, some Bayesian modifications of the LSD,and Duncan's multiple range test. Carmer and Swanson (1978) stated that the inferiority of Scheffe's test, Tukey's test, and Student'sNewman- Keuls' test was even more apparent with sets of ten and twenty treatments. This was, according to them, due to the critical values of these procedures being dependent on the number of treatments. Keselman, Toothaker, and Shooter (1975) studied the harmonic mean and the Kramer unequal n forms of the Tukey HSD statistic. In their study, unequal sample sizes and unequal variances were combined in varied patterns that included normal and skewed population shapes and population variances in the ratios of (a) 1:1:4:4, (b) 1:1:1:, (c) 1:.5:.5:4, and (d) 1::3:4. Their findings indicated a close agreement between the two unequal n forms of the Tukey statistic. Both methods were adversely affected when unequal sample sizes were combined with unequal variances in the way reported by Howell and Games (1973), Petrinovich and Hardyck (1969), and Keselman and Toothaker (1974). Keselman and Rogan (1978) investigated five modifications of Tukey's statistic and compared them with Scheffe's test in controlling Type I errors and sensitivity to unequal sample sizes, variance heterogeneity, and sampling from non-normal

33 19 populations. They utilized a coefficient of variance variation to index the degree of variance heterogeneity. All of their investigations used k=4 groups,and sample sizes varied from a low of sixteen to a high of eighty-nine. Keselman and Rogan reported that a Games and Howell (1976) modification of Tukey's test controlled the Type I error rate at or below the nominal level for all conditions they investigated. Keselman and Rogan selected values of 0.0, 0.40, 0.80, and 1.00 for values of C (Keselman and Rogan's index of variance variation was C, not C ) since they felt this selection of C values represented those likely to be encountered in actual research. Based on the results of their investigation, Keselman and Rogan recommended the Games and Howell modification of the Tukey multiple comparison test for pairwise comparisons of means. According to Winer (1971, p. 198), there are two popular versions of the Tukey multiple comparison procedure. Winer labeled the more popular of the two procedures as Tukey A. The Tukey A procedure has been frequently labeled as Tukey's Honestly Significant Difference test (Winer, 1971; Kirk, 1968; Games, 1971). Tukey A has been known as the T-Method (Glass and Stanley, 1970; Scheffe, 1959), and the WSD test (Games, 1971). Apparently, Games and Kirk do not agree that the WSD test and the HSD test are one and the same,because Kirk states "The WSD test merits consideration but is more complex than the HSD test" (1968, p. 90). Therefore, Kirk has indicated

34 0 that the HSD and the WSD are two different procedures. The form of the statistic utilized in this investigation is that found in Kirk (1968, p. 88): HSD = ^av v / /MS error /V n (6) HSD is the value that must be exceeded in order for a comparison involving two means to be declared significant. The value of a is determined by entering a table for the studentized ^av range distribution with v degrees of freedom that correspond terms degrees of freedom and a level of signifi- ^ to the MS error cance. Another factor that determines q is the number of treatment levels in the experiment. MS error -*- san est i mate taken from the one way analysis of variance mean square withingroup of the experiment. Group sample size is designated by n and the number of treatment levels is designated by k. Tukey's Honestly Significant Difference (HSD) test was designed to make all pairwise comparisons among means (Kirk, 1968). According to Winer (1971) in 1953, Tukey extended an approach originally suggested by Fisher to control FWI error rate. It was this procedure that has been called the HSD test. The basic assumptions of the HSD test are normality, homogeneity of variance, randomization, and equal sample sizes (Kirk, 1968, p. 88). Ryan (1959) introduced two general issues involving multiple comparisons. These were a versus a poitzfiyiofia,

35 1 comparisons and the concept of error rate. According to Ryan, an a pkloh.*. test is one in which "the experimenter states in advance all possible conclusions and the rules by which these conclusions will be drawn" (p. 38). A po&to.n-ioti'l tests are those which are suggested by data. These typesof tests have been known as data snooping or as post-mortem comparisons. Ryan indicated that there were several types of error rates, but Kirk (1968) has defined six kinds of error rates: (a) error rate per comparison, (b) error rate per hypothesis, (c) error rate per experiment, (d) error rate experimentwise, (e) error rate per family, and (f) error rate familywise. "It should be noted that the various error rates are all identical for an experiment involving a single comparison. The error rates become more divergent as the number of comparisons and hypotheses evaluated in an experiment are increased"(kirk, 1968, p. 83). The error rate conceptualized for the HSD test was "familywise." In the one-dimensional case, "per family" and "per experiment," and "familywise" and "experimentwise," are equivalent terms (Ryan, 1959). Therefore, in the one-way analysis of variance, Tukey's terms "family" and "familywise" took on the more simple definition of "experiment" and "experimentwise." According to Kirk (1968), error rate per experiment (i.e., family in the one-dimensional case) was defined as (p. 84) number of comparisons falsely declared significant total number of experiments.

36 Error rate experimentwise (familywise in the one-dimensional case) was defined as (p. 84) number of experiments with at least one statement falsely declared significant total number of experiments. Kirk concluded that... it should be observed that once an experimenter has specified an error rate and has decided on an appropriate conceptual unit for error rate, he can compute the corresponding rate for any other conceptual unit. Basically, the problem facing an experimenter is that of choosing, prior to the conduct of an experiment, a test statistic that provides the kind of protection desired (p. 86). Much research has been conducted on the robustness of the F-test and multiple comparison procedures under the violation of the assumptions of homogeneity of variance. For the most part, the research supports the theory that when sample sizes are equal, the F-test and multiple comparison procedures are robust. According to Box (1954), "It appears that if the groups are equal, moderate inequality of variance does not seriously affect the test" (p. 98). However, "moderate inequality of variance" was not specifically defined. Box's results under extreme conditions (k=7, n's equal, variance ratio = 1:1:1:1:1:1:7, and nominal alpha = 0.05, the empirical alpha = 0.1) indicated that the question has not been

37 3 fully investigated. Therefore, the focus of this study was to investigate the robustness of the Tukey HSD procedure under conditions of equal sample size and heterogeneous variances. In 197, Glass, Peckham>and Sanders stated the following: Whatever the cause, we find it significant to note that subsequent investigators have not extended Box's work in the direction of this curious finding. The conventional conclusion that heterogeneous variances are not important when Kt's are equal seems to have boundary conditions like all other conclusions in this area, and the boundary conditions may have not been sufficiently probed (p. 45). In 1977, Rogan, Keselman, and Breen reported: Of special interest was the finding that large degrees of variance heterogeneity produced liberal Type I error rates even in the presence of equal sample sizes. Although Box found serious distortions in the Type I error of the ANOVA F-test under similar conditions, this finding is contrary to the conventional conclusion that heterogeneous variances are not important when sample sizes are equal. The authors agree with Glass, Peckham, and Sanders in that this conclusion regarding the role of unequal variances in combination with equal sample sizes appears to have boundary conditions which have not been sufficiently probed. The data

38 4 from this investigation suggests that the degree of variance heterogeneity may play a role in determining these boundary conditions (p. 5). Box (1954) developed a method for the indexing of the degree of heterogeneity by a coefficient of variance variation symbolized by C, where 1/ fe C = -1 (N-fe) E v fe ( k ~ ) (7) _ fe and a = Zv^aj^ is the weighted mean of the fe variances, v fe v^ = n^-1 represents the degrees of freedom associated with each of the fe variances, fe represents the number of treatment groups, and N represents the total number of observations. Rogan, Keselman, and Breen (1977) demonstrated that very different ratios of unequal variances and unequal sample sizes may be identical with respect to their degree of heteroge- neity or the C value. For example, consider the two sets of sample sizes and variances presented on the following page.

39 5 Case A Case B n k 4, 3, 36, 40, 48, 60 6, 1, 14, 16, 4, 48.05,.,.35,.5, 16, ,.64,.64,.64,.64,.79 a ratios 1:4:7:10:1:6 1:1:1:1:1:4.35 o ro Both of the above cases involve very different ratios of unequal variance yet are similar with respect to their degree of heterogeneity as indexed by Box's coefficient of variance variation. Though different ratios of variances have been manipulated in other studies, the degree of variance heterogeneity may in some cases not have been varied. Also, a simpler form of the C equation for equal n's was derived by Box (1954): k C = 1 E (a -a ) k t=l t - (a \ ) () C is the variance of the variances divided by the square of the mean variance. If the variances range from a lower value a to an upper value ao (where a is a coefficient of a and

40 6 a>l), then the largest possible value for C is attained when fe-1 of the variances are equal to a and the remaining variance is equal to ao. In this case, C = (fc-1)(a-1) (8) (a-l+k). Values of C greater than one or at most two probably would be extremely rare in reality (Box, 1954). This study was limited to values of C less than fe-1. Tamhane (1979) used Box's coefficient of variation as a measure of unbalance in the values of var (x.) = T - = o^ /n. [i. = 1,..., fc). (9) Tamhane indicated that although Keselman and Rogan (1978) had used this index for measuring variance variation, he believed that t was a more relevant parameter in his study than was a. The purpose of this study was to further investigate the question regarding the effects of variance heterogeneity and equal sample sizes by utilizing Box's coefficient of variance variation to index heterogeneity in order to determine whether boundary conditions existed where the Tukey HSD procedure was no longer robust. Results of this investigation should provide researchers in the behavioral sciences with additional information regarding the proper use of the Tukey HSD statistic.

41 CHAPTER BIBLIOGRAPHY Atiqullah, M. The robustness of the covariance analysis of a one-way classification. Biometrika, 1964, 51_, Box, G. E. P. Some theoremson quadratic forms applied in the study of variance problems. I. Effect of inequality of variances in the one-way classification. Arinals of Mathematica1 Statistics, 1954, 5, Carmer, S. G., and Swanson, M. R. An evaluation of ten pairwise multiple comparison procedures by Monte Carlo methods. Journal of the American Statistical Association, 1978, 6, Games, P. A. Multiple comparisons of means. American Educational Research Journal, 1971, B_(3), Games, P. A. Inverse relation between the risks of type I and type II errors and suggestions for the unequal n case in multiple comparisons. Psychological Bulletin, 1971, 7_5 (), Games, P. A., and Howell, J. F. Pairwise multiple comparison procedures with unequal n's and/or variances: A Monte Carlo study. Journal of Educational Statistics, 1976, 1, Glass, G. V., Peckham, P. D., and Sanders, J. R. Consequences of failure to meet assumptions underlying the fixed 7

42 8 effects analysis of variance and covariance. Review of Educational Research, 197, (3), Glass, G. V., and Stanley, J. C. Statistical methods in education and psychology. Englewood Cliffs, N. J.: Prentice Hall, Howell, J. F., and Games, P. A. The effects of variance heterogeneity on simultaneous multiple comparison procedures with equal sample size. Paper presented at the American Educational Research Association Convention, February, 1973 (ERIC document ED ). (a) Howell, J. F., and Games, P. A. The robustness of the analysis of variance and the Tukey WSD test under various patterns of heterogeneous variances. Journal of Experimental Education, 1973b, 41(4), (b) Hsu, P. L. Contributions to the theory of students t-test as applied to the problem of two samples. Statistical Research Memoirs, II, 1938, 1-4. Keselman, H. J., Murray, R., and Rogan, J. Effect of very unequal group sizes on Tukey's multiple comparison test. Educational and Psychological Measurement, 1976, 36, Keselman, H. J., and Rogan, J. C. A comparison of the modified Tukey and Scheff^ methods of multiple comparisons for pairwise contrasts. Journal of the American Statistical Association, 1978, 73 (361), 47-5.

Presented to the Graduate Council of the. North Texas State University. in Partial. Fulfillment of the Requirements. For the Degree of.

Presented to the Graduate Council of the. North Texas State University. in Partial. Fulfillment of the Requirements. For the Degree of. AN EMPIRICAL INVESTIGATION OF TUKEY'S HONESTLY SIGNIFICANT DIFFERENCE TEST WITH VARIANCE HETEROGENEITY AND UNEQUAL SAMPLE SIZES, UTILIZING KRAMER'S PROCEDURE AND THE HARMONIC MEAN DISSERTATION Presented

More information

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600 Multiple Comparison Procedures Cohen Chapter 13 For EDUC/PSY 6600 1 We have to go to the deductions and the inferences, said Lestrade, winking at me. I find it hard enough to tackle facts, Holmes, without

More information

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions Journal of Modern Applied Statistical Methods Volume 12 Issue 1 Article 7 5-1-2013 A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions William T. Mickelson

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Multiple Comparison Methods for Means

Multiple Comparison Methods for Means SIAM REVIEW Vol. 44, No. 2, pp. 259 278 c 2002 Society for Industrial and Applied Mathematics Multiple Comparison Methods for Means John A. Rafter Martha L. Abell James P. Braselton Abstract. Multiple

More information

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective Second Edition Scott E. Maxwell Uniuersity of Notre Dame Harold D. Delaney Uniuersity of New Mexico J,t{,.?; LAWRENCE ERLBAUM ASSOCIATES,

More information

Chapter 13 Section D. F versus Q: Different Approaches to Controlling Type I Errors with Multiple Comparisons

Chapter 13 Section D. F versus Q: Different Approaches to Controlling Type I Errors with Multiple Comparisons Explaining Psychological Statistics (2 nd Ed.) by Barry H. Cohen Chapter 13 Section D F versus Q: Different Approaches to Controlling Type I Errors with Multiple Comparisons In section B of this chapter,

More information

A posteriori multiple comparison tests

A posteriori multiple comparison tests A posteriori multiple comparison tests 11/15/16 1 Recall the Lakes experiment Source of variation SS DF MS F P Lakes 58.000 2 29.400 8.243 0.006 Error 42.800 12 3.567 Total 101.600 14 The ANOVA tells us

More information

Multiple Comparison Procedures, Trimmed Means and Transformed Statistics. Rhonda K. Kowalchuk Southern Illinois University Carbondale

Multiple Comparison Procedures, Trimmed Means and Transformed Statistics. Rhonda K. Kowalchuk Southern Illinois University Carbondale Multiple Comparison Procedures 1 Multiple Comparison Procedures, Trimmed Means and Transformed Statistics Rhonda K. Kowalchuk Southern Illinois University Carbondale H. J. Keselman University of Manitoba

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing

More information

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie Extending the Robust Means Modeling Framework Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie One-way Independent Subjects Design Model: Y ij = µ + τ j + ε ij, j = 1,, J Y ij = score of the ith

More information

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique

More information

13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics. Post hoc Comparisons 13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Post hoc Comparisons In the prior chapter we used ANOVA

More information

Lec 1: An Introduction to ANOVA

Lec 1: An Introduction to ANOVA Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to

More information

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions Introduction to Analysis of Variance 1 Experiments with More than 2 Conditions Often the research that psychologists perform has more conditions than just the control and experimental conditions You might

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

B. Weaver (18-Oct-2006) MC Procedures Chapter 1: Multiple Comparison Procedures ) C (1.1)

B. Weaver (18-Oct-2006) MC Procedures Chapter 1: Multiple Comparison Procedures ) C (1.1) B. Weaver (18-Oct-2006) MC Procedures... 1 Chapter 1: Multiple Comparison Procedures 1.1 Introduction The omnibus F-test in a one-way ANOVA is a test of the null hypothesis that the population means of

More information

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Alisa A. Gorbunova and Boris Yu. Lemeshko Novosibirsk State Technical University Department of Applied Mathematics,

More information

Chapter 14: Repeated-measures designs

Chapter 14: Repeated-measures designs Chapter 14: Repeated-measures designs Oliver Twisted Please, Sir, can I have some more sphericity? The following article is adapted from: Field, A. P. (1998). A bluffer s guide to sphericity. Newsletter

More information

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large Z Test Comparing a group mean to a hypothesis T test (about 1 mean) T test (about 2 means) Comparing mean to sample mean. Similar means = will have same response to treatment Two unknown means are different

More information

Introduction to the Analysis of Variance (ANOVA)

Introduction to the Analysis of Variance (ANOVA) Introduction to the Analysis of Variance (ANOVA) The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique for testing for differences between the means of multiple (more

More information

A comparison of Type I error rates for the Bootstrap Contrast with the t test and the Roburst Rank Order test for various sample sizes and variances

A comparison of Type I error rates for the Bootstrap Contrast with the t test and the Roburst Rank Order test for various sample sizes and variances Lehigh University Lehigh Preserve Theses and Dissertations 1993 A comparison of Type I error rates for the Bootstrap Contrast with the t test and the Roburst Rank Order test for various sample sizes and

More information

THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED

THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED H. J. Keselman Rand R. Wilcox University of Manitoba University of Southern California Winnipeg, Manitoba Los Angeles,

More information

Conventional And Robust Paired And Independent-Samples t Tests: Type I Error And Power Rates

Conventional And Robust Paired And Independent-Samples t Tests: Type I Error And Power Rates Journal of Modern Applied Statistical Methods Volume Issue Article --3 Conventional And And Independent-Samples t Tests: Type I Error And Power Rates Katherine Fradette University of Manitoba, umfradet@cc.umanitoba.ca

More information

U.S. Department of Agriculture, Beltsville, Maryland 20705

U.S. Department of Agriculture, Beltsville, Maryland 20705 AN EVALUATION OF MULTIPLE COMPARISON PROCEDURES D. R. Waldo 1'2 U.S. Department of Agriculture, Beltsville, Maryland 20705 SUMMARY Least significant difference, Duncan's multiple range test, Student-Newman-Keuls,

More information

Chapter Seven: Multi-Sample Methods 1/52

Chapter Seven: Multi-Sample Methods 1/52 Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze

More information

sphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19

sphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19 additive tree structure, 10-28 ADDTREE, 10-51, 10-53 EXTREE, 10-31 four point condition, 10-29 ADDTREE, 10-28, 10-51, 10-53 adjusted R 2, 8-7 ALSCAL, 10-49 ANCOVA, 9-1 assumptions, 9-5 example, 9-7 MANOVA

More information

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS

More information

Multiple Comparisons

Multiple Comparisons Multiple Comparisons Error Rates, A Priori Tests, and Post-Hoc Tests Multiple Comparisons: A Rationale Multiple comparison tests function to tease apart differences between the groups within our IV when

More information

Transition Passage to Descriptive Statistics 28

Transition Passage to Descriptive Statistics 28 viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of

More information

H0: Tested by k-grp ANOVA

H0: Tested by k-grp ANOVA Analyses of K-Group Designs : Omnibus F, Pairwise Comparisons & Trend Analyses ANOVA for multiple condition designs Pairwise comparisons and RH Testing Alpha inflation & Correction LSD & HSD procedures

More information

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials. One-Way ANOVA Summary The One-Way ANOVA procedure is designed to construct a statistical model describing the impact of a single categorical factor X on a dependent variable Y. Tests are run to determine

More information

Laboratory Topics 4 & 5

Laboratory Topics 4 & 5 PLS205 Lab 3 January 23, 2014 Orthogonal contrasts Class comparisons in SAS Trend analysis in SAS Multiple mean comparisons Laboratory Topics 4 & 5 Orthogonal contrasts Planned, single degree-of-freedom

More information

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1 Linear Combinations Comparison of treatment means Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 6 1 Linear Combinations of Means y ij = µ + τ i + ǫ ij = µ i + ǫ ij Often study

More information

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota Multiple Testing Gary W. Oehlert School of Statistics University of Minnesota January 28, 2016 Background Suppose that you had a 20-sided die. Nineteen of the sides are labeled 0 and one of the sides is

More information

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

SPSS Guide For MMI 409

SPSS Guide For MMI 409 SPSS Guide For MMI 409 by John Wong March 2012 Preface Hopefully, this document can provide some guidance to MMI 409 students on how to use SPSS to solve many of the problems covered in the D Agostino

More information

Hypothesis T e T sting w ith with O ne O One-Way - ANOV ANO A V Statistics Arlo Clark Foos -

Hypothesis T e T sting w ith with O ne O One-Way - ANOV ANO A V Statistics Arlo Clark Foos - Hypothesis Testing with One-Way ANOVA Statistics Arlo Clark-Foos Conceptual Refresher 1. Standardized z distribution of scores and of means can be represented as percentile rankings. 2. t distribution

More information

Group comparison test for independent samples

Group comparison test for independent samples Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences between means. Supposing that: samples come from normal populations

More information

psychological statistics

psychological statistics psychological statistics B Sc. Counselling Psychology 011 Admission onwards III SEMESTER COMPLEMENTARY COURSE UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION CALICUT UNIVERSITY.P.O., MALAPPURAM, KERALA,

More information

GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE

GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE DOI 10.1007/s11018-017-1141-3 Measurement Techniques, Vol. 60, No. 1, April, 2017 GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY

More information

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5) STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons Ch. 4-5) Recall CRD means and effects models: Y ij = µ i + ϵ ij = µ + α i + ϵ ij i = 1,..., g ; j = 1,..., n ; ϵ ij s iid N0, σ 2 ) If we reject

More information

Types of Statistical Tests DR. MIKE MARRAPODI

Types of Statistical Tests DR. MIKE MARRAPODI Types of Statistical Tests DR. MIKE MARRAPODI Tests t tests ANOVA Correlation Regression Multivariate Techniques Non-parametric t tests One sample t test Independent t test Paired sample t test One sample

More information

Journal of Educational and Behavioral Statistics

Journal of Educational and Behavioral Statistics Journal of Educational and Behavioral Statistics http://jebs.aera.net Theory of Estimation and Testing of Effect Sizes: Use in Meta-Analysis Helena Chmura Kraemer JOURNAL OF EDUCATIONAL AND BEHAVIORAL

More information

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and

More information

Introduction to Analysis of Variance (ANOVA) Part 2

Introduction to Analysis of Variance (ANOVA) Part 2 Introduction to Analysis of Variance (ANOVA) Part 2 Single factor Serpulid recruitment and biofilms Effect of biofilm type on number of recruiting serpulid worms in Port Phillip Bay Response variable:

More information

H0: Tested by k-grp ANOVA

H0: Tested by k-grp ANOVA Pairwise Comparisons ANOVA for multiple condition designs Pairwise comparisons and RH Testing Alpha inflation & Correction LSD & HSD procedures Alpha estimation reconsidered H0: Tested by k-grp ANOVA Regardless

More information

NONPARAMETRICS. Statistical Methods Based on Ranks E. L. LEHMANN HOLDEN-DAY, INC. McGRAW-HILL INTERNATIONAL BOOK COMPANY

NONPARAMETRICS. Statistical Methods Based on Ranks E. L. LEHMANN HOLDEN-DAY, INC. McGRAW-HILL INTERNATIONAL BOOK COMPANY NONPARAMETRICS Statistical Methods Based on Ranks E. L. LEHMANN University of California, Berkeley With the special assistance of H. J. M. D'ABRERA University of California, Berkeley HOLDEN-DAY, INC. San

More information

TWO-FACTOR AGRICULTURAL EXPERIMENT WITH REPEATED MEASURES ON ONE FACTOR IN A COMPLETE RANDOMIZED DESIGN

TWO-FACTOR AGRICULTURAL EXPERIMENT WITH REPEATED MEASURES ON ONE FACTOR IN A COMPLETE RANDOMIZED DESIGN Libraries Annual Conference on Applied Statistics in Agriculture 1995-7th Annual Conference Proceedings TWO-FACTOR AGRICULTURAL EXPERIMENT WITH REPEATED MEASURES ON ONE FACTOR IN A COMPLETE RANDOMIZED

More information

Inferences About the Difference Between Two Means

Inferences About the Difference Between Two Means 7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent

More information

Examining Multiple Comparison Procedures According to Error Rate, Power Type and False Discovery Rate

Examining Multiple Comparison Procedures According to Error Rate, Power Type and False Discovery Rate Journal of Modern Applied Statistical Methods Volume 11 Issue 2 Article 7 11-1-2012 Examining Multiple Comparison Procedures According to Error Rate, Power Type and False Discovery Rate Guven Ozkaya Uludag

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

On Selecting Tests for Equality of Two Normal Mean Vectors

On Selecting Tests for Equality of Two Normal Mean Vectors MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department

More information

Basic Statistical Analysis

Basic Statistical Analysis indexerrt.qxd 8/21/2002 9:47 AM Page 1 Corrected index pages for Sprinthall Basic Statistical Analysis Seventh Edition indexerrt.qxd 8/21/2002 9:47 AM Page 656 Index Abscissa, 24 AB-STAT, vii ADD-OR rule,

More information

STAT22200 Spring 2014 Chapter 5

STAT22200 Spring 2014 Chapter 5 STAT22200 Spring 2014 Chapter 5 Yibi Huang April 29, 2014 Chapter 5 Multiple Comparisons Chapter 5-1 Chapter 5 Multiple Comparisons Note the t-tests and C.I. s are constructed assuming we only do one test,

More information

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami Parametric Assumptions The observations must be independent. Dependent variable should be continuous

More information

Introduction. Chapter 8

Introduction. Chapter 8 Chapter 8 Introduction In general, a researcher wants to compare one treatment against another. The analysis of variance (ANOVA) is a general test for comparing treatment means. When the null hypothesis

More information

Chapter 6 Planned Contrasts and Post-hoc Tests for one-way ANOVA

Chapter 6 Planned Contrasts and Post-hoc Tests for one-way ANOVA Chapter 6 Planned Contrasts and Post-hoc Tests for one-way NOV Page. The Problem of Multiple Comparisons 6-. Types of Type Error Rates 6-. Planned contrasts vs. Post hoc Contrasts 6-7 4. Planned Contrasts

More information

COMPARING SEVERAL MEANS: ANOVA

COMPARING SEVERAL MEANS: ANOVA LAST UPDATED: November 15, 2012 COMPARING SEVERAL MEANS: ANOVA Objectives 2 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons

More information

INTRODUCTION TO INTERSECTION-UNION TESTS

INTRODUCTION TO INTERSECTION-UNION TESTS INTRODUCTION TO INTERSECTION-UNION TESTS Jimmy A. Doi, Cal Poly State University San Luis Obispo Department of Statistics (jdoi@calpoly.edu Key Words: Intersection-Union Tests; Multiple Comparisons; Acceptance

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Multiple Comparison Procedures for Trimmed Means. H.J. Keselman, Lisa M. Lix and Rhonda K. Kowalchuk. University of Manitoba

Multiple Comparison Procedures for Trimmed Means. H.J. Keselman, Lisa M. Lix and Rhonda K. Kowalchuk. University of Manitoba 1 Multiple Comparison Procedures for Trimmed Means by H.J. Keselman, Lisa M. Lix and Rhonda K. Kowalchuk University of Manitoba Abstract Stepwise multiple comparison procedures (MCPs) based on least squares

More information

1 One-way Analysis of Variance

1 One-way Analysis of Variance 1 One-way Analysis of Variance Suppose that a random sample of q individuals receives treatment T i, i = 1,,... p. Let Y ij be the response from the jth individual to be treated with the ith treatment

More information

Analysis of variance (ANOVA) Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University Power in Paired-Samples Designs Running head: POWER IN PAIRED-SAMPLES DESIGNS Increasing Power in Paired-Samples Designs by Correcting the Student t Statistic for Correlation Donald W. Zimmerman Carleton

More information

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely

More information

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures Non-parametric Test Stephen Opiyo Overview Distinguish Parametric and Nonparametric Test Procedures Explain commonly used Nonparametric Test Procedures Perform Hypothesis Tests Using Nonparametric Procedures

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 1 1-1 Basic Business Statistics 11 th Edition Chapter 1 Chi-Square Tests and Nonparametric Tests Basic Business Statistics, 11e 009 Prentice-Hall, Inc. Chap 1-1 Learning Objectives In this chapter,

More information

An Overview of the Performance of Four Alternatives to Hotelling's T Square

An Overview of the Performance of Four Alternatives to Hotelling's T Square fi~hjf~~ G 1992, m-t~, 11o-114 Educational Research Journal 1992, Vol.7, pp. 110-114 An Overview of the Performance of Four Alternatives to Hotelling's T Square LIN Wen-ying The Chinese University of Hong

More information

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data 1999 Prentice-Hall, Inc. Chap. 10-1 Chapter Topics The Completely Randomized Model: One-Factor

More information

One-Way Analysis of Covariance (ANCOVA)

One-Way Analysis of Covariance (ANCOVA) Chapter 225 One-Way Analysis of Covariance (ANCOVA) Introduction This procedure performs analysis of covariance (ANCOVA) with one group variable and one covariate. This procedure uses multiple regression

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

NAG Library Chapter Introduction. G08 Nonparametric Statistics

NAG Library Chapter Introduction. G08 Nonparametric Statistics NAG Library Chapter Introduction G08 Nonparametric Statistics Contents 1 Scope of the Chapter.... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric Hypothesis Testing... 2 2.2 Types

More information

Statistics and Measurement Concepts with OpenStat

Statistics and Measurement Concepts with OpenStat Statistics and Measurement Concepts with OpenStat William Miller Statistics and Measurement Concepts with OpenStat William Miller Urbandale, Iowa USA ISBN 978-1-4614-5742-8 ISBN 978-1-4614-5743-5 (ebook)

More information

APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV

APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV DOI 10.1007/s11018-017-1213-4 Measurement Techniques, Vol. 60, No. 5, August, 2017 APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV B. Yu. Lemeshko and T.

More information

http://www.statsoft.it/out.php?loc=http://www.statsoft.com/textbook/ Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences

More information

Intuitive Biostatistics: Choosing a statistical test

Intuitive Biostatistics: Choosing a statistical test pagina 1 van 5 < BACK Intuitive Biostatistics: Choosing a statistical This is chapter 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc.

More information

MATH Notebook 3 Spring 2018

MATH Notebook 3 Spring 2018 MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................

More information

INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF INDEPENDENT GROUPS ABSTRACT

INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF INDEPENDENT GROUPS ABSTRACT Mirtagioğlu et al., The Journal of Animal & Plant Sciences, 4(): 04, Page: J. 344-349 Anim. Plant Sci. 4():04 ISSN: 08-708 INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

Two-Sample Inferential Statistics

Two-Sample Inferential Statistics The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is

More information

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook BIOMETRY THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH THIRD E D I T I O N Robert R. SOKAL and F. James ROHLF State University of New York at Stony Brook W. H. FREEMAN AND COMPANY New

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Hypothesis esting Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Statistical Hypothesis: conjecture about a population parameter

More information

TEST POWER IN COMPARISON DIFFERENCE BETWEEN TWO INDEPENDENT PROPORTIONS

TEST POWER IN COMPARISON DIFFERENCE BETWEEN TWO INDEPENDENT PROPORTIONS TEST POWER IN COMPARISON DIFFERENCE BETWEEN TWO INDEPENDENT PROPORTIONS Mehmet MENDES PhD, Associate Professor, Canakkale Onsekiz Mart University, Agriculture Faculty, Animal Science Department, Biometry

More information

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Chapter Fifteen Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Copyright 2010 Pearson Education, Inc. publishing as Prentice Hall 15-1 Internet Usage Data Table 15.1 Respondent Sex Familiarity

More information

Psicológica ISSN: Universitat de València España

Psicológica ISSN: Universitat de València España Psicológica ISSN: 0211-2159 psicologica@uv.es Universitat de València España Zimmerman, Donald W.; Zumbo, Bruno D. Hazards in Choosing Between Pooled and Separate- Variances t Tests Psicológica, vol. 30,

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) Two types of ANOVA tests: Independent measures and Repeated measures Comparing 2 means: X 1 = 20 t - test X 2 = 30 How can we Compare 3 means?: X 1 = 20 X 2 = 30 X 3 = 35 ANOVA

More information

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Probability Sampling Procedures Collection of Data Measures

More information

Comparison of Two Samples

Comparison of Two Samples 2 Comparison of Two Samples 2.1 Introduction Problems of comparing two samples arise frequently in medicine, sociology, agriculture, engineering, and marketing. The data may have been generated by observation

More information

Non-parametric (Distribution-free) approaches p188 CN

Non-parametric (Distribution-free) approaches p188 CN Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14

More information

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013 Topic 19 - Inference - Fall 2013 Outline Inference for Means Differences in cell means Contrasts Multiplicity Topic 19 2 The Cell Means Model Expressed numerically Y ij = µ i + ε ij where µ i is the theoretical

More information

Contents Kruskal-Wallis Test Friedman s Two-way Analysis of Variance by Ranks... 47

Contents Kruskal-Wallis Test Friedman s Two-way Analysis of Variance by Ranks... 47 Contents 1 Non-parametric Tests 3 1.1 Introduction....................................... 3 1.2 Advantages of Non-parametric Tests......................... 4 1.3 Disadvantages of Non-parametric Tests........................

More information

DESIGN AND ANALYSIS OF EXPERIMENTS Third Edition

DESIGN AND ANALYSIS OF EXPERIMENTS Third Edition DESIGN AND ANALYSIS OF EXPERIMENTS Third Edition Douglas C. Montgomery ARIZONA STATE UNIVERSITY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore Contents Chapter 1. Introduction 1-1 What

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free

More information