TESTS FOR MEAN EQUALITY THAT DO NOT REQUIRE HOMOGENEITY OF VARIANCES: DO THEY REALLY WORK?

Size: px
Start display at page:

Download "TESTS FOR MEAN EQUALITY THAT DO NOT REQUIRE HOMOGENEITY OF VARIANCES: DO THEY REALLY WORK?"

Transcription

1 TESTS FOR MEAN EQUALITY THAT DO NOT REQUIRE HOMOGENEITY OF VARIANCES: DO THEY REALLY WORK? H. J. Keselman Rand R. Wilcox University of Manitoba University of Southern California Winnipeg, Manitoba Los Angeles, California Canada R3T N Jason Taylor University of Manitoba Winnipeg, Manitoba Canada R3T N Rhonda K. Kowalchuk University of Manitoba Winnipeg, Manitoba Canada R3T N Key Words: Tests for Mean Equality; Variance Heterogeneity; Nonnormality; Monte Carlo; Robust Estimators ABSTRACT Tests for mean equality proposed by Weerahandi (1995) and Chen and Chen (1998), tests that do not require equality of population variances, were examined when data were not only heterogeneous but, as well, nonnormal in unbalanced completely randomized designs. Furthermore, these tests were compared to a test examined by Lix and Keselman (1998), a test that uses a heteroscedastic statistic (i.e., Welch, 1951) with robust estimators (0% trimmed means and Winsorized variances). Our findings confirmed previously published data that the tests are indeed robust to variance heterogeneity when the data are obtained from normal populations. However, the Weerahandi (1995) and Chen and Chen (1998) tests were not found to be robust when data were obtained from nonnormal populations. Indeed, rates of Type I error were typically in excess of 10% and, at times, exceeded 50%. On the other hand, the 1

2 statistic presented by Lix and Keselman (1998) was generally robust to variance heterogeneity and nonnormality. 1. INTRODUCTION The Behrens-Fisher problem (see Fisher, 1935) refers to the problem of testing for mean equality in the presence of variance heterogeneity. This problem was originally discussed within the context of a two-group layout but also has been extended to the many-group layout. For example, Welch (1951), James (1951, 1954), Brown and Forsythe (1974) among others (see Gamage & Weerahandi, 1998; Lix & Keselman, 1995) have presented approximate test statistics for testing for mean equality when there are more than two groups and when population variances are not presumed to be equal. Popular methods (e.g., Welch) have been found to be generally robust to variance heterogeneity under normality, but the reverse is true when data are nonnormal (see Lix & Keselman, 1995). For completeness we note that in addition to the popular approximate methods, other solutions to the problem have been presented. For example, transformations of the data, nonparametric tests, as well as tests based on robust estimators (e.g., trimmed means and Winsorized variances) have also been proposed. Unfortunately, these procedures have not proven to be uniformly successful in controlling test size (!) when data are heterogeneous as well as nonnormal, particularly in unbalanced designs. Two recent solutions to this problem have been presented by Weerahandi (1995) and Chen and Chen (1998). These authors have derived test statistics which test for mean equality without requiring that population variances be equal. Thus, researchers may be able to use either of these procedures to test for mean equality and be confident that the test size will not be distorted by heterogeneous variances, a condition believed to characterize applied data (see Wilcox, 1997). Unfortunately, the data presented regarding the operating characteristics of these two test statistics is extremely limited. Gamage and Weerahandi (1998)

3 presented Type I error results indicating that the Weerahandi (1995) procedure is robust to nonnormality and variance heterogeneity in unbalanced designs. However, they only investigated a one-way design containing three treatment groups in which there were a limited number of unequal sample sizes and variances for one type of nonnormal distribution (gamma). Chen and Chen (1998) in their investigation only report power data for their test. Accordingly, the purpose of our investigation was to examine in detail the test statistics presented by these authors. In addition, we compared these procedures to a test examined by Lix and Keselman (1998); namely, a heteroscedastic statistic (i.e., Welch, 1951) that uses trimmed means and Winsorized variances, as suggested by Yuen (1974).. DEFINITION OF THE TEST STATISTICS Suppose n independent random observations X, X, ÞÞÞ, X are 1 n sampled from population ( œ "ß ÞÞÞ ß J). We assume that the X s (i œ 1, á, n ; D n œ N) are obtained from a normal population with mean. and _ # # # w unknown variance 5, with 5 Á 5w ( Á ). Then, let X œ DiX i/n and _ s œ Di(Xi X) /n [ Gamage and Weerahandi (1998) defined the sample variance with n 1 in the denominator, while Weerahandi (1995) used n ; to replicate the Gamage and Weerahandi findings, however, the denominator needed to be n.]. The usual less-than-full-rank model i X i œ.! % i can be applied to the problem at hand where the % i s are assumed to be independent random variables with % µ N(0, 5 ) and!! œ 0. Thus the i J œ1 null hypothesis can be expressed as either H 0:! 1 œ! œ â œ! J or H 0:. 1 œ. œ â œ. J..1 Generalized F-Test (Weerahandi, 1995). According to Weerahandi his generalized F-test is carried out by determining a generalized p-value 3

4 which is then compared to the nominal significance level to determine whether the null hypothesis of mean equality can be reected or not. To determine the generalized p-value one first computes a standardized µ between-group sum of squares, S, where b µ µ J J J q q S œ S ( 5, á, 5 ) œ! # n X / 5 (! # n X / 5 ) /! # n / 5. (.1.1) b b 1 J œ1 œ1 œ1 The generalized p-value is calculated to be p œ 1 q, where µ 1 N J ns ns J 1 b B B âb (1 B )B âb 1 q œ XŒH J 1, N J { s [,, 1 J 1 1 J 1 3 J ns 3 ns J (1 B )B âb 1 B, â, ]}, (.1.) 3 J 1 J 1 and H J 1, N J is the cdf of the F distribution with J 1 and N J degrees of freedom and the expectaion is taken with respect to independent Beta random variables k Bk µ Beta Œ! Ðn 1) Ðn " 1),, k œ 1,, á, J 1. (.1.3) iœ1 According to Weerahandi (1995) the p-value can be computed by numerical integration with respect to the Beta random variables or also through Monte Carlo methods. He points out that when the number of simulations is large the mean of the probabilites will well approximate the expected value. Interested readers can find a derivation of the method in Weerahandi.. The Chen and Chen (1998) Method. The statistic presented by Chen and Chen (1998) is an exact single-stage analysis of variance type procedure (as opposed to two-stage procedures--see Bishop and Dudewicz, 1978), which under the null hypothesis of the distribution of the 4

5 test, is completely free of the unknown variances. (Chen and Chen, p. 644) Again assuming the previously defined model, this procedure uses the _ first n 1 (where n 3) observations to define the sample mean (X ) and variance (s µ ), i.e., _ n 1 X œ! X /(n 1), and i iœ1 n 1 _ µ s œ! (X X) /(n ). iœ1 i Weights for the observations are defined as U œ 1 n " µ µ n Ê " Ðn 1) [s (m) /s 1] " V ÉÐn 1)[s µ /s µ 1] (..1) œ 1 n n (m) where µ s is the maximum of µ s,, µ (m) 1 á s J. Finally, a weighted sample mean is calculated as n µ X œ! W X (..). i i iœ1 where Wi œ U for 1 Ÿ i Ÿ n 1 V for iœ n where U and V satisfy the following equations (n 1)U V œ 1, 5

6 (n 1)U V œ µ s /n µ s. (..3) (m) Chen and Chen (1998) indicate that the transformation t œ µ X.. n Í µ Ì s! W i iœ1 5 µ has a conditional normal distribution with mean zero and variance /s. They also show (p. 646) that the conditional normal distributions of the ts, given the µ s, are unconditional and independent Student t variables with n degrees of freedom. An equivalent version of t, given by Equation 3 is t. œ µ X µ.. s / Èn (m) To test H Chen and Chen (1998) suggest the statistic 0 J µ µ µ 1 X F œ!. X.. µ (..4) s (m)/ Èn, œ1 µ J µ where X œ! X /J. According to Chen and Chen one would reect H.. œ1. 0 µ 1 µ when F F, the upper percentage point of the null distribution of F µ 1!, J, n! (based on a balanced design). A SAS (SAS, Version 6.1) computer program can be obtained from the authors to obtain critical values for both balanced and unbalanced designs..3 Lix and Keselman's (1998) procedure. Lix and Keselman (1998) and Wilcox, Keselman and Kowalchuk (1998) have shown how to obtain a robust test of location equality in unbalanced one-way layouts when the 6

7 underlying data are neither normal in form nor possessing equal variability. The heteroscedastic statistic used by Lix and Keselman (1998) and Wilcox, Keselman, and Kowalchuk (1998) is due to Welch (1951). The statistic can be defined as J! wðx XÑ ÎÐJ "Ñ œ1 F œ ß (.3.1) J " J 1 w W Ð Ñ Ð! Î Ñ ÐJ 1Ñ n 1 œ1 _ J _ J _ where w œ nîs ß X œ! wx /W, W œ! w and X œ DiX i/n and œ1 œ1 s œ D i(xi X ) /(n 1), where X is the estimate of. and s is the usual unbiased estimate of the variance for population. The test statistic is approximately distributed as an F variate and is referred to the critical value F[(1! ); (J 1), /], the (1! )-centile of the F distribution, where error degrees of freedom are obtained from / œ J 1 J (1 w /W) 3! n 1 œ1. (.3.) Yuen (1974) initially suggested that trimmed means and variances based on Winsorized sums of squares be used in conunction with Welch's (1938) two-sample statistic. For heavy-tailed symmetric distributions, Yuen showed that the statistic based on these robust estimators could adequately control the rate of Type I errors and resulted in greater power than a statistic based on the usual mean and variance. While a wide range of robust estimators have been proposed in the literature (see Gross, 1976), the trimmed mean and Winsorized variance are intuitively appealing because of their computational simplicity and good theoretical properties (Wilcox, 1995a). In particular, while the standard error of the usual mean can become seriously inflated when the 7

8 underlying distribution has heavy tails (Tukey, 1960), the standard error of the trimmed mean is less affected by departures from normality because extreme observations, that is, observations in the tails of a distribution, are removed. Furthermore, as Gross (1976) notes, the Winsorized variance is a consistent estimator of the variance of the corresponding trimmed mean" (p. 410). In computing the Winsorized variance, the most extreme observations are replaced with less extreme values in the distribution of scores. While the trimmed mean has been shown to be highly effective, we caution the reader that this measure should only be adopted if one is interested in testing for treatment effects across groups using a measure of location that more accurately reflects the typical score within a group when working with heavy-tailed distributions. As an illustration of how a trimmed mean may provide a better estimate of the typical score than the usual mean, consider the example given by Wilcox (1995a, p. 57) in which a single score in a chi-square distribution with four df (hence. œ 4) is multiplied by 10 (with probability.1). This contaminated chi-square distribution has a population mean of 7.6, a value closer to the upper tail of the distribution. However, a 0% population trimmed mean is 4., a value that is closer to the bulk of scores, hence closer to the typical score in the distribution. Lix and Keselman (1998) and Wilcox, Keselman, and Kowalchuk (1998) replace the hypothesis of equal means with H!:. t1 œ. t œ â œ. tj, the hypothesis of equal trimmed means. Let X(1) Ÿ X() Ÿ á Ÿ X (n ) represent the ordered observations associated with the th group. Let g œ [# n ], where # represents the proportion of observations that are to be trimmed in each tail of the distribution. For reasons summarized by Wilcox (1995a,b), 0% trimming (# œ.) is used here. The effective sample size for the th group becomes h œ n g. The th sample trimmed mean is n g q 1 X t œ! h X (i). (.3.3) iœg 1 and the th sample Winsorized mean is 8

9 where q X œ 1 Y, w n! n iœ1 Yi œ X (g 1) if Xi Ÿ X (g 1) œ X if X X X œ X (n g ) if Xi X (n g ). i (g 1) i (n g) i The sample Winsorized variance is s w œ 1 n 1 n!(yi q X w), (.3.4) iœ1 and (n 1)sw s œ (.3.5) h(h 1) µ w estimates the squared standard error of the sample trimmed mean (see Wilcox, 1996). Thus, with robust estimation, the trimmed group means q q (X s) replace the least squares group means (X s), the Winsorized group t w ) variances estimators ( s s) replace the least squares variances (s s, and D h replaces N, in the statistics and their df. That is, Equations (.3.1) and (.3.) become J! wtðxt XtÑ ÎÐJ "Ñ œ" F t œ ß (.3.6) " J J Ð Ñ Ð1 wtîwtñ! ÐJ 1Ñ h 1 œ 1 _ J _ J where w œh Î µ # s ßX œ! w X /W and W œ! w, where / is estimated by t w t t t t t t t œ" œ" 9

10 / t œ J 1 J (1 w /W ) 3! t t h 1 œ1. (.3.7) 3. METHOD Four variables were manipulated in the study: (a) number of groups (4 and 6), (b) sample size (two cases), (c) population distribution (five distributions: one normal and four nonnormal distributions), and (d) degree/pattern of variance heterogeneity (moderate and large/all (mostly) unequal and all but one equal). Variances and group sizes were both positively and negatively paired. Table I contains the numerical values of the sample sizes and variances investigated in this study. Table I Sample Size and Variance Conditions CON Sample Sizes (Two Cases) Population Variances A 10, 15, 0, 5; 15, 0, 5, 30 1, 4, 9, 16 B 10, 15, 0, 5; 15, 0, 5, 30 1, 1, 1, 36 C 10, 15, 0, 5; 15, 0, 5, 30 16, 9, 4, 1 D 10, 15, 0, 5; 15, 0, 5, 30 36, 1, 1, 1 E 10, 15(), 0(), 5; 15, 0(), 5(), 30 1(), 4, 9(), 16 F 10, 15(), 0(), 5; 15, 0(), 5(), 30 1(5), 36 G 10, 15(), 0(), 5; 15, 0(), 5(), 30 16, 9(), 4, 1() H 10, 15(), 0(), 5; 15, 0(), 5(), 30 36, 1(5) As indicated we investigated one-way designs having four and six groups. For each design size, two sample size cases were investigated. In our unbalanced designs, the smaller of the two cases investigated for each design had an average group size of less than 0, while the larger case in each design had an average group size of at least 0. With respect to the effects of distributional shape on Type I error, we chose to investigate conditions in which the statistics were likely to be 10

11 prone to an excessive number of Type I errors as well as a normally distributed case. Thus, we generated data from four skewed distributions. Specifically, we sampled from a ; 6 and a ; 3 distribution and we also used the method described in Hoaglin (1985) to generate distributions with more extreme degrees of skewness and kurtosis. These particular types of nonnormal distributions were selected since data obtained in applied settings (e.g., behavioral science data) typically have skewed distributions (Micceri, 1989; Wilcox, 1994a, 1994b, 1995a,b). Furthermore, Sawilowsky and Blair (199) investigated the effects of eight non-normal distributions identified by Micceri on the robustness of Student's t test and found that only distributions with the most extreme degree of skewness which were investigated (e.g., # 1 œ 1.64) were found to affect the Type I error control of the independent sample t statistic. Thus, since the statistics we investigated have operating characteristics similar to those reported for the t statistic, we felt that our approach to modeling skewed data would adequately reflect conditions in which those statistics might not perform optimally. For the distribution, skewness and kurtosis values are ; 3 # 1 œ 1.63 and # œ 4.00, respectively (the corresponding values for the ; 6 data are # œ 1.15 and # œ.0) (see Table II). Accordingly, our simulated 1 ; 3 distribution mirrors data found in behavioral science experiments with regard to skewness. The other types of nonnormal distributions were generated from the g- and h-distribution (Hoaglin, 1985). Specifically, we chose to investigate two g- and h- distributions: (a) a g œ 1 and h œ 0 distribution and (b) a g œ 1 and h œ.5 distribution. To give meaning to these values it should be noted that for the standard normal distribution g œ h œ 0. When g œ 0 a distribution is symmetric and the tails of a distribution will become heavier as h increases. Values of skewness and kurtosis corresponding to the investigated g and h distributions are (a) # 1 œ 6. and # œ 114, respectively, and (b) # 1 œ # œ undefined (see Table II). Finally, it should be noted that though the selected combinations of g and h result in extremely skewed distributions, these values according to Wilcox (1994a, 1994b, 1995a,b), are representative of measurements obtained in applied settings (e.g., psychometric measures). Moreover, as Wilcox (1995a) notes, if a procedure performs well over a wide range of 11

12 simulation conditions, including extreme conditions, this suggests that the positive operating characteristics of the procedure might hold over conditions not considered in the simulation and thus positively reflect on the procedure's versatility. Table II Distributions Investigated and Their Properties Distribution Skewness Kurtosis Chi Square (6) Chi Square (3) g œ 1 & h œ g œ 1 & h œ.5 Undefined Undefined As indicated we both positively and negatively paired the group sizes and variances. For positive (negative) pairings, the group having the smallest number of observations was associated with the population having the smallest (largest) variance, while the group having the greatest number of observations was associated with the population having the greatest (smallest) variance. These conditions were chosen since they typically produce distrorted Type I error rates. To generate pseudo-random normal variates, we used the SAS generator RANNOR (SAS Institute, 1989). If Z i is a standard normal variate, then X i œ. ( 5 Z i) is a normal variate with mean equal to. and variance equal to 5. To generate pseudo-random variates having a ; distribution with six (three) degrees of freedom, six (three) standard normal variates were squared and summed. The variates were standardized, and then transformed to ; or ; variates having mean. and variance 5 [see 6 3 t Hastings & Peacock (1975), pp , for further details on the generation of data from this distribution]. To generate data from a g- and h-distribution, standard unit normal variables (Z) were converted to the random variable 1

13 X i œ exp (g Z i) 1 g expœ h Zi, according to the values of g and h selected for investigation. To obtain a distribution with standard deviation 5, each X i ( œ 1, á, J) was multiplied by a value of obtainable from Table I. It is important to note 5 that this does not affect the value of the null hypothesis when g œ 0 (see Wilcox, 1994a, p. 97). However, when g 0, the population mean for a g- and h-distributed variable is 1 œ (exp{g /(1 h)} 1) g(1 h) " #. gh (see Hoaglin, 1985, p. 503). Thus, for those conditions where g 0, was first subtracted from X i before multiplying by 5. Lastly, it should be noted that the standard deviation of a g- and h-distribution is not equal to one, and thus the values enumerated in Table I reflect only the amount that each random variable is multiplied by and not the actual values of the standard deviations (see Wilcox, 1994a, p. 98). As Wilcox notes, the values for the variances (standard deviations) in Table I more aptly reflect the ratio of the variances (standard deviations) between the groups. Our simulation program was written in SAS/IML (SAS, 1989). One thousand replications of each condition were performed using a.05 significance level; Beta values within each simulation were based on 5000 replications (simulations).. gh 4. RESULTS To evaluate the particular conditions under which a test was insensitive to assumption violations, Bradley's (1978) liberal criterion of robustness was employed. According to this criterion, in order for a test to be 13

14 considered robust, its empirical rate of Type I error (!s ) must be contained in the interval 0.5! Ÿ! s Ÿ1.5!. Therefore, for the five percent level of significance used in this study, a test was considered robust in a particular condition if its empirical rate of Type I error fell within the interval.05 Ÿ s! Ÿ.075. Correspondingly, a test was considered to be nonrobust if, for a particular condition, its Type I error rate was not contained in this interval. In the tables, boldfaced entries are used to denote these latter values. We chose this criterion since we feel that it provides a reasonable standard by which to udge robustness. That is, in our opinion, applied researchers should be comfortable working with a procedure that controls the rate of Type I error within these bounds, if the procedure limits the rate across a wide range of assumption violation conditions. Nonetheless, there is no one universal standard by which tests are udged to be robust, so different interpretations of the results are possible. Tables III and IV contain empirical rates of Type I error for a completely randomized design containing four and six groups, respectively. The tabled data indicate that when the observations were obtained from normal distributions, rates of Type I error were controlled, as was reported by Gamage and Weerahandi (1998), Chen and Chen (1998) and Lix and Keselman (1998). However, our results also very clearly indicate that the procedures due to Gamage and Weerahandi (1998) (GW) and Chen and Chen (1998) (CC) can not limit their rates of Type I error within Bradley's (1978) liberal limit when data were nonnormal. Indeed, even for our midly skewed chi-square (6) distribution, rates were typically liberal, approaching values that were approximately equal to 10%. As the the nonnormality of the sampled distribution increased, rates of error became progressively larger, attaining values in excess of 50%. 14

15 TABLE III Empirical Rates of Type I Error (J œ 4) CON Population Type 5 s ns Normal ; 6 ; 3 g œ 1 & h œ 0 g œ 1&h œ.5 GF CC LK GF CC LK GF CC LK GF CC LK GF CC LK A A B B C C D D

16 TABLE IV Empirical Rates of Type I Error (J œ 6) CON Population Type 5 s ns Normal ; 6 ; 3 g œ 1 & h œ 0 g œ 1&h œ.5 GF CC LK GF CC LK GF CC LK GF CC LK GF CC LK E E F F G G H H

17 The procedure presented by Lix and Keselman (1998) (LK) however, was, in most instances, able to limit its rate of Type I error within Bradley's (1978) interval. Indeed, out of the 80 investigated conditions, the test was liberal in ust 7 cases (there was also one conservative value). 5. DISCUSSION We were not surprised to find that the Weerahandi (1995) and Chen and Chen (1998) procedures would not be robust to heterogeneity of variances when data were also nonnormal in unbalanced designs. To date, most test statistics that are intended to cope with the effects of variance heterogeneity have been found to lack robustness when heterogeneity of variances occurs with data that are also nonnormal, particularly when group sizes are unequal. As we indicated in our introduction no procedure for testing mean equality has been found to be uniformly robust to assumption violations when they occur simultaneously. However, this unfortunate state of affairs relates only to test statistics that use least squares measures of central tendency and variability. On the other hand, our results, and those presented by others, indicate that researchers can generally, though not uniformly, obtain a robust test of treatment performance equality by substituting robust measures of central tendency and variability into heteroscedastic test statistics (see e.g., Keselman, Kowalchuk & Lix, 1998; Keselman, Lix & Kowalchuk, 1998; Keselman & Wilcox, 1999; Lix & Keselman, 1998). That is, by substituting 0% trimmed means and Winsorized variances into, say, the Welch (1951) test, one typically can achieve robustness to both nonnormality and variance heterogeneity, even in unbalanced designs. The benefits of using robust estimators, that is, 0% trimmed means and Winsorized variances instead of least squares estimators to combat the effects of nonnormality has been discussed extensively (see e.g., Keselman, Kowalchuk & Lix, 1998; Keselman, Lix & Kowalchuk, 1998; Keselman & Wilcox, 1999; Lix & Keselman, 1998; Wilcox, 1995a,b, 1997). Finally, we note that we did not compare the power of the procedure presented by Lix and Keselman (1998) to those presented by Weerahandi 17

18 (1995) and Chen and Chen (1998) because the latter procedures were not able to control their rates of Type I error. That is, comparisons of power are only meaningful when the procedures being compared are capable of controlling their rates of Type I error. However, we should point out that the power characteristics of statistics based on robust estimators can be predicted from theory and prior work (Lix and Keselman, 1998). That is, as previously indicated, theory tells us that procedures based on sample means result in poor power because the standard error of the mean is inflated when distributions have heavy tails; however, this is less of a problem when working with trimmed means (see Tukey, 1960; Wilcox, 1995b). This phenomenon is illustrated in a number of sources. For example, Wilcox (1994b, 1995b) has presented results indicating that in the two sample and one-way problem, tests (i.e., t and F) based on the usual least squares estimators lose power when data contains outliers and/or is heavy tailed. Specifically, in the two sample problem, Wilcox (1994b) compared the Welch (1938) and Yuen (1974) procedures and found that when data were obtained from contaminated normal distributions (distributions that have thicker tails compared to the normal) the power of Welch's test was considerably diminished compared to its sensitivity to detect nonnull effects when data were normally distributed and, as well, was less sensitive than Yuen's test. Indeed, the power of Welch's test to detect nonnull effects went from.931 when distributions were normally distributed to.78 and.16 for the two contaminated normal distributions that were investigated; the corresponding power values for Yuen's test were.890,.784, and.60, respectively. Wilcox (1995b) presented similar results for four independent groups. Readers should also refer to the data presented by Lix and Keselman (1998) which compared the power values of other independent group statistics based on robust estimators. ACKNOWLEDGEMENTS This research was supported by a Natural Sciences and Engineering Research Council (Canada) grant. 18

19 BIBLIOGRAPHY Bishop, T. A. and Dudewicz, E. J. (1978). Exact analysis of variance with unequal variances: Test procedures and tables. Technometrics, 0, Bradley, J.V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, Brown, M.B., and Forsythe, A.B. (1974). The small sample behavior of some statistics which test the equality of several means, Technometrics, 16, Chen, S., and Chen, H.J. (1998). Single-stage analysis of variance under heteroscedasticity, Communications in Statistics-Simulation and Computation, XX, Fisher, R. A. (1935). The fiducial argument in statistical inference, Annals of Eugenics, 6, Gamage, J. and Weerahandi, S. (1998). Size performance of some tests in one-way ANOVA, Communications in Statistics-Simulation and Computation, XX, Gross, A. M. (1976). Confidence interval robustness with long-tailed symmetric distributions, Journal of the American Statistical Association, 71, Hastings, N. A. J., and Peacock, J. B. (1975). Statistical distributions: A handbook for students and practitioners, New York: Wiley. Hoaglin, D.C. (1985). Summarizing shape numerically: The g- and h distributions, In D. Hoaglin, F. Mosteller, & J. Tukey (Eds.), Exploring data tables, trends, and shapes (pp ). New York: Wiley. James, G. S. (1951). The comparison of several groups of observations when the ratios of the population variances are unknown, Biometrika, 38, James, G. S. (1954). Tests of linear hypotheses in univariate and multivariate analysis when the ratios of the population variances are unknown, Biometrika, 41,

20 Keselman, H.J., Kowalchuk, R.K., and Lix, L.M. (1998). Robust nonorthogonal analyses revisited: An update based on trimmed means, Psychometrika, 63, Keselman, H.J., Lix, L.M., and Kowalchuk, R.K. (1998). Multiple comparison procedures for trimmed means, Psychological Methods, 3, Keselman, H.J., and Wilcox, R.R. (1999). The 'improved' Brown and Forsythe test for mean equality: Some things can't be fixed, Communications in Statistics-Simulation and Computation, 8(3), Lix, L.M., and Keselman, H.J. (1995). Approximate degrees of freedom tests: A unified perspective on testing for mean equality, Psychological Bulletin, 117, Lix, L.M., and Keselman, H.J. (1998). To trim or not to trim: Tests of mean equality under heteroscedasticity and nonnormality, Educational and Psychological Measurement, 58, (Errata: 58, 853). Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures, Psychological Bulletin, 105, SAS Institute Inc. (1989), SAS/IML software: Usage and reference, version 6 (1st ed.), Cary, NC: Author. Sawilowsky, S.S., and Blair, R.C. (199). A more realistic look at the robustness and Type II error probabilities of the > test to departures from population normality, Psychological Bulletin, 111, Tukey, J. W. (1960). A survey of sampling from contaminated normal distributions, In I.Olkin et al. (Eds.), Contributions to probability and statistics. Stanford, CA: Stanford University Press. Weerahandi, S. (1995). ANOVA under unequal error variances, Biometrics, 51, Welch, B.L. (1938). The significance of the difference between two means when the population variances are unequal, Biometrika, 9, Welch, B.L. (1951). On the comparison of several mean values: An alternative approach, Biometrika, 38,

21 Wilcox, R.R. (1994a). A one-way random effects model for trimmed means, Psychometrika, 59, Wilcox, R.R. (1994b). Some results on the Tukey-McLaughlin and Yuen methods for trimmed means when distributions are skewed, Biometrical Journal, 36, Wilcox, R.R. (1995a). ANOVA: A paradigm for low power and misleading measures of effect size?, Review of Educational Research, 65(1), Wilcox, R.R. (1995b). ANOVA: The practical importance of heteroscedastic methods, using trimmed means versus means, and designing simulation studies, British Journal of Mathematical and Statistical Psychology, 48, Wilcox, R.R. (1996a). Statistics for the social sciences, New York: Academic Press. Wilcox, R.R. (1997). Introduction to robust estimation and hypothesis testing, New York: Academic Press. Wilcox, R. R., Keselman, H. J., and Kowalchuk, R. K. (1998). Can tests for treatment group equality be improved?: The bootstrap and trimmed means conecture, British Journal of Mathematical and Statistical Psychology, 51, Yuen, K.K. (1974). The two-sample trimmed t for unequal population variances, Biometrika, 61,

THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED

THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED H. J. Keselman Rand R. Wilcox University of Manitoba University of Southern California Winnipeg, Manitoba Los Angeles,

More information

TO TRIM OR NOT TO TRIM: TESTS OF LOCATION EQUALITY UNDER HETEROSCEDASTICITY AND NONNORMALITY. Lisa M. Lix and H.J. Keselman. University of Manitoba

TO TRIM OR NOT TO TRIM: TESTS OF LOCATION EQUALITY UNDER HETEROSCEDASTICITY AND NONNORMALITY. Lisa M. Lix and H.J. Keselman. University of Manitoba 1 TO TRIM OR NOT TO TRIM: TESTS OF LOCATION EQUALITY UNDER HETEROSCEDASTICITY AND NONNORMALITY Lisa M. Lix and H.J. Keselman University of Manitoba Correspondence concerning this manuscript should be sent

More information

Conventional And Robust Paired And Independent-Samples t Tests: Type I Error And Power Rates

Conventional And Robust Paired And Independent-Samples t Tests: Type I Error And Power Rates Journal of Modern Applied Statistical Methods Volume Issue Article --3 Conventional And And Independent-Samples t Tests: Type I Error And Power Rates Katherine Fradette University of Manitoba, umfradet@cc.umanitoba.ca

More information

Multiple Comparison Procedures, Trimmed Means and Transformed Statistics. Rhonda K. Kowalchuk Southern Illinois University Carbondale

Multiple Comparison Procedures, Trimmed Means and Transformed Statistics. Rhonda K. Kowalchuk Southern Illinois University Carbondale Multiple Comparison Procedures 1 Multiple Comparison Procedures, Trimmed Means and Transformed Statistics Rhonda K. Kowalchuk Southern Illinois University Carbondale H. J. Keselman University of Manitoba

More information

Multiple Comparison Procedures for Trimmed Means. H.J. Keselman, Lisa M. Lix and Rhonda K. Kowalchuk. University of Manitoba

Multiple Comparison Procedures for Trimmed Means. H.J. Keselman, Lisa M. Lix and Rhonda K. Kowalchuk. University of Manitoba 1 Multiple Comparison Procedures for Trimmed Means by H.J. Keselman, Lisa M. Lix and Rhonda K. Kowalchuk University of Manitoba Abstract Stepwise multiple comparison procedures (MCPs) based on least squares

More information

Comparing Measures of the Typical Score Across Treatment Groups. Katherine Fradette. University of Manitoba. Abdul R. Othman

Comparing Measures of the Typical Score Across Treatment Groups. Katherine Fradette. University of Manitoba. Abdul R. Othman Robust Testing Comparing Measures of the Typical Score Across Treatment Groups by Katherine Fradette University of Manitoba Abdul R. Othman Universiti Sains Malaysia H. J. Keselman University of Manitoba

More information

Graphical Procedures, SAS' PROC MIXED, and Tests of Repeated Measures Effects. H.J. Keselman University of Manitoba

Graphical Procedures, SAS' PROC MIXED, and Tests of Repeated Measures Effects. H.J. Keselman University of Manitoba 1 Graphical Procedures, SAS' PROC MIXED, and Tests of Repeated Measures Effects by H.J. Keselman University of Manitoba James Algina University of Florida and Rhonda K. Kowalchuk University of Manitoba

More information

A Test of Symmetry. Abdul R. Othman. Universiti Sains Malaysia. H. J. Keselman. University of Manitoba. Rand R. Wilcox

A Test of Symmetry. Abdul R. Othman. Universiti Sains Malaysia. H. J. Keselman. University of Manitoba. Rand R. Wilcox Symmetry A Test of Symmetry by Abdul R. Othman Universiti Sains Malaysia H. J. Keselman University of Manitoba Rand R. Wilcox University of Southern California Katherine Fradette University of Manitoba

More information

INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF INDEPENDENT GROUPS ABSTRACT

INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF INDEPENDENT GROUPS ABSTRACT Mirtagioğlu et al., The Journal of Animal & Plant Sciences, 4(): 04, Page: J. 344-349 Anim. Plant Sci. 4():04 ISSN: 08-708 INFLUENCE OF USING ALTERNATIVE MEANS ON TYPE-I ERROR RATE IN THE COMPARISON OF

More information

Trimming, Transforming Statistics, And Bootstrapping: Circumventing the Biasing Effects Of Heterescedasticity And Nonnormality

Trimming, Transforming Statistics, And Bootstrapping: Circumventing the Biasing Effects Of Heterescedasticity And Nonnormality Journal of Modern Applied Statistical Methods Volume Issue Article 38 --00 Trimming, Transforming Statistics, And Bootstrapping: Circumventing the Biasing Effects Of Heterescedasticity And Nonnormality

More information

Comparing the performance of modified F t statistic with ANOVA and Kruskal Wallis test

Comparing the performance of modified F t statistic with ANOVA and Kruskal Wallis test Appl. Math. Inf. Sci. 7, No. 2L, 403-408 (2013) 403 Applied Mathematics & Information Sciences An International ournal http://dx.doi.org/10.12785/amis/072l04 Comparing the performance of modified F t statistic

More information

A Comparison of Two Approaches For Selecting Covariance Structures in The Analysis of Repeated Measurements. H.J. Keselman University of Manitoba

A Comparison of Two Approaches For Selecting Covariance Structures in The Analysis of Repeated Measurements. H.J. Keselman University of Manitoba 1 A Comparison of Two Approaches For Selecting Covariance Structures in The Analysis of Repeated Measurements by H.J. Keselman University of Manitoba James Algina University of Florida Rhonda K. Kowalchuk

More information

An Examination of the Robustness of the Empirical Bayes and Other Approaches. for Testing Main and Interaction Effects in Repeated Measures Designs

An Examination of the Robustness of the Empirical Bayes and Other Approaches. for Testing Main and Interaction Effects in Repeated Measures Designs Empirical Bayes 1 An Examination of the Robustness of the Empirical Bayes and Other Approaches for Testing Main and Interaction Effects in Repeated Measures Designs by H.J. Keselman, Rhonda K. Kowalchuk

More information

THE ANALYSIS OF REPEATED MEASUREMENTS: A COMPARISON OF MIXED-MODEL SATTERTHWAITE F TESTS AND A NONPOOLED ADJUSTED DEGREES OF FREEDOM MULTIVARIATE TEST

THE ANALYSIS OF REPEATED MEASUREMENTS: A COMPARISON OF MIXED-MODEL SATTERTHWAITE F TESTS AND A NONPOOLED ADJUSTED DEGREES OF FREEDOM MULTIVARIATE TEST THE ANALYSIS OF REPEATED MEASUREMENTS: A COMPARISON OF MIXED-MODEL SATTERTHWAITE F TESTS AND A NONPOOLED ADJUSTED DEGREES OF FREEDOM MULTIVARIATE TEST H. J. Keselman James Algina University of Manitoba

More information

Robust Means Modeling vs Traditional Robust Tests 1

Robust Means Modeling vs Traditional Robust Tests 1 Robust Means Modeling vs Traditional Robust Tests 1 Comparing Means under Heteroscedasticity and Nonnormality: Further Exploring Robust Means Modeling Alyssa Counsell Department of Psychology Ryerson University

More information

Preliminary Testing for Normality: Is This a Good Practice?

Preliminary Testing for Normality: Is This a Good Practice? Journal of Modern Applied Statistical Methods Volume 12 Issue 2 Article 2 11-1-2013 Preliminary Testing for Normality: Is This a Good Practice? H. J. Keselman University of Manitoba, Winnipeg, Manitoba,

More information

Testing For Aptitude-Treatment Interactions In Analysis Of Covariance And Randomized Block Designs Under Assumption Violations

Testing For Aptitude-Treatment Interactions In Analysis Of Covariance And Randomized Block Designs Under Assumption Violations Journal of Modern Applied Statistical Methods Volume 4 Issue 2 Article 11 11-1-2005 Testing For Aptitude-Treatment Interactions In Analysis Of Covariance And Randomized Block Designs Under Assumption Violations

More information

On Selecting Tests for Equality of Two Normal Mean Vectors

On Selecting Tests for Equality of Two Normal Mean Vectors MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department

More information

A Generally Robust Approach To Hypothesis Testing in Independent and Correlated Groups Designs. H. J. Keselman. University of Manitoba. Rand R.

A Generally Robust Approach To Hypothesis Testing in Independent and Correlated Groups Designs. H. J. Keselman. University of Manitoba. Rand R. Robust Estimation and Testing 1 A Generally Robust Approach To Hypothesis Testing in Independent and Correlated Groups Designs by H. J. Keselman University of Manitoba Rand R. Wilcox University of Southern

More information

Trimming, Transforming Statistics, and Bootstrapping: Circumventing the Biasing Effects of Heterescedasticity and Nonnormality. H. J.

Trimming, Transforming Statistics, and Bootstrapping: Circumventing the Biasing Effects of Heterescedasticity and Nonnormality. H. J. Robust Testing Trimming, Transforming Statistics, and Bootstrapping: Circumventing the Biasing Effects of Heterescedasticity and Nonnormality by H. J. Keselman University of Manitoba Rand R. Wilcox University

More information

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie Extending the Robust Means Modeling Framework Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie One-way Independent Subjects Design Model: Y ij = µ + τ j + ε ij, j = 1,, J Y ij = score of the ith

More information

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions Journal of Modern Applied Statistical Methods Volume 12 Issue 1 Article 7 5-1-2013 A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions William T. Mickelson

More information

A NEW ALTERNATIVE IN TESTING FOR HOMOGENEITY OF VARIANCES

A NEW ALTERNATIVE IN TESTING FOR HOMOGENEITY OF VARIANCES Journal of Statistical Research 0, Vol. 40, No. 2, pp. 5-3 Bangladesh ISSN 025-422 X A NEW ALTERNATIVE IN TESTING FOR HOMOGENEITY OF VARIANCES Mehmet MENDEŞ Departmanet of Animal Science, University of

More information

Assessing Normality: Applications in Multi-Group Designs

Assessing Normality: Applications in Multi-Group Designs Malaysian Journal of Mathematical Sciences 9(1): 53-65 (2015) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal 1* Abdul R. Othman, 2 H. J. Keselman and 3 Rand

More information

An Overview of the Performance of Four Alternatives to Hotelling's T Square

An Overview of the Performance of Four Alternatives to Hotelling's T Square fi~hjf~~ G 1992, m-t~, 11o-114 Educational Research Journal 1992, Vol.7, pp. 110-114 An Overview of the Performance of Four Alternatives to Hotelling's T Square LIN Wen-ying The Chinese University of Hong

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

Parametric Probability Densities and Distribution Functions for Tukey g-and-h Transformations and their Use for Fitting Data

Parametric Probability Densities and Distribution Functions for Tukey g-and-h Transformations and their Use for Fitting Data Applied Mathematical Sciences, Vol. 2, 2008, no. 9, 449-462 Parametric Probability Densities and Distribution Functions for Tukey g-and-h Transformations and their Use for Fitting Data Todd C. Headrick,

More information

Modied tests for comparison of group means under heteroskedasticity and non-normality caused by outlier(s)

Modied tests for comparison of group means under heteroskedasticity and non-normality caused by outlier(s) Hacettepe Journal of Mathematics and Statistics Volume 46 (3) (2017), 493 510 Modied tests for comparison of group means under heteroskedasticity and non-normality caused by outlier(s) Mustafa Cavus, Berna

More information

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Alisa A. Gorbunova and Boris Yu. Lemeshko Novosibirsk State Technical University Department of Applied Mathematics,

More information

ABSTRACT. Between-Subjects Design under Variance. Heterogeneity and Nonnormality. Evaluation

ABSTRACT. Between-Subjects Design under Variance. Heterogeneity and Nonnormality. Evaluation ABSTRACT Title of dissertation: Robust Means Modeling: An Alternative to Hypothesis Testing Of Mean Equality in the Between-Subjects Design under Variance Heterogeneity and Nonnormality Weihua Fan, Doctor

More information

AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC

AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC Journal of Applied Statistical Science ISSN 1067-5817 Volume 14, Number 3/4, pp. 225-235 2005 Nova Science Publishers, Inc. AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC FOR TWO-FACTOR ANALYSIS OF VARIANCE

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

The Analysis of Repeated Measures Designs: A Review. H.J. Keselman. University of Manitoba. James Algina. University of Florida.

The Analysis of Repeated Measures Designs: A Review. H.J. Keselman. University of Manitoba. James Algina. University of Florida. Repeated Measures Analyses 1 The Analysis of Repeated Measures Designs: A Review by H.J. Keselman University of Manitoba James Algina University of Florida and Rhonda K. Kowalchuk University of Manitoba

More information

Inferences About the Difference Between Two Means

Inferences About the Difference Between Two Means 7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent

More information

Robustness. James H. Steiger. Department of Psychology and Human Development Vanderbilt University. James H. Steiger (Vanderbilt University) 1 / 37

Robustness. James H. Steiger. Department of Psychology and Human Development Vanderbilt University. James H. Steiger (Vanderbilt University) 1 / 37 Robustness James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 37 Robustness 1 Introduction 2 Robust Parameters and Robust

More information

Comparing Two Dependent Groups: Dealing with Missing Values

Comparing Two Dependent Groups: Dealing with Missing Values Journal of Data Science 9(2011), 1-13 Comparing Two Dependent Groups: Dealing with Missing Values Rand R. Wilcox University of Southern California Abstract: The paper considers the problem of comparing

More information

An Alternative to Cronbach s Alpha: A L-Moment Based Measure of Internal-consistency Reliability

An Alternative to Cronbach s Alpha: A L-Moment Based Measure of Internal-consistency Reliability Southern Illinois University Carbondale OpenSIUC Book Chapters Educational Psychology and Special Education 013 An Alternative to Cronbach s Alpha: A L-Moment Based Measure of Internal-consistency Reliability

More information

Aligned Rank Tests As Robust Alternatives For Testing Interactions In Multiple Group Repeated Measures Designs With Heterogeneous Covariances

Aligned Rank Tests As Robust Alternatives For Testing Interactions In Multiple Group Repeated Measures Designs With Heterogeneous Covariances Journal of Modern Applied Statistical Methods Volume 3 Issue 2 Article 17 11-1-2004 Aligned Rank Tests As Robust Alternatives For Testing Interactions In Multiple Group Repeated Measures Designs With Heterogeneous

More information

A Monte-Carlo study of asymptotically robust tests for correlation coefficients

A Monte-Carlo study of asymptotically robust tests for correlation coefficients Biometrika (1973), 6, 3, p. 661 551 Printed in Great Britain A Monte-Carlo study of asymptotically robust tests for correlation coefficients BY G. T. DUNCAN AND M. W. J. LAYAKD University of California,

More information

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing

More information

Testing homogeneity of variances with unequal sample sizes

Testing homogeneity of variances with unequal sample sizes Comput Stat (2013) 28:1269 1297 DOI 10.1007/s00180-012-0353-x ORIGINAL PAPER Testing homogeneity of variances with unequal sample sizes I. Parra-Frutos Received: 28 February 2011 / Accepted: 14 July 2012

More information

Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function

Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function Journal of Data Science 7(2009), 459-468 Two-by-two ANOVA: Global and Graphical Comparisons Based on an Extension of the Shift Function Rand R. Wilcox University of Southern California Abstract: When comparing

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION (REFEREED RESEARCH) COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION Hakan S. Sazak 1, *, Hülya Yılmaz 2 1 Ege University, Department

More information

Numerical Computing and Graphics for the Power Method Transformation Using Mathematica

Numerical Computing and Graphics for the Power Method Transformation Using Mathematica Southern Illinois University Carbondale OpenSIUC Publications Educational Psychology and Special Education 4-2007 Numerical Computing and Graphics for the Power Method Transformation Using Mathematica

More information

October 1, Keywords: Conditional Testing Procedures, Non-normal Data, Nonparametric Statistics, Simulation study

October 1, Keywords: Conditional Testing Procedures, Non-normal Data, Nonparametric Statistics, Simulation study A comparison of efficient permutation tests for unbalanced ANOVA in two by two designs and their behavior under heteroscedasticity arxiv:1309.7781v1 [stat.me] 30 Sep 2013 Sonja Hahn Department of Psychology,

More information

Type I Error Rates of the Kenward-Roger Adjusted Degree of Freedom F-test for a Split-Plot Design with Missing Values

Type I Error Rates of the Kenward-Roger Adjusted Degree of Freedom F-test for a Split-Plot Design with Missing Values Journal of Modern Applied Statistical Methods Volume 6 Issue 1 Article 8 5-1-2007 Type I Error Rates of the Kenward-Roger Adjusted Degree of Freedom F-test for a Split-Plot Design with Missing Values Miguel

More information

Comparison of Power between Adaptive Tests and Other Tests in the Field of Two Sample Scale Problem

Comparison of Power between Adaptive Tests and Other Tests in the Field of Two Sample Scale Problem Comparison of Power between Adaptive Tests and Other Tests in the Field of Two Sample Scale Problem Chikhla Jun Gogoi 1, Dr. Bipin Gogoi 2 1 Research Scholar, Department of Statistics, Dibrugarh University,

More information

THE EFFECTS OF NONNORMAL DISTRIBUTIONS ON CONFIDENCE INTERVALS AROUND THE STANDARDIZED MEAN DIFFERENCE: BOOTSTRAP AND PARAMETRIC CONFIDENCE INTERVALS

THE EFFECTS OF NONNORMAL DISTRIBUTIONS ON CONFIDENCE INTERVALS AROUND THE STANDARDIZED MEAN DIFFERENCE: BOOTSTRAP AND PARAMETRIC CONFIDENCE INTERVALS EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 10.1177/0013164404264850 KELLEY THE EFFECTS OF NONNORMAL DISTRIBUTIONS ON CONFIDENCE INTERVALS AROUND THE STANDARDIZED MEAN DIFFERENCE: BOOTSTRAP AND PARAMETRIC

More information

Applications of Basu's TheorelTI. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University

Applications of Basu's TheorelTI. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University i Applications of Basu's TheorelTI by '. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University January 1997 Institute of Statistics ii-limeo Series

More information

Published: 26 April 2016

Published: 26 April 2016 Electronic Journal of Applied Statistical Analysis EJASA, Electron. J. App. Stat. Anal. http://siba-ese.unisalento.it/index.php/ejasa/index e-issn: 2070-5948 DOI: 10.1285/i20705948v9n1p111 A robust dispersion

More information

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University

Increasing Power in Paired-Samples Designs. by Correcting the Student t Statistic for Correlation. Donald W. Zimmerman. Carleton University Power in Paired-Samples Designs Running head: POWER IN PAIRED-SAMPLES DESIGNS Increasing Power in Paired-Samples Designs by Correcting the Student t Statistic for Correlation Donald W. Zimmerman Carleton

More information

Presented to the Graduate Council of the. North Texas State University. in Partial. Fulfillment of the Requirements. For the Degree of.

Presented to the Graduate Council of the. North Texas State University. in Partial. Fulfillment of the Requirements. For the Degree of. AN EMPIRICAL INVESTIGATION OF TUKEY'S HONESTLY SIGNIFICANT DIFFERENCE TEST WITH VARIANCE HETEROGENEITY AND UNEQUAL SAMPLE SIZES, UTILIZING KRAMER'S PROCEDURE AND THE HARMONIC MEAN DISSERTATION Presented

More information

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin Equivalency Test for Model Fit 1 Running head: EQUIVALENCY TEST FOR MODEL FIT An Equivalency Test for Model Fit Craig S. Wells University of Massachusetts Amherst James. A. Wollack Ronald C. Serlin University

More information

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials. One-Way ANOVA Summary The One-Way ANOVA procedure is designed to construct a statistical model describing the impact of a single categorical factor X on a dependent variable Y. Tests are run to determine

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Journal of Modern Applied Statistical Methods May, 2007, Vol. 6, No. 1, /07/$95.00

Journal of Modern Applied Statistical Methods May, 2007, Vol. 6, No. 1, /07/$95.00 ournal of Modern Applied Statistical Methods Copyright 007 MASM, nc. May, 007, Vol. 6, o., 53-65 538 947/07/$95.00 Analyses of Unbalanced roups-versus-ndividual Research Designs Using Three Alternative

More information

R-functions for the analysis of variance

R-functions for the analysis of variance 1 R-functions for the analysis of variance The following R functions may be downloaded from the directory http://www.uni-koeln.de/~luepsen/r/ Usage advices: Variables used as factors have to declared as

More information

Practical Solutions to Behrens-Fisher Problem: Bootstrapping, Permutation, Dudewicz-Ahmed Method

Practical Solutions to Behrens-Fisher Problem: Bootstrapping, Permutation, Dudewicz-Ahmed Method Practical Solutions to Behrens-Fisher Problem: Bootstrapping, Permutation, Dudewicz-Ahmed Method MAT653 Final Project Yanjun Yan Syracuse University Nov. 22, 2005 Outline Outline 1 Introduction 2 Problem

More information

r(equivalent): A Simple Effect Size Indicator

r(equivalent): A Simple Effect Size Indicator r(equivalent): A Simple Effect Size Indicator The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Rosenthal, Robert, and

More information

ROBUSTNESS OF TWO-PHASE REGRESSION TESTS

ROBUSTNESS OF TWO-PHASE REGRESSION TESTS REVSTAT Statistical Journal Volume 3, Number 1, June 2005, 1 18 ROBUSTNESS OF TWO-PHASE REGRESSION TESTS Authors: Carlos A.R. Diniz Departamento de Estatística, Universidade Federal de São Carlos, São

More information

Comparison of nonparametric analysis of variance methods a Monte Carlo study Part A: Between subjects designs - A Vote for van der Waerden

Comparison of nonparametric analysis of variance methods a Monte Carlo study Part A: Between subjects designs - A Vote for van der Waerden Comparison of nonparametric analysis of variance methods a Monte Carlo study Part A: Between subjects designs - A Vote for van der Waerden Version 5 completely revised and extended (13.7.2017) Haiko Lüpsen

More information

SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM

SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM SOME ASPECTS OF MULTIVARIATE BEHRENS-FISHER PROBLEM Junyong Park Bimal Sinha Department of Mathematics/Statistics University of Maryland, Baltimore Abstract In this paper we discuss the well known multivariate

More information

Empirical likelihood-based methods for the difference of two trimmed means

Empirical likelihood-based methods for the difference of two trimmed means Empirical likelihood-based methods for the difference of two trimmed means 24.09.2012. Latvijas Universitate Contents 1 Introduction 2 Trimmed mean 3 Empirical likelihood 4 Empirical likelihood for the

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Two Measurement Procedures

Two Measurement Procedures Test of the Hypothesis That the Intraclass Reliability Coefficient is the Same for Two Measurement Procedures Yousef M. Alsawalmeh, Yarmouk University Leonard S. Feldt, University of lowa An approximate

More information

APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV

APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV DOI 10.1007/s11018-017-1213-4 Measurement Techniques, Vol. 60, No. 5, August, 2017 APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY OF VARIANCES. PART IV B. Yu. Lemeshko and T.

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution

Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution Journal of Computational and Applied Mathematics 216 (2008) 545 553 www.elsevier.com/locate/cam Analysis of variance and linear contrasts in experimental design with generalized secant hyperbolic distribution

More information

GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE

GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE DOI 10.1007/s11018-017-1141-3 Measurement Techniques, Vol. 60, No. 1, April, 2017 GENERAL PROBLEMS OF METROLOGY AND MEASUREMENT TECHNIQUE APPLICATION AND POWER OF PARAMETRIC CRITERIA FOR TESTING THE HOMOGENEITY

More information

TESTING FOR HOMOGENEITY IN COMBINING OF TWO-ARMED TRIALS WITH NORMALLY DISTRIBUTED RESPONSES

TESTING FOR HOMOGENEITY IN COMBINING OF TWO-ARMED TRIALS WITH NORMALLY DISTRIBUTED RESPONSES Sankhyā : The Indian Journal of Statistics 2001, Volume 63, Series B, Pt. 3, pp 298-310 TESTING FOR HOMOGENEITY IN COMBINING OF TWO-ARMED TRIALS WITH NORMALLY DISTRIBUTED RESPONSES By JOACHIM HARTUNG and

More information

Methodology Review: Applications of Distribution Theory in Studies of. Population Validity and Cross Validity. James Algina. University of Florida

Methodology Review: Applications of Distribution Theory in Studies of. Population Validity and Cross Validity. James Algina. University of Florida Distribution Theory 1 Methodology eview: Applications of Distribution Theory in Studies of Population Validity and Cross Validity by James Algina University of Florida and H. J. Keselman University of

More information

10/31/2012. One-Way ANOVA F-test

10/31/2012. One-Way ANOVA F-test PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 1. Situation/hypotheses 2. Test statistic 3.Distribution 4. Assumptions One-Way ANOVA F-test One factor J>2 independent samples

More information

One-way ANOVA. Experimental Design. One-way ANOVA

One-way ANOVA. Experimental Design. One-way ANOVA Method to compare more than two samples simultaneously without inflating Type I Error rate (α) Simplicity Few assumptions Adequate for highly complex hypothesis testing 09/30/12 1 Outline of this class

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

INTRODUCTION TO ANALYSIS OF VARIANCE

INTRODUCTION TO ANALYSIS OF VARIANCE CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two

More information

The Robustness of the Multivariate EWMA Control Chart

The Robustness of the Multivariate EWMA Control Chart The Robustness of the Multivariate EWMA Control Chart Zachary G. Stoumbos, Rutgers University, and Joe H. Sullivan, Mississippi State University Joe H. Sullivan, MSU, MS 39762 Key Words: Elliptically symmetric,

More information

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Econometrics Working Paper EWP0402 ISSN 1485-6441 Department of Economics TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST Lauren Bin Dong & David E. A. Giles Department

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

A SIMULATION STUDY ON TESTS FOR ONE-WAY ANOVA UNDER THE UNEQUAL VARIANCE ASSUMPTION

A SIMULATION STUDY ON TESTS FOR ONE-WAY ANOVA UNDER THE UNEQUAL VARIANCE ASSUMPTION C om m un.fac.sci.u niv.a nk.series A Volum e 59, N um b er, Pages 5 34 (00) ISSN 303 599 A SIMULATION STUDY ON TESTS FOR ONE-WAY ANOVA UNDER THE UNEQUAL VARIANCE ASSUMPTION ESRA YI ¼GIT AND FIKRI GÖKPINAR

More information

Two-Mean Inference. Two-Group Research. Research Designs. The Correlated Samples t Test

Two-Mean Inference. Two-Group Research. Research Designs. The Correlated Samples t Test Two-Mean Inference 6430 Two-Group Research. We wish to know whether two groups (samples) of scores (on some continuous OV, outcome variable) are different enough from one another to indicate that the two

More information

Transition Passage to Descriptive Statistics 28

Transition Passage to Descriptive Statistics 28 viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of

More information

COMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY

COMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY COMPARING ROBUST REGRESSION LINES ASSOCIATED WITH TWO DEPENDENT GROUPS WHEN THERE IS HETEROSCEDASTICITY Rand R. Wilcox Dept of Psychology University of Southern California Florence Clark Division of Occupational

More information

Bootstrap Procedures for Testing Homogeneity Hypotheses

Bootstrap Procedures for Testing Homogeneity Hypotheses Journal of Statistical Theory and Applications Volume 11, Number 2, 2012, pp. 183-195 ISSN 1538-7887 Bootstrap Procedures for Testing Homogeneity Hypotheses Bimal Sinha 1, Arvind Shah 2, Dihua Xu 1, Jianxin

More information

Algorithms and Code JMASM24: Numerical Computing for Third-Order Power Method Polynomials (Excel)

Algorithms and Code JMASM24: Numerical Computing for Third-Order Power Method Polynomials (Excel) Journal of Modern Applied Statistical Methods November, 2006, Vol. 5, No.2, 567-574 Copyright 2006 JMASM, Inc. 1538-9472/06/$95.00 Algorithms and Code JMASM24: Numerical Computing for Third-Order Power

More information

COMPARING SEVERAL MEANS: ANOVA

COMPARING SEVERAL MEANS: ANOVA LAST UPDATED: November 15, 2012 COMPARING SEVERAL MEANS: ANOVA Objectives 2 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons

More information

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE Supported by Patrick Adebayo 1 and Ahmed Ibrahim 1 Department of Statistics, University of Ilorin, Kwara State, Nigeria Department

More information

An Approximate Test for Homogeneity of Correlated Correlation Coefficients

An Approximate Test for Homogeneity of Correlated Correlation Coefficients Quality & Quantity 37: 99 110, 2003. 2003 Kluwer Academic Publishers. Printed in the Netherlands. 99 Research Note An Approximate Test for Homogeneity of Correlated Correlation Coefficients TRIVELLORE

More information

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600 Multiple Comparison Procedures Cohen Chapter 13 For EDUC/PSY 6600 1 We have to go to the deductions and the inferences, said Lestrade, winking at me. I find it hard enough to tackle facts, Holmes, without

More information

Applied Multivariate and Longitudinal Data Analysis

Applied Multivariate and Longitudinal Data Analysis Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) II Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 1 Compare Means from More Than Two

More information

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Chap The McGraw-Hill Companies, Inc. All rights reserved. 11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview

More information

Distribution-Free Monitoring of Univariate Processes. Peihua Qiu 1 and Zhonghua Li 1,2. Abstract

Distribution-Free Monitoring of Univariate Processes. Peihua Qiu 1 and Zhonghua Li 1,2. Abstract Distribution-Free Monitoring of Univariate Processes Peihua Qiu 1 and Zhonghua Li 1,2 1 School of Statistics, University of Minnesota, USA 2 LPMC and Department of Statistics, Nankai University, China

More information

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Assessing the relation between language comprehension and performance in general chemistry. Appendices Assessing the relation between language comprehension and performance in general chemistry Daniel T. Pyburn a, Samuel Pazicni* a, Victor A. Benassi b, and Elizabeth E. Tappin c a Department of Chemistry,

More information

Multiple Comparison Methods for Means

Multiple Comparison Methods for Means SIAM REVIEW Vol. 44, No. 2, pp. 259 278 c 2002 Society for Industrial and Applied Mathematics Multiple Comparison Methods for Means John A. Rafter Martha L. Abell James P. Braselton Abstract. Multiple

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics. Post hoc Comparisons 13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Post hoc Comparisons In the prior chapter we used ANOVA

More information

GOTEBORG UNIVERSITY. Department of Statistics

GOTEBORG UNIVERSITY. Department of Statistics GOTEBORG UNIVERSITY Department of Statistics RESEARCH REPORT 1994:5 ISSN 0349-8034 COMPARING POWER AND MULTIPLE SIGNIFICANCE LEVEL FOR STEP UP AND STEP DOWN MULTIPLE TEST PROCEDURES FOR CORRELATED ESTIMATES

More information