Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Similar documents
x = , so that calculated

Lecture 6 More on Complete Randomized Block Design (RBD)

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Modeling and Simulation NETW 707

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

DERIVATION OF THE PROBABILITY PLOT CORRELATION COEFFICIENT TEST STATISTICS FOR THE GENERALIZED LOGISTIC DISTRIBUTION

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Comparison of Regression Lines

4.1. Lecture 4: Fitting distributions: goodness of fit. Goodness of fit: the underlying principle

POWER AND SIZE OF NORMAL DISTRIBUTION AND ITS APPLICATIONS

NUMERICAL DIFFERENTIATION

Chapter 13: Multiple Regression

CHAPTER 8. Exercise Solutions

A Robust Method for Calculating the Correlation Coefficient

/ n ) are compared. The logic is: if the two

Joint Statistical Meetings - Biopharmaceutical Section

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Statistical Evaluation of WATFLOOD

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Linear Approximation with Regularization and Moving Least Squares

A nonparametric two-sample wald test of equality of variances

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Statistical Hypothesis Testing for Returns to Scale Using Data Envelopment Analysis

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method

A Note on Test of Homogeneity Against Umbrella Scale Alternative Based on U-Statistics

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

This column is a continuation of our previous column

Bayesian predictive Configural Frequency Analysis

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Exponential Type Product Estimator for Finite Population Mean with Information on Auxiliary Attribute

Assignment 5. Simulation for Logistics. Monti, N.E. Yunita, T.

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

An (almost) unbiased estimator for the S-Gini index

Statistics for Economics & Business

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

System in Weibull Distribution

Linear Regression Analysis: Terminology and Notation

The RS Generalized Lambda Distribution Based Calibration Model

Testing for seasonal unit roots in heterogeneous panels

An Application of Fuzzy Hypotheses Testing in Radar Detection

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Estimation of the Mean of Truncated Exponential Distribution

Topic 23 - Randomized Complete Block Designs (RCBD)

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

BOOTSTRAP METHOD FOR TESTING THE EQUALITY OF MEANS: IN CASE OF HETEROSCEDASTICITY

Statistics Chapter 4

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

Lecture 4 Hypothesis Testing

Chapter 11: Simple Linear Regression and Correlation

Negative Binomial Regression

Polynomial Regression Models

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Spring Research Conference - Section on Physical & Engineering Sciences (SPES) n n -1 4 ).

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Inference from Data Partitions

Kernel Methods and SVMs Extension

Statistical inference for generalized Pareto distribution based on progressive Type-II censored data with random removals

STATISTICS QUESTIONS. Step by Step Solutions.

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

Limited Dependent Variables

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Pulse Coded Modulation

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

ERROR RATES STABILITY OF THE HOMOSCEDASTIC DISCRIMINANT FUNCTION

Numerical Heat and Mass Transfer

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

Goodness of fit and Wilks theorem

Test for Intraclass Correlation Coefficient under Unequal Family Sizes

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

x i1 =1 for all i (the constant ).

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

Lab 2e Thermal System Response and Effective Heat Transfer Coefficient

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

A note on regression estimation with unknown population size

A General Approach to Evaluating Agreement between Two Observers or Methods of Measurement

On the Influential Points in the Functional Circular Relationship Models

Statistics for Business and Economics

Lecture Notes on Linear Regression

GROUP SEQUENTIAL TEST OF NON-PARAMETRIC STATISTICS FOR SURVIVAL DATA

VARIATION OF CONSTANT SUM CONSTRAINT FOR INTEGER MODEL WITH NON UNIFORM VARIABLES

Lecture 20: Hypothesis testing

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

Simulation and Probability Distribution

Transcription:

Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth Unversty, 3 School of Informaton Technology and Mathematcal Scences, Unversty of Ballarat. E-Mal: Mke.Steele@cu.edu.au Keywords: Goodness-of-ft; ; Emprcal dstrbuton functon. EXTENDED ABSTRACT The use of goodness-of-ft test statstcs for dscrete or categorcal data s wdespread throughout the research communty wth the Ch- Square the most popular when a researcher ams to determne f observed categorcal data dffers from a hypotheszed multnomal dstrbuton. Even for ordnal categorcal data, the use of emprcal dstrbuton functon (EDF) test statstcs such as the Kolmogorov-Smrnov, the three Cramér-von Mses (A, W and U as defned below) and varous modfcatons of these are lmted n the lterature. studes of the EDF type test statstcs are even more lmted. The results of the smulated power studes n ths paper lead to the followng general recommendatons: For trend type alternatves A and W appear much more powerful than U and χ. (See Fgure for a unform null aganst a decreasng trend alternatve dstrbuton). For all the other nvestgated alternatve dstrbutons U and χ appear much more powerful than A and W. (See Fgure 3 for a unform null aganst a leptokurtc type alternatve dstrbuton). Ths paper compares the smulated power of the three Cramér-von Mses test statstcs wth that of the Ch-Square test statstc for a unform null hypothess aganst a varety of alternatve dstrbutons whch are summarzed n Fgure. Recommendatons are made on whch s the most powerful test statstc for the predefned alternatve dstrbutons. 3 5 Decreasng trend Step Fgure. s for a unform null and a decreasng alternatve dstrbuton. Trangular Platykurtc 3 5 Leptokurtc Bmodal Fgure. Type of alternatve dstrbutons used n the power studes. Fgure 3. s for a unform null and a leptokurtc alternatve dstrbuton. 3

. INTRODUCTION Although desgned for ordnal categorcal data, the emprcal dstrbuton functon (EDF) type goodness-of-ft test statstcs Cramér-von Mses (W ), Anderson-Darlng (A ) and Watson (U ) as defned by Choulakan et al. (994) are not wdely used n the appled lterature. These authors have used smulaton studes to show that A and W are relatvely more powerful than the Ch-Square (χ ) test statstc (Pearson 9) when the null dstrbuton s unform and the alternatve dstrbuton follows a trend. The test statstcs are specfed n Table. Table. Test statstcs used n the power study. Test Statstc Cramérvon Mses Anderson- A Equaton = k W N Z p () = = k Z p N = Darlng H( H) k = = ( ) Watson () U N Z Z p (3) Pearson s k ( O E ) Ch- χ = (4) = E Square where k s the number of cells, N s the sample sze, p s the probablty of an event occurrng n cell, E s the expected frequency n cell, O s the observed frequency n cell, Z = O E, = H E and = k Z Z p. ( ) = = = There have been lmted nvestgatons of the powers of these partcular EDF type test statstcs. Ths paper uses smulated powers to extend the studes of Choulakan et al. (994) and From (996) by comparng the powers of the three Cramér-von Mses type test statstcs wth the χ test statstc for a unform null dstrbuton (A ) aganst the fully specfed alternatve dstrbutons summarzed n Fgure (Decreasng A, Step A, Trangular or bath-tub type A 3, Platykurtc A 4, Leptokurtc A 5 and Bmodal A 6 ) and fully defned n Table. The unform null dstrbuton was used because most smlar publshed power studes of dscrete goodness-of-ft tests have used such a null dstrbuton however further work on non-unform null dstrbutons has been undertaken by Steele (). For a small number of categores some of the alternatve dstrbutons do not clearly exhbt the shapes llustrated n Fgure. Also the dstrbutons become qute smlar for a small number of categores. For ths reason a larger number of categores (k=) was used. Table. Dstrbutons used n the power study. Cell Probabltes 3 4 5 6 7 8 9 A.......... A.3.3..8.7.7.6.6.5.5 A.5.5.5.5.5.5.5.5.5.5 A 3.7.3..7.3.3.7..3.7 A 4.4.....4 A 5.5.5.5.5.3.3.5.5.5.5 A 6.5..7..6.6..7..5 In Secton the smulaton and lnear nterpolaton technques used to approxmate power are dscussed wth sample sze consderatons. The results of the power studes are presented n Secton 3 and a summary table of the most powerful test statstc for each alternatve dstrbuton s presented n the concludng Secton 4.. CALCULATION OF THE SIMULATED POWER For a unform null dstrbuton over ten cells aganst the alternatve dstrbutons defned n Table the powers of the test statstcs are approxmated for sample szes of,, 3, 5, and. The sample szes represent expected frequences of,, 3, 5, and per cell under the unform null dstrbuton and by selectng these expected frequences researchers who use goodness-of-ft tests wth a mnmum requrement of 5 observatons per cell can make power comparsons for dfferent mnmum number of observatons per cell. It s also shown n the results that n most of the stuatons dscussed below that sample szes of around per cell produce power approxmatons very close to. The powers are estmated usng smulated random samples. The smulated null dstrbuton of each test statstc s dscrete whch means that a crtcal value and correspondng power at a sgnfcance level of exactly 5% may not be possble. To enable meanngful comparsons of the powers of each test statstc, the powers are obtaned for crtcal values ether sde of the 5% level, and lnearly nterpolated to produce the approxmate power for the 5% level. 3

3. POWER STUDY RESULTS 3.. Unform Null wth a Decreasng (A ) Alternatve For small sample szes Fgure 4 shows that A and W have powers greater than χ and U. The largest cumulatve dfference between the unform null and the decreasng alternatve dstrbuton occurs at the second cell and as A and W are affected by large cumulatve dfferences at the earler cells ths s one reason why they have larger power under these crcumstances. Also χ generally has hgher power than U. For sample szes of at least 5 per cell (e N 5 n ths example), the powers of all the test statstcs are very hgh. 3 5 3 5 Fgure 4. s for unform null and decreasng (A ) alternatve. 3.. Unform Null wth a Step Type (A ) Alternatve For the step type dstrbuton the cumulatve dfference between a unform null and the step type A dstrbuton ncreases up to the ffth cell. Because they are more able to detect larger cumulatve dfferences n the earler cells the test statstcs A and W are shown n Fgure 5 to be more powerful. It should be noted that the power of U s almost as good as A and W whle the power of χ s notceably less than the three Cramér-von Mses type test statstcs. For larger sample szes of ten or more per cell (e N n ths stuaton) the powers of all four test statstcs are very hgh and approxmately the same. Fgure 5. s for a unform null and step type (A ) alternatve. 3.3. Unform Null wth a Trangular (A 3 ) Alternatve The maor cumulatve dfferences between a unform dstrbuton and the A 3 trangular alternatve dstrbuton do not occur n the earler cells as was the case n Sectons 3. and 3.. For ths reason t s expected that A and W are less lkely to detect a dfference and hence have lower power. The U statstc s crcular n that although t can be used on ordnal type data, calculaton of the test statstc does not depend on whch cell s defned as the frst. Ths crcular test statstc s shown n Fgure 6 to be much more powerful than the other three test statstcs. However for larger sample szes the powers of all the test statstcs are approxmately the same and hgh. Ths result also corresponds to a smlar trangular type alternatve dstrbuton based on cells by Choulakan et al. (994). 3.4. Unform Null wth a Platykurtc (A 4 ) Alternatve As the cumulatve dfferences between a unform null and the A 4 platykurtc alternatve dstrbuton are not large the A and W test statstcs are expected to have lower power. The power of W s shown n Fgure 7 to be very poor for all sample szes however for the smaller sample szes of fve per cell (that s N 5) under the unform null all the test statstcs have poor power. For larger sample szes χ and U are shown to have much hgher power. 3

3 5 3 5 Fgure 6. s for unform null and trangular (A 3 ) alternatve. Fgure 8. s for unform null and leptokurtc (A 5 ) alternatve. 3.6. Unform Null wth a Bmodal (A 6 ) Alternatve The powers of the test statstcs are shown n Fgure 9 to be qute dverse. The power of χ s shown to be approxmately double those of the other test statstcs for smaller sample szes. Although the power of U s qute low t s stll much larger than the very weak powers of A and W. 3 5 Fgure 7. s for unform null and platykurtc (A 4 ) alternatve. 3.5. Unform Null wth a Leptokurtc (A 5 ) Alternatve As was also the case n Secton 3.4, the cumulatve dfferences between the unform null and the leptokurtc A 5 alternatve are qute small for earler cells and the low powers of A and W n Fgure 8 show ths to be true for smaller sample szes. The powers of U and χ are shown to be approxmately equal for all sample szes. It appears that due to ts crcular nature, U s more able to detect the large cumulatve dfferences whch occur at the mddle cells. 3 5 Fgure 9. s for unform null and bmodal (A 6 ) alternatve. 33

4. CONCLUSIONS Although t s not possble to recommend one of these test statstcs as beng the most powerful for all stuatons a very broad summary of the smulated powers n ths paper suggests that, partcularly for smaller sample szes: For trend type alternatves A and W appear much more powerful than U and χ. For all the other nvestgated alternatve dstrbutons U and χ appear much more powerful than A and W. Importantly, when consderng the power of the test statstc, the smulated results presented n ths and other papers suggests that the appled researcher should not blndly use one partcular test statstc. However the broad summary above may assst an appled researcher to at least consder alternatves to the χ test when testng whether ther observed ordnal data dffers from that expected under a multnomal null dstrbuton. 5. REFERENCES Choulakan, V., Lockhart, R.A. and Stephens, M.A (994), Cramér-von Mses statstcs for dscrete dstrbutons, The Canadan Journal of Statstcs, (994) 5-37. From, S.G. (996), A new goodness of ft test for the equalty of multnomal cell probabltes verses trend alternatves, Communcatons n Statstcs-Theory and Methods, 5(996) 367-383. Pearson, K. (9), On the crteron that a gven system of devatons from the probable n the case of a correlated system of varables s such that t can be reasonably supposed to have arsen from random samplng, Phlosophcal Magazne Seres 5, 5(9) 57-75. Steele, M., Chaselng, J. and Hurst, C. (5), A power study of goodness-of-ft tests for categorcal data, 55 th Sesson of the Internatonal Statstcal Insttute, Proceedngs, Sydney, Australa. Steele, M (), The power of categorcal goodness-of-ft test statstcs, PhD thess, Grffth Unversty, Brsbane, Australa. 34