Access to the published version may require journal subscription. Published with permission from: Elsevier.

Similar documents

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Expectation and Variance of a random variable

Chapter 6 Sampling Distributions

1 Inferential Methods for Correlation and Regression Analysis

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Approximate Confidence Interval for the Reciprocal of a Normal Mean with a Known Coefficient of Variation

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Probability and statistics: basic terms

Chapter 13, Part A Analysis of Variance and Experimental Design

LECTURE 8: ASYMPTOTICS I

Stat 421-SP2012 Interval Estimation Section

Linear Regression Models

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

32 estimating the cumulative distribution function

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Stat 319 Theory of Statistics (2) Exercises

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

Bootstrap Intervals of the Parameters of Lognormal Distribution Using Power Rule Model and Accelerated Life Tests

Comparison Study of Series Approximation. and Convergence between Chebyshev. and Legendre Series

Parameter, Statistic and Random Samples

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Common Large/Small Sample Tests 1/55

Introducing a Novel Bivariate Generalized Skew-Symmetric Normal Distribution

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

GG313 GEOLOGICAL DATA ANALYSIS

Testing Statistical Hypotheses for Compare. Means with Vague Data

Maximum likelihood estimation from record-breaking data for the generalized Pareto distribution

Estimation for Complete Data

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Frequentist Inference

The standard deviation of the mean

Properties and Hypothesis Testing

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Lecture 19: Convergence

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

A goodness-of-fit test based on the empirical characteristic function and a comparison of tests for normality

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Chapter 6. Sampling and Estimation

Simulation. Two Rule For Inverting A Distribution Function

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

of the matrix is =-85, so it is not positive definite. Thus, the first

1.010 Uncertainty in Engineering Fall 2008

Random Variables, Sampling and Estimation

Stat 200 -Testing Summary Page 1

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Parameter, Statistic and Random Samples

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Bayesian Methods: Introduction to Multi-parameter Models

Module 1 Fundamentals in statistics

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Statistical inference: example 1. Inferential Statistics

Central limit theorem and almost sure central limit theorem for the product of some partial sums

A Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

MATH/STAT 352: Lecture 15

Chapter 2 Descriptive Statistics

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

R. van Zyl 1, A.J. van der Merwe 2. Quintiles International, University of the Free State

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

GUIDELINES ON REPRESENTATIVE SAMPLING

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

7.1 Convergence of sequences of random variables

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS

Distributions of Functions of. Normal Random Variables Version 27 Jan 2004

Asymptotic distribution of products of sums of independent random variables

EDGEWORTH SIZE CORRECTED W, LR AND LM TESTS IN THE FORMATION OF THE PRELIMINARY TEST ESTIMATOR

Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Lecture 18: Sampling distributions

CONTROL CHARTS FOR THE LOGNORMAL DISTRIBUTION

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

ANOTHER WEIGHTED WEIBULL DISTRIBUTION FROM AZZALINI S FAMILY

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

IUT of Saint-Etienne Sales and Marketing department Mr Ferraris Prom /12/2015

On Differently Defined Skewness

Statistics 511 Additional Materials

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Provläsningsexemplar / Preview TECHNICAL REPORT INTERNATIONAL SPECIAL COMMITTEE ON RADIO INTERFERENCE

This is an introductory course in Analysis of Variance and Design of Experiments.

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Transcription:

This is a author produced versio of a paper published i Statistics ad Probability Letters. This paper has bee peer-reviewed, it does ot iclude the joural pagiatio. Citatio for the published paper: Forkma, Johaes ad Verrill, Steve. 2008) The distributio of McKay s approximatio for the coefficiet of variatio. Statistics ad Probability Letters. 78: 1, 10-14. ISSN: 0167-7152 http://dx.doi.org/10.1016/j.spl.2007.04.018 Access to the published versio may require joural subscriptio. Published with permissio from: Elsevier. Epsilo Ope Archive http://epsilo.slu.se

The Distributio of McKay s Approximatio for the Coefficiet of Variatio Johaes Forkma a,1, Steve Verrill b a Departmet of Biometry ad Egieerig, Swedish Uiversity of Agricultural Scieces, P.O. Box 7032, SE-750 07 Uppsala, Swede b U.S.D.A., Forest Products Laboratory, 1 Gifford Pichot Drive, Madiso, WI 53726, USA Abstract McKay s chi-square approximatio for the coefficiet of variatio is type II ocetral beta distributed ad asymptotically ormal with mea 1 ad variace smaller tha 2 1). Key words: Coefficiet of variatio, McKay s approximatio, Nocetral beta distributio. 1 correspodig author. E-mail address: johaes.forkma@bt.slu.se 1

1 Itroductio The coefficiet of variatio is defied as the stadard deviatio divided by the mea. This measure, which is commoly expressed as a percetage, is widely used sice it is ofte ecessary to relate the size of the variatio to the level of the observatios. McKay 1932) itroduced a χ 2 approximatio for the coefficiet of variatio calculated o ormally distributed observatios. It ca be defied i the followig way. Defiitio 1. Let y j, j = 1,...,, be idepedet observatios from a ormal distributio with expected value µ ad variace σ 2. Let γ deote the populatio coefficiet of variatio, i.e. γ = σ/µ, ad let c deote the sample coefficiet of variatio, i.e. c = 1 1 y j m) 2, m = 1 y j. m 1 j=1 McKay s approximatio K is defied as K = j=1 1 + 1 ) 1) c 2 γ 2. 1) 1 + 1)c2 As poited out by Umphrey 1983) formula 1) appears slightly differet i the origial paper by McKay 1932) sice McKay used the maximum likelihood estimator of σ 2, with deomiator, istead of the ubiased estimator with deomiator 1. McKay 1932) claimed that K is approximately cetral χ 2 distributed with 1 degrees of freedom provided that γ is small γ < 1/3). This result was established by expressig the probability desity fuctio of c as a cotour itegral ad makig a approximatio i the complex plae. McKay did ot theoretically express the size of the error of the approximatio. For this reaso Fieller 1932), i immediate coectio to McKay s paper, ivestigated McKay s approximatio K umerically ad cocluded that it is quite adequate for ay practical purpose. Also Pearso 1932) examied the ew approximatio ad foud it very satisfactory. Later Iglewicz & Myers 1970) studied the usefuless of McKay s approximatio for calculatig quatiles of the distributio of the sample coefficiet of variatio c whe the uderlyig distributio is ormal. They 2

compared results accordig to the approximatio with exact results ad foud that the approximatio is accurate. Umphrey 1983) corrected a similar study made by Warre 1982) ad cocluded that McKay s approximatio is adequate. Vagel 1996) aalytically compared the cumulative desity fuctio of McKay s approximatio with the cumulative desity fuctio of the aïve χ 2 approximatio N = 1) c2 γ 2 ad showed that McKay s approximatio is substatially more accurate. Vagel also proposed a small modificatio of McKay s approximatio useful for calculatig approximate cofidece itervals for the coefficiet of variatio. Forkma 2006) suggested McKay s approximatio for testig the hypothesis that two coefficiets of variatio are equal. Aother test for the hypothesis of equal coefficiets of variatio, also based o McKay s approximatio, was proposed by Beett 1976). It is thus well documeted that McKay s approximatio is approximately cetral χ 2 distributed with 1 degrees of freedom, ad useful applicatios have bee suggested. I this paper it is show that McKay s approximatio is type II ocetral beta distributed, ad its asymptotic behavior is ivestigated. 2 The distributio of McKay s approximatio If U ad V are idepedet cetral χ 2 distributed radom variables with u ad v degrees of freedom respectively, the ratio R = V/ U + V ) is beta distributed with v/2 ad u/2 degrees of freedom respectively. If V is istead a ocetral χ 2 distributed radom variable the ratio R is ocetral beta distributed Johso & Kotz, 1970). I this case Chattamvelli 1995) calls the distributio of R the type I ocetral beta distributio ad the distributio of 1 R the type II ocetral beta distributio. We shall i agreemet with Chattamvelli 1995) use the followig defiitio. 3

Defiitio 2. Let U be a cetral χ 2 distributed radom variable with u degrees of freedom, ad let V be a ocetral χ 2 distributed radom variable, idepedet of U, with v degrees of freedom ad ocetrality parameter λ. The type II ocetral beta distributio with parameters u/2, v/2 ad λ, deoted Beta II u/2, v/2, λ) is defied as the distributio of U/ U + V ). The followig theorem states that the radom variable K, claimed by McKay 1932) to be approximately χ 2 distributed, is type II ocetral beta distributed. Theorem 3. The distributio of McKay s approximatio K, as defied i Defiitio 1, is 1 + 1 ) 1 γ 2 Beta II, 2 ) 1 2, γ 2. 2) Proof. Let s deote the stadard deviatio, i.e. s = cm. The the secod factor i 1) ca be writte = 1) c 2 1 + 1)c2 j=1 = y j m) 2 m 2 + 1 j=1 y j m) 2 j=1 y j m) 2 j=1 y j m) 2 + U = j=1 m2 U + V, where U = j=1 y j m) 2 /σ 2 ad V = j=1 m2 /σ 2. Here U is cetral χ 2 distributed with 1 degrees of freedom. The average m is ormally distributed with expected value µ ad variace σ 2 /. Cosequetly m 2 /σ 2, i.e. V, is χ 2 distributed with 1 degree of freedom ad ocetrality parameter µ 2 /σ 2 = /γ 2. Sice the sums of squares j=1 y j m) 2 ad j=1 m2 are idepedet the theorem follows. It is well kow that /c is ocetral t distributed with 1 degrees of freedom ad ocetrality parameter /γ e.g. Owe, 1968). Theorem 3 is easily prove from this startig poit as well. We also ote that the factor 1 + 1/γ 2 ) i 2) is the expected value of U + V as defied i the proof of Theorem 3. This observatio suggests applicatio of the law of large umbers whe ivestigatig the covergece of McKay s approximatio. 4

Theorem 4. The distributio of McKay s approximatio K as defied i Defiitio 1, equals the distributio of U W, where U is a cetral χ 2 distributed radom variable with 1 degrees of freedom ad W is a radom variable that coverges i probability to 1. Proof. Let Z k, k = 1, 2,..., 1, be idepedet stadardized ormal radom variables. The U d = 1 1 Z 2 1 1 k, which coverges almost surely to 1. Let also Z deote a stadardized ormal radom variable, ad let V = 1 Z + V / coverges i probability to 1/γ 2. Thus By Theorem 3 γ k=1 ) 2 = Z2 + 2Z γ + 1 γ 2. ) U + V p 1 + 1 γ 2. 3) d K = 1 + 1 ) U γ 2 = W U, U + V where W = 1 + 1/γ 2 )/ U + V ), by 3), coverges i probability to 1. Give Theorem 4 oe might assume that McKay s approximatio K is asymptotically ormal with mea 1 ad variace 2 1). Istead the followig result holds. Theorem 5. Let K defied i Defiitio 1. The be McKay s approximatio ad γ the coefficiet of variatio as K 1) 2 1) d N 0, 1 + 2γ 2 ) 1 + 2γ 2 + γ 4. 4) Proof. Let Z deote a stadardized ormal radom variable, ad let V = Z + /γ) 2. Let U be a cetral χ 2 distributed radom variable with 1 degrees of freedom, idepedet of V. The, by Theorem 3, K 1) d 1 1 + 1/γ 2 ) )U = 1) = A B, 5) 2 1) 2 1) U + V 5

where, by 3), ad We obtai where B = A = p γ 2 U + V 1 + γ 2 6) 1 U γ 2 + 1) 2 1) γ 2 1)U ) + V ). B = C + D + E + F 7) C = U 1) d γ 2 N 0, 1 ) 2 1) γ 4, 8) D = 2 1)Z γ 1) d N 0, 2 ) γ 2, 9) E = U p 0, 10) 2 1) 1)Z2 F = p 0. 11) 2 1) Sice C is idepedet of D, results 7) 11) imply that d B N 0, 1 γ 4 + 2 ) γ 2. 12) Results 5), 6) ad 12) yield the theorem. 3 Discussio We have see that McKay s χ 2 approximatio for the coefficiet of variatio is exactly type II ocetral beta distributed. This observatio provides isight ito the approximatio, origially derived by complex aalysis. We showed that McKay s χ 2 approximatio i distributio equals the product of a χ 2 distributed radom variable ad a variable that coverges i probability to 1. Nevertheless, accordig to Theorem 5, McKay s χ 2 approximatio is asymptotically ormal with mea 1 ad variace 2 1)1+2γ 2 )/1+γ 2 ) 2, 6

where γ is the coefficiet of variatio. Sice it has previously bee assumed that McKay s approximatio is asymptotically exact Vagel, 1996) it is surprisig that the variace does ot equal 2 1). It should be oted, however, that McKay s χ 2 approximatio is iteded for the cases i which the coefficiet of variatio γ is smaller tha 1/3. This requiremet should be fulfilled whe aalyzig observatios from a positive variable that is approximately ormally distributed, sice otherwise σ > µ/3 ad the probability of egative observatios is ot egligable. Provided that γ < 1/3 the stadardized McKay s χ 2 approximatio 4) coverges i distributio to a ormal distributio with expected value 0 ad variace larger tha 0.99 but smaller tha 1. McKay s χ 2 approximatio should cosequetly asymptotically be sufficietly accurate for most applicatios. Though the iverse of the coefficiet of variatio is ocetral t distributed ad algorithms for calculatig the cumulative desity fuctio of this distributio owadays exist Leth, 1989), McKay s approximatio is still adequate ad may be useful for various composite iferetial problems o the coefficiet of variatio i ormally distributed data. Algorithms for computig the cumulative distributio fuctio of the ocetral beta distributio were reviewed by Chattamvelli 1995). The ope source software R makes use of algorithms give by Leth 1987) ad Frick 1990). 4 Ackowledgemets We thak Prof. Dietrich vo Rose for ideas ad discussios. The Cetre of Biostochastics, Swedish Uiversity of Agricultural Scieces ad Pharmacia Diagostics AB fiaced the research. 5 Refereces Beett, B.M. 1976), O a Approximate Test for Homogeeity of Coefficiets of Variatio, i: W.J. Ziegler, eds. Cotributios to Applied Statistics dedicated to A. 7

Lider, Experietia Suppl. 22, 169 171. Chattamvelli R. 1995), A Note o the Nocetral Beta Distributio Fuctio, Amer. Statist. 49, 231 234. Fieller, E.C. 1932), A Numerical Test of the Adequacy of A. T. McKay s Approximatio, J. Roy. Statist. Soc. 95, 699 702. Forkma, J. 2006), Statistical Iferece for the Coefficiet of Variatio i Normally Distributed Data, Cetre of Biostochastics, Swedish Uiversity of Agricultural Scieces, Research Report 2006:2. Frick, H. 1990), Algorithm AS R84: A Remark o Algorithm AS 226: Computig No-cetral Beta Probabilities, Appl. Statist. 39, 311 312. Iglewicz, B. ad Myers, R.H. 1970), Compariso of Approximatios to the Percetage Poits of the Sample Coefficiet of Variatio, Techometrics 12, 166 169. Johso, N.L. ad Kotz, S. 1970), Distributios i Statistics: Cotiuous Uivariate Distributios 2 Wiley, New York). Leth, R.V. 1987), Algorithm AS 226: Computig Nocetral Beta Probabilities, Appl. Statist. 36, 241 244. Leth, R.V. 1989), Algorithm AS 243: Cumulative Distributio Fuctio of the ocetral t Distributio, Appl. Statist. 38, 185 189. McKay, A.T. 1932), Distributio of the Coefficiet of Variatio ad the Exteded t Distributio, J. Roy. Statist. Soc. 95, 695 698. Owe, D.B. 1968), A Survey of Properties ad Applicatios of the Nocetral t-distributio, Techometrics 10, 445 478. Pearso, E.S. 1932), Compariso of A. T. McKay s Approximatio with Experimetal Samplig Results, J. Roy. Statist. Soc. 95, 703 704. Umphrey, G.J. 1983), A Commet o McKay s Approximatio for the Coefficiet of Variatio, Commu. Stat. Simul. C. 12, 629 635. 8

Vagel, M.G. 1996), Cofidece Itervals for a Normal Coefficiet of Variatio. Amer. Statist.15, 21 26. Warre, W.G. 1982), O the Adequacy of the Chi-Squared Approximatio for the Coefficiet of Variatio. Commu. Stat. Simul. C. 11, 659 666. 9