This is a author produced versio of a paper published i Statistics ad Probability Letters. This paper has bee peer-reviewed, it does ot iclude the joural pagiatio. Citatio for the published paper: Forkma, Johaes ad Verrill, Steve. 2008) The distributio of McKay s approximatio for the coefficiet of variatio. Statistics ad Probability Letters. 78: 1, 10-14. ISSN: 0167-7152 http://dx.doi.org/10.1016/j.spl.2007.04.018 Access to the published versio may require joural subscriptio. Published with permissio from: Elsevier. Epsilo Ope Archive http://epsilo.slu.se
The Distributio of McKay s Approximatio for the Coefficiet of Variatio Johaes Forkma a,1, Steve Verrill b a Departmet of Biometry ad Egieerig, Swedish Uiversity of Agricultural Scieces, P.O. Box 7032, SE-750 07 Uppsala, Swede b U.S.D.A., Forest Products Laboratory, 1 Gifford Pichot Drive, Madiso, WI 53726, USA Abstract McKay s chi-square approximatio for the coefficiet of variatio is type II ocetral beta distributed ad asymptotically ormal with mea 1 ad variace smaller tha 2 1). Key words: Coefficiet of variatio, McKay s approximatio, Nocetral beta distributio. 1 correspodig author. E-mail address: johaes.forkma@bt.slu.se 1
1 Itroductio The coefficiet of variatio is defied as the stadard deviatio divided by the mea. This measure, which is commoly expressed as a percetage, is widely used sice it is ofte ecessary to relate the size of the variatio to the level of the observatios. McKay 1932) itroduced a χ 2 approximatio for the coefficiet of variatio calculated o ormally distributed observatios. It ca be defied i the followig way. Defiitio 1. Let y j, j = 1,...,, be idepedet observatios from a ormal distributio with expected value µ ad variace σ 2. Let γ deote the populatio coefficiet of variatio, i.e. γ = σ/µ, ad let c deote the sample coefficiet of variatio, i.e. c = 1 1 y j m) 2, m = 1 y j. m 1 j=1 McKay s approximatio K is defied as K = j=1 1 + 1 ) 1) c 2 γ 2. 1) 1 + 1)c2 As poited out by Umphrey 1983) formula 1) appears slightly differet i the origial paper by McKay 1932) sice McKay used the maximum likelihood estimator of σ 2, with deomiator, istead of the ubiased estimator with deomiator 1. McKay 1932) claimed that K is approximately cetral χ 2 distributed with 1 degrees of freedom provided that γ is small γ < 1/3). This result was established by expressig the probability desity fuctio of c as a cotour itegral ad makig a approximatio i the complex plae. McKay did ot theoretically express the size of the error of the approximatio. For this reaso Fieller 1932), i immediate coectio to McKay s paper, ivestigated McKay s approximatio K umerically ad cocluded that it is quite adequate for ay practical purpose. Also Pearso 1932) examied the ew approximatio ad foud it very satisfactory. Later Iglewicz & Myers 1970) studied the usefuless of McKay s approximatio for calculatig quatiles of the distributio of the sample coefficiet of variatio c whe the uderlyig distributio is ormal. They 2
compared results accordig to the approximatio with exact results ad foud that the approximatio is accurate. Umphrey 1983) corrected a similar study made by Warre 1982) ad cocluded that McKay s approximatio is adequate. Vagel 1996) aalytically compared the cumulative desity fuctio of McKay s approximatio with the cumulative desity fuctio of the aïve χ 2 approximatio N = 1) c2 γ 2 ad showed that McKay s approximatio is substatially more accurate. Vagel also proposed a small modificatio of McKay s approximatio useful for calculatig approximate cofidece itervals for the coefficiet of variatio. Forkma 2006) suggested McKay s approximatio for testig the hypothesis that two coefficiets of variatio are equal. Aother test for the hypothesis of equal coefficiets of variatio, also based o McKay s approximatio, was proposed by Beett 1976). It is thus well documeted that McKay s approximatio is approximately cetral χ 2 distributed with 1 degrees of freedom, ad useful applicatios have bee suggested. I this paper it is show that McKay s approximatio is type II ocetral beta distributed, ad its asymptotic behavior is ivestigated. 2 The distributio of McKay s approximatio If U ad V are idepedet cetral χ 2 distributed radom variables with u ad v degrees of freedom respectively, the ratio R = V/ U + V ) is beta distributed with v/2 ad u/2 degrees of freedom respectively. If V is istead a ocetral χ 2 distributed radom variable the ratio R is ocetral beta distributed Johso & Kotz, 1970). I this case Chattamvelli 1995) calls the distributio of R the type I ocetral beta distributio ad the distributio of 1 R the type II ocetral beta distributio. We shall i agreemet with Chattamvelli 1995) use the followig defiitio. 3
Defiitio 2. Let U be a cetral χ 2 distributed radom variable with u degrees of freedom, ad let V be a ocetral χ 2 distributed radom variable, idepedet of U, with v degrees of freedom ad ocetrality parameter λ. The type II ocetral beta distributio with parameters u/2, v/2 ad λ, deoted Beta II u/2, v/2, λ) is defied as the distributio of U/ U + V ). The followig theorem states that the radom variable K, claimed by McKay 1932) to be approximately χ 2 distributed, is type II ocetral beta distributed. Theorem 3. The distributio of McKay s approximatio K, as defied i Defiitio 1, is 1 + 1 ) 1 γ 2 Beta II, 2 ) 1 2, γ 2. 2) Proof. Let s deote the stadard deviatio, i.e. s = cm. The the secod factor i 1) ca be writte = 1) c 2 1 + 1)c2 j=1 = y j m) 2 m 2 + 1 j=1 y j m) 2 j=1 y j m) 2 j=1 y j m) 2 + U = j=1 m2 U + V, where U = j=1 y j m) 2 /σ 2 ad V = j=1 m2 /σ 2. Here U is cetral χ 2 distributed with 1 degrees of freedom. The average m is ormally distributed with expected value µ ad variace σ 2 /. Cosequetly m 2 /σ 2, i.e. V, is χ 2 distributed with 1 degree of freedom ad ocetrality parameter µ 2 /σ 2 = /γ 2. Sice the sums of squares j=1 y j m) 2 ad j=1 m2 are idepedet the theorem follows. It is well kow that /c is ocetral t distributed with 1 degrees of freedom ad ocetrality parameter /γ e.g. Owe, 1968). Theorem 3 is easily prove from this startig poit as well. We also ote that the factor 1 + 1/γ 2 ) i 2) is the expected value of U + V as defied i the proof of Theorem 3. This observatio suggests applicatio of the law of large umbers whe ivestigatig the covergece of McKay s approximatio. 4
Theorem 4. The distributio of McKay s approximatio K as defied i Defiitio 1, equals the distributio of U W, where U is a cetral χ 2 distributed radom variable with 1 degrees of freedom ad W is a radom variable that coverges i probability to 1. Proof. Let Z k, k = 1, 2,..., 1, be idepedet stadardized ormal radom variables. The U d = 1 1 Z 2 1 1 k, which coverges almost surely to 1. Let also Z deote a stadardized ormal radom variable, ad let V = 1 Z + V / coverges i probability to 1/γ 2. Thus By Theorem 3 γ k=1 ) 2 = Z2 + 2Z γ + 1 γ 2. ) U + V p 1 + 1 γ 2. 3) d K = 1 + 1 ) U γ 2 = W U, U + V where W = 1 + 1/γ 2 )/ U + V ), by 3), coverges i probability to 1. Give Theorem 4 oe might assume that McKay s approximatio K is asymptotically ormal with mea 1 ad variace 2 1). Istead the followig result holds. Theorem 5. Let K defied i Defiitio 1. The be McKay s approximatio ad γ the coefficiet of variatio as K 1) 2 1) d N 0, 1 + 2γ 2 ) 1 + 2γ 2 + γ 4. 4) Proof. Let Z deote a stadardized ormal radom variable, ad let V = Z + /γ) 2. Let U be a cetral χ 2 distributed radom variable with 1 degrees of freedom, idepedet of V. The, by Theorem 3, K 1) d 1 1 + 1/γ 2 ) )U = 1) = A B, 5) 2 1) 2 1) U + V 5
where, by 3), ad We obtai where B = A = p γ 2 U + V 1 + γ 2 6) 1 U γ 2 + 1) 2 1) γ 2 1)U ) + V ). B = C + D + E + F 7) C = U 1) d γ 2 N 0, 1 ) 2 1) γ 4, 8) D = 2 1)Z γ 1) d N 0, 2 ) γ 2, 9) E = U p 0, 10) 2 1) 1)Z2 F = p 0. 11) 2 1) Sice C is idepedet of D, results 7) 11) imply that d B N 0, 1 γ 4 + 2 ) γ 2. 12) Results 5), 6) ad 12) yield the theorem. 3 Discussio We have see that McKay s χ 2 approximatio for the coefficiet of variatio is exactly type II ocetral beta distributed. This observatio provides isight ito the approximatio, origially derived by complex aalysis. We showed that McKay s χ 2 approximatio i distributio equals the product of a χ 2 distributed radom variable ad a variable that coverges i probability to 1. Nevertheless, accordig to Theorem 5, McKay s χ 2 approximatio is asymptotically ormal with mea 1 ad variace 2 1)1+2γ 2 )/1+γ 2 ) 2, 6
where γ is the coefficiet of variatio. Sice it has previously bee assumed that McKay s approximatio is asymptotically exact Vagel, 1996) it is surprisig that the variace does ot equal 2 1). It should be oted, however, that McKay s χ 2 approximatio is iteded for the cases i which the coefficiet of variatio γ is smaller tha 1/3. This requiremet should be fulfilled whe aalyzig observatios from a positive variable that is approximately ormally distributed, sice otherwise σ > µ/3 ad the probability of egative observatios is ot egligable. Provided that γ < 1/3 the stadardized McKay s χ 2 approximatio 4) coverges i distributio to a ormal distributio with expected value 0 ad variace larger tha 0.99 but smaller tha 1. McKay s χ 2 approximatio should cosequetly asymptotically be sufficietly accurate for most applicatios. Though the iverse of the coefficiet of variatio is ocetral t distributed ad algorithms for calculatig the cumulative desity fuctio of this distributio owadays exist Leth, 1989), McKay s approximatio is still adequate ad may be useful for various composite iferetial problems o the coefficiet of variatio i ormally distributed data. Algorithms for computig the cumulative distributio fuctio of the ocetral beta distributio were reviewed by Chattamvelli 1995). The ope source software R makes use of algorithms give by Leth 1987) ad Frick 1990). 4 Ackowledgemets We thak Prof. Dietrich vo Rose for ideas ad discussios. The Cetre of Biostochastics, Swedish Uiversity of Agricultural Scieces ad Pharmacia Diagostics AB fiaced the research. 5 Refereces Beett, B.M. 1976), O a Approximate Test for Homogeeity of Coefficiets of Variatio, i: W.J. Ziegler, eds. Cotributios to Applied Statistics dedicated to A. 7
Lider, Experietia Suppl. 22, 169 171. Chattamvelli R. 1995), A Note o the Nocetral Beta Distributio Fuctio, Amer. Statist. 49, 231 234. Fieller, E.C. 1932), A Numerical Test of the Adequacy of A. T. McKay s Approximatio, J. Roy. Statist. Soc. 95, 699 702. Forkma, J. 2006), Statistical Iferece for the Coefficiet of Variatio i Normally Distributed Data, Cetre of Biostochastics, Swedish Uiversity of Agricultural Scieces, Research Report 2006:2. Frick, H. 1990), Algorithm AS R84: A Remark o Algorithm AS 226: Computig No-cetral Beta Probabilities, Appl. Statist. 39, 311 312. Iglewicz, B. ad Myers, R.H. 1970), Compariso of Approximatios to the Percetage Poits of the Sample Coefficiet of Variatio, Techometrics 12, 166 169. Johso, N.L. ad Kotz, S. 1970), Distributios i Statistics: Cotiuous Uivariate Distributios 2 Wiley, New York). Leth, R.V. 1987), Algorithm AS 226: Computig Nocetral Beta Probabilities, Appl. Statist. 36, 241 244. Leth, R.V. 1989), Algorithm AS 243: Cumulative Distributio Fuctio of the ocetral t Distributio, Appl. Statist. 38, 185 189. McKay, A.T. 1932), Distributio of the Coefficiet of Variatio ad the Exteded t Distributio, J. Roy. Statist. Soc. 95, 695 698. Owe, D.B. 1968), A Survey of Properties ad Applicatios of the Nocetral t-distributio, Techometrics 10, 445 478. Pearso, E.S. 1932), Compariso of A. T. McKay s Approximatio with Experimetal Samplig Results, J. Roy. Statist. Soc. 95, 703 704. Umphrey, G.J. 1983), A Commet o McKay s Approximatio for the Coefficiet of Variatio, Commu. Stat. Simul. C. 12, 629 635. 8
Vagel, M.G. 1996), Cofidece Itervals for a Normal Coefficiet of Variatio. Amer. Statist.15, 21 26. Warre, W.G. 1982), O the Adequacy of the Chi-Squared Approximatio for the Coefficiet of Variatio. Commu. Stat. Simul. C. 11, 659 666. 9