Similar documents
Access to the published version may require journal subscription. Published with permission from: Elsevier.

Approximate Confidence Interval for the Reciprocal of a Normal Mean with a Known Coefficient of Variation

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Properties and Hypothesis Testing

Topic 9: Sampling Distributions of Estimators

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Stat 200 -Testing Summary Page 1

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

1 Inferential Methods for Correlation and Regression Analysis

A statistical method to determine sample size to estimate characteristic value of soil parameters

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Random Variables, Sampling and Estimation

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Stat 421-SP2012 Interval Estimation Section

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

6 Sample Size Calculations

Stat 319 Theory of Statistics (2) Exercises

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

Confidence Intervals For P(X less than Y) In The Exponential Case With Common Location Parameter

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

A goodness-of-fit test based on the empirical characteristic function and a comparison of tests for normality

Sample Size Estimation in the Proportional Hazards Model for K-sample or Regression Settings Scott S. Emerson, M.D., Ph.D.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Approximations to the Distribution of the Sample Correlation Coefficient

Department of Mathematics

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Sample Size Determination (Two or More Samples)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

The standard deviation of the mean

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

5. Likelihood Ratio Tests

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Expectation and Variance of a random variable

Introductory statistics

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

This is an introductory course in Analysis of Variance and Design of Experiments.

The Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. Beta-Binomial Distribution

Lecture 33: Bootstrap

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE

R. van Zyl 1, A.J. van der Merwe 2. Quintiles International, University of the Free State

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

A new distribution-free quantile estimator

[ ] ( ) ( ) [ ] ( ) 1 [ ] [ ] Sums of Random Variables Y = a 1 X 1 + a 2 X 2 + +a n X n The expected value of Y is:

Confidence Intervals for the Coefficients of Variation with Bounded Parameters

Lecture 7: Properties of Random Samples

Final Examination Solutions 17/6/2010

Testing Statistical Hypotheses for Compare. Means with Vague Data

Rank tests and regression rank scores tests in measurement error models

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

CONTROL CHARTS FOR THE LOGNORMAL DISTRIBUTION

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Topic 18: Composite Hypotheses

Bootstrap Intervals of the Parameters of Lognormal Distribution Using Power Rule Model and Accelerated Life Tests

Problem Set 4 Due Oct, 12

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Statistics 511 Additional Materials

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

POWER COMPARISON OF EMPIRICAL LIKELIHOOD RATIO TESTS: SMALL SAMPLE PROPERTIES THROUGH MONTE CARLO STUDIES*

NCSS Statistical Software. Tolerance Intervals

Common Large/Small Sample Tests 1/55

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

Statistical inference: example 1. Inferential Statistics

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Frequentist Inference

Lecture Notes 15 Hypothesis Testing (Chapter 10)

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Lorenzo Camponovo, Taisuke Otsu On Bartlett correctability of empirical likelihood in generalized power divergence family

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

ON BARTLETT CORRECTABILITY OF EMPIRICAL LIKELIHOOD IN GENERALIZED POWER DIVERGENCE FAMILY. Lorenzo Camponovo and Taisuke Otsu.

Chapter 6 Principles of Data Reduction

Estimation for Complete Data

Lecture 2: Monte Carlo Simulation

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Provläsningsexemplar / Preview TECHNICAL REPORT INTERNATIONAL SPECIAL COMMITTEE ON RADIO INTERFERENCE

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Additional Notes and Computational Formulas CHAPTER 3

Estimation of Gumbel Parameters under Ranked Set Sampling

Some Properties of the Exact and Score Methods for Binomial Proportion and Sample Size Calculation

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

Section 14. Simple linear regression.

Statistical Hypothesis Testing. STAT 536: Genetic Statistics. Statistical Hypothesis Testing - Terminology. Hardy-Weinberg Disequilibrium

Using the IML Procedure to Examine the Efficacy of a New Control Charting Technique

Chapter 13, Part A Analysis of Variance and Experimental Design

1 Introduction to reducing variance in Monte Carlo simulations

Transcription:

This is a author produced versio of a paper published i Commuicatios i Statistics - Theory ad Methods. This paper has bee peer-reviewed ad is proof-corrected, but does ot iclude the joural pagiatio. Citatio for the published paper: Forkma, J. (29) Estimator ad Tests for Commo Coefficiets of Variatio i Normal Distributios. Commuicatios i Statistics - Theory ad Methods. Volume: 38 Number: 2, pp 233-251. http://dx.doi.org/1.18/3619282187448 Access to the published versio may require joural subscriptio. Published with permissio from: Taylor & Fracis Epsilo Ope Archive http://epsilo.slu.se

Estimator ad Tests for Commo Coefficiets of Variatio i Normal Distributios Johaes Forkma Departmet of Eergy ad Techology Swedish Uiversity of Agricultural Scieces Box 732, SE-75 7 Uppsala, Swede Johaes.Forkma@vpe.slu.se Abstract Iferece for the coefficiet of variatio i ormal distributios is cosidered. A explicit estimator of a coefficiet of variatio that is shared by several populatios with ormal distributios is proposed. Methods for makig cofidece itervals ad statistical tests, based o McKay s approximatio for the coefficiet of variatio, are provided. Exact expressios for the first two momets of McKay s approximatio are give. A approximate F-test for equality of a coefficiet of variatio that is shared by several ormal distributios ad a coefficiet of variatio that is shared by several other ormal distributios is itroduced. Key words: Coefficiet of Variatio, Cofidece iterval, Hypothesis test, McKay s approximatio, Normal distributio, Statistical iferece 1 Itroductio The coefficiet of variatio c i a sigle sample with observatios y 1, y 2,..., y is defied as c = s/m, where m ad s are m = 1 y j ad s = 1 (y j m) 1 2, (1) j=1 respectively. I this paper we cosider idepedetly ormally distributed observatios with expected value µ >, variace σ 2 ad populatio coefficiet of variatio γ = σ/µ. We discuss three problems: (i) Estimatio of a coefficiet of variatio γ that is shared by k populatios (ii) Cofidece iterval ad test for a coefficiet of variatio γ that is shared by k populatios j=1

(iii) Test for equality of a coefficiet of variatio γ 1 that is shared by k 1 populatios ad a coefficiet of variatio γ 2 that is shared by k 2 populatios Give k estimates c 1,..., c k of a commo coefficiet of variatio γ a method is eeded for poolig the estimates ito oe estimate. Explicitly we wat to kow if the sigle estimates shall be weighted by the umber of observatios i, by the degrees of freedom i 1 or by some other fuctio of the sample size. Zeigler (1973) compared several estimators of a commo coefficiet of variatio, but cosidered oly the case of equally large sample sizes, ad did ot discuss hypothesis tests ad cofidece itervals. Tia (25) studied the problem of makig iferece about a commo γ based o k samples ad suggested a repeated samplig method for calculatig a geeralized probability value as defied by Tsui ad Weerahadi (1989). A drawback with resamplig methods is that they do ot give the same result wheever applied. There could be a eed for a method based o explicit expressios. Verrill ad Johso (27) proposed a likelihood ratio based cofidece iterval for a commo coefficiet of variatio. However, the likelihood ratio test is kow to be too liberal for small sample sizes (Doorbos ad Dijkstra, 1983; Fug ad Tsag, 1998; Nairy ad Rao, 23). This is cofirmed i Sectio 4.2 of the preset paper. The likelihood ratio test is computatioally icoveiet whe there are may populatios. Verrill ad Johso (27) provided, for the likelihood ratio test, a web-based program that simulates critical values for small samples. We suggest that cofidece itervals ad tests for a commo coefficiet of variatio γ be based o the trasformatio for the coefficiet of variatio developed by McKay (1932). This trasformatio gives a approximately χ 2 distributed radom variable whe γ < 1/3, as cofirmed by Fieller (1932), Pearso (1932), Iglewicz ad Myers (197) ad Umphrey (1983). Forkma ad Verrill (28) showed that McKay s χ 2 approximatio is asymptotically ormal with mea 1 ad variace slightly smaller tha 2( 1). Vagel (1996) showed, by Taylor series expasio, that the error i McKay s approximatio is small whe the populatio coefficiet of variatio is small. We derive exact expressios for the first two momets of McKay s approximatio. A test is itroduced, based o McKay s trasformatio, for the equality of a coefficiet of variatio that is shared by k 1 populatios ad a coefficiet of variatio that is shared by k 2 populatios. May tests have bee proposed for the special case k 1 = k 2 = 1: the likelihood ratio test (Lohrdig, 1975; 2

Beett, 1977; Doorbos ad Dijkstra, 1983), the Wald test ad the score test (Gupta ad Ma, 1996), Beett s test (Beett, 1976; Shafer ad Sulliva, 1986) ad Miller s test (Miller, 1991a; Feltz ad Miller, 1996; Miller ad Feltz, 1997). The three problems listed above are cosidered i Sectios 2 4, respectively. I Sectio 4.2 we make a small Mote Carlo study of the performace of the ew test for equality of coefficiets of variatio compared with the likelihood ratio test, Beett s test ad Miller s test. 2 Estimatio of a Commo Coefficiet of Variatio Cosider samples from k ormally distributed populatios with a commo populatio coefficiet of variatio γ, ad defie the sample coefficiets of variatio as i Defiitio 1. Defiitio 1. Let y ij = µ i + e ij, where e ij are idepedetly distributed N(, σi 2), i = 1, 2,..., k ad j = 1, 2,..., i, with positive expected values µ i ad a positive commo populatio coefficiet of variatio γ = σ i /µ i, i = 1, 2,..., k. The coefficiet of variatio c i of sample i, i = 1, 2,..., k, is defied as c i = s i /m i, where m i ad s i are m i = 1 i y ij ad s i = 1 i (y ij m i ) i i 1 2, (2) respectively. j=1 We shall, throughout this paper, assume that the expected values µ i, i = 1, 2,..., k, are positive, which implies γ >. Sice we are focused o applicatios with positive ad approximately ormally distributed observatios, we also assume that the probability of egative observatios is small. We make this assumptio by requirig µ i 3σ i >, i = 1, 2,..., k. This implies γ < 1/3. The joit probability desity fuctio of the observatios {y ij } ca be writte k ( ( 1 i (2π) i/2 exp µ i γ 2 y ij 1 2µ 2 i γ2 i=1 j=1 i j=1 j=1 yij 2 )) i 2γ 2 i log µ i γ. (3) 3

Thus, by the factorizatio theorem, the 2k dimesioal statistic S = { i i y ij, j=1 is sufficiet for η = {1/(µ i γ 2 ), 1/(µ 2 i γ2 )} k i=1, ad sice there is a oe-to-oe correspodece betwee η ad β = (γ, µ 1, µ 2,..., µ k ), S is also sufficiet for β. By writig (3) as ( exp 1 2γ 2 k i (y ij µ i ) 2 i=1 j=1 µ 2 i j=1 y 2 ij } k i=1 k i log µ i γ i=1 k i=1 ) i 2 log(2π), we see that k i (y ij µ i ) 2 i=1 j=1 is complete ad sufficiet for γ 2, whe µ i, i = 1, 2,..., k, are kow. Whe µ i, i = 1, 2,..., k, are ukow, µ i ca be estimated by m i. Thus, with otatio accordig to Defiitio 1, cosider µ 2 i U = 1 k i=1 ( i 1) k i (y ij m i ) 2 i=1 j=1 m 2 i = k i=1 ( i 1) c 2 i k i=1 ( i 1), (4) as a estimator of γ 2. Theorem 1. Let γ be a commo populatio coefficiet of variatio, as defied i Defiitio 1. Let v = k i=1 ( i 1), ad let U v = U as defied by (4). Assume that ( i 1)/v λ i > as v. The v (Uv γ 2 ) d N(, 2γ 4 + 4γ 6 ), as v. Proof. With otatios accordig to Defiitio 1 i (m i µ i, s 2 i σ 2 i ) d N(, V), i = 1, 2,..., k, where ( σ 2 V = i 2σi 4 ). 4

Followig Serflig (198, p. 124) i (c 2 i γ 2 ) d N(, g Vg), i = 1, 2,..., k, where, evaluated at {m i, s 2 i } = {µ i, σ 2 i }, ( ) ( c g 2 = i, c2 i 2σ 2 m i s 2 = i i µ 3, 1 ) i µ 2. i Thus i (c 2 i γ 2 ) d N(, 2γ 4 + 4γ 6 ), i = 1, 2,..., k, ad, because k i=1 λ i = 1, v (U γ 2 ) = k i 1 i=1 v i 1 (c 2 i γ 2 ) d N(, 2γ 4 + 4γ 6 ). We ow cosider T = f(u) = U, (5) with U from (4), as a estimator of γ. Accordig to Theorem 2 the estimator (5) is asymptotically ormally distributed with mea γ ad variace (γ 2 /2 + γ 4 )/ k i=1 ( i 1). Theorem 2. Let γ be a commo populatio coefficiet of variatio, as defied by Defiitio 1, ad let v = k i=1 ( i 1). Let T v = U v, where U v = U as defied by (4). The v (Tv γ) d N(, γ 2 /2 + γ 4 ), as v. (6) Proof. By Theorem 1, v U v d N(γ 2, 2γ 4 + 4γ 6 ) as v. The, by applicatio of Theorem 3.1A i Serflig (198), v Tv = v f(u v ) d N(f(γ 2 ), (f (γ 2 )) 2 (2γ 4 + 4γ 6 )), as v, ad (6) follows sice (f (γ 2 )) 2 (2γ 4 + 4γ 6 ) = γ 2 /2 + γ 4. The expected values of the coefficiets of variatio c i are ot defied, because the desities of m i, i = 1, 2,..., k, are positive i eighborhoods of zero. As a cosequece the expected value of the estimator T, as defied by (5), does ot exist. Nevertheless, whe the expected values µ i are sufficietly 5

large, say larger tha 3 σ i, the probability of averages m i close to zero are egligible i most applicatios, ad we expect c i, i = 1, 2,..., k, to be close to γ. Accordig to Theorem 2 the estimator T is asymptotically ubiased. We shall ow derive a bias correctio for the case of small sample sizes. By a secod order series expasio of T, as a fuctio of {m i, s 2 i }, i = 1, 2,..., k, E(T ) γ + 1 2 k i=1 ( 2 T (s 2 i )2 V ar(s2 i ) + 2 T m 2 i ) V ar(m i ), (7) where the partial secod order derivatives should be evaluated at {m i, s 2 i }k i=1 = {µ i, σ 2 i }k i=1. By (7), sice V ar(s2 i ) = 2σ4 i /( i 1) ad V ar(m i ) = σ 2 i / i, E(T ) γ + 1 2 = γ + 1 2 k ( (i 1) 2 i=1 4v 2 γ 3 µ 4 i k ( (i 1) γ i=1 = γ γ 4v + γ3 2v k i=1 2v 2 i 1 i 2σi 4 i 1 + ( ( i 1) γ vµ 2 3 ) ) i 1 σ 2 i i v i + γ3 v ( i 1 3 )) i 1 i v ( 3 ) i 1, v where v = k i=1 ( i 1). Thus we expect T to be close to γ(1 1/(4v)) whe γ is small, ad a bias adjusted estimator of a commo populatio coefficiet of variatio γ is give by ˆγ = ( 1 1 4 k i=1 ( i 1) ) 1 k i=1 ( i 1) c 2 i k i=1 ( i 1), (8) with otatios as i Defiitio 1. The result of a simulatio study is preseted i Table 1. Three samples of 1, 2 ad 3 observatios from ormal distributios with expected values 1, 1, ad 1,, respectively, ad with a commo populatio coefficiet of variatio γ, were geerated 2, times i MATLAB 6.5 (The Mathworks Ic., Natick, MA, USA). I total 18 combiatios of γ, 1, 2 ad 3 were ivestigated. For each combiatio, the meas of the uadjusted estimator (5), the bias adjusted estimator (8), ad their stadard deviatio were calculated. The study idicates that the bias adjustmet works well for coefficiets of variatio smaller tha 2%. I may chemical applicatios the 6

coefficiet of variatio is smaller tha 2%, ad the bias adjusted estimator (8) is the recommeded. For example, i immuoassays the workig rage is ofte defied as the rage of cocetratios for which the coefficiet of variatio is smaller tha 2% (Carroll, 23). Table 1: Averages ad stadard deviatios of simulated estimators T ad ˆγ of a commo coefficiet of variatio γ γ 1 2 3 Mea(T ) Mea(ˆγ) SD.5 2 2 2.461.53.214.5 3 4 5.486.5.119.5 1 1 1.495.5.68.1 2 2 2.927.111.43.1 3 4 5.976.14.244.1 1 1 1.994.13.138.15 2 2 2.141.1528.662.15 3 4 5.147.1512.369.15 1 1 1.1489.153.21.2 2 2 2.1887.258.897.2 3 4 5.1968.224.51.2 1 1 1.199.29.286.25 2 2 2.2399.2617.1184.25 3 4 5.2482.2553.655.25 1 1 1.2496.2519.367.3 2 2 2.2954.3222.1542.3 3 4 5.35.391.814.3 1 1 1.311.339.458 3 Cofidece Iterval ad Test for a Commo Coefficiet of Variatio I this sectio we cosider the problems of makig a cofidece iterval for a coefficiet of variatio γ that is shared by k populatios ad testig the statistical hypothesis that γ equals some specific value γ. Cofidece itervals ad tests could be based o the estimator ˆγ from (8). However, the percetiles of the distributio of ˆγ are ot easily obtaied. The uadjusted 7

estimator (5) is the square root of a weighted average of squared sample coefficiets of variatio, ad the reciprocal of each sample coefficiet of variatio is ocetral t-distributed (Owe, 1968). We shall build the cofidece iterval ad test o McKay s approximatio, as defied by Defiitio 2. Defiitio 2. McKay s approximatio for the coefficiet of variatio of sample i, as defied by Defiitio 1, is give by K i = (1 + 1γ ) ( i 1)c 2 i 2 1 + c 2 i (. (9) i 1)/ i The expected value of McKay s approximatio (9) exists, ad the distributio of (9) is well approximated by a cetral χ 2 distributio with i 1 degrees of freedom, provided that γ < 1/3. McKay s approximatio is useful for approximately ormally distributed measuremets of variables that oly take positive values. As oted i Sectio 2, if oly positive values ca be obtaied ad the distributio is approximately ormal, the expected value µ caot be smaller tha 3 stadard deviatios σ. The requiremet γ = σ/µ < 1/3, eeded for McKay s approximatio to be approximately χ 2 distributed, is the fulfilled. The distributio of u i = c 2 i 1 + c 2 i ( i 1)/ i, i = 1, 2,..., k, (1) is cosequetly well approximated by a distributio with expected value θ = γ2 1 + γ 2 = γ2 γ 4 + γ 6... (11) ad variace iversely proportioal to i 1. Because the distributio of McKay s approximatio K i is, approximately, cetral χ 2 distributed with i 1 degrees of freedom, k i=1 K i is, approximately, cetral χ 2 distributed with k i=1 ( i 1) degrees of freedom. Thus, k i=1 K i = k i=1 ( i 1)u i /θ ca be used as a approximate pivotal quatity for costructig a cofidece iterval for θ as defied by (11). This approximate 1(1 α)% cofidece iterval for θ ca be writte [ k i=1 ( i 1)u i χ 2 1 α/2, k i=1 ( i 1)u i χ 2 α/2 ], 8

where χ 2 α deotes the 1α:th percetile of a cetral χ 2 distributio with k i=1 ( i 1) degrees of freedom, ad u i is defied by (1). The correspodig approximate 1(1 α)% cofidece iterval for γ is [ k i=1 ( ] i 1)u i χ 2 1 α/2 k k i=1 (, i=1 ( i 1)u i i 1)u i χ 2 α/2 k i=1 (. (12) i 1)u i Cosider the statistical ull hypothesis H : γ = γ. This hypothesis is equivalet to the hypothesis H : θ = θ, where θ = γ 2/(1 + γ2 ). Thus we ca use k i=1 ( i 1)u i (13) θ as a approximately cetral χ 2 distributed, with k i=1 ( i 1) degrees of freedom, test statistic of the hypothesis H : γ = γ. The proposed cofidece iterval (12) ad test (13) rely o the adequacy of McKay s approximatio. Sice we are iterested i the adequacy, we ed this sectio with a ivestigatio of the first two momets of the approximatio, as fuctios of the populatio coefficiet of variatio γ ad the sample size. Theorem 4. Let γ be the populatio coefficiet of variatio as defied by Defiitio 1, ad let K = K 1 be McKay s approximatio for the coefficiet of variatio i a sample with = 1 observatios, as defied by Defiitio 2. The first ad secod momets of McKay s approximatio are E(K) = ( 1) h(γ, ) 2θ E(K 2 ) = (2 1)((1 + γ 2 ) h(γ, ) 2γ 2 ) 4θ 2, where θ = γ 2 /(1 + γ 2 ) ad, for t =, 1, 2...,, h(γ, ) = 9

γ 2 (1 exp( 1/γ 2 )), = 2 t ( ) γ ( 1) r 2 r+1 Γ(t + 3/2) t + 3/2 Γ(t + 3/2 r) r= +( 1) t+1 ( γ 2 t + 3/2 ) t+3/2 2 Γ(t + 3/2) π d( (t + 3/2)/γ), = 3 + 2t t ( ) γ ( 1) r 2 r+1 Γ(t + 2) t + 2 Γ(t + 2 r) r= ( ) γ +( 1) t+1 2 t+2 Γ(t + 2) (1 exp( (t + 2)/γ 2 )), = 4 + 2t, t + 2 with d(x) = exp( x 2 ) x exp(z 2 ) dz. Proof. Forkma ad Verrill (28) showed that Kθ/ is type II ocetral beta distributed with parameters ( 1)/2, 1/2 ad /γ 2. Cosequetly X = 1 Kθ/ is type I ocetral beta distributed with parameters 1/2, ( 1)/2 ad /γ 2. A type I ocetral beta distributio with parameters a, b ad δ is cetral Beta(a+V, b), where V is Poisso(δ/2). Marchad (1997) provided expressios for the first ad secod momets of the type I ocetral beta distributio with parameters a, b ad δ based o g(a+b, δ/2) = E(1/(a+b+ V )). We let h(γ, ) = g(/2, /(2γ 2 )). The, accordig to Marchad (1997), E(X 2 ) = 1 (2 1) γ 2 2 E(X) = 1 ( 1)h(γ, ) 2, (14) + ( 1)( 3 + ( + 1) γ2 ) h(γ, ), (15) 4 ad the theorem follows sice E(K) = (1 E(X))/θ ad E(K 2 ) = 2 (1 2E(X) + E(X 2 ))/θ 2. The fuctio d, required for odd sample sizes i Theorem 4, is Dawsos s itegral, which has bee tabulated by Abramowitz ad Stegu (1972). Because exp( (t + 2)/γ 2 ), t =, 1, 2..., Theorem 4 makes it easy to calculate approximate first ad secod momets, especially for eve sample 1

sizes. For example, E(K) 1 (1 + γ 2 ), = 2 ) 3 (1 + γ2 2 γ4, = 4 2 ) 5 (1 + γ2 3 + 2γ4 9 + 8γ6, = 6, 9 ad 3 (1 + 2γ 2 + γ 4 ), = 2 E(K 2 15 (1 + γ 2 γ 4 γ 6 ), = 4 ) ) 35 (1 + 8γ2 3 + 5γ4 + 6γ 6 + 8γ8, = 6. 3 Notice that whe γ is small, = 2, 4, 6, E(K) approximately equals 1 ad E(K 2 ) approximately equals ( 1)( + 1), which is the exact first ad secod momets, respectively, of a χ 2 distributed radom variable with 1 degrees of freedom. 4 Test for Equality of Two Commo Coefficiets of Variatio We ow itroduce a statistical test for the hypothesis that a coefficiet of variatio γ 1 that is shared by k 1 populatios equals a coefficiet of variatio γ 2 that is shared by k 2 populatios. Defiitio 3 makes clear the settig ad what we mea by coefficiets of variatio i this case. Defiitio 3. Let y rij = µ ri + e rij, where e rij are idepedetly distributed N(, σri 2 ), r = 1, 2, i = 1, 2,..., k r ad j = 1, 2,..., ri, with positive expected values µ ri ad positive populatio coefficiets of variatio γ r = σ ri /µ ri. The coefficiet of variatio c ri, r = 1, 2, i = 1, 2,..., k r, is defied as c ri = s ri /m ri, where m ri ad s ri are m ri = 1 ri y rij ad s ri = 1 ri (y rij m ri ) ri ri 1 2, (16) j=1 j=1 11

respectively. 4.1 A Test for Equality of Coefficiets of Variatio Cosider the hypothesis H : γ 1 = γ 2. Let ad u ri = c 2 ri 1 + c 2 ri ( ri 1)/ ri, r = 1, 2; i = 1, 2,..., k r, γ2 r θ r = 1 + γr 2, r = 1, 2, with otatio accordig to Defiitio 3. Because k r i=1 ( ri 1)u ri /θ r is approximately cetral χ 2 distributed with k r i=1 ( ri 1) degrees of freedom, ad θ 1 = θ 2 whe the hypothesis is true, F = k1 i=1 ( 1i 1)u 1i / k 1 i=1 ( 1i 1) k2 i=1 ( 2i 1)u 2i / k 2 i=1 ( 2i 1) (17) is approximately F distributed with k 1 i=1 ( 1i 1) ad k 2 i=1 ( 2i 1) degrees of freedom. Thus F ca be used as a approximately F distributed test statistic for the hypothesis of equal coefficiets of variatio. Whe k 1 = k 2 = 1, the test statistic F, as defied by (17), simplifies to F = c2 1 /(1 + c2 1 ( 1 1)/ 1 ) c 2 2 /(1 + c2 2 ( 2 1)/ 2 ), (18) where c r = c r1 ad r = r1, r = 1, 2, as defied by Defiitio 3. Accordig to Theorem 6 the distributio of the logarithm of F, as defied by (18), equals the distributio of the logarithm of a F distributed radom variable plus some error variables that are i probability of small orders. Let O p deote order i probability, defied as i Azzalii (1996). Theorem 6. Let γ = γ 1 = γ 2 as defied by Defiitio 3, with k 1 = k 2 = 1. Let X be a F distributed radom variable with 1 1 ad 2 1 degrees of freedom, let Z be a stadardized ormal radom variable, ad let U 1 ad U 2 be χ 2 distributed radom variables with 1 1 ad 2 1 degrees of freedom, respectively. Let X, Z, U 1 ad U 2 be idepedet. The log F = d 1 log X + 2Zγ + 1 ( U1 + U ) 2 γ 2 + R( 1, 2, γ) (19) 1 2 1 2 12

where F is defied by (18) ad R( 1, 2, γ) = O p (max{ 1 1 γ2, 1 2 γ2, γ 4 }). Proof. Write log F as log F = log c 2 1 ( 1 + ( 1 1)c 2 1 1 ) 1 ( log c 2 2 1 + ( 2 1)c 2 ) 1 2. (2) 2 Let W r = U r /( r 1), r = 1, 2, ad let Z 1 ad Z 2 be idepedet stadardized ormal radom variables. The distributios of the averages m r1 ad the stadard deviatios s r1, r = 1, 2, as defied by Defiitio 3, equals the distributios of µ r1 +Z r σ r1 / r ad µ r1 γ W r, respectively. Thus c 2 r equals W r γ 2 /(1 + Z r γ/ ) 2 i distributio. The distributio of the first term i (2) cosequetly equals the distributio of ( log W 1 γ 2 log 1 + 2Z 1γ + Z2 1 γ2 + ( 1 1)W 1 γ 2 ) 1 1 1 = log W 1 γ 2 + 2Z 1γ + Z2 1 γ2 + ( 1 1)W 1 γ 2 1 1 1 1 ( 2Z1 γ + Z2 1 γ2 + ( 1 1)W 1 γ 2 ) 2 +... 2 1 1 1 = log W 1 γ 2 + 2Z 1γ + ( 1 1)W 1 γ 2 ( ( )) γ 2 + O p max, γ 4. (21) 1 1 1 The correspodig calculatios ca be made also for the secod term i (2), ad (19) follows. 4.2 Simulatio Study I this sectio we ivestigate, by Mote Carlo techique, the sigificace levels ad powers of the itroduced approximate F-test (18), for the hypothesis H : γ 1 = γ 2 whe k 1 = k 2 = 1. We also study the likelihood ratio test, Beett s test as modified by Shafer ad Sulliva (1986), ad Miller s test. These tests are, for quick referece, give i the Appedix. I each simulatio two samples with 1 ad 2 observatios, respectively, were radomly geerated 2, times i MATLAB 6.5. The observatios were ormally distributed with expected values 1 ad 1,, ad with coefficiets of variatio γ 1 ad γ 2, respectively. The tests were performed with sigificace level 5% agaist the alterative hypothesis of uequal coefficiets of variatio, that is the tests were two-sided. Eight cases were studied. The type I errors of the tests were ivestigated 13

i Cases 1 6, ad the powers of the tests were ivestigated i Cases 7 ad 8. I Cases 1 3, the commo coefficiet of variatio γ was 5%, 15% ad 3%, respectively. The sample sizes were equal, i.e. 1 = 2 =, ad varied varied from 2 to 2. I Cases 4 6, γ was 5%, 15% ad 3%, respectively, but i these cases 1 = 4 ad 2 varied from 2 to 2. I Case 7 oe coefficiet of variatio was 5%, ad the other 15%, ad i Case 8 oe coefficiet of variatio was 15%, ad the other 3%. I Case 7 ad 8, the two sample sizes were equal ad varied from 2 to 2. Thus 19 simulatios were made per case. The results of the simulatio study are preseted i Figures 1 8, with oe figure per ivestigated case. The likelihood ratio test showed too large frequecy of type I error (Figures 1 6), especially for sample sizes smaller tha 1. Whe 2 = 2 the frequecy of rejected hypotheses was larger tha 2% with the likelihood ratio test. Beett s test was also too liberal, though ot as liberal as the likelihood ratio test (Figures 1 6). Miller s test performed better with regard to type I error, except whe oly 2 observatios were sampled per distributio (Figures 1 3). The approximate F-test, itroduced i this paper, was the oly test that produced almost correct frequecy of rejected hypotheses i all cases (Figures 1 6). The likelihood ratio test ad Beett s test, which were too liberal, showed better power for small sample sizes tha Miller s test ad the approximate F-test (Figures 7 ad 8). 5 Discussio I applicatios with costat, or almost costat, coefficiets of variatio it is ofte appropriate to assume that the data follows logormal distributios. Oe should therefore cosider workig o the log scale (Cole, 2). After log trasformatio of the data the usual tests for equality of variaces, such as Bartlett s test, could be applied. However, it is ot always appropriate to assume that the data is logormally distributed. I immuoassays, for example, ormally distributed errors i the volumes of samples could result i ormally distributed measuremets of cocetratio with costat coefficiet of variatio. I this paper we have discussed iferece for the coefficiet of variatio whe there are reasos to believe that the data is ormally, but ot logormally, distributed. 14

Approximate F test Likelihood ratio test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Beett s test Miller s test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Figure 1: Case 1. = 1 = 2. of type I error whe γ 1 = γ 2 = 5% ad 15

Approximate F test Likelihood ratio test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Beett s test Miller s test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Figure 2: Case 2. = 1 = 2. of type I error whe γ 1 = γ 2 = 15% ad 16

Approximate F test Likelihood ratio test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Beett s test Miller s test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Figure 3: Case 3. = 1 = 2. of type I error whe γ 1 = γ 2 = 3% ad 17

Approximate F test Likelihood ratio test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Beett s test Miller s test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Figure 4: Case 4. of type I error whe γ 1 = γ 2 = 5% ad 1 = 4. 18

Approximate F test Likelihood ratio test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Beett s test Miller s test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Figure 5: Case 5. 1 = 4. of type I error whe γ 1 = γ 2 = 15% ad 19

Approximate F test Likelihood ratio test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Beett s test Miller s test.15.15.1.5.1.5 5 1 15 2 5 1 15 2 Figure 6: Case 6. 1 = 4. of type I error whe γ 1 = γ 2 = 3% ad 2

1 Approximate F test 1 Likelihood ratio test.8.8.6.4.6.4.2.2 5 1 15 2 5 1 15 2 1 Beett s test 1 Miller s test.8.8.6.4.6.4.2.2 5 1 15 2 5 1 15 2 Figure 7: Case 7. Power whe γ 1 = 5%, γ 2 = 15% ad = 1 = 2. 21

1 Approximate F test 1 Likelihood ratio test.8.8.6.4.6.4.2.2 5 1 15 2 5 1 15 2 1 Beett s test 1 Miller s test.8.8.6.4.6.4.2.2 5 1 15 2 5 1 15 2 Figure 8: Case 8. Power whe γ 1 = 15%, γ 2 = 3% ad = 1 = 2. 22

Sice it is ofte ecessary to relate the stadard deviatio to the level of the measuremets, the coefficiet of variatio is a widely used measure of dispersio. Coefficiets of variatio are ofte calculated o samples from several idepedet populatios, ad questios about how to compare them aturally arise. There is a eed for a explicit estimator of a commo coefficiet of variatio. Such a estimator has bee give i this paper. For makig cofidece itervals ad tests we have cosidered McKay s approximatio, which is valid oly whe the populatio coefficiet of variatio γ is smaller tha 1/3. Coefficiets of variatio are usually calculated o positive data, such as measuremets of cocetratio, weight or legth. Give that the positive measuremets are approximately ormally distributed γ is smaller tha 1/3, because otherwise the expected value is smaller tha 3 stadard deviatios, ad the probability of egative observatios is ot egligible. Thus, McKay s approximatio is applicable for positive variables, provided oly that the distributios are approximately ormally distributed. Normality could be checked, e.g. by the Shapiro-Wilk test. Over the years may tests have bee proposed for equality of coefficiets of variatio. I this paper, a additioal test has bee itroduced: the approximate F-test. Ulike may other tests the ew test ca be applied ot oly whe there are oe estimate per populatio coefficiet of variatio, but also whe there are several. The small simulatio study reported i this paper idicated good performace of the approximate F-test, especially with regard to type I error. It would, however, be iterestig to see results from a larger simulatio study, icludig more cases, several tests ad a ivestigatio of robustess. As already poited out, the methods proposed i this paper are iteded for ormally distributed data. Miller (1991b) suggested a oparametric test for equality of coefficiets of variatio. This oparametric test was recommeded for oormal distributios by Fug ad Tsag (1998). The F max -test (Hartley, 195) is a atural extesio of the approximate F-test to more tha two commo coefficiets of variatio (G.E. Miller, persoal commuicatio). For two coefficiets of variatio, the F max -test ad the approximate F-test are idetical for the two-sided hypothesis. The F max - test requires equal degrees of freedom. Tables for the F max distributio were give by Nelso (1987). 23

Ackowledgmets I thak the referees for useful commets. The Cetre of Biostochastics, Swedish Uiversity of Agricultural Scieces, ad Pharmacia Diagostics AB, Uppsala, Swede, supported the research. Appedix. The Likelihood ratio test, Beett s ad Miller s Tests Let m r = m r1 ad c r = c r1 deote the average ad the coefficiet of variatio, respectively, i the r:th sample, r = 1, 2, as defied by Defiitio 3. The likelihood ratio test statistic ca be writte 1 ( γ µ 1 ) 2 2 log λ = 1 log ( 1 1) c 2 1 m2 1 2 ( γ µ 2 ) 2 + 2 log ( 2 1) c 2 2 m2 2, (22) where λ is the likelihood ratio ad γ, µ 1 ad µ 2 are the maximum likelihood estimates of γ, µ 1 ad µ 2, respectively. These are, accordig to Gerig ad Se (198), 1 m 1 µ 2 µ 1 =, µ 2 = q ( 1 + 2 ) µ 2 2 m 2 2p + q 2 4p 2 r p, γ = 1 µ 2 ( 2 1) c 2 2 m2 2 2 + m 2 2 m 2 µ 2, (23) where p = ( 1 + 2 )c 2 1 + 2, q = (2 2 c 2 1 + 2 2 1 )m 2 ad r = (( 2 2 (c2 1 + 1) 2 1 (c2 2 + 1))m2 2 )/( 1 + 2 ). Asymptotically (22) is χ 2 distributed with 1 degree of freedom. Beett s test statistic, as modified by Shafer ad Sulliva (1986), ca be writte ( ( 1 + 2 2) log 1 1 + 2 2 ( ( 1 1)c 2 1 ( 1 1)c 2 )) 1 1 + c 2 1 ( + 1 1)/ 1 1 + c 2 1 ( 1 1)/ 1 ( ( 1 1)c 2 ) 1 ( 1 1) log ( 1 1)(1 + c 2 1 ( 1 1)/ 1 ( ) ( 2 1) log, ( 2 1)c 2 2 ( 2 1)(1 + c 2 2 ( 2 1)/ 2 ) ad is approximately χ 2 distributed with 1 degree of freedom. 24

Miller s test statistic (Miller, 1991a),which is approximately N(,1), is ( c 2 (c 1 c 2 ) 2( 1 1) + c4 1 1 + c 2 ) 1/2 2( 2 1) + c4, 2 1 where c = (( 1 1)c 1 + ( 2 1)c 2 )/( 1 + 2 2). Refereces Abramowitz, M. ad Stegu, I. A. (1972). Hadbook of Mathematical Fuctios with Formulas, Graphs, ad Mathematical Tables. New York: Dover. Azzalii, A. (1996). Statistical Iferece. Lodo: Chapma ad Hall. Beett, B. M. (1976). O a approximate test for homogeeity of coefficiets of variatio. I W. J. Ziegler (Ed.), Cotributios to Applied Statistics dedicated to A. Lider. Experietia Supplemetum 22:169 171. Basel: Birkhäuser. Beett, B. M. (1977). LR tests for homogeeity of coefficiets of variatio i repeated samples. Sakhya, Series B, 39:4 45. Carroll, R. J. (23). Variaces are ot always uisace parameters. Biometrics 59:211 22. Cole, T. J. (2). Sympercets: symmetric percetage differeces o the 1 log e scale simplify the presetatio of log trasformed data. Statistics i Medicie 19:319 3125. Doorbos R. ad Dijkstra J. B. (1983). A multi sample test for the equality of coefficiets of variatio i ormal populatios. Commuicatios i Statistics Simulatio ad Computatio 12:147-158. Feltz, C. J. ad Miller G. E (1996). A asymptotic test for the equality of coefficiets of variatio from k populatios. Statistics i Medicie 15:647 658. Fieller, E. C. (1932). A umerical test of the adequacy of A. T. McKay s approximatio. Joural of the Royal Statistical Society 95:699 72. Forkma, J. ad Verrill, S. (28). The distributio of McKay s approximatio for the coefficiet of variatio. Statistics & Letters 78:1 14. 25

Fug W. K. ad Tsag T. S. (1998). A simulatio study comparig tests for the equality of coefficiets of variatio. Statistics i Medicie 17:23 214. Gerig T. M. ad Se A. R. (198). MLE i two ormal samples with equal but ukow populatio coefficiets of variatio. Joural of the America Statistical Associatio 75:74 78. Gupta R. C. ad Ma S. (1996). Testig the equality of coefficiets of variatio i k ormal populatios. Commuicatios i Statistics Theory ad Methods 25:115 132. Hartley H. O. (195). The maximum F-ratio as a short cut test for heterogeeity of variace. Biometrika 37:38 312. Iglewicz B. ad Myers R. H. (197). Compariso of approximatios to the percetage poits of the sample coefficiet of variatio. Techometrics 12:166 169. Lohrdig R. K. (1975). A two sample test of equality of coefficiets of variatio or relative errors. Joural of Statistical Computatio ad Simulatio 4:31 36. Marchad E. (1997). O momets of beta mixtures, the ocetral beta distributio, ad the coefficiet of determiatio. Joural of Statistical Computatio ad Simulatio 59:161 178. McKay, A. T. (1932). Distributio of the coefficiet of variatio ad the exteded t distributio. Joural of the Royal Statistical Society 95:695 698. Miller G. E. (1991a). Asymptotic test statistics for coefficiets of variatio. Commuicatios i Statistics Theory ad Methods 2:3351 3363. Miller G. E. (1991b). Use of the squared raks test to test for the equality of the coefficiets of variatio. Commuicatios i Statistics Simulatio ad Computatio 2:743 75. Miller G. E. ad Feltz C. J. (1997). Asymptotic iferece for coefficiets of variatio. Commuicatios i Statistics Theory ad Methods 26:715 726. Nairy, K. S. ad Rao, K. A. (23). Tests of coefficiets of variatio of ormal populatio. Commuicatios i Statistics Simulatio ad Computatio 32:641-661. 26

Nelso L. S. (1987). Upper 1%, 5% ad 1% poits of the maximum F- ratio. Joural of Quality Techology 19:165 167. Owe D. B. (1968). A survey of properties ad applicatios of the ocetral t-distributio. Techometrics 1:445 478. Pearso E. S. (1932). Compariso of A.T. McKay s approximatio with experimetal samplig results. Joural of the Royal Statistical Society 95:73 74. Serflig, R. J. (198) Approximatio Theorems of Mathematical Statistics. New York: Wiley. Shafer N. J. ad Sulliva J. A. (1986). A simulatio study of a test for the equality of the coefficiets of variatio. Commuicatios i Statistics Simulatio ad Computatio 15:681 695. Tia L. (25). Ifereces o the commo coefficiet of variatio. Statistics i Medicie 24:2213 222. Tsui K. W. ad Weerahadi S. (1989). Geeralized P-values i sigificace testig of hypotheses i the presece of uisace parameters. Joural of America Statistical Associatio 84:62 67. Umphrey G. J. (1983). A commet o McKay s approximatio for the coefficiet of variatio. Commuicatios i Statistics Simulatio ad Computatio 12:629 635. Vagel M. G. (1996). Cofidece itervals for a ormal coefficiet of variatio. The America Statisticia 15:21 26. Verrill S. ad Johso R. A. (27). Cofidece bouds ad hypothesis tests for ormal distributio coefficiets of variatio. Commuicatios i Statistics Theory ad Methods 36:2187 226. Zeigler R. K. (1973). Estimators of coefficiets of variatio usig k samples. Techometrics 15:49 414. 27