Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Similar documents
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to Biometrics.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

SEQUENTIAL TESTS FOR COMPOSITE HYPOTHESES

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

E. DROR, W. G. DWYER AND D. M. KAN

The Periodogram and its Optical Analogy.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Testing the homogeneity of variances in a two-way classification

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Mind Association. Oxford University Press and Mind Association are collaborating with JSTOR to digitize, preserve and extend access to Mind.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Some History of Optimality

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

SOME PROBLEMS CONNECTED WITH STATISTICAL INFERENCE BY D. R. Cox

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

A correlation coefficient for circular data

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

The Review of Economic Studies, Ltd.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Advanced Herd Management Probabilities and distributions

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Recall the Basics of Hypothesis Testing

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

27.) exp {-j(r-- i)2/y2,u 2},

Hypothesis Testing. ) the hypothesis that suggests no change from previous experience

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

[313 ] A USE OF COMPLEX PROBABILITIES IN THE THEORY OF STOCHASTIC PROCESSES

ON PITMAN EFFICIENCY OF

11] Index Number Which Shall Meet Certain of Fisher's Tests 397

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Introduction to Probability

Mathematical Association of America

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

The College Mathematics Journal, Vol. 16, No. 2. (Mar., 1985), pp

The American Mathematical Monthly, Vol. 100, No. 8. (Oct., 1993), pp

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly.

This paper is not to be removed from the Examination Halls

Final Exam. Name: Solution:

Annals of Mathematics

Ecological Society of America is collaborating with JSTOR to digitize, preserve and extend access to Ecology.

7 Estimation. 7.1 Population and Sample (P.91-92)

MEASUREMENTS DESIGN THE LATIN SQUARE AS A REPEATED. used). A method that has been used to eliminate this order effect from treatment 241

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS

Confidence Intervals of Prescribed Precision Summary

The Suntory and Toyota International Centres for Economics and Related Disciplines

INSTITUTE OF ACTUARIES OF INDIA

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Large Sample Properties of Estimators in the Classical Linear Regression Model

Paper Reference R. Statistics S4 Advanced/Advanced Subsidiary. Friday 21 June 2013 Morning Time: 1 hour 30 minutes

MATH Notebook 3 Spring 2018

THE INTERCHANGEABILITY OF./M/1 QUEUES IN SERIES. 1. Introduction

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Conditional confidence interval procedures for the location and scale parameters of the Cauchy and logistic distributions

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Variance of Lipschitz Functions and an Isoperimetric Problem for a Class of Product Measures

On the Asymptotic Power of Tests for Independence in Contingency Tables from Stratified Samples

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Applied Statistics Preliminary Examination Theory of Linear Models August 2017

Econometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11

Philosophy of Science Association

A nonparametric two-sample wald test of equality of variances

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Practice Problems Section Problems

Inference on reliability in two-parameter exponential stress strength model

INFORMS is collaborating with JSTOR to digitize, preserve and extend access to Management Science.

CONVERTING OBSERVED LIKELIHOOD FUNCTIONS TO TAIL PROBABILITIES. D.A.S. Fraser Mathematics Department York University North York, Ontario M3J 1P3

Data analysis and Geostatistics - lecture VI

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Introduction to General and Generalized Linear Models

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Tests and Their Power

Statistical inference

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Fiducial Inference and Generalizations

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

The American Mathematical Monthly, Vol. 104, No. 8. (Oct., 1997), pp

Transcription:

Biometrika Trust Some Simple Approximate Tests for Poisson Variates Author(s): D. R. Cox Source: Biometrika, Vol. 40, No. 3/4 (Dec., 1953), pp. 354-360 Published by: Oxford University Press on behalf of Biometrika Trust Stable URL: http://www.jstor.org/stable/2333353 Accessed: 17-05-2017 09:34 UTC JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://about.jstor.org/terms Biometrika Trust, Oxford University Press are collaborating with JSTOR to digitize, preserve and extend access to Biometrika

[ 354 ] SOME SIMPLE APPROXIMATE TESTS FOR POISSON VARIATES BY D. R. COX Statistical Laboratory, University of Cambridge 1. INTRODUCTION If events, such as accidents or stoppages of a machine, occur randomly in time at a true rate A, the number of events in a fixed time t follows a Poisson distribution of mean At. In inverse sampling, with n fixed, the time t up to the nth event is distributed as (2A)-1 X2n) where X;n denotes a x2 variate with 2n degrees of freedom. Barnard (1946) has pointed out that the latter result leads to convenient 'sequential' tests of hypotheses about A. For example, if we have inverse samples (nl, tl) and (n2, t2) from two populations, the hypothesis that A1 = A2 is tested by referring F = t1n2/t2nr to the variance-ratio tables with (2n1, 2n2) degrees of freedom. In the present note we show that, with a slight modification, Barnard's tests apply almost exactly to direct Poisson sampling in which t is fixed and n is a random variable. We obtain, in particular, a convenient test for the equality of Poisson means, although the main use of the method is likely to be in more complicated situations where A is the product of unknown parameters. 2. THE BASIC APPROXIMATION In direct Poisson sampling in which the number of events occurring in a fixed time is recorded, we have prob (no. of events > n) = z e- (At)r= prob (ix < t) (1) and prob (no. of events n + 1) = prob (%n+2 < t) (2) If we wish to make an approximation to prob (no. of events events is treated as a continuous variate, it is reasonable to t between (1) and (2). A natural choice is i.e. we calculate probabilities as if prob (no. of events > n) - prob 2 X+1 < 2At is distributed as (3) Thus the suggestion is that if we obs mately as if n is fixed and 2At is di cerned with the consequences of (3). 3. COMPARISON OF TWO POPULATIONS If we sample two populations with rates of occurrence A1, A2 and in times tl, t2 observe nl, n2 events, then, according to (3), t1(n2+i) A1 (4) t2(n, + j) A2

D. R. Cox 355 is distributed approxima test the hypothesis that A1 = A2 by referring F t-(n + (5) to the F tables with (2n, intervals for A1/A2 may be obtained from t2(n1?+ )~ A1' t2(n1 + () tl(n2+ - A2 t1(n2+i) '(6) where F, F+ are the lower and upper c freedom. Example. Observations of the spinning of one batch of wool gave 5 ends down in 200 spindle hours and of a second batch 12 ends down in 180 spindle hours. Assuming that the occurrence of ends down is random for each batch, we may test for the difference between batches by 200 12*5 F x ~ = 2*53, 180 5X5 with (11, 25) degrees of freedom. The 5 % point in the ordinary F point 2*56. Thus in a two-sided test the difference between batches is very nearly significant at 5 %. The lower and upper 24 % points of F with (11, 25) degrees of freedom are 1/3-16 and 2*56, so that by (6) a 95 % confidence interval for the ratio of the stoppage rates is 180 x 5-5 1 A1 180 x 5-5 200x 12 5 3*16 A2200x 12-5 56 i.e. 0*125<A1/A2< 1-01. The test of the difference between batches could equally well be done by the conventional x2 method, but the provision of a simple confidence interval for the ratio of the stoppag rates is a useful additional feature of the present approach.* 4. ACCURACY OF THE TEST The test (5) may be expecttd to be accurate for large samples; its accuracy for small samples may be investigated as follows. The distribution of n1, n2 corresponding to given A1, A2, t1 and t2 is exactly prob (n1, n = (A, (A1t,)n)L ea2t2 (A2 t2)n, (7) prbn,2) = -At]L n!eat 7 n,! ~n2! If we first determine the critical region of the test (5), we can then find the exact probability associated with the test by adding (7) over all points in the critical region. Table 1 gives the results of such calculations. In all cases a one-sided test of the hypothesis A2 = A1 against alternatives A2 > A1 has been examined. The general conclusion from Table 1 is that except when the population means are very small, the approximate F test gives the probability of errors of the first kind sufficiently accurately for practical purposes. For samples of the same size, the test may be considered satisfactory at the 0*05 level if the true mean exceeds one, and satisfactory at the 0-01 level if the true mean exceeds two. * Note added in proof. The derivation of confidence intervals for the ratio of Poisson means has recently been considered in detail by Chapman (1952).

356 Simple approximate tests for Poisson variates It is natural to compare the approximate F test with the exact test of Przyborowski & Wilenski (1940), which is based on the distribution of nl, n2 conditional on n1 + n2. The comparison is difficult because the latter test is discrete and the true size of the critical region at, say, the 5 % level is appreciably less than 5 %, when small numbers are involved The critical region of the approximate F test appears always to consist of the critical region of the exact test with some additional points. It is thus roughly equivalent to Barnard's c.s.m. test (Barnard, 1947). Table 1. Exact probabilities associated with various nominal significance levels of the approximate F test of the hypothesis A2 = A1 against alternatives A2 > Al (a) Samples of equal size, t, = t2 Population Nominal significance level mean, A1t1 = A2t2 0 1 0 05 0.01 0.001 1-0.031 0.001 1.5 0 049 0 004 2 0*124 0*059 0*007 0.0001 3-0065 0.011 4 0*062 0*012 5 0*058 0*012 6 0 101 0*054 0*012 0.0013 (b) First sample size the larg Smaller Nominal significance level population mean, At2 005 0*01 1 0048 0008 1.5 0055 0009 2 0055 0*010 3 0056 0*011 4 0056 0*010 5 0051 0*010 (c) Second sample size Smaller Nominal significance level population mean, At, 005 0*01 1 0012 3 x 10-6 1.5 0.038 5 x 10-4 2 0056 0002 3 0059 0*010 4 0 059 0-013 5 0.057 0012

D. R. Cox 357 The approximate F test is an interesting consequence of (3) and may sometimes be preferred to conventional approximate x8 methods. The detailed discussion does, moreover, show that (3) may lead to accurate results even in very small samples; in? 7 we shall consider an application where a great simplification is achieved by the use of (3). 5. CONFIDENCE INTERVALS FOR A SINGLE MEAN The approximation (3) may be used to test the hypothesis that A = AO, or to obtain confidence intervals for A. An interval of confidence coefficient (100-2cc) % is thus given approximately by 2 <At < ixjn+i2 +, (8) where X.2,? are the upper and lower a % points of x2 with be compared with Garwood's (1936) confidence interval, which in the present notation is ix2, X2n, < _ At t<i2n+2,+, < (9) (9) The interval (9) has a confidence coefficie At; the true confidence coefficient is a ser its values appreciably greater than (I100-2a) %. The interval (8) is always narrower than (9), and the true confidence coefficient is sometimes less than (100-2a) %. In many practical applications it would be reasonable to assume that over a number of applications of the method the true means At are distributed randomly with respect to the serrations of the graph of the confidence coefficient. In this case it would be justifiable to use a confidence interval for which the average confidence coefficient over any fairly small range of At is at least (100-2a) %. The interval (8) appears to satisfy this condition provided that n and a, are not very small, and it might therefore be claimed that (8) is preferable to (9). The point is, however, of little practical importance because the reduction in width from (9) to (8) is only small. Example. Seven stoppages of a machine are observed in a certain period. 90 % confidence intervals for the population number of stoppages are, according to (8), (3.63, 12.50) and according to (9), (3.29,13.15). 6. RELATION WITH INVERSE SAMPLING The close connexion between the tests given in?? 3 and 5 and those based on inverse sampling has been mentioned briefly in?? 1 and 2. This connexion will now be discussed in more detail.* Tests based on the measurement of the intervals between randomly occurring events have been described in full by Maguire, Pearson & Wynn (1952). In particular, their test for the significance of the difference between the rates of occurrence in two series is as follows. Let nl, n2 be fixed and let tl, t2 be random variables defined as the times in the two series up to the nl, n2th events. Then F = tln2/t2n1 may be tested exactly as a variance ratio with (2nl, 2n2) degrees of freedom. The test of? 3, on the other hand, is that if tl, t2 are fixed times and if random variables nl, n2 are defined as the numbers of events occurring in times tl, t2, then F = tl(n2 + )/t(nl + ) may be tested approximately as a variance ratio with (2n, + 1, 2n2 + 1) degrees of freedom. * I am very grateful to Prof. E. S. Pearson for some helpful comments leading to the addition of this section.

358 Simple approximate tests for Poisson variates In the first case the interval t1 ends with the n1th event, but in the in general so. The extra degrees of freedom, and the 2 S in the definition o of as accounting for the periods between the last events and the close of the periods of observation. In many applications in which tl, t2are fixed, the precise instants at which nl, n2th events occur would not be known. It will frequently happen in applications that neither tl, t2nor nl, n2 are fixed in adv of the observations. If, however, the periods tl, t2 are determined by some random pro that is quite uninfluenced by the numbers of events occurring, then conditionally on t the quantities nl, n2 follow Poisson distributions and the approximate F test may be applied. Similarly, if the time intervals are measured up to the nl, n2th events, where nl, n2 are determined by a random process independent of the observations, then tlle exact F test with (2n1, 2n2) degrees of freedom is applicable. There are many other possible ways in which tl, t2 might be determined; for example, tl, t2 might be chosen by some random process correlated with, but not completely determined by nl, n2* In such cases it is not possible to find the properties of the above tests without special investigation, although a reasonable general rule is to use the first F test, with (2n1, 2n2) degrees of freedom, whenever the intervals tl, t2 are ended by events, and the second test otherwise. This rule breaks down in extreme cases; for example, if tl, t2are det mined by a likelihood-ratio sequential test for comparing the rates of occurrence, it would clearly be entirely inappropriate to apply either F test. 7. THE LOGARITHMIC TRANSFORMATION The basic approximation (3) is very conveniently expressed in logarithmic form, and so may be expected to be useful whenever A is the product of unknown parameters. Iffn i function of n we may rewrite (3) logen = log A + logfn- log X2n+1 The log X2 distribution has been studied in detail by Bartlett & Kendall (1946) and by Wishart (1947), who have, in particular, shown that E log IX2 = Vf(Iv), var log jx2 = 3b (iv), where Vfr, Vf' are the digamma and trigamma functions. It is convenient to choosefn so th logfn/t is an unbiased estimate of log A. Thus we take fn = exp {i/r(n + i)}, and for the present purpose it is accurate enough to write fo = 0-14, fn = n (n = 1, 2,...). (10) Also var logfn = ( +) 493 (n = 0), } (11) Thus if we define a transformed variate by 0.14 z = log10ot if n = O, t =1og10- if nto, (12) = logio n if n * o )

D. R. Cox 359 then approximately E(z) = log10 A and varz = (log10 e)2 v*, where vi = 4 93 (n=0),l (13) =1/n (n 0). One use of the transformation will now be described. Suppose observations of stoppage rates are made for k machines on each of which stoppages occur at random, the rates for the different machines being different. Suppose that a change in the process is introduced and that fresh observations are then made. In some cases it would be reasonable to expect a constant proportional change in stoppage rate, i.e. to expect that if the initial stoppage rates are A1,..., Ak the final stoppage rates will b pa1,.., pak. If the observations are, in the initial period (n1, t1),..., (nk, tk), and in the final period (n', td,.., (n', t4), we may define transformed variates z1,... X and 4..*, 4 as in (12). Then ui = z*-zi has mean,u = loglop and known variance (log10 e)2 (vi + v*) = (log10 e)2w say. The best estimate of,u is thus An almost unbiased estimate of the parameter p is Y2UiwtF1 (log10 e)2, varu=.o (14) P - 1} 1O&. (15) An approximate test of the hypothesis of a proportional change in stoppage rate may be obtained by calculating 2 ZW = {*l(u,- )21 (logloe)2. (16) Example. The following data were the results of a sampling experiment with p= 2, the values of T*, T* being chosen arbitrarily. Initial Final i I-I l I I Ui 1//w Ti (hr.) ni T' (hr.) n/ 1 30 2 28 5 0-428 1.429 2 26 4 41 21 0522 3 360 3 21 2 24 3 0*118 1*200 4 45 18 16 11 0235 6*828 5 30 4 32 12 0449 3 OOO = 15-817, 2uiw-wl= 5-4611, Yu4w-1 = 241774. The last two columns give ui = Zz-=logl0Tjn*/T*nj and w* 1 =(vj+v')-1 =n n'/(nj+n') no frequency being zero. u w=.u-w*/1w*,-1 0.345 and varj2 = (0.4343)2/15.817,

360 Simiple approximate tests for Poisson variates so that standard error (j)= 0109. There is thus good agreement with the true value = log10 2 = 0-3010. p is estimated by P {12 x 15.817} = 2*145. The hypothesis of a proportional change in stoppage rate is tested by %4= (2.3026)2 iw(u.)2 - (2.3026)2 { 1u-(i W j22} = 1-55, indicating a good agreement. This problem could be tackled without introducing the device (3) by the method of Dyke & Patterson (1952), using an iterative solution of maximum-likelihood equations. The present method is considerably quicker. The methods are probably asymptotically equivalent when the ni and n, are all large. The transformation (12) can be applied similar way to more complicated problems involving Poisson variates. I am grateful to Mr D. A. East and Miss Patricia Johnson who did most of the calculations. REFERENCES BARNARD, G. A. (1946). J.R. Statist. Soc. Suppl. 8, 1. BARNARD, G. A. (1947). Biometrika, 34, 123. BARTLETT, M. S. & KENDALL, D. G. (1946). J.R. Statist. Soc. Suppl. 8, 128. CHAPMAN, D. G. (1952). Ann. Inst. Statist. Math. 4, 45. DYKE, G. V. & PATTERSON, H. D. (1952). Biometrics, 8, 1. GARWOOD, F. (1936). Biometrika, 28, 437. MAGUIRE, B. A., PEARSON, E. S. & WYNN, A. H. A. (1952). Biometrika, 39, 168. PRZYBOROWSKI, J. & WILENSKI, H. (1940). Biometrika, 31, 313. WISHART, J. (1947). Biometrika, 34, 170.