Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Similar documents
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to Biometrics.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Detection of Influential Observation in Linear Regression. R. Dennis Cook. Technometrics, Vol. 19, No. 1. (Feb., 1977), pp

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Mind Association. Oxford University Press and Mind Association are collaborating with JSTOR to digitize, preserve and extend access to Mind.

The Review of Economic Studies, Ltd.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

11] Index Number Which Shall Meet Certain of Fisher's Tests 397

The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.

Introduction to the Design and Analysis of Experiments

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Math 423/533: The Main Theoretical Topics

The Econometric Society is collaborating with JSTOR to digitize, preserve and extend access to Econometrica.

The Periodogram and its Optical Analogy.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

HANDBOOK OF APPLICABLE MATHEMATICS

THE ESTIMATION OF MISSING VALUES IN INCOMPLETE RANDOMIZED BLOCK EXPERIMENTS

A NOTE ON CONFIDENCE BOUNDS CONNECTED WITH ANOVA AND MANOVA FOR BALANCED AND PARTIALLY BALANCED INCOMPLETE BLOCK DESIGNS. v. P.

DESIGN AND ANALYSIS OF EXPERIMENTS Third Edition

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Ecological Society of America is collaborating with JSTOR to digitize, preserve and extend access to Ecology.

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Design of Engineering Experiments Part 5 The 2 k Factorial Design

MEASUREMENTS DESIGN THE LATIN SQUARE AS A REPEATED. used). A method that has been used to eliminate this order effect from treatment 241

Interaction effects for continuous predictors in regression modeling

SEQUENTIAL TESTS FOR COMPOSITE HYPOTHESES

Mathematical Association of America

Confidence intervals and the Feldman-Cousins construction. Edoardo Milotti Advanced Statistics for Data Analysis A.Y

Prediction Intervals in the Presence of Outliers

Journal of the American Mathematical Society, Vol. 2, No. 2. (Apr., 1989), pp

r(j) -::::.- --X U.;,..;...-h_D_Vl_5_ :;;2.. Name: ~s'~o--=-i Class; Date: ID: A

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC

arxiv: v1 [math.na] 5 May 2011

Kumaun University Nainital

Analysis of variance, multivariate (MANOVA)

INFORMS is collaborating with JSTOR to digitize, preserve and extend access to Mathematics of Operations Research.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Linearity in Calibration:

Chapter 1 Statistical Inference

STATISTICS 174: APPLIED STATISTICS TAKE-HOME FINAL EXAM POSTED ON WEBPAGE: 6:00 pm, DECEMBER 6, 2004 HAND IN BY: 6:00 pm, DECEMBER 7, 2004 This is a

Testing the homogeneity of variances in a two-way classification

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Statistical Inference Using Maximum Likelihood Estimation and the Generalized Likelihood Ratio

When Good Confidence Intervals Go Bad: Predictor Sort Experiments and ANOVA

of, denoted Xij - N(~i,af). The usu~l unbiased

Regression. Oscar García

On the independence of quadratic forms in a noncentral. Osaka Mathematical Journal. 2(2) P.151-P.159

Mixture Designs Based On Hadamard Matrices

1. Find the missing input or output values for the following functions. If there is no value, explain why not. b. x=1-1

E. DROR, W. G. DWYER AND D. M. KAN

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

3 Matrix Algebra. 3.1 Operations on matrices

P a g e 5 1 of R e p o r t P B 4 / 0 9

SOUTHWESTERN ELECTRIC POWER COMPANY SCHEDULE H-6.1b NUCLEAR UNIT OUTAGE DATA. For the Test Year Ended March 31, 2009

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Linear Regression with Multiple Regressors

Experimental design. Matti Hotokka Department of Physical Chemistry Åbo Akademi University

Lecture 11. Multivariate Normal theory

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Unit 5: Fractional Factorial Experiments at Two Levels

Intrinsic Four-Point Properties

An Improved Approximate Formula for Calculating Sample Sizes for Comparing Two Binomial Distributions

Relating Graph to Matlab

Key Algebraic Results in Linear Regression

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Defense Technical Information Center Compilation Part Notice

Applications of Basu's TheorelTI. Dennis D. Boos and Jacqueline M. Hughes-Oliver I Department of Statistics, North Car-;'lina State University

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR

On Moore Graphs with Diameters 2 and 3

COVARIANCE ANALYSIS. Rajender Parsad and V.K. Gupta I.A.S.R.I., Library Avenue, New Delhi

Likelihood and p-value functions in the composite likelihood context

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

INFORMS is collaborating with JSTOR to digitize, preserve and extend access to Management Science.

Chapter 11. Correlation and Regression

Useful Numerical Statistics of Some Response Surface Methodology Designs

SIMPLIFIED CALCULATION OF PRINCIPAL COMPONENTS HAROLD HOTELLING

UNIT 3 CONCEPT OF DISPERSION

On the Triangle Test with Replications

Transcription:

Biometrika Trust The Use of a Concomitant Variable in Selecting an Experimental Design Author(s): D. R. Cox Source: Biometrika, Vol. 44, No. 1/2 (Jun., 1957), pp. 150-158 Published by: Oxford University Press on behalf of Biometrika Trust Stable URL: http://www.jstor.org/stable/2333247 Accessed: 01-03-2017 15:06 UTC REFERENCES Linked references are available on JSTOR for this article: http://www.jstor.org/stable/2333247?seq=1&cid=pdf-reference#references_tab_contents You may need to log in to JSTOR to access the linked references. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://about.jstor.org/terms Biometrika Trust, Oxford University Press are collaborating with JSTOR to digitize, preserve and extend access to Biometrika

[ 150 ] THE USE OF A CONCOMITANT VARIABLE IN SELECTING AN EXPERIMENTAL DESIGN* BY D. R. COX Department of Biostati8ticm, School of Public Health, University of North Carolina and Department of Stati8tic8, University of Californiat 1. INTRODUCTION Suppose that we have t treatments for comparison using N experimental units, for exam animals, and that on each unit we have a preliminary observation, for example, the body weight, available before the treatments are allotted to the units. Suppose that it is expected that the final observation of interest, y, will, in the absence of treatment effects, be well correlated with the preliminary observation, x. Then a number of standard methods are available for exploiting this correlation in order to increase the precision of the estimated treatment effects. The object of this paper is to compare the methods in some simple situations with a small number of experimental units. Attention is restricted to experiments in which each of the alternative treatments appears the same number, k, of items, so that N = tk. 2. SOME STANDARD METHODS We list first the following methods for increasing the precision of the treatment comparison Method I. An index of response is used, for example, ylx, y/x?, y - x, etc. The index be suggested by analysis of previous data or by general consideration of a plausible form for the regression relation between y and x. Method II. The treatments are completely randomized over the experimental units and an adjustment made for regression on x by analysis of covariance. For simplicity only linear regression will be considered. Method III. The experimental units are ranked in order of increasing x and then grouped into k blocks of t units each, the first block, for example, consisting of the t units with lowest values of x. A randomized block arrangement is then constructed, based on this grouping of the units. Method I V. This is applicable when k is equal to, or is a multiple of, t. The units are again ranked in order of increasing x and a Latin square arrangement used to control not only differences between blocks but also variation associated with order within a block. Thus with t = k = 4, N = 16, the units are numbered from 1,..., 16, in order of increasing x and a Latin square set out, as in the following example: Order within block Block no. 1 2 3 4 1 1:T8 2: T1 3: T2 4: T4 2 5: T4 6: T2 7: T 8: Ti 3 9: T2 10: T4 11:Ti 12:7T3 4 13: T1 14:Ta 15: T4 16:7T2 * This paper was prepared with the partial suppo under contract with the U.S.A.F. School of Aviation Medicine. t Present address, Birkbeck College, University of London.

D. R. Cox 151 A similar method, employing Youden squares, could be employed if k is not a multiple of t, but would require a lengthier analysis. Method V. Methods II and III may be combined by using the randomized block design plus a covariance correction. Method VI. It often happens that the preliminary observation x is not the only thing associated with the units that can be used to increase precision. One of a number of possibilities is that a grouping into randomized blocks should be made on the basis of the other criteria and then a covariance adjustment made for x. Method VII. A systematic, or non-randomized, design may be used, chosen, for example, to give maximum precision under the hypothesis that the regression equation of y on x is a polynomial of low degree (Cox, 1951). The use of such a design has the disadvantages attendant on the lack of full randomization. An eighth method is that of Papadakis (see, for example, Bartlett (1938)). This is useful in large experiments in which there are expected to be trends or serial correlation between experimental units adjacent in space or time, but will not be considered here. 3. BASIS FOR COMPARISON Denote the t treatments by T1,..., T1 and the corresponding estimated treatment means, after adjustment by covariance where appropriate, by 91,..., Pt. We measure the random errors associated with the design by the variance of the estimated difference between a pair of treatments, averaged over all pairs of treatments, i.e. by V, = Ave V(Pi - gj).(1) i,j, itj This is a certain multiple of the population residual variance o-2 and is not affected by errors in estimating o-2. Call (1) the true (average) imprecision of the design; it is slightly different from the quantity suggested by Lucas and used by Greenberg (1953). Often, however, we are interested in the apparent imprecision, Va, making due allow for the effective loss of information that arises in estimating the residual variance. To obtain Va we use Fisher's factor (f + 3)/(f + 1), wheref is the number of degrees of freedom availab to estimate the residual variance (Cochran & Cox, 1950, p. 26); that is, we put f - 3 (2) This will be supplemented by the rather arbitrary rule that if f < 5 no effective estimate of the residual varianice is considered possible from the data alone. From the point of view of the length of confidence intervals and the sensitivity of significance tests, a situation with V, = 1 and with the variance estimated from the data is nearly equivalent to a situation in which the residual variance is known and V, = 1. The use of the average variance in (1) is a natural step but should be viewed critically. particularly in small experiments where there may be substantial variation in precision between different randomization patterns and between different comparisons within one randomization pattern. This is particularly so with Method II.

152 Use of a concomitant variable in experimental design 4. CALCULATION OF IMPRECISION To compare the different methods it will be assumed that, in the absence of treatment effects, y and x have a bivariate normal frequency distribution with correlation coefficient, p, and with the variance of y for fixed x denoted by o-2. If the effect of variation accounted for by x were completely removed, we should have V1 = 2o-2/k. Therefore we write Vt 2 _ (3) 10'2 Va-k a? (4) and use I1 and la as indices of t We now compute I, and Ia for Table 1. Loss of precision from using wrong index of response Range of 8/?!/ within which It < p I t~~ if x ignored 1.1 1-2 1.5 0.2 (-0.55, 2.55) (-1.19, 3.19) (-2.46, 4.46) 1.04 04 (0-28, 1.72) (-0.02, 2.02) (-0.62, 2*62) 1.19 06 (0.68, 1.32) (0.40, 1.60) (0.06, 1.94) 1.56 0.8 (0.76, 1.24) (0.67, 1.33) (0.47, 1.53) 2.78 0.9 (0-82, 118) (0.74, 1.26) (0.59, 1.41) 5.26 0.95 (0.90, 1.10) (0-85, 1.15) (0.77, 1.23) 10-26 Method 1. Suppose that the true regression coefficient of y on x is,1 and that the index of response is y-fiox. This situation has been considered by Gourlay (1953). It is easily shown that the index of true imprecision is (1) l- (12p2 f?+ p2 f? (I _p2)-_ (p*? r (5) 1 + 2Bo x2/c-2 (p = 0). If no attempt is made to use x, i.e. if f80 0, It =(I _ p2)-1. Thus the attempted correction is an advantage whenever 0 </,30// < 2. Table 1 shows the ranges of values of /?0/,8 for which It < 1 1, 1-2, 1*5 and so tells us how neard and,/0 need to be to avoid losing a specified amount of information. Fisher (1935, p. 163) criticized the use of indices of response based on inadequate assessments of the relation between y and x. It is no contradiction of these criticisms to conclude from Table 1 that, particularly if p is not near unity, flo does not need to be very near fi to give a worth-while increase in precision. The use of non-linear indices such as y/xx allows curvilinear regression to be accounted for. Note, however, that if the treatment effects are constant, independently of x, on the y scale, they will not be constant when the index is used; if the index and the original observation have equal physical significance there may

D. R. Cox 153 well be no reason for expec measuring treatment effects. However, if the object is the estimation of the effect on the long-run average of y of a change in treatments, there may occasionally be difficulties connected with a naive use of average of y/xx. Method II. Denote the terms of the analysis of covariance as follows: Treatments Tx Tx T,X Residual RXX Rx Ry Total SXX SX SvY where, for example, Rx denotes the residual sum of products of x for the ith treatment and Pi the mean of y adjusted for regre V(P -Pi) = ki(1 + 2R,)} - (6) Ave V(Pi- 'j) = (1 + 2{ i,j~ ~ ~ ( 1) Rx This is conditional on fixed x's; but the x's are in fact a random sample from a normal population and the second term in brackets is therefore proportional to an F variate. Hence if we take expectations and remove the factors 2o2/k, we obtain for the index of true imprecision 1 t (k-1)t-2 N-t-2 (7) Also 1(2) = 1(2)N-t+2 (8) since the residual degrees of freedom Table 2 gives the values of (7) and (8 The increase in variance above that eliminated can be described as due to or as arising from the non-orthogon the populations sampled. Method III. Consider first one block and a fixed set of x values xl,..., xt. In the absence of treatment effects, the corresponding observations on y are /3xl + e1,...,/xt + et, where,8 is the regression coefficient of y on x and e1,..., et are independent with constant mean and with variance o2, and determine the dispersion of y about its regression line on x. Hence if two positions are selected randomly to carry treatments Ti and Tj, the expected meansquare difference between the resulting observations is G'2+ - Vi-)2 = 0-2 1+ #-2)tl r Hence, averaging over the k blocks, and over the distribution of the x's, we have that I3 + 1_p2W, where W is the expected mean square of x within blocks, di

154 Use of a concomitant variable in experimental design O 0q ~O0~04 c00 o OO I I111 I 0 0 * O bd - -0 ~~~0 00 "qi 1 001-~ ~~~"-I -I4 O,-I 'L_z I O_e _4 _l4 { 4 e Q o~~~~~~qqa t- q rq q ko > i>c <IIIII D E O 8e - 0 o 0 o o -q u. aq~~. aq e o0 aq e c o O P- oo 00 "- 01 aq r- CO M r- tz 0C ox kt* 00-0 Co 1 -qx 000m~ m 0 000Ct- 0 01 Cot-01 - C ~~~~~~0 o C o o o - o q co Ci CO Cim~oX 1-- - - - o ------ - 6 AA-~-4----! -ta 0 Co - I C X I Il I II I > ; I?1 01o 0 0 oo. 01 0o o0000-m o o CO O Ci o 01 x A 0 OO O O 0-' ^ 401O 3H F001 ' 40 01 00 00001E0 01 COOCO '~J4 ~ ) CO - - 01000011o 0 c1 o " ce I I I I I I III I o O s d4 B~~~ 0~lH - I -1- ta o. 00 Cl CO - c Cs co cs o 00 0 o - O sp- 1* u:o c4 0 r- oq *0 (M r t- 00 o> oc o O o0 zo o o m r c "i O o ooo 0t oco Co 0*0* co 0. Z0 CO 0 O 0 0 I oo c 00 i O OC (O O~~~~ 00Ot' C0 01 _10'4 014,) 0; 0101 ~ -~r46 r: 011014 -~. I- C4 :4r~r:4 rq r: : ux ci O X I II I I I I I O o X c. 00 -m0o CO - s~~~~~ C) oo C) O rq ol ' -*O 0 O Ci m Cq m c 00 - X - :*b* 2) CO ~ 0cO0C ~ s - I I I I I I I I. ~o- 1 0t Q X 10 O O O CY)q O m IM I I I I I I I1 O: 0,- ~~~~C~~OCO 10E.. 4010 C4'IJItI I4I I0I I0~CO: * 10Ct * 0,COQ V co a0 x 01 o o-q oco - o,o- ci - 1 " W. O U UO O C) 0 "i CYD t- xo aq r- CO -. co5>m co O O xo O Oq Oq c OO O E *> o666 o666 66 ~~~ D. 141 4 114 ; e4 u, f > ew q o @' 0 0~~~~~~~~~~~~~~~~~ " 0 0 : S o_! 68? ^,? s?r. I E II Io

D. R. Cox 155 We need, in order to calculate W, to consider the following. Take an ordered sample x(,),..., x(f) from a unit normal population. Divide this into blocks as described above a find the mean square within blocks. Then W is the expected value of this mean square and can be calculated for N < 20 from recently published tables of the second moments of order statistics in a normal sample (Teichroew, 1956); Table 2 gives some numerical values derived in this way. The general conclusion from the values of IJ) and I(3) is that Method III is somewhat better than Method II if p < 0 6 and that Method II becomes appreciably better than Method III only when p is as large as 0 8 or more. It makes little difference if the comparison is based on Ia instead of on It. In larger experiments with moderate t and large k, both methods will be effective in reducing the value of I, to near unity, except when p is very near unity when the use of Method III will be inadvisable. However, Method III will remain reasonably efficient for any form of smooth regression between y and x, not just for linear regression. If the regression is linear, but the distribution of x is leptokurtic, the randomized block method is likely to be relatively less effective due to the end-blocks having units with widely discrepant values of x. Method IV. The argument is similar when a Latin square is used, except that W', equal to the expected residual mean square in the two-way array of x's formed from the rows and columns of the square, replaces W. Table 2 gives the value of W', and of 1(4) in certain cases. If we have r squares, each of size t x t, the residual degrees of freedom are (t - 1) (rt - r - 1), when the residual within squares and the treatment x squares terms are combined. This number is small in the cases examined. The additional precision gained by eliminating 'order within blocks' makes the critical value of p at which Methods II and IV are approximately equivalent equal to about 0*8. Method V. This is the use of x simultaneously for blocking and for covariance correction. There are two possibilities. We may analyse the design as a randomized block with covariance, estimating the regression coefficient from the residual line of the analysis of covariance. Equation (6) applies with Rx still defined as the residual sum of squares, this time in the randomized block analysis. We again require the expectation of TxxlRxx; over the randomization with the x's fixed with near equality in a large design. Hence E(TxxIRxx) > E(Tx)IE(Rx) - 1' (10) +(t- 1) (k- 1)' (1 and the right-hand side of (11) will be used as an approximation to 115); the error will be of the order of [l/(t- 1) (ki-1l)]2, as can be shown by the expansion methods of large-sample theory. The second possibility is to analyse the design as if it were completely randomized. This would be in order if the arrangement into blocks has no effect on the y-values other than that due to correlation with the x's, and if the assumptions of the least squares model can be postulated, i.e. if we use more than pure randomization theory. In this case the argument parallel to (10) leads to again with near-equality in a large experiment. a w n e i n1+t(kri1)m (12)

156 Use of a concomitant variable in experimental design The numerical values in Table 2 and a direct comparison of the formulae show that if k > 3 the lower limit for I(5) is greater than 1(2). Hence, under the conditions postulated, Method V is inferior to Method II. Even with the second method of analysis, i.e. with (12), it seems unlikely that there is appreciable gain in average precision over Method II, under the conditions assumed. Similar conclusions are reached from I(5) and 1(5)' Method VI. In this we consider a randomized block design with grouping based on a criterion separate from x. Quantitative investigation of this, based, for example, on the assumption that x, y and the property determining the grouping have a trivariate normal distribution, has not been attempted. We can, however, deal with two limiting cases. The system of blocking may be identical to that based on x. Equation (11) is then applicable. Or the criterion for grouping may be independent of x, in which case t (t-1) (k-1)-2 (13) For the smaller values of (t - 1) (k - 1) among the designs investigated, this is about 1l15 J(2), showing that in these cases the additional system of blocking should be included only if there is a reasonable prospect of a reduction of 20 % or more in residual variance. For the larger values of (t - 1) (k - 1), 1(6) and 1(2) are very nearly equal. Method VII. This method is theoretically the most efficient one when the observations on y are built up of a polynomial trend on x plus treatment effect plus random error of constant mean and dispersion. The method is best illustrated by an example: suppose that N = 9, t = k = 3, that a second degree curve is considered adequate to represent the regression of y on x, and that the values of x are in order - 1-3, - 0-9 - 0-8, - 0-5 - 0-2, 0-0) 0-4) 0-7, 1.1. The most systematic procedure is to start by forming first and second degree orthogonal polynomials, 6', 6', from these observations. If ma = Exi4, these are 61= Xi- Ml/MO,A =' 2_(moms - ml) M Ml3-is ).} (14) =4 - Xmm i / 2 The numerical values are 6:- 1-13, - 0-73, - 0-63, - 0-33, - 0-03, 0-17, 0-57, 0-87, 1-27 2: 0-89, 0'08, - 0-07, - 0-40, - 055, -056, - 032, 0-07, 086. These are then normalized by dividing by V(Zi2) to give 61 and 62, n 61: - 0-50, - 0-33, - 0-28, - 0-15, - 0-01, 0-08) 0-25) 0-39) 0-57, 62: 057, 0.05, -0-04, -0-26, -0-35, -0-36, -0-21, 0-04, 0-55. We have to select from the nine units, three to receive T1, etc. Le the sum over those units receiving T1 of 61, 62. This gives us six n arrangement that minimizes the sum of squares of these six numbers is very nearly, or exactly, the most precise arrangement. Trial and error shows this arrangement to be x -l13-0-9-0.8-0-5-0-2 0.0 0 4 0.7 1.1 T, T3 T2 T2 T3 Tl Tl T3 T2

D. R. Cox 157 (a non-randomized block design) with T1 T2 T3 S61-0417 0 14 0.05 62 0.00 0-25 - 0-26 The sum of squares of these numbers is Smin = 01811, and the value of I1 is, in general, approximately (Cox, 1951) It -(1 k(t - ) Srn. {1 Smin./k}1, (15) which in this case is 1-03. (The residual degrees of freedom would be 4 if both linear and quadratic trend were removed and 5 if only linear trend is removed.) In general the systematic search for the optimum arrangement is tedious although it is usually possible to find quickly an arrangement that is nearly the best. If the degree of the polynomial is small compared with k, it will usually be possible to find an arrangement with 1(7) negligibly greater than 1. The disadvantage of this method is the lack of randomization; the method is of most value in single small experiments. 5. DiscusSION In deciding what design to use in a particular case, we should consider (i) the values for imprecision given above; (ii) the extent to which departure from assumed conditions is likely to affect (i); (iii) the importance to be attached to simplicity of design, and analysis; (iv) the extent to which considerations other than precision are relevant. The general conclusion from the calculations in? 4 is that the methods based on covariance are preferable to the simpler methods based on blocking only if the correlation coefficient between y and x is at least 0-6, and that under the conditions postulated the systematic design, Method VII, is the most precise. For larger experiments all methods except the first are likely to have I near unity. The main assumptions in the calculations are the linearity of the regression and the normality of the distribution of x. Non-normality of x should have little effect on the efficiency of covariance analysis, while I for the blocking methods will usually be an increasing function of the kurtosis of the distribution of x. If the regression is non-linear but smooth, blocking methods will remain effective, while covariance methods will not, unless the linear component accounts for most of the regression, or multiple covariance used. The methods of design are all simple except Method VII. Details of analysis are, of course, simpler for methods not involving covariance. There are two further considerations. The form of the relation between y and x may be of intrinsic interest, either in helping to understand the experimental material or in giving information useful in the design of further experiments. Also we may suspect that the treatment effects are not independent of x, i.e. that there is a treatment x x interaction. Such an interaction may give useful insight into the mechanism underlying the treatment effects and may also change any practical recommendations to be made from the experiment. If these considerations are relevant we shall normally prefer to use x quantitatively. I am grateful to Dr B. G. Greenberg for very helpful discussion.

158 Use of a concomitant variable in experimental design REFERENCES BARTLEET, M. S. (1938). The approximate recovery of information from replicated experiments with large blocks. J. Agric. Sci. 28, 418-20. COCHRAN, W. G. & Cox, G. M. (1950). Experinental Designs. New York: John Wiley and Son. Cox, D. R. (1951). Some systematic experimental designs. Biometrika, 38, 312-23. FISHER, R. A. (1935). Design of Experiments. Edinburgh: Oliver and Boyd. GouRLAY, N. (1953). Covariance analysis and its applications in psychological research. Brit. J. Statist. Psychol. 6, 25-34. GREENBERG, B. G. (1953). Use of covariance and balancing in analytical surveys. Amer. J. Publ. Hlth, 43, 692-9. TEIcHRoEw, D. (1956). Tables of expected values of order statistics and products of order statistics for samples of size twenty and less from the normal distribution. Ann. Math. Statist. 27, 410-26.