Topic 10: ANOVA models for random and mixed effects Fixed and Random Models in One-way Classification Experiments

Topc 10: ANOVA models for random and mxed effects eferences: ST&D Topc 7.5 (15-153), Topc 9.9 (5-7), Topc 15.5 (379-384); rules for expected on ST&D page 381 replaced by Chapter 8 from Montgomery, 1991. 10.1 Introducton The experments dscussed n prevous chapters have dealt prmarly wth stuatons n whch the researcher s concerned wth mang comparsons among specfc factor levels. There are other types of experments, however. or example, one may wsh to determne the sources of varaton n a system rather than mae partcular comparsons. One may also wsh to mae conclusons that extend beyond the set of specfc factor levels ncluded n the experment. The purpose of ths chapter s to ntroduce analyss of varance models that are approprate to these dfferent nds of expermental obectves. 10. xed and andom Models n One-way Classfcaton Experments 10..1 xed-effects model A typcal fxed-effect analyss of an experment nvolves comparng treatment means to one another n an attempt to detect dfferences. or example, four dfferent forms of ntrogen fertlzer are compared n a CD wth fve replcatons. The analyss of varance taes the followng famlar form: Source df Total 19 Treatment (.e. among groups) 3 Error (.e. wthn groups) 16 And the lnear model for ths experment s: Y = µ + +. Ths experment nvolves four specfc treatment levels (e.g. urea, ammona, ammonum ntrate, and 30% urea:ammonum ntrate soluton). The expermenter decded these were the levels of nterest, selected them specfcally for nvestgaton, and has no thought for any other N- fertlzers n the analyss. In other words, the expermenter's attenton s fxed upon these four fertlzer treatments and no other. If the experment were to be repeated, these same exact four forms of ntrogen fertlzer would be used agan. Treatment (or factor) effects are called fxed effects when the 's reman the same n each replcaton of the experment. Even f the experment s not replcated, one can thn of a probablstc model n whch there are very many possble outcomes and the actual experment selects one of them. In ths context, the term fxed effects means that n each possble outcome, the 's have the same value. Also, the 's must add to 0 snce the mean over all the treatments s 1

µ. Ths ANOVA model s called the fxed-effects model or Model I ANOVA; and t s the one we have consdered up to ths pont n the class. In such experments, the (.e. the resduals) are a random sample from a normally dstrbuted populaton of errors wth mean 0 and varance. 10.. andom-effects model In a random-effects model, the treatment levels themselves are a random sample drawn from a larger populaton of treatment levels. In ths case, the lnear model loos very smlar: Y = µ + + But now the treatment effect ( ) s a random varable. Whle the populaton of these treatment effects has mean 0 and varance, for any gven random sample of treatment levels, 0. The obectve of the researcher s to extend the conclusons based on a sample of treatment levels to all treatment levels n the populaton. Because the treatment levels are a random sample from a larger populaton of treatment levels (.e. the treatment effect s a random varable), conclusons about any one specfc level are relatvely useless. Sad another way, f the experment were to be repeated, an entrely new random sample of treatment levels would be nvestgated; so clearly the obectve s not to compare specfc treatment levels to one another. In fact, the null hypothess of a random-effects ANOVA s qute dfferent from ts fxed-effects counterpart: H 0 for a xed-effects ANOVA: 1 = = = t = 0 H 0 for a andom-effects ANOVA: = 0 When the null hypothess s false, there wll be an addtonal component of varance n the experment equal to r. The am of the researcher s to test for the presence of ths addtonal component of varance and to estmate ts magntude. As the lnear model for the one-way ANOVA suggests, the computaton s the same for the both the fxed and the random models. However, the obectves and the conclusons are qute dfferent. Subsequent analyss followng the ntal test for sgnfcance s also dfferent. Example: In an anmal breedng experment conducted to estmate the breedng value (or combnablty) of sres from a certan breed, several sres were randomly selected from a populaton and each sre was mated wth several dams. The weghts of all the newborn anmals were recorded. The expermental model s Y = µ + +, where s the effect of the th sre (.e. the dfference between the th sre and the overall mean). The other terms are the same as the fxed-effects model.

Let us assume that four sres were chosen for the study and that each was mated wth 6 dams. In ths case, the four sres are of no specfc nterest; they are merely a random sample from a populaton of sres of the breed. They are nterestng only to the extent that they represent ther populaton. If the experment were to be repeated, another set of sres very lely would be used. The 's are a random sample from a populaton of 's wth mean 0. In general, however, the 's wll be dfferent n each replcaton of the experment and wll not sum to zero for any partcular replcaton. Ths les at the crux of a random-effects model (sometmes called Model II). If we let Y represent the mean breedng value of the th sre and µ represent the average breedng value of sres from that populaton, then for any gven sample of sres: ( Y µ) = 0 Ths sum wll not equal zero, ust as ndcated, unless the summaton covers the entre populaton ( = 1,,N). Sometmes the determnaton of whether an effect s fxed or random s not obvous. Examples are laboratores or techncans n a comparatve study, years n a multple-year tral, locatons n a multple-locaton study, etc. These factors can be fxed or random dependng on the obectve of the study, the ntended nferences to be made, and the process by whch the levels of the factors were selected. 10..3 Dfferences between fxed- and random-effects model Although the lnear models for the above two types of sngle-classfcaton experments are smlar, there are some fundamental dfferences worth notng: 1. The obectves are dfferent. In the fertlzer experment, each fertlzer s of specfc nterest. The four fertlzers are not a random sample from some larger populaton of fertlzers. The purpose of the study s to compare these partcular treatment levels to one another. In the breedng study, however, the obectve of the study s to estmate the combnablty of a breed. The sres used n the study are merely a sample from whch nferences are to be made concernng the populaton. The purpose of a fxed model s to test the hypothess that the treatment effects are the same; the purpose of a random model s to estmate the component of varance.. The samplng procedures are dfferent. In the fxed-effects experment, the treatment levels are selected purposefully (.e. not randomly) by the nvestgator. In the random-effects experment, the treatment levels are selected randomly, and the unnown varance of the populaton of treatment effects contrbutes to the total sum of squares. If the frst experment s repeated, the four fertlzers wll be used agan and only the random errors change from experment to experment (.e. the 's are assumed to be constants; only the 's change). In the 3

second experment, the four sres most lely wll dffer each tme the study s conducted. Thus not only the errors vary but the sre effects ( 's) vary as well. 3. The expected sums of the effects are dfferent. Because the effects are constants, n the fxed-effects model: ( Y µ) = = 0 But for any gven sample n the random-effects model: ( Y µ) = 0 4. And, therefore, the expected varances are dfferent. or Model I: Var of Y = varance of µ + varance of + varance of = varance of = or Model II: Var of Y = varance of µ + varance of + varance of = varance of + varance of = r + ( and are called varance components) The expected mean squares for these models: E Source df xed Model andom Model Treatment t-1 T + t 1 Error t(r-1) E r + Example: Suppose fve cuts of meat are taen from each of three pgs, all from the same breed, and the fat content s measured n each cut. If the goal were to compare the fat content among these fve cuts of meat, ths would be an example of a fxed effects model wth fve treatment levels (cuts) and three replcatons (pgs). If, however, the goal s to assess anmal-to-anmal and wthn-anmal varaton, the partcular three pgs selected are not of nterest. Ths would be an example of a random effects model wth three treatment levels (pgs) and fve subsamples (cuts). 4

Suppose the goal s ndeed to assess components of varaton. The analyss reveals a treatment mean square of 80 and an error mean square of 0. The resultant rato = 80/0 = 4, wth and 1 degrees of freedom. Ths number s larger than the crtcal value and hence s declared sgnfcant at the 5% level. We therefore conclude that > 0; but how bg s t? The table above mples that: 80 = r + Snce r = 5 and = 0, we can solve: 80 = 5 + 0 = 1 Ths number (1) s an estmate of, the varance component due to anmal-to-anmal dfferences. Ths study shows that fat content s nearly twce as varable among cuts wthn ths breed of pg as t s among pgs of ths breed. 10.3 Two-way classfcaton experments In the sngle-factor case, the model must specfy whether the factor s characterzed by ether fxed or random effects. In the multfactor case, these two possbltes are oned by a thrd, the mxed effects model n whch some factors are fxed and some are random. An examnaton of ths model s useful because t provdes a good opportunty to contrast fxed and random effects. Suppose an expermenter conducts a feld study of several varetes tested across a number of locatons. In each locaton, a completely randomzed desgn s used, wth each varety replcated over r plots. Let Y represent the plot yeld of the th plot of the th varety at the th locaton. The lnear model: Y = µ + + ß + (ß) + where µ s the overall mean yeld, s the effect on yeld of the th varety, ß s the effect on yeld of the th locaton, (ß) s the effect on yeld of the nteracton between varety and locatons, and s the usual random error term wth mean 0 and varance. There are several possble models for ths study. 5

10.3.1 xed-effects model If the expermenter s only nterested n these partcular selected varetes grown at these partcular selected locatons and conclusons are not to be generalzed to other varetes or locatons, then the model s a fxed-effects model. In ths case: 1. µ s the overall mean yeld n the study. = Y. µ s the true effect of the th varety; and = 0. 3. = Y. µ s the true effect of the th locaton; and = 0. 4. (ß) = s the specfc nteracton effect due to the th varety and the th locaton. The varance of Y s. In ths model, the expermenter s nterested n estmatng and testng hypotheses about, ß, and (ß). 10.3. andom-effects model Now suppose the varetes are randomly chosen from a populaton of varetes and the nvestgator s nterested n characterzng the yeld varaton wthn a populaton of varetes (e.g. pre-green evoluton wheat varetes). Smlarly, the locatons are randomly selected from numerous possble testng stes representng some larger regon; therefore, the specfc locatons n the tral are of no partcular nterest. In ths stuaton: 1. µ s the overall mean yeld of all varetes at all possble locatons (only a sample of those varetes and locatons were ncluded n the study).. = Y. µ s a random effect from a populaton of effects wth mean 0 and varance. 3. = Y. µ s a random effect from a populaton of effects wth mean 0 and varance. 4. (ß) s the random nteracton effect from a populaton of all possble nteracton effects between varetes and locatons, wth mean 0 and varance. The varance of Y n ths model s + + +. In ths model, the expermenter s nterested n estmatng and testng hypotheses about,, and. 10.3.3 Mxed-effects model Now suppose the varetes are specfcally chosen for comparson but the locatons are randomly selected from many possble locatons n order to examne the consstency of the varetal yelds under varous envronmental condtons. In ths case: 6

1. µ s the overall mean yeld of these partcular varetes at all possble locatons... = Y. µ s the true effect of the th varety; and = 0. 3. = Y. µ s the random effect of the th locaton drawn from a populaton of effects wth mean 0 and varance. 4. (ß) = s the nteracton effect of the th varety n the th locaton. Snce locatons are random, ths nteracton s usually consdered to be a random effect wth mean 0 and varance. Here the varance of Y s + +. In ths model, the expermenter s nterested n estmatng and testng hypotheses about,, and. 10.3.4 Expected mean squares and tests In all of the examples of fxed models dscussed n prevous topcs, the error term has always been on the last lne, whatever unexplaned varaton s left over after parttonng the total sum of squares. The proper tool for determnng the approprate error varances for more complex stuatons s the set of expected mean squares. Expected mean squares are algebrac expressons whch specfy the underlyng model parameters that are estmated by the calculated mean squares (.e. the mean squares whch result from parttonng the sum of squares for a partcular dataset). Generally, these expected mean squares are lnear functons of elements that represent: 1. The error varances. unctons of varances of random effects 3. unctons of sums of squares and products (quadratc forms) of fxed effects Below s a table of the expected mean squares of three two-way classfcaton experments, featurng a varetes, b locatons, and r replcatons: Source df xed Model andom Model Var a-1 V + br a 1 Loc b-1 L + ar b 1 V*L (b-1)(a-1) VL ( ) + r ( a 1)( b 1) Error ba(r-1) E Mxed Model xed Var, andom Loc + a 1 + br + r br + r + ar + r + r + ar + r + r There s controversy among statstcans regardng the ncluson of nteractons between fxed and random factors n the E of the random factors. Steel, Torre, and Dcey, for example, 7

exclude the r term from the E of the random varable. However, Hocng (Journal of the Amercan Statstcal Assocaton 1975 70: 706-71) ponted out that excluson of r s nconsstent wth results commonly reported for the unbalanced case. We wll follow Hocng s approach; that s, we wll nclude the nteractons between fxed and random factors n the E of random factors n a mxed model. Based on the expected mean squares, correct denomnators can be dentfed to perform the approprate tests on locatons, varetes, and ther nteracton. The approprate tests for the three prevous models are shown n the table below: Source xed andom Mxed Varety V/E V/VL V/VL Locaton L/E L/VL L/VL V*L VL/E VL/E VL/E Note that the approprate tests change, dependng on the model. The underlyng prncple of an test on a set of fxed-effect parameters s that the expected mean square for the denomnator contans a lnear functon of varances of random effects, whereas the expected mean square for the numerator contans the same functon of these varances plus a varance component attrbutable to the parameter beng tested. In fxed models, each mean square other than the E estmates the resdual error varance plus a varance component attrbutable to the parameter n queston. Hence the proper denomnator for all tests s the E. That s why determnaton of the expected mean squares s usually not requred for fxed models. In random and mxed models, however, t s essental to determne the expected mean squares n order to select the correct denomnators for all tests. In the random model, one common obectve s to extend one's concluson to the complete populaton of treatment levels (e.g. locatons, as n the examples above). Therefore, t maes sense to dvde by the of the nteracton. Why? Because, for conclusons about varetes to be vald across all locatons, the varety effects must be larger than any genotype x envronment nteracton. In Topc 9.7., t was ponted out that a sgnfcant nteracton n a fxed-effects model leads us to forfet tests of hypotheses for man effects n favor of tests of hypotheses for smple effects. However, for a mxed-effects model, we may not be at all nterested n smple effects for a fxed effect, snce they are measured at randomly selected levels of another factor. Instead, we reman nterested n the man effects, even when there s an nteracton between the random and the fxed factors. 8

10.3.4.1 ules for determnng Expected Mean Squares (E) To perform an ANOVA, one must frst determne the sum of squares for each component n the model as well as the number of degrees of freedom assocated wth each sum of squares. Then, to construct approprate test statstcs (see box below), the expected mean squares must be determned. In complex desgn stuatons, partcularly those nvolvng random or mxed models, t s frequently helpful to have a formal procedure for ths process. The approprate test statstc () s a rato of mean squares that s chosen such that the expected value of the numerator dffers from the expected value of the denomnator only by the specfc varance component or fxed factor beng tested. The followng set of rules s approprate for the manual calculaton of the expected mean squares for any balanced factoral, nested, or nested factoral experment. (Note that partally balanced arrangements, such as Latn squares and ncomplete bloc desgns, are specfcally excluded.) To llustrate the applcaton of these rules, a two-factor factoral model s used. ULE 1 In addton to the error term, the model contans all the man effects and any nteractons that the expermenter assumes exst. In other words, the model s the lnear model; t contans all the effects that would appear n the lm() statement n. urthermore, the error term n the model,...m, s wrtten as (...) m, where the subscrpt m denotes the replcaton subscrpt. or the two-factor model, ths rule mples that becomes (). Subsamples, f they appear n the model, are smlarly "nested" wthn ther hgher-level classfcaton varables. ULE or each term n the model, the subscrpts fall nto three classes: Man: those subscrpts present n the term and not n parentheses Contngent: those subscrpts present n the term and n parentheses (usually subscrpts of replcatons and subsamples/nested factors) Absent: those subscrpts present n the model but not n that partcular term or example, n (), and are ndependent and s absent; and n (), s a man subscrpt and both and are contngent. Note that the number of degrees of freedom for any term n the model s the product of the number of levels assocated wth each contngent subscrpt and the number of levels mnus 1 assocated wth each man subscrpt. or example: df () = (a-1)(b-1); df () = ab(r-1) 9

ULE 3 Each effect has ether a varance component (random effect) or a fxed factor effect assocated wth t. If an nteracton contans at least one random effect, the entre nteracton s consdered to be a random effect. or example, n a two-factor mxed model wth factor A fxed! and factor B random, the varance component for B s! and the varance component for AB s! ɑ!. A fxed effect s always represented by the sum of squares of the model components assocated wth that factor, dvded by ts degrees of freedom. In our example, the fxed effect of A s:!!!! a 1!! THE METHOD To obtan the expected mean squares, prepare a table wth a row for each model component and a column for each subscrpt. Over each subscrpt, wrte the number of levels of the factor assocated wth that subscrpt and whether the factor s fxed () or random (). Blocs, replcates, and subsamples are always consdered to be random. (a) In each row, wrte 1 f a contngent subscrpt matches the subscrpt n the colum: xed or andom Number of levels actor a b b (b) e () 1 1 r (b) In each row, f any of the subscrpts match the subscrpt n the column, wrte 0 f the column s a fxed factor and 1 f the column s a random factor. or nteractons nvolvng at least one random factor, wrte 1 n the columns that match the row subscrpt, ndependent of the nature (fxed or random) of that column: xed or andom Number of levels actor a b r 0 b 1 (b) 1 1 e () 1 1 1 Note the ncluson of a 1 n the column for the nteracton term (b). Ths wll determne the presence of r n the E of the random actor B. 10

(c) In the remanng empty cells, wrte the number of levels shown above the column headng: xed or andom Number of levels actor a b r 0 b r b a 1 r (b) 1 1 r e () 1 1 1 (d) To obtan the E for any model component (.e. row), cover all columns headed by man subscrpts assocated wth that component. Then, n each row that contans at least the same subscrpts as those of the component beng consdered, tae the product of the vsble numbers and multply by the approprate fxed or random factor from ule 5. The sum of these quanttes s the E of the model component beng consdered. To fnd E A, for example, cover column ( s the ndex assocated wth factor A). The product of the vsble numbers n the rows that contan at least subscrpt are br (row 1), r (row 3), and 1 (row 4). Note that s mssng n row. The expected mean squares derved usng these rules for the two-way mxed model are: xed or andom Number of levels actor a b r Expected Mean Squares 0 b r + r + brσ /(a-1) b a 1 r + r + ar (b) 1 1 r + r e () 1 1 1 11

10.4 Expected Mean Squares for a three-way ANOVA Now consder a three-factor factoral experment wth a levels of factor A, b levels of factor B, c levels of factor C, and r replcates. The analyss of ths desgn, assumng that A s fxed and B and C are both random, s gven below. The lnear model s: Yl = µ + + + γ + ( ) + ( γ ) + ( γ ) + ( γ ) + l xed or andom Number of levels actor a b c r l Expected Mean Squares 0 b c r + r + br + cr + bcr? γ γ a 1 a 1 c r + r γ + ar γ + cr + acr? γ a b 1 r ( ) 1 1 c r (γ) 1 b 1 r (γ ) a 1 1 r (γ ) 1 1 1 r ( )l 1 1 1 1 + r γ + ar γ + brγ + abr γ? + r γ + cr + r + br γ + r + ar + r γ γ γ γ γ γ γ γ γ γ Proper tests for the two-factor nteractons and the three-factor nteracton can be performed usng dfferent from ths table. However, no exact tests exst for the man effects of A, B, or C. That s, f we wsh to test the hypothess Σ /(a-1) = 0, we cannot form a rato of two expected mean squares such that the only term n the numerator that s not n the denomnator s bcrσ /(a-1). However, t s lely that tests on the man effects are of central mportance to the expermenter. Ths problem s consdered n the next secton. 1

10.5 Approxmate tests: synthetc errors In factoral experments wth three or more factors, at least one of whch s a random factor, and certan other, more complex desgns, there are frequently no exact test statstcs for certan effects n the models. One possble "soluton" to ths dlemma s to assume that certan nteractons are neglgble. Although ths seems an attractve possblty, there must be somethng n the nature of the system (or some strong pror nowledge) to ustfy such an assumpton. In general, ths assumpton s not easly made, nor should t be taen lghtly. We should not elmnate certan nteractons from the model wthout conclusve evdence that t s approprate to do so. A procedure advocated by some expermenters s to test the nteractons frst and then set to zero those nteractons that are found to be nsgnfcant. Once ths s done, one can then assume that those non-sgnfcant nteractons are zero when testng other effects n the same experment. Although sometmes done n practce, ths procedure can be dangerous because any decson regardng an nteracton s subect to both Type I and Type II errors. If we cannot assume that certan nteractons are neglgble but we stll need to mae nferences about those effects for whch exact tests do not exst, a procedure attrbuted to Satterthwate (1946) can be employed. Satterthwate s method utlzes lnear combnatons of extant mean squares, for example: ' = A + B + + X and where: '' = + + + χ a. the mean square components are chosen so that no appears smultaneously n ' and ''; and b. E' E'' s equal to the effect (the model parameter or varance component) consdered n the null hypothess Under these condtons, the test statstc becomes: = ' / '' whch s dstrbuted approxmately as p,q where p and q are the effectve degrees of freedom for the numerator and denomnator, respectvely: 13

( +... + ) = and A X p A X df A +... + df X ( χ q = χ df +... + ) +... + df χ In these expressons for p and q, df s the number of degrees of freedom assocated wth the mean square. There s no assurance that p and q wll be ntegers, so t wll be necessary to nterpolate n the tables of the dstrbuton. or example, n the three-factor mxed effects model dscussed above, t s relatvely easy to see that an approprate test statstc for H 0 : 1 =... = t = 0 would be: = A AB + + ABC AC Why? Let's re-express ths equaton n terms of expected mean squares (refer bac to the E table on page 1): + r = + r γ γ + br + br γ γ + cr + cr + bcr + a 1 +... + + r + r γ γ Notce that = 1 f and only f the effect of actor A ( ) s 0. Ths rato of lnear a 1 combnatons of mean squares s, therefore, an approprate test for actor A. The degrees of freedom for ths test can be estmated usng the above equatons for p and q. Snce no mean square appears n both the numerator and denomnator, the numerator and denomnator are ndependent from one another. In general, t s better to construct lnear combnatons of mean squares n the numerator and denomnator through addton rather than through subtracton [e.g = A / ( AB + AC - ABC )]. Ths s because negatve sgns n such lnear functons can lead to dffcultes (ST&D pg. 380). Nevertheless, some software pacages (e.g. SAS) use the subtracton method. 14

10.5. An addtonal example wth nested effects The rules outlned n secton 10.3.4.1 also apply to nested factoral experments (.e. factoral experments wth subsamples). The reason for ths s that subsamples force the expermental unts nto the lnear model, and expermental unts (.e. replcatons) are random, by defnton. So even when all factors are fxed effects, nestng transforms the model nto a mxed model. To llustrate the use of the rules wth a nested factoral experment, consder a two-way factoral (fxed A, random B) wth C (random) replcatons n each A x B combnaton and D subsamples measured on each replcaton. or all factors other than the error term, subscrpts n parentheses gve specfc meanng to the subscrpt whch precedes the parentheses (.e. ep(a*b) ndcates that the ID varable "ep" only has meanng once a specfc combnaton of A and B levels s stated). xed or andom Number of levels actor a b c d l Expected Mean Squares / 1 + / 0 b c d + d γ ( ) + cd + bcd a a 1 c d d γ ( ) + cd + acd ( ) 1 1 c d γ 1 1 1 d ( ) ( ) l() 1 1 1 1 Note that the results are logcal: + d + cd / γ() γ ( ) γ ( ) + d γ() / 1. To test the nteracton, we must use the of the factor (the EU) nested wthn the nteracton, as wth any prevous nested desgn.. To test A or B, we must use the nteracton () because our ntenton s to extend conclusons across the full populaton of treatment levels from whch we drew the B sample of treatment levels. or effects A or B to be sgnfcant means they must be sgnfcantly larger than ther nteracton, the dfferences n responses to A at the dfferent levels of B. 15