The experimental unit of a study is the object on which measurements are taken.

Size: px

Start display at page:

Download "The experimental unit of a study is the object on which measurements are taken."

Oliver Riley
5 years ago
Views:

1 Contents 4 Analyss of Varance (ANOVA) Introducton Termnology Data One-Way ANOVA The Model Identfablty Model Assumptons Inference Checkng Model Assumptons Nonparametrc Test Contrasts Two-Way ANOVA Model Parameter Estmaton Hypothess Testng Randomzed Block Desgn Nonparametrc Test Analyss of Varance (ANOVA) 4.1 Introducton The goal s to extend our prevous results from two samples to more than two samples. In drect extenson of the two-sample case, we can magne that we collect samples from T 2 populatons. Another scenaro that fts n ths framework s the case where a sngle sample of sze n s randomly assgned to T treatments. In ths case, we magne T hypothetcal populatons. The tth populaton s hypothetcally subjected to treatment t. The samples randomly assgned to treatment t are lke a random sample from the tth populaton Termnology Defntons/Termnology Defnton: expermental unt The expermental unt of a study s the object on whch measurements are taken. Expermental unts can be people, computers, anmals, classes, unverstes, etc. Let s consder some examples. Rats are randomly fed dfferent ron supplements n ther food, and ron retenton n ther bodes s measured 3 days later. The expermental unt s the rat.

2 Rats lve n cages, and all rats n the same cage eat from a communal food bowl. Food bowls were randomly dosed wth dfferent ron supplements, and all rats measured for ron retenton 3 days later. The expermental unt s the cage, because rats wthn cages cannot be assgned to dfferent treatments. A scentst wants to determne whether chldhood exposure to TV commercals mpacts obesty. They enroll random famles n the study, recordng TV usage of the household perodcally over several years. Durng the study, they also record heght and weght of chldren n the famly. The expermental unt s the collecton of all chldren n the household, snce TV usage of ndvdual chldren cannot be determned. When testng chemcals for teratogeness (the propensty to cause brth defects), pregnant rats are subject to hgh doses and the pups are scored for brth defects. Because the ndvdual pups cannot be assgned dfferent treatments, the ltter s an expermental unt. A rabbt s sacrfced and the retnas are used for scentfc experments on neurons (horrble thng, but true). To avod waste, both retnas are used n separate, ndependent experments. The expermental unts are the retnas, but one should worry about the foced blockng on rabbt (see dscusson of blockng later). If wthn each retna, ndvdual amacrne cells are located and ndependently tested, then the expermental unt s the amacrne cell. There s blockng at two herarchcal levels: the rabbt, and the retna wthn the rabbt. Defnton: factor A factor s a varable that s controlled n an experment. Dstnct values of the factor are called levels. Some examples of factors are: (1) tme exposed to teratogenc chemcal, (2) dose of teratogenc chemcal, (3) temperature of greenhouse, (4) ntal number of nfectous agents n an agent-based model of an epdemc, etc. Defnton: treatment A treatment s a specfc combnaton of factors to whch expermental unts may be exposed n an experment. An example s the dose and tme of exposure to a teratogenc chemcal. ANOVA analyses can be classfed n several ways. A one-way ANOVA consders a set of treatments caused by varyng one factor. A two-way ANOVA consderes a set of treatments caused by varyng two factors smultaneously. For example, f one desgned a teratogenc study that vared the dose and exposure tme to the chemcal, then you are varyng two factors, dose and tme. One can expand to mult-way ANOVA by ncludng addtonal factors. If the populatons (or treatments) ncluded n the study are selected by the expermenter and nferences are to be made only about those populatons, then the model s called a fxed effects model. If nstead the populatons are representatve of a large collecton of populatons, many of whch are not sampled, and the expermenter wshes to nfer propertes of all populatons, then the model s called a random effects model. For example, f three antdepressant drugs and a control are appled to four random groups of patents n order to determne whch of these drugs can reduce depresson symptoms, then the mean treatment responses are fxed effects assocated wth these three treatments only. In contrast, f three, representatve antdepressant drugs are appled to three random groups of patents to determne the sde effects of takng antdepressants n general, then the observed means for the three drugs are random varables representatve of the knds of effects caused by any antdepressant drug, even those not ncluded n the study. We wll focus on fxed effects ANOVA n these notes Data The data assocated wth ANOVA mght be summarzed n a table of the followng form: 2

3 Treatment k y 11 y 21 y 31 y k1 y 12 y 22 y 32 y k2... y 2n2 y 3n3. y 1n1 y knk... Example: heghts of sngers n a chor Suppose, for example, that you are studyng the heghts of sngers n a chor. Your data table s below. [Fnd the orgnal data at the Data & Story Lbrary.] Soprano Alto Tenor Bass The treatments are actual populatons n ths case,.e. the dfferent snger types (soprano, alto, tenor, bass). In workng wth ths data, you mght be nterested n determnng whether the mean heghts of all treatments are the same. Specfcally, you mght expect a sgnfcant dfference n the mean heghts of basses and sopranos, because most, f not all, of the former are male, and the latter are female. Frst Step The frst step n any data analyss s to plot the data. Dr. Zhu ntroduced you to the R functons boxplot() and strpchart() partcularly useful n ths context. Statstcs We now ntroduce some notaton and common statstcs that are computed from ANOVA-type data. The sample treatment mean s Ȳ = 1 n Y j n j=1 3

4 The overall sample mean s obtaned as Ȳ = 1 k =1 n n =1 j=1 Y j Intutvely, t should be clear that large dfferences n the sample treatment means may ndcate that the treatments affect the quantty beng measured. Thus, t should come as no surprse, that varaton n sample means s exactly the sgnal used to reject the null hypothess of no treatment effects. The detals are below. 4.2 One-Way ANOVA The Model Cell Means Model Y j = θ + ɛ j = 1, 2,..., k, j = 1, 2,..., n where θ are unknown populaton parameters, ɛ j are random errors, k s the number of dstnct populatons, and n s the sample sze n the th populaton. Note, the sample szes may not be equal. Note, f we assume E[ɛ j ] = 0, then the expected value of data s E[Y j ] = θ, j = 1, 2,..., n. Thus, the mean of the data depends only on the treatment. In partcular, we conclude that the parameter, θ, s the populaton mean of populaton. Snce we focus on fxed effect models, ths collecton of k populatons are the only ones of nterest. Therefore, θ are vewed as unknown constants. Alternatve Parameterzaton Often, you wll see another parameterzaton of the one-way ANOVA model. Then, Y j = µ + α + ɛ j E[Y j ] = µ + α where µ s the grand mean and α s the unque effect of treatment. Notce the expected dfference between two measurements s gven as E[Y j Y kl ] = µ + α (µ + α k ) = α α k, a dfference n effects. Note, there are k + 1 parameters n ths model formulaton and ths leads to dentfablty problems, dscussed n the next secton Identfablty Recall our overall framework. We have a populaton and assocated wth t are unknown populaton parameter(s) θ. We assume there s some probablty model that descrbes data X sampled from ths populaton. The probablty model defnes the pdf f θ (x) (or pmf for dscrete outcomes) for the data. Defnton: dentfable A populaton parameter θ s dentfable f dstnct θ correspond to dstnct pdfs (or pmfs for dscrete random varables). That s, f θ θ, then the pdf of the data f θ (x) f θ (x) are dstnct functons. 4

5 For example, f µ 1 µ 2, then the correspondng normal pdfs are not the same: f µ1 (x) = 1 [ exp (x µ ] 1) 1 [ exp (x µ ] 1) = f µ2 (x) 2πσ 2σ 2πσ 2σ ndcatng that populaton mean of normally dstrbuted random varables s dentfable. 1. dentfablty s a property of the model (not the estmates of the populaton parameter), so solvng dentfablty problems nvolves changng the model 2. f a model s not dentfable, then estmaton of or nference on ts populaton parameters s not possble Alternatve Parameterzaton s Overparameterzed In the alternatve formulaton, there are k + 1 parameters and k sample means avalable from the data. The extra degree of freedom n the data ndcates that the model s undentfable. More than one choce of (µ, α 1,..., α k ) can lead to the same pdf. One restrcton on the parameters must be added to make the model dentfable. There are multple choces for that restrcton that change the way the parameters are nterpretted. α = 0 means that we can nterpret the α as devatons from the overall mean attrbutable to each populaton. α 1 = 0 mght be useful f populaton 1 s the control group and we want to nterpret the α, > 1 as devatons from no treatment Model Assumptons Assumptons 1. E[ɛ j ] = 0, Var(ɛ j ) = σ 2 < for all, j, Cov(ɛ j, ɛ kl ) = 0 for all, j, k, l wth j or k l. 2. ɛ j N(0, σ 2 ) ndependent. 3. Homoscedastcty: σ 2 = σ2 Comments: Assumpton 2 s requred for hypothess testng and confdence ntervals. Wthout assumpton 2, we are lmted to do estmaton. Wth assumpton 1 about varance, we can fnd the estmate wth mnmum varance. Non-normalty can lead to dffcultes, but there are solutons for other knds of dstrbutons. We wll not dscuss much here. We can use CLT to get normalty on populaton means f n s large enough and the real dstrbuton s farly symmetrc. Robustness to volatons of 3 s lessened f n n constant for all treatments. Robustness to volatons of 2 depends on the extent to whch 3 s true. For ths reason, people wll often transform the Y random varables to acheve 3 so that they do not need to worry so much about normalty of ther data. 5

6 4.2.4 Inference Estmatng µ and α Frst, we address the problem of estmatng the parameters µ and α (we address estmaton of σ 2 n the context of hypothess testng for ANOVA). It should be ntutvely clear that and ˆµ = Ȳ ˆα = Ȳ ˆµ. are good estmators. You can also show these are the maxmum lkelhood estmates of the parameters gven the model assumptons defned n the last secton. We wll dscuss estmaton of σ 2 when we dscuss nference. Classc ANOVA Hypothess H 0 : θ 1 = θ 2 = = θ k or H 0 : α 1 = = α k = 0 H A : θ θ j for some j or H A : α 0 for some Ths hypothess s not often so nterestng. Take the example of comparng several treatments. One may often nclude a control as a treatment to make sure that the experment runs as planned. One knows before even collectng data that the control should have a dfferent outcome compared to the rest, whch means ths classc H 0 wll always be rejected. We mght stll lke to know f θ 2 θ 3. We wll come back to ths problem later. Parttonng Varance Often, ANOVA s presented as a way of parttonng the varance. The total varablty can be summarzed as the total sum of squares n SS tot = (Y j Ȳ ) 2 =1 j=1 Note, ths s just N 1 tmes the combned sample varance, where N = k =1 n. By addng and subtractng the sample means Ȳ, we can partton the total varance nto parts n [ ( Ȳ Ȳ ) 2 + (Y j Ȳ ) 2] =1 j=1 Expand the quadratc and recognze the cross-term becomes 0 because to fnd n j=1 n (Y j Ȳ ) 2 = =1 j=1 Interprettng each part of ths sum, we have (Y j Ȳ ) = n Ȳ n Ȳ = 0 n (Ȳ Ȳ ) 2 + =1 SS tot = SS B + SS W n (Y j Ȳ ) 2 =1 j=1 where SS B s the sum-of-squares due to varaton between treatments and SS W s the sum-of-squares due to error wthn treatments. There are N observatons, so there are N 1 d.f. for SS tot. There are k treatments, so there are k 1 d.f. for SS B. Wthn the th treatment, there are n 1 d.f. for a total of (n 1) = N k d.f. wthn treatments for the SS W. 6

7 Estmatng σ 2 We wll now show how all sums-of-squares, SS tot, SS B, and SS W, are estmates of populaton varance σ 2 under the ANOVA null hypothess. Ths fact wll also allows us to propose statstcs wth samplng dstrbutons for testng the null hypothess. Lemma 13. Suppose X ndependent are random varables wth E[X ] = µ and Var(X ) = σ 2, = 1,..., n. Then, E [ (X X) 2] = (µ µ) 2 + n 1 n σ2, where µ = 1 n n =1 µ. Proof. Recall for any random varable Z, that E[Z 2 ] = (E[Z]) 2 + Var(Z) by defnton of varance, and E [ X X ] = µ X by lnearty of expectaton. The only part mssng s Var(X X) = Var(X ) + Var( X) 2Cov(X, X) = σ 2 + σ2 n 2Cov X, 1 X j n = σ 2 + σ2 n 2σ2 n Puttng the parts back nto the formula for E[Z 2 ], the lemma s proved. Theorem 14. Gven the ANOVA assumptons, and assumng n = n for all, Proof. E[SS W ] = k(n 1)σ 2 E[SS B ] = n E [SS W ] = k =1 E [ n j=1 (Y j Ȳ ) 2 ] α 2 + (k 1)σ 2 =1 = k =1 E[(n 1)S2 ] defnton of sample varance = k =1 (n 1)σ2 constant varance assumpton & sample varance unbased = k(n 1)σ 2. The second result uses the lemma. E[SS B ] = n = n = n =1 =1 E [ (Ȳ Ȳ ) 2] [ α 2 + k 1 ] kn σ2 α 2 + (k 1)σ 2 =1 where the second step results because E[Ȳ ] = E[Y j ] = µ + α, Var(Ȳ ) = σ2 n, µ = 1 k k =1 (µ + α ) = µ, and µ µ = α. The expectatons just derved suggest two ways to estmate the populaton varance σ 2. Most naturally, we defne the pooled sample varance Sp 2 := SS W k(n 1). 7 j

8 Under the ANOVA assumpton of constant varance, S 2 p uses all the data to estmate populaton varance σ 2. (The denomnator becomes N k for unequal sample szes.) Ths s the mult-sample extenson of the pooled sample varance from the two-sample t-test wth equal varances. When the ANOVA hypothess s true, so α = 0 for all, then SS B k 1 also estmates the populaton varance. By the way, so does SS tot kn 1. It s the dfference between E[SS B [] and] E[SS W [] that forms ] the bass of a statstcal test of the null hypothess. If α 0 for some, then E SSB k 1 > E SSW k(n 1). It s ths sgnal we ll capture n a statstc. The only trck then, wll be to choose the statstc and determne ts dstrbuton under the null. To help us toward that goal, we consder the followng theorem. Theorem 15. If ɛ j d N(0, σ 2 ), then and f α = 0 for all, then ndependent of SS W σ 2. SS W σ 2 SS B σ 2 χ 2 k(n 1) χ 2 k 1 Proof. We wll not prove ths result, but the proof follows the same knd of reasonng that gave us the chsquared dstrbuton n the one- or two-sample case. Ths theorem s a specal case of Cochran s Theorem, appled by recognzng that SS tot σ 2 χ 2 kn 1 when H 0 s true by results from the sngle sample result. (To be clear, under H 0, the multple samples are all part of one bg sngle sample from the same populaton.) We state Cochran s theorem for ts general applcablty to parttoned sums-of-squares. Theorem 16 (Cochran s Theorem). Let Z N(0, 1) for = 1,..., ν and ν Z 2 = Q 1 + Q Q s =1 wth s ν. Then, Q 1, Q 2,..., Q s are ndependent χ 2 random varables wth ν 1, ν 2,..., ν s d.f., respectvely f and only f ν = ν 1 + ν ν s. F Test for Testng Classcal ANOVA Hypothess These estmates of σ 2 provde the bass of the F test for the classc hypothess. Defne statstc F = SS B k 1 SS W k(n 1) As already argued, ths statstc should be close to 1 f H 0 s correct. Otherwse, t should tend to exceed 1. We take ths moment to defne a new dstrbuton called the F dstrbuton. Defnton: F dstrbuton 8

9 Gven U χ 2 m and V χ 2 n ndependent ch-square random varables, then W = U M V n F (m, n) s sad to have an F dstrbuton wth m and n degrees of freedom. The pdf of the F dstrbuton s gven by f(w) = Γ ( ) m+n ( 2 m ) ( m/2 Γ ( ) ( m 2 Γ n ) w m/ mw ) (m+n)/2 n n 2 Thus, f H 0 s correct, then theorem 15 shows us F F (k 1, k(n 1)), and n the case wth varable samples szes F (k 1, N k). If the alternatve hypothess s correct, then we expect SS B /(k 1) to overestmate the populaton varance, so large values statstc F wll ndcate problems wth H 0, thus rejecton s accordng to a one-taled test when F > F k 1,N k (1 α/2) In R, the functons pf(), qf(), and frends are for the F dstrbuton. The ANOVA Table The one-way ANOVA analyss s summarzed n the ANOVA table. Source of Varaton Sum of Squares Degrees of Freedom Mean Square F Between treatments SS B k 1 MS B = SS B k 1 Wthn treatments SS W N k MS W = SS W N k Total SS tot N 1 F = MS B MS W Tukey Method If the ANOVA null hypothess s rejected, then there s some effect α 0 for some populaton. It becomes mportant to fgure out whch effects are non-zero, or whch populaton means dffer sgnfcantly. Recallng θ = µ + α, our model assumptons yeld ) Ȳ N (θ, σ2 so we can construct CI usng t statstcs f we can produce an estmate of σ 2. Prevously, we argued that S 2 p always estmates σ 2, even when the null hypothess s not satsfed, so t s natural to form CI for θ of n Ȳ ± t N k (1 α/2)s p The degrees of freedom used are the degrees of freedom of SS W, whch s used to compute S 2 p. A way to remember the degrees of freedom s to realze there are N observatons, but k d.f. are lost to estmate the sample means n order to compute the sum of squared devatons n SS W. The above CI s drectly relevant to testng H 0 : α = 0. You can fgure out the test statstc and ts samplng dstrbuton. Notce, there are k such tests we mght need to run. As for computng CI for mean dfferences, e.g. θ θ j, we recognze that testng H 0 : α = α j s equvalent. For ths case, statstc Ȳ Ȳj S p 1/n + 1/n j t N k s useful, but there are ( k 2) pars of means we could test. In both cases above, t s unwse to run that many tests wthout correctng the type I error rate α. The objectve of Tukey s method s to estmate CI for all parwse mean dfferences µ µ j that smultaneously have the desred coverage. 9

10 Recall (Ȳ θ ) N(0, σ 2 /n) f σ 2 s constant across sample szes and the sample sze s constant n. Defne statstc SR = max,j (Ȳ θ ) (Ȳj θ j ) S p / n Under the ANOVA model, SR follows a studentzed range dstrbuton wth parameters k and k(n 1). Unusually large values of SR suggest that the proposed populaton means θ are not the true populaton means. We wll not wrte a formula for the studentzed range dstrbuton, but suppose q k,k(n 1) (1 α) s ts quantle. Then, [ P (Ȳ θ ) (Ȳj θ j ) q k,k(n 1) (1 α) S ] p = 1 α n for all j. When we hypothesze θ = θ j = µ, the confdence nterval for α α j s Ȳ Ȳj ± q k,k(n 1) (1 α) S p n If the CI does not contan 0, then we reject H 0 : α = α j wth p-value < α. The key advantage of the Tukey method s that f H 0 : α = α l s also rejected, then the p-value for that concluson s also < α. If separate t tests at the α level are used for these analyses, the CI s would be narrower, more nulls would be rejected, and the probablty of a type I error for any test would exceed α. Another soluton to ths problem s to use Bonferron corrected α values on the separate t tests. Example: We consder the followng data wth sample means computed from 7 treatments, each based on 10 measurements. Suppose we are gven the pooled sample varance s S p = 0.061, whch we could read off an ANOVA table as the square root of MS W. Lab Mean The quantle for the Tukey statstc s gven n R as qtukey(0.95, nmeans=7, df=63), so q 7,63 (0.95)S p / 10 = We can examne all absolute parwse dfferences, and any dfference that exceeds allows us to reject the correspondng null that there s no dfference wth α = We fnd populatons 1 and 4, 1 and 5, 1 and 6, and 3 and 4 have sgnfcantly dfferent treatment effects. If we, ncorrectly, performed a two-sample t-test, a sgnfcant dfference s anythng larger than t 63 (0.975)S p 2 10 = If we perform the two-sample t-tests wth Bonferron correcton, a sgnfcant dfference s found for every parwse dstance exceedng ( t ) 2 S p = You mght also be nterested to see the TukeyHSD() functon n R. 10

11 4.2.5 Checkng Model Assumptons Defnton: resdual The resdual s the dfference between the observaton and ts model-estmated mean. In ths case, r j = Y j ˆµ ˆα = Y j Ȳ By assumpton of the model, the resduals should be normally dstrbuted. One can check ths assumpton wth probablty plots for r j or other tests of normalty that we have dscussed. The ANOVA model addtonally assumes constant varance, and we have not yet dscussed methods for checkng varance, though some problems can be dentfed from the boxplots. Testng Common Varance The F test tself suggests a way to test equal varance n two-sample tests. If we have two ndependent samples: X 1,..., X nx Y 1,..., Y ny d N(µ x, σ 2 x) d N(µ y, σ 2 y) then we know are ndependent, so statstc (n x 1)Sx 2 χ 2 n x 1 σ 2 (n y 1)S 2 y σ 2 S 2 x S 2 y χ 2 n y 1 F (n x 1, n y 1) In ths case, the statstc may be unusually small or unusually large, but should be around 1 f the hypothess H 0 : σ 2 x = σ 2 x = σ 2 s correct. Thus, a two-taled test can be used to fnd samples wth sgnfcantly dfferent varances. Testng Common Varance: Multple Samples To test the null hypothess that all sample varances are equal across more than two samples,.e. H 0 : σ 2 1 = σ 2 2 = = σ 2 k = σ 2 we can use Bartlett s test or Levene s test. We wll not spend tme dervng these tests, but only show you how to use them. Bartlett s Test. See bartlett.test() n R. The downsde of ths test s t reles on the normalty assumpton. Levene s Test. Perform a second ANOVA on the absolute resduals, r j, testng the classc ANOVA hypothess of constant means (or no effects). 11

12 4.2.6 Nonparametrc Test Kruskal-Walls Test If you fnd that your data does not satsfy the ANOVA assumptons, there s an alternatve test that s related to the rank sum test. Let R j be the rank of Y j n the combned sample. Handle tes as for the rank sum test. Defne R = 1 n R j and R = 1 n R j = N + 1 n N 2 and j=1 SS B = =1 j=1 n ( R R ) 2. Then, t should be clear that the larger SS B, the more evdence there s aganst the hypothess =1 H 0 : same probablty dstrbuton for all k groups. As for the rank sum test, the statstc s most senstve to changes n locaton of the dstrbutons. For small samples, you can use R s functon kruskal.test() to compute the p-value. For larger samples, t turns out that K = 12SS B N(N + 1) χ2 k 1 has an asymptotc ch-square dstrbuton. The condtons for good asymptotcs are I = 3, n 5 or I > 3 and n Contrasts Contrast The followng s optonal materal. It was not dscussed n class, but t covers a very common aspect of ANOVA. Defnton: contrast Let t = (t 1,..., t k ) be a vector of random varables, ther realzatons, parameters, or statstcs. Let a = (a 1,..., a k ) be constants, then a t =1 s a lnear combnaton of t s. If a = 0, then the lnear combnaton s called a contrast. We can wrte the classcal ANOVA hypothess n terms of contrasts. Theorem 17. θ 1 = = θ k f and only f a θ = 0 for all a A, where A = {a = (a 1,..., a k ) : a = 0}. Proof. The forward mplcaton s obvous a θ = θ a = 0 The reverse mplcaton s also qute easy. Consder a (1) = (1, 1, 0,..., 0) A. Ths one shows θ 1 = θ 2. Smlarly, a (2) = (0, 1, 1, 0,..., 0) shows θ 2 = θ 3. In general, the set a (1), a (2),..., a (k 1) spans the space A. Therefore, all possble equaltes encoded n θ 1 = = θ k are mpled by combnng these vectors approprately. 12

13 Inference on Contrasts Under the ANOVA assumptons, we have Y j N(θ, σ 2 ) and Also, for any a, wth mean and varance [ E a Ȳ ] = a θ Ȳ N(θ, σ 2 /n ). a Ȳ N(, ) =1 ( Var a Ȳ ) = σ 2 a 2 n =1 t-test for Generc Contrast But of course, we don t usually know σ 2. Instead, we use S 2 = 1 n 1 n =1 ( Yj Ȳ ) 2 whch s unbased for σ 2 (σ 2 wth heteroscedastcty) and also has dstrbuton (n 1)S 2 σ 2 χ 2 n 1 If assumpton 3 of homoscedastcty apples, then we can pool sample varances to get a better estmate of σ 2. Namely, wth N = n, we use the pooled sample varance S 2 p = 1 N k (n 1)S 2 = 1 n ( Yj Ȳ ) 2 N k =1 =1 j=1 Because the S 2 are ndependent, we also have (N k)s 2 p σ 2 χ 2 N k Also, because S 2 p s ndependent of Ȳ, we have that statstc a Ȳ a θ S p a 2 n whch allows confdence ntervals of the usual form t N k a Ȳ t N k,α/2 S p a 2 n a θ a Ȳ + t N k,α/2 S p a 2 n 13

14 4.3 Two-Way ANOVA Model Two-Way ANOVA Model In the two-way ANOVA, the expermenter smultaneously controls two factors, e.g. dosage level and exposure tme. Each combnaton of factors s a treatment, and forms a cell n the two-way layout. Suppose there we observe a constant K observatons per cell, I levels of factor one, and J levels of factor two. Then the two-way ANOVA model s Y jk = µ + α + β j + δ j + ɛ jk ɛ jk d N(0, σ 2 ) α effect of factor one, level, β j effect of factor two, level j, I α = 0 =1 J β = 0 δ j nteracton effect of factor one, level, and factor two, level j, j=1 I J δ j = δ j = 0 =1 j= Parameter Estmaton As we motvated the estmates for the one-way ANOVA, we can use to justfy µ jk = E[Y jk ] = µ + α + β j + δ j µ k = 1 E[Y jk ] = 1 (µ + α + β j + δ j ) = µ + α J J j j µ jk = 1 E[Y jk ] = 1 (µ + α + β j + δ j ) = µ + β j I I ˆµ = Ȳ ˆα = Ȳ Ȳ ˆβ = Ȳ j Ȳ ˆδ j = Ȳj (ˆµ + ˆα + ˆβ j ) = Ȳj Ȳ (Ȳ Ȳ ) (Ȳ j Ȳ ) = Ȳj Ȳ Ȳ j + Ȳ but t s agan possble to show that these are maxmum lkelhood estmates under the assumpton of normally dstrbuted errors and fxed effects. The lkelhood s [ L(Y jk ; µ, α, β j, δ j, σ 2 1 ) = exp 1 ] 2πσ 2 2σ 2 (Y jk µ α β j δ j ) 2. Because of ndependence of observatons, we have the log lkelhood of all the data Y s l(y ; µ, α, β j, δ j, σ 2 ) = IJK 2 ln(2πσ 2 ) 1 2σ 2 j (Y jk µ α β j δ j ) 2 Maxmzng smultaneously for all the parameters, yelds the ntutve estmates above. Agan, we leave estmaton of σ 2 untl later, as t s ntmately related to hypothess testng. k 14

15 4.3.3 Hypothess Testng Sums-of-Squares As we dd for one-way ANOVA, we can break down the total sums-of-squares nto components. SS tot = SS A + SS B + SS AB + SS E where SS A measures the varaton n the factor one means, SS B measures the varaton n the factor two means, SS AB quantfes the strength of the nteracton effects, and SS E s analagous to the wthn sum-ofsquares, measurng the measurement error (hence subscrpt E). In terms of the data, ths partton of the sum-of-squares s jk Ȳ ),j,k(y 2 = JK (Ȳ Ȳ ) 2 +IK (Ȳ j Ȳ ) 2 +K (Ȳj Ȳ Ȳ j +Ȳ ) 2 + j,j,j,k whch can be proven by expandng Y jk Ȳ = (Y jk Ȳj ) + (Ȳ Ȳ ) + (Ȳ j Ȳ ) + (Ȳj Ȳ Ȳ j + Ȳ ), squarng both sdes, and droppng cross-terms (because they sum to 0). As before, we frst work out the expectatons of each of these parttoned sums-of-squares. Theorem 18. Under the two-way ANOVA model wth ɛ jk d N(0, σ 2 ), (Y jk Ȳj ) 2 E[SS A ] = (I 1)σ 2 + JK α 2 E[SS B ] = (J 1)σ 2 + IK j β 2 j E[SS AB ] = (I 1)(J 1)σ 2 + K,j δ 2 j E[SS E ] = IJ(K 1)σ 2 Proof. You can use lemma 13 to prove the result for SS A and SS B. For SS AB, we apply the lemma to E[SS tot ] = E,j,k(Y jk Ȳ ) 2 =,j,k [ ] IJK 1 IJK σ2 + (α + β j + δ j ) 2 = (IJK 1)σ 2 + JK α 2 + IK j β 2 j + K,j δ 2 j whch uses all the denttes lke α = 0 to smplfy the result. Then, because E[SS AB ] = E[SS tot ] E[SS A ] E[SS B ], we are done. Next comes dstrbutonal nformaton for the sums-of-squares. Theorem 19. Under the two-way ANOVA model wth ɛ jk d N(0, σ 2 ), SS E /σ 2 χ 2 IJ(K 1). If H A : α = 0, then SS A /σ 2 χ 2 I 1. If H B : β j = 0 j, then SS B /σ 2 χ 2 J 1. If H AB : δ j = 0, j, then SS AB /σ 2 χ 2 (I 1)(J 1). And SS A, SS B, SS AB, and SS E are all ndependent of each other. 15

16 Estmatng σ 2 We can see that SS E can be used to estmate populaton varance σ 2, so ˆσ 2 = SS E IJ(K 1) := S2 p Under approprate null hypotheses, the other sums-of-squares also estmate σ 2. Testng Hypotheses As before, ths realzaton motvates the F tests. Theorem 20. Under the two-way ANOVA model wth ɛ jk d N(0, σ 2 ), For testng H A, F = SS A /SS E F (I 1, IJ(K 1)). For testng H B, F = SS B /SS E F (J 1, IJ(K 1)). For testng H AB, F = SS AB /SS E F ((I 1)(J 1), IJ(K 1)). ANOVA Table Source of Varaton Sum of Squares Degrees of Freedom Mean Square F Factor one SS A I 1 MS A = SS A I 1 Factor two SS B J 1 MS B = SS B J 1 Interacton SS AB (I 1)(J 1) MS AB = SS AB (I 1)(J 1) Error SS E IJ(K 1) MS E = SS E IJ(K 1) Total SS tot IJK 1 F = MS A MS E F = MS B MS E F = MS AB MS E Reduced Model: No Interacton Notce, f nteracton effects are assumed δ j = 0, then the ANOVA table reduces and the degrees of freedom changes. Source of Varaton Sum of Squares Degrees of Freedom Mean Square F Factor one SS A I 1 MS A = SS A I 1 Factor two SS B J 1 MS B = SS B Error SS E IJK I J + 1 MS E = Total SS tot IJK 1 J 1 SS E IJK I J+1 F = MS A MS E F = MS B MS E Confdence Intervals Tukey s method can be extended to the two-way ANOVA, but we wll focus on uncorrected CI here. Suppose we want a CI for α α, then the relevant statstc s Because the samples are ndependent, we have Ȳ Ȳ. and the CI are Var(Ȳ ) = Var(Ȳ ) = σ2 JK and Var(Ȳ Ȳ ) = 2σ2 JK. 2SS E Ȳ Ȳ ± t IJ(K 1) (1 α/2) IJ 2 K(K 1) 16

17 4.3.4 Randomzed Block Desgn Expermental Desgn We now take a moment to dscuss expermental desgn because one of the most common expermental desgns produces a two-way ANOVA. Defnton: completely randomzed desgn (CRD) Gven T treatments and n expermental unts, the completely randomzed desgn results f the EU are randomly dvded nto T groups wth n 1,..., n T EU n each, such that all EU n group t receve treatment t. As we have dscussed, the randomzaton of the CRD s a good thng because t nsures that there are no confoundng factors ntroduced by expermenter n assgnng treatments that mght also affect the response. Defnton: randomzed block desgn (RBD) The RBD conssts of B blocks of T EU each, wth treatments randomly assgned such that each treatment appears exactly once n each block. The RBD s an extenson of the matched par desgn. If the expermenter can dentfy a confoundng factor, e.g. weght of subject, computer lab contanng computer, etc., that mght affect the measured response, then t s a good dea to use a RBD desgn to block ( par n the context of > 2 samples) on the confoundng factor. Example: Suppose four treatments are to be appled to 8 expermental unts. In a CRD, we would probably randomly choose n 1 = = n 4 = 2 EU per treatment. The problem s that EU wll often vary tremendously n ther response to even the same treatment. Thus, n = 2 may not be enough EU to see small treatment effects amongst large subject effects. Suppose the treatments can be appled sequentally to the same EU (.e. there s nothng rreversble to the treatments, for example no surgeres). Then, a lot of power can be ganed by blockng on EU. Each EU s subject to all four treatments, appled n random order (here s where the randomzaton enters). A random RBD for four EU s shown below, where T s treatment. Subject T 2 T 4 T 1 T 1 T 1 T 2 T 3 T 4 T 4 T 1 T 2 T 3 T 3 T 3 T 4 T 2 The treatment order may have an effect as well. One can also block on tmng of treatment, so that each temporal sequence of the treatments s observed only once. Effcent desgns for ths mult-dmensonal blockng are Latn Hypercube Desgns. An example s shown below. Notce the treatment orders are no longer random, but do vary from subject to subject. Subject T 1 T 4 T 3 T 2 T 2 T 1 T 4 T 3 T 3 T 2 T 1 T 4 T 4 T 3 T 2 T 1 17

18 RBD as Two-Way ANOVA The RBD s very popular and t leads to the followng model Y j = µ + α + β j + ɛ j where α s the treatment effect, β j s the block effect (of lttle nterest), and ɛ j are the usual errors. Because K = 1, we drop the subscrpt k. I should note that RBD desgns often are mxed effects models, that s where α are fxed effects and β j are random effects. Take our example. We probably don t want to just make nference for the blocks (subjects) n our study, but to extrapolate to the populaton of EU. In ths case, β j are random effects. Fortunately, the hypothess testng for α are unchanged from the fxed effects models we ve been dscussng Nonparametrc Test Fredman s Test The one-way ANOVA assumptons about the errors also apply to the two-way ANOVA. If these assumptons are suspect for your dataset, then nonparametrc methods may be warranted. Fredman s test for the RBD s a generalzaton of the sgn rank test for pared samples. For each treatment, rank the measurements Y 1,..., Y B to obtan R 1,..., R B. Then compute SS A = J ( R R ) 2, a measure of the dfference n ranks across treatments. R s functon fredman.test() can be used to compute p-values usng ths statstc, but for large samples, the statstc Q = 12SS A I(I + 1) χ2 I 1. 18

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected