Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur

Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty of expermental unts s small relatve to the treatment dfferences and the expermenter do not wshes to use expermental desgn, then ust take large number of observatons on each treatment effect and compute ts mean The varaton around mean can be made as small as desred by takng more observatons When there s consderable varaton among observatons on the same treatment and t s not possble to take an unlmted number of observatons, the technques used for reducng the varaton are () use of proper expermental desgn and () use of concomtant varables The use of concomtant varables s accomplshed through the technque of analyss of covarance If both the technques fal to control the expermental varablty then the number of replcatons of dfferent treatments (n other words, the number of expermental unts) are needed to be ncreased to a pont where adequate control of varablty s attaned

3 Introducton t to analyss of covarance model In the lnear model Y = X1β1+ Xβ + + X p β p + ε, f the explanatory varables are quanttatve varables as well as ndcator varables, e, some of them are qualtatve and some are quanttatve, then the lnear model s termed as analyss of covarance (ANCOVA) model Note that the ndcator varables do not provde as much nformaton as the quanttatve varables For example, the quanttatve observatons on age can be converted nto ndcator varable Let an ndctor varable be 1 f age 17years D = 0 f age < 17 years Now the followng quanttatve values of age can be changed nto ndcator varables Ages (n years) Ages (n terms of ndcator varable) 14 0 15 0 16 0 17 1 0 1 1 1 1

4 In many real applcaton, some varables may be quanttatve and others may be qualtatve In such cases, ANCOVA provdes a way out It helps n reducng the sum of squares due to error whch n turn reflects the better model adequacy dagnostcs See how does ths work: In one way model: Y = μ + α + ε, we have TSS = SSA + SSE 1 1 1 In two way model: Y = μ+ α + β + ε, we have TSS = SSA + SSB + SSE In three way model : Y = μ + α + β + γ + ε, we have TSS = SSA + SSB + SSγ + SSE k k 3 3 3 3 3 If we have a gven data set, then deally TSS = TSS = TSS 1 3 SSA = SSA = SSA ; SSB 1 3 = SSB 3 1 3 So SSE SSE SSE Note that n the constructon of F - statstcs we use SS( effects)/ df SSE / df So F - statstc essentally depends on the SSEs Smaller SSE large F more chance of reecton

5 Snce SSA, SSB etc here are based on dummy varables, so obvously f SSA, SSB, etc are based on quanttatve varables, they wll provde more nformaton Such deas are used n ANCOVA models and we construct the model by ncorporatng the quanttatve explanatory varables n ANOVA models In another example, suppose our nterest s to compare several dfferent knds of feed for ther ablty to put weght on anmals If we use ANOVA, then we use the fnal weghts at the end of experment However, fnal weghts of the anmals depend upon the ntal weght of the anmals at the begnnng of the experment as well as upon the dfference n feeds Use of ANCOVA models enables us to adust or correct these ntal t dfferences ANCOVA s useful for mprovng the precson of an experment Suppose response Y s lnearly related to covarate X (or concomtant varable) Suppose expermenter cannot control X but can observe t ANCOVA nvolves adustng for the effect of X If such an adustment s not made, then the X can nflate the error mean square and makes the true dfferences s Y due to treatment harder to detect If for a gven expermental materal, the use of proper expermental desgn cannot control the expermental varaton, the use of concomtant varables (whch are related to expermental materal) may be effectvee n reducng the varablty

6 Consder the one way classfcaton model as EY ( ) = β = 1,,, p; = 1,,, N, Var Y ( ) = σ If usual analyss of varance for testng the hypothess of equalty of treatment effects shows a hghly sgnfcant dfference n the treatment effects due to some factors affectng the experment, then consder the model whch takes nto account ths effect E ( Y ) = β + γ t = 1 1,,, p, = 1 1,,, N, Var( Y ) = σ where t are the observatons on concomtant varables (whch are related to X ) and γ s the regresson coeffcent assocated wth t Wth ths model, the varablty of treatment effects can be consderably reduced For example, n any agrcultural expermental, f the expermental unts are plots of land then, t characterstc of the th plot recevng th treatment and X can be yeld can be measure of fertlty In another example, f expermental unts are anmals and suppose the obectve s to compare the growth rates of groups of anmals recevng dfferent dets Note that the observed dfferences n growth rates can be attrbuted to det only f all the anmals are smlar n some observable characterstcs lke weght, age etc whch nfluence the growth rates

7 In the absence of smlarty, use t whch s the weght or age of th anmal recevng th treatment If we consder the quadratc regresson n t then E Y t t p n ( ) = β + γ + γ, = 1,,, = 1,,, Var Y ( ) = σ ANCOVA n ths case s the same as ANCOVA wth two concomtant varables t and t In two way classfcaton wth one observaton per cell, or EY ( ) = μ + α + β + γt, = 1,, I, = 1,, J wth EY ( ) = μ+ α + β + γ t + γ w α = 0, β = 0, th ( y, t ) ( y, t, w ) (, ) t, w then or are the observatons n cell and are the concomtment varables The concomtant varables can be fxed on random We consder the case of fxed concomtant varables only

8 One-way classfcaton Let Y ( = 1 n, = 1 p) μ = EY ( ) = β + γt Var( Y ) = σ be a random sample of sze n from th normal populatons wth mean where β, γ and σ are the unknown parameters, t are known constants whch are the observatons on a concomtant varable The null hypothess s H 0 β1 = β = = β p : Let 1 1 1 y = y ; y = y, y = y n p n o o oo 1 1 1 t = t ; t = t, t = t n p n o o oo n= n Under the whole parametrc space, use lkelhood rato test for whch we obtan ˆ β ' s and γˆγ usng the least ( ) π Ω squares prncple (or maxmum lkelhood estmaton) as follows: S = ( y μ ) Mnmze = ( y β γt ) S = 0 β for fxed β = y γt o o γ

9 β Put n and mnmze the functon by S S = 0, γ e mnmze y yo γ( t to ) wth respect to γ gves Thus we have ( y yo)( t to) ˆ γ = ( t t ) ˆ β = y ˆ γt o o ˆ μ = ˆ β + ˆ γt ( o) y ˆ μ = y ˆ β ˆ γt Snce = y y ˆ( γ t t ), o o ( y y )( t t ) o o ( y ˆ μ ) = ( y yo) ( t to) Under H 0 : β1 = = β p = β (say), consder S = y β γt w S w ( ) π w and mnmze under sample space as S w = 0, β S w = 0 γ

10 Hence and ˆ β = y ˆt γt ˆ γ = oo oo ˆ μ = ˆ β + γt ˆ ( y y )( t t ) oo oo ( t t ) oo ( y ˆ μ ) = ( y y ) ( y yoo)( t too) ( t t ) oo oo ( ) ( ) ( ) ˆ ( ) ˆ μ ˆ ˆ ˆ μ = y yoo + γ t to γ t too The lkelhood rato test statstc n ths case s gven by λ = = max L( βγσ,, ) w max L( βγσ,, ) Ω ( ˆ μ ˆ μ ) ( y ˆ μ )

11 Now we use the followng theorems: Theorem 1: Let Y = ( Y, Y,, Y ) follow a multvarate normal dstrbuton N ( μ, Σ) wth mean vector μ and postve 1 Σ n defnte covarance matrx Then YAY follows a noncentral ch-square dstrbuton wth p degrees of freedom and noncentralty parameter μ Aμ, e, χ ( p, μ Aμ) f and only f ΣA s an dempotent matrx of rank p Theorem : Let Y = ( Y, Y,, Y ) follows a multvarate normal dstrbuton N ( μ, ) wth mean vector μ and postve 1 n Σ 1 defnte covarance matrx Let YAY follows χ ( p, μ A μ) and YAY follows χ ( p, μ Aμ) Then YAY and YAY are ndependently d dstrbuted b t d f A 1 Σ A = 0 1 1 1 Theorem 3: Let Y = ( Y, Y,, Y ) 1 n follows a multvarate normal dstrbuton N( μσ, I), then the maxmum lkelhood (or βˆ least squares) estmator L β of estmable lnear parametrc functon s ndependently dstrbuted ted of σˆ ; Lβˆ β follow 1 nσˆ N L β, L ( XX ) L and follows χ ( n p) where rank( X ) = p σ Usng these theorems on the ndependence of quadratc forms and dvdng the numerator and denomnator by respectve degrees of freedom, we have ( ˆ μ ˆ μ ) n p1 F = p1 ( y ˆ μ ) ~ F( p1, n p) under H So reect H whenever F F1 ( p1, n p) at α level of sgnfcance 0 α 0

1 λ The terms nvolved n can be smplfed for computatonal convenence as follows: We can wrte ( y ˆ ) ˆ ˆ μ = y β γ t ˆ = ( y ˆ t t yoo) γ ( oo) ˆ = ( y ) ˆ ( ) ˆ( ) ˆ( ˆ yoo γ t too + γ t to γ t to) = ( ) ˆ( ˆ y yo γ t to ) = ( y ) ˆ( ) ˆ yoo + γ t to γ( t too) y ˆ ˆ ˆ μ μ μ = ( ) + ( ) For computatonal convenence where T yt E yt ( ˆ μ ˆ μ ) Tyy Eyy Ttt E tt λ = = ( y ˆ μ ) E yt Eyy E yy T y y T t t T y y t t yy = ( oo), tt = ( oo ), yt = ( oo)( oo), yy = ( o), Ett = ( t to), Eyt = ( y yo)( t to) E y y

Analyss of covarance table for one way classfcaton s as follows: 13