Analysis of Covariance Using categorical and continuous predictor variables Example An experiment is set up to look at the effects of watering on Oak Seedling establishment Three levels of watering: (no additional) (three times a week) ( times a week) 1
Mean() Results No significant effect of watering Chart Least Squares Means Table Level Least Sq Mean 64. 6.1486 8.1486 Std Error 5.668994 5.668994 5.668994 Mean 64. 6.14 8.14 Analysis of Variance Source Model Error C. Total DF 18 Sum of Squares 891.14 4.851 416.514 Mean Square 445.85 19.14 F Ratio.49 Prob > F.1118 1 Each error bar is constructed using 1 standard error from the mean. Addition of a covariate -proximity to bushes Perhaps a proxy for grazing pressure TREAT
There is a spread of distances for each 1 4 5 6 Distance from bush (meters) Compare effect of treatments with and without accounting for distance to bushes 1 4 5 6 1 4 5 6 TREAT 1 4 5 6
De-trend Data Account for effect of covariate 1 4 5 6 1 4 5 6 Adjusted Mean (SEM) 5 1 4 5 6 65 Compare models with and without covariate No covariate With covariate SEM=5 SEM= Seedlings (SEM) Adjusted Mean (SEM) 5 65 4
Formally - ANCOVA The objective of an analysis of covariance is to compare the treatment means after adjusting for differences among the treatments due to differences attributable to the covariate. The analysis is a joining of the regression model with the analysis of variance model. Combinations of Categorical and Continuous Factors Calculation of Adjusted means in ANCOVA y adjusted y 1adjusted Y y Group Group 1 y 1 x x x1 X Adjusted Y means are based on overall X mean, not x means for each group 5
Linear model where y ij = µ + i + (x ij -x) + ij m overall mean i effect of factor A (m i - m) ij ij combined regression coefficient representing pooling of regression slopes of Y on X within each group. unexplained variation Assumptions Linearity The relationship between Y and X must be linear or transformed to linear. If not (Slope) term is meaningless and will lead to errors in analysis Y y Group y adjusted y 1adjusted Group 1 y 1 x x x1 X 6
Assumptions Covariate values similar across groups Assumption is that distributions of covariates are similar across groups This ensures that the the covariate is independent of group (treatment) Also allows for logical assumption of linearity throughout range of covariate. Y Group Group 1 This could be possible X Assumptions Covariate is fixed (without error) Same assumption as for regression analysis Almost never true in Biological systems If assumptions of homogeneity of variances, similar ranges in covariate values and homogeneity of slopes are met then: No obvious increase in Type I or II error
Homogeneity of Slopes Assumptions Assumption is that all slopes are the same Allows for pooling of groups to generate common slope Allows for logical partitioning of variance Could Groups be compared without this assumption? Y Group 1 Group Are Groups simply comparable? NO! X Testing Homogeneity of Slopes assumption (for 1 categorical and 1 covariate) Interaction between covariate and categorical variable is test of homogeneity of slopes assumption First test HOS assumption as part of FULL model If slopes are homogeneous (no significant interaction effect) then: Run REDUCED model (leave out interaction between categorical and covariate) 8
General scheme for testing ANCOVA Full Model A=Categorical B=Covariate Source df df denomonator F A df A = p-1 N/A N/A B df B =1 N/A N/A AB df AB = (p-1) df Residual MS AB /MS Residual If interaction between A and B is not significant, indicating homogeneity of slopes then run the reduced model Source df df denomonator F A df A = p-1 df Residual MS A /MS Residual B df B =1 df Residual MS B /MS Residual General scheme for testing ANCOVA A=Categorical B=Categorical C=Covariate Drop Source df df denomonator F A df A = p-1 B df B = q-1 C df C = 1 AB df AB = (p-1)(q-1) AC df AC = (p-1) BC df BC = (q-1) ABC df ABC = (p-1)(q-1) df Residual MS ABC /MS Residual If ABC not significant then drop and run reduced model Source df df denomonator F A df A = p-1 B df B = q-1 C df C = 1 AB df AB = (p-1)(q-1) Drop or AC df AC = (p-1) df Residual MS BC /MS Residual BC df BC = (q-1) df Residual MS BC /MS Residual If either AC or BC not significant then drop and run reduced model 9
If other interaction involving covariate not significant then drop and run reduced model Source df df denomonator F A=Categorical B=Categorical C=Covariate Drop A df A = p-1 B df B = q-1 C df C = 1 AB df AB = (p-1)(q-1) AC df AC = (p-1) df Residual MS AC /MS Residual In the end if all interaction involving covariate can be dropped then you are left with the fully reduced model Source df df denomonator F A df A = p-1 df Residual MS A /MS Residual B df B = q-1 df Residual MS B /MS Residual C df C = 1 df Residual MS C /MS Residual AB df AB = (p-1)(q-1) df Residual MS AB /MS Residual Back to Example An experiment is set up to look at the effects of watering on Oak Seedling establishment Three levels of watering: (no additional) (three times a week) ( times a week) Ancova seedlings, water and distance from bushes 1
There is a spread of distances for each 1 4 5 6 Distance from bush (meters) Test Full model vs. Y () = 4.1 + 9.95*X Y () = 4.1 + 9.8*X Y () =.9 + 1.9*X Analysis of Variance Source Model Error C. Total Effect Tests DF 5 15 Source * Sum of Squares 1.6 41.8688 416.514 Nparm 1 Mean Square 4.541 DF 1.591 Sum of Squares.9864.644.9818 F Ratio 6.91 Prob > F <.1* F Ratio 4.95 98.969.18 Prob > F.55* <.1*.984 1 4 5 This are meaningless Homogeneity of Slopes Assumption met 11
Run Reduced Model Analysis of Variance Source Model Error C. Total Effect Tests Source DF 1 Nparm 1 Sum of Squares 11.8 414.86 416.514 DF 1 Mean Square 1.4 Sum of Squares 5.186 8.65 4. F Ratio 5.68 115.5599 F Ratio.4 Prob > F <.1* Prob > F.1* <.1* Both and Distance to Bushes are significant 91. Least Squares Means ANOVA 8. Least Squares Means ANCOVA 8.8 81. 4.6 66.4 5.4 69.6 58. 6.8. TREAT 58. TREAT Test for Linear Trend (do seedling numbers increase linearly with watering regime) Contrast Contrast Specification -1 1 + + + - - - Click on + or - to make contrast values. Test for Linear Trend 8. Least Squares Means Contrast 81. SS NumDF DenDF 1 1 F Ratio 11.188 Prob > F.8* 5.4 69.6 6.8 58. TREAT 1