VIII. ANCOVA A. Introduction In most experiments and observational studies, additional information on each experimental unit is available, information besides the factors under direct control or of interest. ANCOVA stands for Analysis of Covariance. 674
The additional information is covariates (continuous or categorical) thought to also influence or be associated with the outcome. Controlling for these covariates can often increase the precision of the group mean comparisons, primarily by decreasing the... In blocking designs, additional information is already being used. How? 675
However, blocking on covariates known or likely to be important is sometimes not feasible. There may be too few EUs with a particular covariate value (or range of values) to form a block. In most medical studies, age, weight, gender, health history, and other variables are known to be associated with most health outcomes, but we can t block on all those covariates unless the sample size is huge. 676
Sufficient information on the covariate may not be available in advance of the experiment to form blocks. Example: outcome = wrist pain each day for 7 days experimental factors = desk type, keyboard type covariate = 677
Covariates are sometimes called control variables or concomitant variables. We will begin with a model that assumes the association of the covariate with the outcome is the same for all treatment groups. 678
B. One-Way ANOVA with One Covariate EXAMPLE Exercise & Oxygen Ventilation Researchers in exercise physiology want to know if 12 weeks of an outdoor jogging regimen was better preparation for a cardiovascular health treadmill test than 12 weeks of a step aerobics regimen. 679
12 subjects with a sendentary lifestyle were recruited and had their cardiovascular health tested. 6 each were randomly assigned to the two training regimens. Age was also recorded. 12 weeks later, cardiovascular health was again tested, and change from baseline was computed. 680
681
Model: Y ij = µ i + β(x ij X.. ) + e ij i = 1,..., t, j = 1,..., r e ij iid N(0, σe 2 ) µ i = µ + α i = treatmentgroupmeans Note that β has no subscripts and that treatment and the covariate X ij do not interact. This is sometimes called the parallel-lines model. Why? This is also called the separable intercepts model. Why? 682
Tests: Because we are now including a continuous covariate, there are no easy formulas for sums of squares. SSTreatment and SSCovariate are not orthogonal! Tests are constructed using the regression approach. Treatment and covariate are each adjusted for the other. 683
Estimation: How do we now compute estimated treatment means? We might want to somehow adjust them for the covariate effect. Ȳ 1 is the estimated outcome in group 1, but group 1 has mean covariate X 1. Ȳ 2 is the estimated outcome in group 2, but group 2 has mean covariate X 2. 684
To make Ȳ 1 comparable to Ȳ 2, we must adjust them so that they equal what we would have observed if both groups had had the same mean covariate value. adjust to X. Unadjusted: ˆµ i = Ȳ i Adjusted: ˆµ adj i = Ȳ ia = Ȳ i ˆβ( X i X ) 685
Where does this come from? Start with E[Y ij ] = µ i + β(x ij X ) 686
This computation is equivalent to what LSMEANS does in SAS. What is the interpretation of these estimated means? 687
688
Note that the adjusted means could be adjusted to any covariate value, not just to the average observed covariate value. 689
What is an advantage of using covariate adjusted means? What is a disadvantage of using covariate adjusted means? 690
For contrasts or confidence intervals, V ar[ˆµ adj i ] = MSE 1 r + ( X i X ) 2 t r i=1 j=1 (X ij X i ) 2 Notice that it will change across i values!! 691
Efficiency: Was using a covariate worth it? RE = M SE(Reduced model) ( M SE(Full Model) 1 + r t i=1 ( X i X ) 2 ) ti=1 rj=1 (X ij X i ) 2 692
Diagnostics: As before, plus we need to check whether (a) treatments affected the covariate (b) Y vs X is really linear (c) β is the same for all treatments. 693
(a) Verify how the study was carried out. Is it physically possible that a treatment could affect a covariate value? Was the covariate collected before the treatments were applied? If the covariate value was affected by a treatment, then ANCOVA is not appropriate. 694
(b) Plot Y ij vs X ij and look for an approximately linear scatter of points. Also plot residuals vs. X ij and look for no systematic pattern in the scatter of points. Then the linear assumption is appropriate. If the linear trend does not seem appropriate, then consider transforming X ij and/or Y ij, or consider a more complex ANCOVA model with, e.g., linear and quadratic effects for the covariate. 695
(c) Plot Y ij vs X ij and superimpose a loess smooth separately for each treatment group. If they are approximately parallel, then the ANCOVA model is appropriate. If not, then we have an interaction between treatment and covariate. 696
C. ANCOVA with Interaction If the covariate effect is not the same for all treatments, then we have a separate-slopes model. This allows us to, in effect, carry out a separate regression for each treatment group. 697
Model: Y ij = µ i + β i (X ij X..) + e ij i = 1,..., t, j = 1,..., r e ij iid N(0, σe 2 ) This is identical to including a treatment by covariate interaction. Tests: What is the first test to carry out? 698
H 0 : β 1 = β 2 = = β t Full model: Reduced model: df Full model = df Reduced model = This is sometimes called a test of homogeneity of covariate effects or coefficients. 699
If this is significant, then we proceed directly to mean comparisons among treatment groups at specific values of the covariate.
Coding: LSMEANS trt/at variable list = value list; EX LSMEANS trt / AT age = 25; EX LSMEANS trt / AT MEANS; (SAS default) 700
D. Generalizations The analysis of any experimental design or observational study with factors can be turned into an ANCOVA by adding adjustment for one or more covariates. All testing should be done with Type III SS or General linear F-tests. 701
Before any analyses are carried out, careful consideration should be given to whether covariate by factor interactions should be included or not. LSMEANS by default adjusts treatment group means to the average observed value of each covariate, unless you specify otherwise with the AT option. 702
E. Caveats Is ANCOVA better than blocking on the covariate? If there are a large number of covariates which affect the outcome, then blocking on all of them is not feasible. Blocking requires dividing the covariate values into groups, and each group then forms one block. This grouping seems like throwing away information, and the boundaries between groups may be arbitrary. 703
Is blocking on a covariate better than ANCOVA? Randomization can result in unbalanced covariate values across the treatment groups, especially in small studies. Blocking imposes balance. Blocking is advantageous when the variability due to blocks is large. Blocking maintains the orthogonality of covariate and treatment effects. 704
In either analysis, it can be tempting to extrapolate beyond the range of observed covariate values or blocking groups. This is never a good idea!! 705