Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a study is designed to evaluate different methods of teaching reading to 8-year old children. The response variable is final scores of the children after participating in the reading program. However, the children participating in the study will have different reading ability prior to entering the program. Also, there will be many factors outside the school that may have an influence on the reading score of the child, such as socioeconomic variables associated with the child s family. The variables that describe the differences in experimental units or experimental conditions are called covariates. The analysis of covariance is a method by which the influence of the covariates on the treatment means is reduced. This will often result in increase power for tests of hypotheses. In an analysis of covariance, we estimate factor effects over and above the effect of the covariate. Thus, we obtain estimates of differences among factor level means that would occur if all units had the same value of the covariate. The resulting means are called adjusted treatment means (or least squares means) and are calculated for the mean of the covariate for all observations. For a clear interpretation of the results in an analysis of covariance, the covariate should be measured before the study; or if measured during the study, it should not be influenced by the treatments in any way. The following example illustrates a case where the covariate is affected by the treatments. Example of treatments affecting covariate A company was conducting a training school for engineers to teach them accounting and budgeting principles. Two teaching methods were used, and engineers were assigned at random to one of the two. At the end of the program, a score was obtained for each engineer reflecting the amount of learning. The analyst decided to use as a covariate the amount of time devoted to study (which the engineers were required to record). After conducting the analysis of covariance, the analyst found that training method had virtually no effect. The analyst was baffled by this finding until it was pointed out that the amount of study time was also effected by the treatments, and analysis indeed confirmed this. One of the training methods involved computer-assisted learning which appealed to the engineers so that they spent more time studying and also learned more. In other words, both the learning score and the amount of study time were influenced by the treatment in this case. As a result of the high correlation between the amount of study time and the learning score, the marginal treatment effect of the teaching methods on amount of learning was small and the test for treatment effects showed no significant difference between the two teaching methods. Covariates commonly used with human subjects include prestudy attitudes, age, socioeconomic status, and aptitude.
The hypothesis test of interest in analysis of covariance is: H 0 : µ, Adj,2 =... Adj Adj, t Ha: At least two of the adjusted population means are unequal Assumptions made for analysis of covariance. For individuals with the same value on the covariate(x) and the same value for the categorical predictor variable, the dependent variable has a normal distribution. 2. Homogeneity of variance. Observations are independent 4. The relationship between the response and the covariate is linear. 5. The slopes of the different treatment regression lines are equal. 6. The treatments do not affect the covariate. Analysis of Covariance In Minitab Example An experiment has been set up to determine the effectiveness of three new ergonomic designs for airplane control panels. Twenty-four pilots have been randomly selected for the experiment and assigned to training in a flight simulator that contains one of the control panels (eight planes per panel). After completion of training on their respective control panels, the pilots are presented with eight emergency situations in the flight simulator. The emergency situations are presented in random order, and the total time (in seconds) required to make all emergency responses is recorded for each pilot. These data are found in the table below. The only factor of interest in this experiment is panel configuration. The response variable is reaction time. The table also gives the number of years of experience each pilot has. The latter variable is not controlled by the experimenter, but this uncontrolled variable (or covariate) may influence reaction time. How can the effect of the covariate be accounted for in the analysis of the data.
Inputting Data: Reaction Time Years Experience Panel Indicator Indicator 2 6.7 7.7.0 0.0 0.0 6. 7.4.0 0.0 0.0 6.0 0.7.0 0.0 0.0 5.9 22..0 0.0 0.0 7. 6..0 0.0 0.0 7.7 4..0 0.0 0.0 6.0 6.5.0 0.0 0.0 6.4 8.8.0 0.0 0.0 5.8.2 2.0.0 0.0 6.5 2.6 2.0.0 0.0 6.8 4. 2.0.0 0.0 6. 5. 2.0.0 0.0 6.0 4.7 2.0.0 0.0 5.4. 2.0.0 0.0 5.7 4.6 2.0.0 0.0 5.4 8. 2.0.0 0.0 6.0.8.0 0.0.0 6.5 8.2.0 0.0.0 7.0 7.0.0 0.0.0 7.0 6.0.0 0.0.0 7.2 0.9.0 0.0.0 6.8.2.0 0.0.0 6.6 8.9.0 0.0.0 7.4.0.0 0.0.0 Minitab Commands for Scatterplot: GRAPH > SCATTERPLOT > WITH GROUPS > OK > Y-VARIABLE Reaction Time > X-VARIABLE Years of Experience > CATEGORICAL VARIABLE Panel > OK Reaction Time for the Three Panels with Covariate Years of Experience 8.0 7.5 Panel 2 Reaction Time 7.0 6.5 6.0 5.5 0 5 0 5 Years of Experience 20 25
Results from Performing One-Way Analysis of Variance Note: These results can be used for comparison purposes later to learn if any benefits are achieved from performing an analysis of covariance. One-way ANOVA: Reaction Time versus Panel Source DF SS MS F P Panel 2 2.790.95 4.62 0.022 Error 2 6.46 0.02 Total 2 9.6 H 0 : µ 2 H a : At least two of the treatment means are unequal α=.05 F=4.62 P=.022 Reject H 0 in favor of H a. The data provide sufficient evidence to conclude that at least two of the treatment means are unequal. Results from Performing an Analysis of Covariance MODEL Years of Experience Panel > COVARIATE Years of Experience > OK > OK Analysis of Variance for Reaction Time, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P Panel 2 2.7900.8574.9287 5.88 0.000 Years of Experience.975.975.975 2.26 0.000 Error 20 2.4288 2.4288 0.24 Total 2 9.6 H 0 : µ Adj, Adj,2 Adj, H a : At least two of the adjusted treatment means are unequal α=.05 F=5.88 P=.000 Reject H 0 in favor of H a. The data provide sufficient evidence to conclude that at least two of the adjusted treatment means are unequal. Note: Compared with the one-way analysis of variance, the analysis of covariance has provided stronger evidence against the null hypothesis.
How Estimated Adjusted Treatment Means are Calculated COMMANDS: STAT > REGRESSION > REGRESSION > RESPONSE VARIABLE Reaction Time > PREDICTOR VARIABLES Years of Experience Indicator Indicator 2 > OK Regression Analysis: Reaction Tim versus Years of Exp, Indicator,... The regression equation is Reaction Time = 7.55-0.0885 Years of Experience - 0.85 Indicator + 0.00 Indicator 2 Predictor Coef SE Coef T P Constant 7.545 0.297 4.5 0.000 Years of Experience -0.08846 0.0558-5.68 0.000 Indicator -0.854 0.86-4.65 0.000 Indicator 2 0.002 0.806 0.7 0.869 S = 0.48480 R-Sq = 7.4% R-Sq(adj) = 69.4% From the output above, ŷ = 7.545 -.088YrsExp -.85Indicator +.0Indicator2 The estimated adjusted treatment means can be obtained using the following regression equations: Control Panel : ŷ =7.545 -.088YrsExp -.85(0) +.0(0) = 7.545 -.088YrsExp Control Panel 2: ŷ =7.545 -.088YrsExp -.85() +.0(0) = 6.692 -.088YrsExp Control Panel : ŷ =7.545 -.088YrsExp -.85(0) +.0() = 7.575 -.088YrsExp Estimated adjusted treatment means are calculated at the overall mean for YrsExp which is 9.42. Thus the estimated adjusted treatment mean for control panel is Control Panel : ŷ = 7.545 -.088YrsExp = 7.545-.088(9.42) = 6.7 Pairwise Comparisons Using Bonferroni Procedure MODEL Years of Experience Panel > COVARIATE Years of Experience > OK > COMPARISONS > TERMS Panel > Bonferroni > OK > OK
Grouping Information Using Bonferroni Method and 95.0% Confidence Panel N Mean Grouping 8 6.7 A 8 6.7 A 2 8 5.9 B Means that do not share a letter are significantly different. Bonferroni 95.0% Simultaneous Confidence Intervals Response Variable Reaction Time All Pairwise Comparisons among Levels of Panel Panel = subtracted from: Panel Lower Center Upper -------+---------+---------+--------- 2 -. -0.854-0.78 (-----*-----) -0.442 0.002 0.5020 (-----*-----) -------+---------+---------+--------- -0.80 0.00 0.80 Panel = 2 subtracted from: Panel Lower Center Upper -------+---------+---------+--------- 0.4276 0.886.40 (-----*-----) -------+---------+---------+--------- -0.80 0.00 0.80 Assessing Reasonableness of Normality and Equal Variance Assumption MODEL Years of Experience Panel > COVARIATE Years of Experience > GRAPHS > Histogram of Residuals Normal Plot of Residuals Residuals versus fits > OK > OK Assessing Reasonableness of Normality Assumption Normal Probability Plot (response is Reaction Time) Histogram (response is Reaction Time) 99 95 90 5 4 Percent 80 70 60 50 40 0 20 0 5 Frequency 2-0.8-0.6-0.4-0.2 0.0 Residual 0.2 0.4 0.6 0.8 0-0.6-0. 0.0 Residual 0. 0.6 From the above plots, the normality assumption is reasonable.
Assessing Reasonableness of Equal Variance Assumption Versus Fits (response is Reaction Time) 0.50 0.25 Residual 0.00-0.25-0.50-0.75 5.5 6.0 6.5 Fitted Value 7.0 7.5 From the above plots, the equal variance assumption is reasonable. Assessing Reasonableness of Equal Slopes Assumption MODEL Years of Experience Panel Years of Experience*Panel > COVARIATE Years of Experience > OK > OK Source DF Seq SS Adj SS Adj MS F P Panel 2 2.7900 0.8226 0.4.05 0.072 Years of Experience.975.098.098 22.9 0.000 Panel*Years of Experience 2 0.005 0.005 0.0008 0.0 0.994 Error 8 2.4272 2.4272 0.48 Total 2 9.6 H 0 : Panel and Years of Experience Do Not Interact (The slopes of the different treatment regression lines are equal) H a : Panel and Years of Experience Do Interact α=.05 F =.0 p-value=.994 Fail to reject H 0. The assumption of equal slopes for the different treatment regression lines is reasonable.