Data Set 8: Laysan Finch Beak Widths

Data Set 8: Finch Beak Widths Statistical Setting This handout describes an analysis of covariance (ANCOVA) involving one categorical independent variable (with only two levels) and one quantitative covariate. Background and Data The data are from Sheila Conant and Marie Morin s study of Finch beak morphology, described in handouts in ZOOL 631. In this handout of adult female birds will be compared between two populations ( and Islands), controlling for overall size. The data are for adult female Finches captured and measured in 1987. There were 62 on Island and 10 on Island. The two variables used in the following analysis are (the width of the upper mandible, in cm) and, as a measure of body size, the (in cm). Preliminary Data Exploration Scatter Plot The most useful preliminary description of the data is a scatter plot of the response variable () plotted against the covariate (), with the different levels of the categorical variable () shown with different symbols. This plot shows that there is a slight relationship between and, and that Island birds tend both to have wider beaks and to have longer sterna. 0.84 1.65 1.70

Descriptive Statistics These statistics support the impressions drawn from the scatter plot: both beak widths and s tend to be greater in Island birds, and there is at least some relationship between and. The standard deviations of both variables are quite similar between s. Box Plots : N MEAN MEDIAN STDEV MIN MAX Q1 62 031 150 0.02264 200 0.81600 550 0.772 10 0.79950 0.79350 0.02401 600 0.83600 0.77950 3 both 72 575 600 0.02645 200 0.83600 0.75225 0.777 : N MEAN MEDIAN STDEV MIN MAX Q1 62 1.6699 1.6715 0.0614 00 1.8120 1.6415 1.70 10 1.7045 1.7080 0.1041 1.5760 1.8410 1.5870 both 72 1.6747 1.6735 0.0690 00 1.8410 1.6400 1.70 correlations between and : 0.228 0.569 0.84 1.70 1.65 These boxplots again show the greater s and (to a lesser extent) sternum lengths of the Island birds. They also show that all four distributions are fairly symmetrical, with no major outliers. (A number of observations in the population exceed Minitab s 1.5xIQR rule for identifying possible outliers, but none really seem exceptional, given the fairly large size of this sample.) Data Set 8: Finch Beak Widths 2

Analysis of Covariance For the following analyses the covariate,, was centered by subtracting off the overall mean value (1.6747). Test of Parallelism Prior to conducting the ANCOVA it is necessary to determine whether it is reasonable to fit parallel regressions to the two samples. This is done by testing for an x sternum interaction, in a general linear model: Source DF Seq SS Adj SS Adj MS F P 1 0.0132278 0.0097071 0.0097071 19.91 0.000 1 0.0031429 0.0031669 0.0031669 6.49 0.013 *sternum 1 0.0001529 0.0001529 0.0001529 0.31 0.577 Error 68 0.0331598 0.0331598 0.0004876 Total 71 0.0496835 This model fits separate regressions to the two groups, as in the following scatterplot. The slopes of these regressions are not very different, and the analysis above indicates that this difference is not at all significant statistically. 0.84 1.65 1.70 There is no evidence from this analysis that the parallelism assumption is not reasonable. We therefore can proceed with the ANCOVA. Data Set 8: Finch Beak Widths 3

ANCOVA The ANCOVA can be conducted as a general linear model as above, but without the interaction term: Source DF Seq SS Adj SS Adj MS F P 1 0.0132278 0.0107006 0.0107006 22.16 0.000 1 0.0031429 0.0031429 0.0031429 6.51 0.013 Error 69 0.0333127 0.0333127 0.0004828 Total 71 0.0496835 These results show that there is a highly significant difference between beak widths on the two s, after adjusting for. The relationship of beak width to also is highly significant, though this is not really interesting to us (what matters is the effect on the ANOVA conclusions of including the covariate). Analysis of Effect Estimation of Adjusted Means To determine the magnitude of the (adjusted) difference between the s we need the parameter estimates. (Notice that the t-value for the - coefficient is the square root of the F for the effect: these are equivalent tests.) Term Coef SE Coef T P Constant 0.778679 0.003775 206.30 0.000-0.017901 0.003802-4.71 0.000 sternum 0.09793 0.03838 2.55 0.013 Minitab s GLM uses an indicator-variable coding for the variable in which is +1 and Island is 1. Using the preceding coefficients, the fitted regression relationships are : Ŷ i = ( 0.61468 0.017901) + 0.09793x i = 0.596779 + 0.09793x i : Ŷ i = ( 0.61468 + 0.017901) + 0.09793x i = 0.632581 + 0.09793x i With this coding, as these equations show, the coefficient for the variable is half the difference between the s. Thus mean s, for a given sternum length, are 2 x 0.017901 = 0.035802 cm larger on Island. The adjusted means for the s (means at x i = x ) are the LS means since the covariate was centered for these analyses. They are: Least Squares Means for Mean SE Mean 08 0.002797 0.7966 0.007042 Data Set 8: Finch Beak Widths 4

To obtain a confidence interval for the difference in adjusted means, note that this difference is twice the coefficient for, given above. Using the standard error of this coefficient (0.003802), a 99% CI for the difference would be Aptness of Covariance Model It was noted in the preliminary exploration that the variances are similar in the two populations and there are no severe outliers. The very non-significant test for a difference in regression slopes justifies the parallelism assumption. The remaining assumptions to be considered are normality and linearity. Normality: 2τˆ1 ± t 0.995, 69 2se = 2( 0.017901) ± ( 2.649 2 0.003802) = 0.035802 ± 0.020143 = ( 0.055945, 0.015659) 2.5 2.0 1.5 Normal Score 1.0 0.5 0.0-0.5-1.0-1.5-2.0-2.5-0.05 0.00 0.05 Residual This plot is very slightly sigmoid rather than perfectly straight: the tails of the distribution are slightly longer than those of a normal distribution. The correlation between residuals and normal scores, however, is 0.995, which is more than good enough given the sample size. Linearity Linearity is best assessed by the usual plot of residuals vs. fitted values, with different levels of the categorical variable indicated by different symbols. This plot (next page) shows the few large positive and negative residuals, but there is no clear indication of nonlinearity in either population. Data Set 8: Finch Beak Widths 5

0.075 0.050 residuals 0.025 0.000-0.025-0.050 0.75 0.77 fits 0.79 0.81 Conclusion Island birds have wider beaks than do Island birds. Since Island birds also are larger (as measured by ), and larger birds tend to have wider beaks, some of the difference in s between the s could be attributed to the difference in overall size. The analysis of covariance, however, shows that there is a statistically significant difference in s even after adjusting for the difference in s (P < 0.001); the 99% confidence interval for the adjusted difference in mean s is (0.0156, 0.0559) (in cm, ). The analysis also indicates that the relationship between and beak width is roughly linear, is statistically significant (P = 0.013), and does not differ significantly between the two populations (P = 0.577). These results are shown in the plot on the next page, with the fitted regression relationships superimposed on the data, and the adjusted mean scores shown by the (green) diamond and (blue) triangle at x = 1.675. Data Set 8: Finch Beak Widths 6

0.84 1.65 1.70 Comparison with ANOVA To understand the effect (if any) of including the covariate in this analysis, it is interesting to examine the results of a simple ANOVA (or equivalently, a two-sample t-test with pooled standard error). Source DF Seq SS Adj SS Adj MS F P 1 0.013228 0.013228 0.013228 25.40 0.000 Error 70 0.036456 0.036456 0.000521 Total 71 0.049684 Term Coeff Stdev t-value P Constant 0.779903 0.003888 200.57 0.000-0.019597 0.003888-5.04 0.000 The unadjusted difference in mean s is somewhat (about 9%) larger than the adjusted difference: since Island birds had larger s as well as s, and the two variables were positively related, the adjustment to a standard reduced the difference between s. On the other hand, removing the proportion of within- variability explainable by increased the precision of the analysis: the ANCOVA MSE was smaller than that for the ANOVA (0.0004828 vs. 0.000521, a 7% decrease). As a result, the standard error of the adjusted difference is slightly (about 2%) smaller than that of the unadjusted mean. The net result of these somewhat offsetting differences is that the effect is more significant in the ANOVA (the F is larger), though it is still highly significant in the ANCOVA. Data Set 8: Finch Beak Widths 7