Unit 8: A Mixed Two-Factor Design

Minitab Notes for STAT 6305 Dept. of Statistics CSU East Bay Unit 8: A Mixed Two-Factor Design 8.1. The Data We use data quoted in Brownlee: Statistical Theory and Methodology in Science and Engineering, 2nd ed., page 502, on the rupture strength of concrete beams. Six beams were cast from each of five batches of concrete. Of the six beams from each batch, two were treated with each of three different "capping" formulas (a process intended to add strength). The relevant measurement was the breaking strength of each beam in psi (lbs. per square inch). The results are shown below. Breaking Strength (psi) of Concrete Beams (Results for 3 capping formulas with each of 5 batches of concrete) Batch (B) Formula (A) 1 2 3 4 5 1 613 631 656 637 648 638 637 637 602 585 2 591 591 618 613 575 608 614 591 545 534 3 583 609 641 617 641 634 625 639 597 566 Problems: Source: George Werner, "The effect of capping material on the compressive strength of concrete cylinders," Proceedings of the American Society for Testing and Materials, 58 (1958), 1166-1186. (Somewhat abridged here.) 8.1.1. For your convenience the breaking strength measurements in the table above have been repeated below, reading across each row of the table. Put them into c1 of a Minitab worksheet; label it PSI. Then use the patterned data feature to make subscript columns c2 and c3 for A and B, respectively. What "pattern codes" can be used to generate these columns? 613 631 656 637 648 638 637 637 602 585 591 591 618 613 575 608 614 591 545 534 583 609 641 617 641 634 625 639 597 566 Finally, look in the worksheet or print it out to verify that its rows contain the information shown at the top of the next page. 8.1.2. Use Minitab to make a table similar to the one shown above. Also make a table that shows the means for each cell, row, and column. 8.1.3. Make box plots or dot plots of these data broken out by Formula. Even though these plots do not account for the fact that the data come from five different batches of concrete, do they suggest that one capping formula may result in beams that are significantly stronger (or weaker) than the others? Is there any evidence of outliers or skewness among the observations for any of the Formulas? 8.1.4. Based on graphical displays similar to those in the previous problem, does it seem that there is significant variability among the batches?

Minitab Notes for STAT 6305 Unit 8-2 c1 c2 c3 c1 c2 c3 c1 c2 c3 ROW PSI A B ROW PSI A B ROW PSI A B 1 613 1 1 11 591 2 1 21 583 3 1 2 631 1 1 12 591 2 1 22 609 3 1 3 656 1 2 13 618 2 2 23 641 3 2 4 637 1 2 14 613 2 2 24 617 3 2 5 648 1 3 15 575 2 3 25 641 3 3 6 638 1 3 16 608 2 3 26 634 3 3 7 637 1 4 17 614 2 4 27 625 3 4 8 637 1 4 18 591 2 4 28 639 3 4 9 602 1 5 19 545 2 5 29 597 3 5 10 585 1 5 20 534 2 5 30 566 3 5 8.2. The Model In this experiment there are two factors that may influence the strength of a particular concrete beam: the capping Formula used to enhance its strength and the Batch of concrete from which it was made. We consider Formula to be a fixed effect. Only these three capping formulas are currently under study. We would like to know if they differ significantly as to the strength they impart to the beams to which they are applied. If, so we would like to establish the pattern of significant differences. In particular, we would like to know if one capping formula is better than the others. We consider Batch to be a random effect. The five batches are of interest only because variability among them may contribute to variability in strength of the beams. We would like to know if the variability among batches is noticeable over and above the inevitable variability in strength from beam to beam, even ones made in the same way from the same batch. In addition, there might be interaction between these two factors. (As a purely speculative scenario, some batches may be mixed in such a way that voids from air bubbles show at the surface of beams cast from them. Unfilled, these voids may weaken a beam. One capping formula may not be as good as the others overall, but it may be particularly effective in filling such voids.) We are able to test for interaction here because there is more than one observation per cell, so that interaction and error effects are not confounded. Formally, this is a two-way mixed-model analysis of variance (mixed because one effect is fixed and the other is random). There are a = 3 levels of the fixed factor, b = 5 levels of the random factor, and n = 2 replications within each cell. The model is: Y ijk = µ + α i + B j + (αb) ij + e ijk, for i = 1, 2,3, j = 1,..., 5, and k = 1, 2. The parameters α i correspond to the fixed effects of the 3 levels of Formula, the random variables B j correspond to the random effects of the 5 levels of Batch, and the random variables (αb) ij correspond to 15 random "non-additive" adjustments, one for each cell. (Non-additive means that for a given k, the value of Y ijk cannot be expressed as the sum of mean and effects µ + α i + B j.) Moreover, the distributions of the random variables in this model are as follows: B j are identically distributed according to N(0, σ B 2 ), (αb) ij are distributed identically according to N(0, σ αb 2 ), and

e ijk are identically distributed according to N(0, σ 2 ). Minitab Notes for STAT 6305 Unit 8-3 Moreover, in the unrestricted version of the model all of these random variables are mutually independent. In our ANOVA we shall perform tests of three null hypotheses: No Formula effect [H 0 : Σ i α i 2 = 0], no Batch effect [H 0 : σ B 2 = 0], and no Interaction [H 0 : σ αb 2 = 0]. For each test the alternative is that the relevant sum of squared parameters or variance component is positive. About restricted and unrestricted models. Some authors place restrictions on the parameters α i and (αb) ij and some authors do not. The restriction Σ i α i = 0 is a matter of notational and computational convenience, but it does not affect the way the data are analyzed. Another possible restriction is that Σ i (αb) ij = 0, for each value of j, where each sum is taken over the full range of the subscript i. If this restriction is made, then Cov[(αB) ij, (αb) i'j ] = σ αb 2 /(a 1), for i i'. If this restriction is made, then, for a given j and different levels i, the random variables (αb) ij cannot be independent because of the restriction that they sum to 0 over the index i. Some authors do not put restrictions on the random variables (αb) ij that represent the interaction effect. One speaks of the distinction between restricted ANOVA models and unrestricted ones. The distinction can make a difference in how data is analyzed in some cases. This is because the decision to restrict or not makes a difference in how EMSs are computed, and thus how F-ratios are formed. Although there are specific experimental situations in which it seems clear that these restrictions on interaction terms may be appropriate, the tendency seems to be to use unrestricted models as a general practice. In O/L 5e all computations of mixed two-factor models seem to have used the restricted model, even though this was not explicitly stated in specifying the models. In Minitab, a subcommand is required if restrictions are to be made. It is difficult to use restricted models in SAS. It is simpler to derive EMSs for restricted models than for unrestricted ones. In O/L 6e it is explicitly stated that unrestricted models are used. (This required large-scale revisions in the text, a few of which do not seem to me to have been made correctly. I will try to point out the errors that matter for purposes of this course.) In Minitab, unrestricted models are the default (requiring no subcommand). Problems: 8.2.1. Look at the interaction plot that has Batches on the horizontal axis, so that there are three broken-line traces representing the three capping formulas. (See the next page.) If you had to guess just from looking at this plot, would you say that there is significant interaction? If interaction were significant, would it be disorderly with respect to the Formula effect?

Minitab Notes for STAT 6305 Unit 8-4 Interaction Plot (data means) for PSI A 1 2 3 4 5 650 625 600 575 A 1 2 3 550 650 625 600 575 550 B B 1 2 3 4 5 1 2 3 8.2.2. (a) Because we are interested in the randomly chosen batches only as a possible source of variability, we will not be interested in determining specific patterns of differences among the batches, even if the Batch effect turns out to be significant. But use the profile plot with Formulas on the horizontal axis to speculate whether some of our randomly chosen Batches happened to produce stronger (weaker) beams than others. (b) If the Formula effect turns out to be significant, is there a formula that seems better (or worse) than the others? 8.2.3. Suppose that you were given the 15 cell means rather than the 30 individual observations. Write the model you would use to analyze the results. 8.3. Analysis and Interpretation Here we use Minitab to make the ANOVA table and to interpret the results of this mixed two-factor model. The model for this ANOVA is more complex than for those that have gone before. It is very important to include every key feature of the formal model (Section 2) in specifying the model in Minitab. Notice the following requirements for using Minitab correctly: You must include the interaction term in specifying the model on the command line. There are two ways to do this: either include the term A * B in the model or put a vertical bar between the designations of the two main effects: A B. Minitab assumes that effects are fixed unless declared as random. You must declare Batch as a random effect, either using the subcommand random Batch or, if using menus, by putting Batch in the "text box" for random factors. A proper interpretation of the resulting ANOVA table is based on an understanding of the expected mean squares (EMS table). Include the subcommand ems or by marking the check box for EMS under Results.

Minitab Notes for STAT 6305 Unit 8-5 As always, we will want to look at a normal probability plot of the residuals. This requires the subcommand residuals cx, where cx denotes an empty worksheet column. Alternatively, in the Balanced ANOVA menu, check the box for residuals under Storage, in which case Minitab selects an empty column automatically. (In menus, you can also get a normal probability plot of the residuals by marking the appropriate check box under Graphs, but this plot will not have confidence bands.) STAT > ANOVA > Balanced, Declare B random, Results to show EMS, Storage of residuals [or Graph normal probability plot of residuals]. MTB > anova PSI = A B; SUBC> random B; SUBC> ems; SUBC> resid c4. ANOVA: PSI versus A, B Factor Type Levels Values A fixed 3 1, 2, 3 B random 5 1, 2, 3, 4, 5 Analysis of Variance for PSI Source DF SS MS F P A 2 8487.5 4243.7 24.39 0.000 B 4 13983.8 3495.9 20.09 0.000 A*B 8 1392.2 174.0 0.99 0.484 Error 15 2648.0 176.5 Total 29 26511.5 S = 13.2866 R-Sq = 90.01% R-Sq(adj) = 80.69% Expected Mean Square Variance Error for Each Term (using Source component term unrestricted model) 1 A 3 (4) + 2 (3) + Q[1] 2 B 553.654 3 (4) + 2 (3) + 6 (2) 3 A*B -1.254 4 (4) + 2 (3) 4 Error 176.533 (4) In all of the models we have seen to date, MS(Error) has been used in the denominator of the F-statistic for each test of significance. The situation is now somewhat more complex. The following table interprets Minitab's EMS table in terms of our symbols above, where we define θ α = nb[σ i α i 2 ]/(a 1) and where the column "Error Term" indicates which MS is used in the denominator of the F-statistic for the corresponding row. Two-Way Mixed Model (Unrestricted) Source Error Term Expected Mean Square Formula (A Fixed) Interaction σ 2 + 2σ 2 αb + θ α Batch (B Random) Interaction σ 2 + 2σ 2 2 αb + 6σ B Interaction (A*B Mixed) Error σ 2 2 + 2σ αb Error σ 2 We begin our interpretation of the ANOVA table by noticing that the Interaction effect is not significant, P = 0.484. Thus we can make straightforward interpretations of the main effects.

Minitab Notes for STAT 6305 Unit 8-6 Both of the main effects are very highly significant (with very small P-values). A final interpretation as to which capping formula is best or worst awaits an analysis of multiple comparisons (see Problem 8.3.2). As for the Batch, we now know that batch-to-batch variation is a significant source of variability in the strengths of the beams. Of course there is also variability within each cell of the data table because no two beams will ever be exactly alike and because of measurement errors. But the component of variance due to Batches significantly stands out above this within-cell variability. There is no point in determining the exact pattern of differences among the Batches because we will never see these particular batches again. It may, however, be worth pondering whether something can be done to decrease this component of variance in order to get a more uniform product. The important thing to notice about the tests of the main effects is that the F-ratio for testing the Formula effect is MS(A)/MS(A*B) and the F-ratio for testing the Batch effect is MS(B)/MS(A*B). That is, the interaction mean square is used as the denominator for both F-ratios. The rationale for using MS(A*B) in the denominator of F(A) is as follows. If there is no Formula effect, then all of the α i are 0 so that θ α = 0. In that case, we have EMS(A) = EMS(A*B) = σ 2 + 2σ αb 2, so that MS(A)/MS(A*B) has an F-distribution with 4 and 8 degrees of freedom. Similar arguments based on the EMS table lead us to conclude that the Batch effect is also "tested against Interaction" and that Interaction should be "tested against Error." Problems: 8.3.1. Look at the normal probability plot of the residuals. What is your interpretation? What is the P-value of the Anderson-Darling test for normality? 8.3.2. Because Interaction is not significant one could remove the interaction term from the ANOVA model before testing the main effects. (a) What happens to the df for Interaction when you "pool" in this fashion? What is the correct "error term" for testing the main effects in this model? (b) Use a calculator to perform Tukey's HSD procedure to determine the pattern of differences among Formulas. Use the "pooled" MS(Error) and df(error) in your computations. Recall that each Formula mean is based on 10 observations. Verify your results using Tukey comparisons in Minitab's general linear models (GLM) procedure. 8.3.3. Perform the ANOVA including the interaction term, but specifying the restricted model. What important difference in the formation of F-ratios is implied by Minitab's EMS table? Note: For several decades, the difference in error terms (between the restricted and unrestricted model) that you see here has been the cause of controversy and debate among statisticians in the interpretation of mixed-model ANOVAs. The question has been: Which model is correct: restricted or unrestricted? Here are comments on recent discussions: Some texts suggest that the restricted model is appropriate in some applications and the unrestricted model in others. (An example is the text by Oehlert.) The texts by Brownlee and Snedecor and Cochran discuss the restricted model exclusively. O/L 5e also used the restricted model, but O/L 6e uses the unrestricted model. Thus, sensible or not, using the restricted model in Minitab for every ANOVA with any fixed effect will give the same results as in the books by Brownlee, Snedecor / Cochran, and earlier editions of Ott's book. More advanced books on ANOVA generally prefer the unrestricted model. Also, some software packages also prefer the unrestricted model: SAS seems to do the restricted model only with special programming, and Minitab uses the unrestricted model by default, doing restricted analyses only if requested by a subcommand.

Minitab Notes for STAT 6305 Unit 8-7 In practice, situations in which the distinction between restricted and unrestricted models will lead to different practical interpretations seem to be rare. For example, in the current two-way mixed model ANOVA, the importance of the distinction evaporates if interaction is far from significant and the interaction term is removed from the model (as in Problem 8.3.2). Even when the interaction term remains, for the practical interpretation of main effects it may not make much difference whether MS(Error) or MS(Interaction) is put in the denominator of F. 8.3.4. Suppose that the 30 observations given at the beginning of this unit were collected exactly in order reading across the rows of the original data table, with Formula 1 tests being run on Monday, Formula 2 on Tuesday, and Formula 3 on Wednesday. Furthermore, suppose that on each of these days the Day shift ran Batch 1, the Swing shift ran Batches 2, 3, and 4, and the Night shift ran Batch 5. What effect would this knowledge have on the report you would write about your conclusions from the experiment? 8.3.5. Suppose that the beams that produced the values 613 and 656 in the first two cells with Formula 1 had been damaged in handling so that these two values are missing. Further suppose that for each of the cells in Batch 3 a third beam was tested with results: 649, 612, 614 for Formulas 1, 2, 3, respectively. How would you analyze the resulting unbalanced design? Do the missing and extra observations change the overall interpretation of the results? 8.4. Comparison with Fixed and Random Models In this section we compare the mixed model just analyzed with a model in which both factors are fixed and one in which both factors are random. We do this by "changing the story" behind the data on strengths of concrete beams. These fantasy models are incorrect for the data given, but they are instructive from a computational point of view. If Both Factors Were Fixed. First, suppose that we have access to five major suppliers of concrete, and that Factor B represents "brands" of concrete rather than batches from the same supplier. Suppose that Factor A remains capping formulas as in the true story. In this case both factors would be fixed effects. HYPOTHETICAL FIXED-EFFECTS MODEL MTB > anova PSI = A B; SUBC> ems. ANOVA: PSI versus A, B Factor Type Levels Values A fixed 3 1, 2, 3 B fixed 5 1, 2, 3, 4, 5 Analysis of Variance for PSI Source DF SS MS F P A 2 8487.5 4243.7 24.04 0.000 B 4 13983.8 3495.9 19.80 0.000 A*B 8 1392.2 174.0 0.99 0.484 Error 15 2648.0 176.5 Total 29 26511.5 S = 13.2866 R-Sq = 90.01% R-Sq(adj) = 80.69% Expected Mean Square Variance Error for Each Term (using Source component term unrestricted model) 1 A 4 (4) + Q[1,3] 2 B 4 (4) + Q[2,3] 3 A*B 4 (4) + Q[3] 4 Error 176.5 (4)

Minitab Notes for STAT 6305 Unit 8-8 Notice that MS(Error) is used in the denominator for F(A*B), F(A) and F(B) when both effects are fixed. There is not much point in giving interpretations of the F-ratios here because our phony model was not used to collect the data. A legitimate example of a fixed two-factor ANOVA with interaction was discussed in a prior unit. If Both Factors Were Random. Second, pretend that Factor B refers to randomly chosen batches as in the true story. However, imagine that Factor A is also a random factor perhaps corresponding to randomly chosen batches of a single capping formula. The issue would be how much of the variability in measured strength of the beams is due to random concrete batch variability, how much to random variability in batches of the capping formula, how much to any interaction between these two random effects, and how much to unexplained random error. To use Minitab for this fanciful scenario, it is necessary to declare both factors as random. We declare the restricted model here just for consistency. Whether to "restrict" is not an issue unless there is at least one fixed effect; the only result of the restrict subcommand is in the heading of the EMS table. (The residuals are the same as for the previous two models.) HYPOTHETICAL RANDOM-EFFECTS MODEL MTB > anova PSI = A B; SUBC> random A B; SUBC> ems. ANOVA: PSI versus A, B Factor Type Levels Values A random 3 1, 2, 3 B random 5 1, 2, 3, 4, 5 Analysis of Variance for PSI Source DF SS MS F P A 2 8487.5 4243.7 24.39 0.000 B 4 13983.8 3495.9 20.09 0.000 A*B 8 1392.2 174.0 0.99 0.484 Error 15 2648.0 176.5 Total 29 26511.5 S = 13.2866 R-Sq = 90.01% R-Sq(adj) = 80.69% Expected Mean Square Variance Error for Each Term (using Source component term unrestricted model) 1 A 406.971 3 (4) + 2 (3) + 10 (1) 2 B 553.654 3 (4) + 2 (3) + 6 (2) 3 A*B -1.254 4 (4) + 2 (3) 4 Error 176.533 (4) In this case, the EMS information tells us to use the same F-ratios as in the model where A is fixed and B is random. EMS Tables in O/L 6e for Fixed, Mixed, and Random Models. Table 17.17 in O/L 6e gives general formulas for EMSs in the two-factor fixed, mixed, and random models. The coefficients of the terms in each EMS depend on the number of levels of a and b of the two factors and the number n of replications in each cell. These are similar to the EMS tables from Minitab, except that the components for fixed effects (θ A, θ B, θ AB in O/L and Q[ ] in Minitab, as appropriate) are defined differently.

Minitab Notes for STAT 6305 Unit 8-9 O/L 6e Sect. 17.5 presents an algorithm claimed to produce EMS tables for a wide variety of experimental designs. This section is not required for our course because we do not feel any important insights are gained by memorizing this rather complicated procedure, and because the algorithm programmed into Minitab (and most other serious software packages) can provide the EMS tables we need. If interaction is not significant and the reduced model (leaving out interaction) is used, then all twoway models (fixed/mixed/random, restricted/unrestricted) have the same F-ratios: both main effects are tested against error. Problems: 8.4.1. Compare two-way ANOVA tables for fixed, mixed, and random models. Which of the following columns are unchanged, regardless of which factors are declared as random: DF, SS, MS, F? 8.4.2. In Problem 8.3.3 we saw that, for the mixed model, Minitab's F-ratios are different depending on whether the restricted model is declared or whether the default unrestricted model is used. Is there a similar difference in F-ratios for the two-way model with both effects random? 8.4.3. Same as problem 8.4.2, but consider both effects to be fixed? First, use the unrestricted model and then the restricted model, showing EMS tables in both cases. Comment on similarities and differences. Which EMS table agrees with what is shown in O/L 6e, Table 17.17, p1057? 8.4.4. For the data in this unit, the most fundamental interpretation has been the same whether with the true story (mixed model), or with the two imaginary scenarios (both fixed, both random): Interaction is not significant, A-effect highly significant, and B-effect highly significant. Thus, with these data an untrained "statistician" who does not know the difference between fixed and random effects would get the same results for the three F-tests no matter which model he or she happened to choose. Under what circumstances might the results of these F-tests depend on getting the correct model? (Of course, the interpretation of profile plots, use of multiple comparison procedures for significant main effects, and explanation of the findings would depend on knowing the difference between fixed and random effects in any case.) 8.4.5. Carefully write the models with both factors fixed (restricted) and both factors random (where the issue of restriction does not arise). Show the equation (using Greek letters for parameters and Latin letters for random variables), ranges of subscripts, restrictions (as appropriate), and distributional assumptions. Minitab Notes for Statistics 6305 by Bruce E. Trumbo, Department of Statistics, CSU East Bay, Hayward CA, 94542, Email: bruce.trumbo@csueastbau.edu. Comments and corrections welcome. Copyright 1991, 2010 by Bruce E. Trumbo. All rights reserved. These notes are intended primarily for use at California State University, East Bay. For other uses, please contact the author. Preparation of earlier versions of this document was partially supported by NSF grant USE-9150433. Modified: 2/10