Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs

Size: px

Start display at page:

Download "Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs"

Johnathan Dennis
5 years ago
Views:

1 Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs STA 643: Advanced Experimental Design Derek S. Young 1

2 Learning Objectives Revisit your understanding of the basic factorial design Understand the benefits of full factorial experiments Become familiar with 2 k factorial designs Know how to interpret the different effects in 2 k factorial designs Know how to estimate error variance with only one replication Know how to estimate and conduct testing when treatments have unequal replications Understand why an experimenter might be faced with an incomplete block design Know how to construct an incidence matrix 2

3 Outline of Topics 1 2 k Factorial Designs 2 Single Replications or Unequal Replications in Factorial Designs 3 Incomplete Block Designs 3

4 Outline of Topics 1 2 k Factorial Designs 2 Single Replications or Unequal Replications in Factorial Designs 3 Incomplete Block Designs 4

5 Factorial Designs We have already discussed (implicitly and explicitly) factorial designs. To reiterate, a (full) factorial design (also called a fully-crossed design) is an experiment whose design consists of two or more factors (with at least two levels) and whose EUs take on all possible combinations of the levels across all such factors. By far, the most common type of factorial designs are experimental studies involving k factors, each at two levels. Such designs are called 2 k factorial designs. As we have seen, the factors can be quantitative or qualitative. In our discussion, the factor effects will all be treated as fixed effects. 5

6 Benefits of 2 k Factorial Designs 2 k factorial designs have a number of benefits, including: 1 They require relatively few runs per factor studied. 2 The interpretation of the observations produced by the designs can proceed largely by using common sense, elementary arithmetic, and computer visualizations. 3 While quantitative factors make one unable to fully explore a wide region in the factor space, they often determine a promising direction for further experimentation. 4 Designs can be suitably augmented when a more thorough local exploration is needed a process called sequential assembly. 5 When cost and feasibility become an issue, 2 k factorial designs provide the framework for constructing two-level fractional factorial designs, which we discuss in a later lecture. 6

7 One-Factor-at-a-Time Method The one-factor-at-a-time method of experimentation, in which factors are varied one at a time with the remaining factors held constant, was once considered the correct way to conduct experiments. But, this method only provides an estimate of the effect of a single factor at selected and fixed conditions of the other factors. For such an estimate to have more general relevance, it would be necessary to assume that the effect was the same at all the other settings of the remaining factors that is, the factors would affect the response additively. However, if the factors do act additively, then (1) the factorial does the job with much more precision, and (2) if the factors do not act additively, the factorial, unlike the one-factor-at-a-time design, can detect and estimate interactions that measure this nonadditivity. 7

8 2 2 Factorial Design For a 2 2 factorial design, let A and B represent the two factors, each measured at a low (-) and high (+) level. Below is the standard way to code the design matrix (considering only main effects) as well as a geometric view of this design. Note that the units for each factor will almost certainly be different, so the actual geometric area is not an issue. The point of this graphic is to compactly visualize low/high levels of the factors under consideration. Design-Matrix-(Coded) Factor 3 4 Run A B 1 & 2 + & & B- - 3 & A- - 8

9 2 2 Factorial Design Below is the coded design matrix for the full model with the two-way interaction. The interaction column is simply the product of the main effects. Design1Matrix1(Coded) Factor Run A B 1 ( ( 2 + ( 3 ( AB + ( ( + 9

10 Deconstructing the 2 2 Design B A 1 2 Factor A Means 1 µ 11 µ 12 µ 1 = µ 1 /2 2 µ 21 µ 22 µ 2 = µ 2 /2 Factor B Means µ 1 = µ 1 /2 µ 2 = µ 2 /2 The effect of a factor is a change in the measured response caused by a change in the level of that factor. We are, again, interested in the three effects: simple, main, and interaction. The population means for 2 2 factorial experiment can be represented with cell means µ ij, where i = 1, 2 and j = 1, 2 for the factor levels of A and B, respectively. Recall that µ i = j µ ij and µ j = i µ ij represent the marginal sums over a given level of i and j, respectively. The cell and marginal means are given in the table above. Note that the grand mean is the average of all the cell means: µ = (µ 11 + µ 12 + µ 21 + µ 22 )/4 10

11 Simple Effects The simple effects of a factor are contrasts between levels of one factor at a single level of another factor. The simple effect (l 1 ) of the two levels of factor A on the response at the first level of factor B is calculated from the cell means table by l 1 = µ 21 µ 11 Similarly, the simple effect (l 2 ) of the two levels of factor A on the response at the second level of factor B is calculated from the cell means table by l 2 = µ 22 µ 12 11

12 Main Effects The main effects of a factor are contrasts between levels of one factor averaged over all levels of another factor. The main effect (l 3 ) of factor A on factor B is the difference between the marginal means for factor A: l 3 = µ 2 µ 1 Moreover, the above main effect can also be expressed as the average of the two simple effects from the previous slide: l 3 = µ 2 µ 1 = = = 12

13 Interaction Effects The interaction effect measures differences between the simple effects of one factor at different levels of the other factor. The interaction effects (l 4 ) measures the interaction between factor A and factor B as they affect the response: l 4 = 1 2 (l 2 l 1 ) We can, again, use the treatment means plot to help us identify the importance (or existence) of any interaction. 13

14 No Interaction To the right is a treatment means plot of no interaction in a 2 2 factorial arrangement. Factor A is on the x-axis, the response is on the y-axis, and the profiles are broken out by factor B. Again, no interaction is present since the lines are parallel, which means the factors act independently and the separate main effects can be used to interpret the effects of the respective factor. For this particular scenario, we can also break down the simple effects that we defined. Response No Interaction B 2 B 1 A 1 A 2 Factor A 14

15 Simple Effects (A on B) On the treatment means plot, we have two vertical dashed lines. The bottom vertical green dashed line is the simple effect of A on the first level of B; i.e., contrast l 1. The top vertical green dashed line is the simple effect of A on the second level of B; i.e., contrast l 2. In each case, a horizontal green dotted line is included to show the these simple effects are being measured with respect to a horizontal configuration of the treatment profile. Response Simple Effects of A at B Levels B 2 B 1 A 1 A 2 Factor A 15

16 Simple Effects (B on A) On the treatment means plot, we have two horizontal dashed lines. The left vertical green dashed line is the simple effect of B on the first level of A. The right vertical green dashed line is the simple effect of B on the second level of A. Though we constructed the earlier contrasts for the simple effect of A on each level of B, you can work through the same logic to specify the contrasts for the simple effect of B on each level of A, as shown on the plot. Response Simple Effects of B at A Levels B 2 B 1 A 1 A 2 Factor A 16

17 Interaction (Magnitude) Interaction (Magnitude) B 2 To the right is a treatment means plot an interaction due to differences in the magnitude of the responses. For such a scenario, the treatment profiles do not cross, however, such a situation does not allow us to interpret the effects of each factor independently. Response B 1 A 1 A 2 Factor A 17

18 Interaction (Direction) Interaction (Direction) To the right is a treatment means plot an interaction due to differences in the direction of the responses. For such a scenario, the treatment profiles do cross, which is a stronger indication about the effect due to the interaction than if we simply had differences in the magnitude of the responses. Response B 2 B 1 A 1 A 2 Factor A 18

19 Example: Asphalt Study A researcher wants to study the bond strength of asphalt mix. Two factors are of interest: the aggregate type used in the asphalt (factor A) and the compaction method (factor B). A 2 2 factorial treatment design is used to assess whether the two factors act independently on the strength of the test specimens. The two levels of aggregate are silicious and basalt, while the two levels of compaction method are static and kneading. The response is tensile strength (psi). The data are reported in the table below. Note that the value in the lower-right hand corner of the table is the grand mean µ. B A Static Kneading Factor A Means Silicious Basalt Factor B Means

20 Example: Asphalt Study The simple effect of aggregate type on tensile strength with static compaction is: l 1 = µ 21 µ 11 = = 3, which means that the average tensile strength of silicious rock specimens was greater than that for basalt specimens by 3 psi when using static compaction. The simple effect of aggregate type on tensile strength with kneading compaction is: l 2 = µ 22 µ 12 = = 37, which means that the average tensile strength of basalt specimens was greater than that for silicious rock specimens by 37 psi when using kneading compaction. The main effect of aggregate type on tensile strength is: l 3 = µ 2 µ 1 = = 17, which means the difference in tensile strength between basalt and silicious rock specimens is 17 psi in favor of the basalt when averaged over both compaction methods. The interaction effect of aggregate type and compaction method on tensile strength is: l 4 = (l 2 l 1 )/2 = (37 ( 3))/2 = 20, which means the difference between basalt and silicious rock was 20 psi greater with kneading compaction than it was with static compaction. 20

21 2 3 Factorial Design For a 2 3 factorial design, let A, B, and C represent the three factors, each measured at a low (-) and high (+) level. Below is the coded design matrix (considering only main effects) as well as a geometric view of this design (also called a cube plot). Design1Matrix1(Coded) Factor 7 8 Run A B C 1 * * * 2 + * * * + * * 5 * * * + 7 * B C1 A1 1 21

22 2 3 Factorial Design Below is the coded design matrix for the full model with two-way and three-way interactions. Again, the interaction columns are simply the product of the respective main effects. Design6Matrix6(Coded) Factor Run A B C AB AC BC 1 $ $ $ $ $ $ $ + 3 $ + $ $ + $ $ + $ $ 5 $ $ + + $ $ 6 + $ + $ + $ 7 $ + + $ $ ABC $ $ $ + 22

23 Deconstructing the 2 3 Design The marginal means are: B C A µ µ µ 211 µ µ 112 µ µ 212 µ 222 Factor A: µ 1 = (1/4)(µ µ µ µ 122 ) µ 2 = (1/4)(µ µ µ µ 222 ) Factor B: µ 1 = (1/4)(µ µ µ µ 212 ) µ 2 = (1/4)(µ µ µ µ 222 ) Factor C: µ 1 = (1/4)(µ µ µ µ 221 ) µ 2 = (1/4)(µ µ µ µ 222 ) For ease of notation on the subsequent slides, let the levels at each factor be written as A 1, A 2, B 1, B 2, C 1, and C 2. 23

24 Simple Effects B C Simple Effect of A 1 to A 2 B 1 C 1 µ 211 µ 111 B 2 C 1 µ 221 µ 121 B 1 C 2 µ 212 µ 112 B 2 C 2 µ 222 µ 122 A C Simple Effect of B 1 to B 2 A 1 C 1 µ 121 µ 111 A 2 C 1 µ 221 µ 211 A 1 C 2 µ 122 µ 112 A 2 C 2 µ 222 µ 212 A B Simple Effect of C 1 to C 2 A 1 B 1 µ 112 µ 111 A 2 B 1 µ 212 µ 211 A 1 B 2 µ 122 µ 121 A 2 B 2 µ 222 µ

25 Main Effects and Interactions Main Effects Factor A: µ 2 µ 1 = 1 4 µ µ 1 Factor B: µ 2 µ 1 = 1 4 µ µ 1 Factor C: µ 2 µ 1 = 1 4 µ µ 1 Two-Factor Interactions AB: 1 4 (µ µ µ µ 222 ) 1 4 (µ µ µ µ 122 ) AC: 1 4 (µ µ µ µ 222 ) 1 4 (µ µ µ µ 122 ) BC: 1 4 (µ µ µ µ 222 ) 1 4 (µ µ µ µ 212 ) Three-Factor Interaction ABC: 1 4 ((µ 222 µ 122 ) (µ 212 µ 112 )) 1 4 ((µ 221 µ 121 ) (µ 211 µ 111 )) 25

26 Example: Pilot Plant Investigation An experimenter employed a 2 3 factorial experimental design with two quantitative factors temperature (A) and concentration (B) and a single qualitative factor type of catalyst (C). Temperature is measured at 160 C (-) and 180 C (+) Concentration is measured at 20% (-) and 40% (+). Catalyst is measured at type A (-) and type B (+). The response measured is percentage yield. The data are reported in the table below. Factor A B C Yield µ 111 = µ 211 = µ 121 = µ 221 = µ 112 = µ 212 = µ 122 = µ 222 = 80 26

27 Example: Pilot Plant Investigation Concentration Catalyst Changing from 160 C to 180 C 20% A µ 211 µ 111 = = 12 40% A µ 221 µ 121 = = 14 20% B µ 212 µ 112 = = 31 40% B µ 222 µ 122 = = 35 Temperature Catalyst Changing from 20% to 40% 160 C A µ 121 µ 111 = = C A µ 221 µ 211 = = C B µ 122 µ 112 = = C B µ 222 µ 212 = = 3 Temperature Concentration Changing from Method A to Method B 160 C 20% µ 112 µ 111 = = C 20% µ 212 µ 211 = = C 40% µ 122 µ 121 = = C 40% µ 222 µ 221 = = 12 27

28 Example: Pilot Plant Investigation Main Effects Factor A: 1 4 ( ) 1 ( ) = = 23 4 Factor B: 1 4 ( ) 1 ( ) = = 5 4 Factor C: 1 4 ( ) 1 ( ) = = Two-Factor Interactions AB: 1 4 ( ) 1 ( ) = = AC: 1 4 ( ) 1 ( ) = = 10 4 BC: 1 4 ( ) 1 ( ) = = 0 4 Three-Factor Interaction ABC: 28

29 Outline of Topics 1 2 k Factorial Designs 2 Single Replications or Unequal Replications in Factorial Designs 3 Incomplete Block Designs 29

30 Preliminary Comments We have already discussed the statistical models used when we have two factors or three factors, with and without interactions. In each case, we focused on balanced designs; i.e., the same number of replications occur for each treatment. However, sometimes problems with the experiment or data collection process will result in losing some of the data. Because of the cost, it will almost never be appropriate to simply rerun the experiment, so you want to analyze the good data that you collect. We will discuss the statistical models when we no longer have the same number of replications per treatment; i.e., unbalancedness. To transition over to this paradigm, we will first revisit, as well as develop further, some quantities of the two-factor model. 30

31 Two Treatments (Cell Means Model) Consider a factorial treatment with two factors, A and B, each measured at a and b levels, respectively. The cell means model, with n replicates, in a CRD is where Y ijk = µ ij + ɛ ijk, (1) i = 1,..., a, j = 1,..., b, and k = 1,..., n; µ ij is the mean of the treatment combination of factor A at level i and factor B at level j; ɛijk is our experimental error, which is normally distributed with mean 0 and variance σ 2 ; and N = abn is the total number of EUs in the experiment. 31

32 Least Squares Estimates of Cell Means The SSE for the cell means model is SSE = a b n (y ijk ˆµ ij ) 2 i=1 j=1 k=1 The least squares estimators for µ ij are the observed cell means of the treatment combinations: ˆµ ij = y ij /n = ȳ ij The observed marginal means are unbiased estimates of the factor marginal means: ˆ µ i = ȳ i and ˆ µ j = ȳ j. The grand mean µ is estimated by the observed grand mean ȳ. 32

33 Additivity and (Fixed) Factor Effects The cell means µ ij represent the true response for the treatment combination of level i for factor A and level j for factor B. In the absence of interaction (i.e., additivity), the cell mean can be expressed as a sum of a general mean, µ, plus the treatment effects for factor A and factor B namely, α i and β j, respectively; namely, the factor effects model: Y ijk = µ + α i + β j + ɛ ijk Noting that the factors can be defined as deviations of the respective marginal means from the grand mean: we have α i = µ i µ and β j = µ j µ, µ ij = µ + ( µ i µ ) + ( µ j µ ) = (µ ij µ ) = ( µ i µ ) + ( µ j µ ) 33

34 Interactions In the presence of interaction, the treatment effect will not be equal to the sum of the main effects. The interaction term can be defined as the following difference: (αβ) ij = (µ ij µ ) ( µ i µ ) ( µ j µ ) = µ ij µ i µ j + µ Thus, we have our two-factor with interactions model: Y ijk = µ + α i + β j + (αβ) ij + ɛ ijk, where µ is the overall mean (a constant); i = 1,..., a, j = 1,..., b, and k = 1,..., n; αi is the effect of factor A at level i and is subject to the constraint a i=1 α i = 0; β j is the effect of factor B at level j and is subject to the constraint b j=1 β j = 0; (αβ) ij are the interaction effects and are subject to the constraints ai=1 (αβ) ij = b j=1 (αβ) ij = 0; and ɛijk is the experimental error and are iid normal with mean 0 and variance σ 2. 34

35 SS for Factorial Effects We use the following identity (which we ve written in terms of the observed y values) to break out the different effects: (ȳ ij ȳ ) = (ȳ i ȳ ) + (ȳ j ȳ ) + (ȳ ij ȳ i ȳ j + ȳ ) Taking the sums of the squares of the above (where the sums of cross-products drop out), we have a A main effect: SSA = nb (ȳ i ȳ ) 2 i=1 b B main effect: SSB = na (ȳ j ȳ ) 2 AB main effect: SSAB = n j=1 a b (ȳ ij ȳ i ȳ j + ȳ ) 2 i=1 j=1 a b Treatment: SSTr = n (ȳ ij ȳ ) 2, i=1 j=1 which gives us SSTr = SSA + SSB + SSAB for the ANOVA table. 35

36 Deviations from the Preceding Design We just reviewed the two-factor effects with interaction model. The basic construction for more than two factors is straightforward (as we have done before with three factors). Most of our discussions have focused around the balanced case and multiple replications. We now turn our attention to answer the following questions about two variations on the factorial model we just presented: 1 What happens when we have only one replication per treatment; i.e., k = 1? 2 How do we handle unequal replication of treatments? The remainder of our discussion will be in the context of two factors, but most of our discussion applies to settings with more than two factors. Note that we do not yet discuss the setting where a particular treatment (or subset of treatments) is unobserved. 36

37 One Replication and Error Variability If only one replication per treatment is available, then the experimental error variance cannot be estimated. The SS partitions for the factor main effects and interaction are equal to the SSTot for the observations, thus reducing SSE to 0. Under additivity of the factors, the mean square for the interaction (MSAB) can be used as an estimate of experimental error. The additivity of main effects or absence of interaction is not guaranteed, and some measures of evaluating the presence of interaction is required. 37

38 Quantitative Factors The additivity of two quantitative factors can be investigated with the interaction components for linear (L) and sometimes quadratic (Q) regression partitions. For example, the SS for L L interaction can be partitioned out for the interaction SS, under the assumption that the remaining SS for deviations from L L interaction is experimental error. These SS for deviations from the L L interaction would include all higher-order polynomial interactions, such as L Q. The mean square for deviations from the L L interaction can then be used as the MSE. The number of 1 df interaction terms that are partitioned from the interaction is a matter of judgment based on the number of available df. The same approach discussed above can be used when one factor is quantitative and one factor is qualitative. In this case, the SS for the interaction between the qualitative factor and the L effect of the quantitative factor can be partitioned out of the interaction SS, thus leaving the remainder as an estimate of experimental error. 38

39 Two Qualitative Factors - Tukey s Method If both factors are qualitative, the problem of estimating error variability is a bit more difficult, but there are satisfactory solutions available. Tukey s method (not to be confused with the Tukey procedure discussed for multiple comparisons) isolates a 1 df SS to test for nonadditivity in a two-factor classification with one replication per cell. Nonadditivity in the linear model is characterized by λα i β j, where λ is a parameter representing nonadditivity. The product of main effects is a multiplicative form of interaction, and if there is nonadditivity from this specific type of interaction between the main effects (α i and β j ), then λ 0. Under this paradigm, the cell means are a sum of the grand mean, the factor effects, and the product term: µ ij = µ + ( µ i µ ) + ( µ j µ ) + λ( µ i µ )( µ j µ ) (2) The SS for nonadditivity requires a computation involving the deviations of the A and B means from the grand mean. 39

40 Tukey s Method Calculation For Tukey s method, we need the following quantities: a b P j = y ij (ȳ i ȳ ) P = P j (ȳ j ȳ ) i=1 j=1 After estimating the two-factor main effects model with one replication per treatment, we can then partition the SSE into a nonadditivity SS and residual SS: abp 2 SSNA = (SSA) (SSB) SSRes = SSE SSNA dfna = 1, which implies that df Res = df E 1. Testing of no additivity has the test statistic F = MSNA/MSRes, which follows an F -distribution with df df NA and df Res. The ANOVA table is below. Source df SS MS A df A SSA MSA B df B SSB MSB Error df E SSE MSE NonAdditivity df NA SSNA MSNA Residual df Res SSRes MSRes 40

41 Example: Hearing Levels Below is a dataset where N = 49 men aged 55 to 64 with hearing levels 16 decibels above the audio metric zero were studied. The first factor (A) is sound levels in cycles per second (hertz) and the second factor (B) is occupational category. The study is a 7 7 factorial arrangement. Note that this is taken from Example 6.6 of Kuehl s textbook. However, there are numerical errors in that example, so it is illustrative to redo the analysis as an example. The values of P j are given in the last row of the table, while P = Therefore, SSNA = abp 2 (SSA) (SSB) = (7)(7)( )2 ( )(1141.5) = A B ȳ i (ȳ i ȳ ) ȳ j (ȳ j ȳ ) P j

42 Example: Hearing Levels Analysis of Variance Table Response: hearing Df Sum Sq Mean Sq F value Pr(>F) sound < 2.2e-16 *** occ ** Residuals Tukey s test of nonadditivity hearing P : Q : Analysis of Variance Table Response: residual Df Sum Sq Mean Sq F value Pr(>F) Nonadditivity * Residuals Signif. codes: 0 *** ** 0.01 * Above is the output for the ANOVA table and the test of nonadditivity. We see that nonadditivity is significant, meaning that a significant interaction is present. 42

43 Unequal Replication of Treatments When missing data occurs in properly designed research studies, then the design is no longer balanced with a complete dataset. As a result, the standard computing formulae we presented no longer apply. An unbalanced design in an ANOVA means that the number of EUs at each treatment are unequal. In a single-factor ANOVA, this is not much of a problem, but it does make constructing orthogonal contrasts difficult since different treatment means must be weighted accordingly. In a multi-factor ANOVA, unbalancedness complicates things substantially because the treatment effects for the main effects and interactions are no longer orthogonal as the SS partitioning for the treatment no longer nicely partitions. 43

44 SS for Unbalanced Designs A problem in an unbalanced design is that the SS for a treatment effect depends on the order in which that term is fit in the model. In general, the SS for a treatment is the difference between the SSE when that treatment effect is included/excluded from the model. In a balanced design, this difference is the same regardless of which terms are in the reduced model, but this is not the case with unbalanced designs. Thus, it is important in the analysis of unbalanced designs to have a precise manner of specifying the order in which the models are fit. The order in which we fit the models is why we introduced Type I, II, and III SS. 44

45 Establish Estimators with the Cell Means Model In the unbalanced setting, the cell means model can be used to establish the appropriate estimators for population parameters and hypotheses that we want to test. The cell means model is where Y ijk = µ ij + ɛ ijk, i = 1,..., a, j = 1,..., b, and k = 1,..., nij ; µij is the cell mean for the level i of factor A and level j of factor B; ɛ ijk is the experimental error and are iid normal with mean 0 and variance σ 2 ; and a b N = i=1 j=1 n ij such that n ij > 0 is the number of observations in cell (ij). 45

46 Least Squares Estimators The cell means can be estimated by least squares. The estimators for the cell means are the observed cell means: ˆµ ij = 1 n ij y ijk = ȳ ij n ij k=1 The experimental error variance estimator is n ˆσ 2 = s 2 1 ij = (y ijk ȳ ij ) 2 N ab k=1 The unbiased least squares estimators of the marginal means are with standard error estimators = sˆ µi s2 b b 2 respectively. ˆ µ i = 1 b ˆµ ij and ˆ µ j = 1 a ˆµ ij, b j=1 a i=1 1 and = j=1 n sˆ µ j ij s2 a 2 a 1, i=1 n ij 46

47 Observed Marginal Means The observed marginal means, ȳ i and ȳ j, do not have the same value as the least squares estimates of the marginal means. The observed marginal means estimate weighted functions of the population means, where the weights are proportional to the number of replications in the respective cells. The expected values of the observed marginal means are E(ȳ i ) = 1 n i b n ij µ ij and E(ȳ j ) = 1 a n ij µ ij j=1 n j i=1 If the number of observations in the treatment cells of the study is proportional to the frequency with which those treatment combinations occur in the population, then the observed marginal means provide the appropriate estimators for the population marginal means. While proportional relationships of observations to population frequencies is common in sample surveys, such proportions are not expected to hold for designed experiments and, thus, the least squares estimates should be used. 47

48 Hypothesis Testing The hypotheses of interest in factorial treatment designs with unequal replication numbers are unchanged from those with equal replications numbers. The general form of the interaction (i.e., the difference between levels i k of factor A at levels j m of factor B) is (µ ij µ kj ) (µ im µ km ) = µ ij µ kj µ im + µ km The hypothesis of no interaction can be expressed in terms of the cell means as H 0 : µ ij µ kj µ im + µ km = 0 for all i, j, k, and m H A : µ ij µ kj µ im + µ km 0 for some i, j, k, and m In the absence of an interaction, the null hypotheses for factor A and factor B, respectively, are H 0 : µ 1 = µ 2 = = µ a and H 0 : µ 1 = µ 2 = = µ b In each case, the alternative hypothesis is that at least one of the marginal means differs from the rest. We use Type III SS for analyzing factorial experiments with unequal replication. 48

49 SS for Interaction from Full and Reduced Models The full model (with interaction) and reduced model, in terms of factorial effects, are, respectively, and y ijk = µ + α i + β j + (αβ) ij + ɛ ijk y ijk = µ + α i + β j + ɛ ijk The respective SSEs under these models are and SSE f = a n b ij [y ijk ˆµ ˆα i ˆβ j ( αβ) ˆ ij ] 2 i=1 j=1 k=1 SSE r = a n b ij [y ijk ˆµ ˆα i ˆβ j ] 2 i=1 j=1 k=1 Then, the SS for the interaction can be computed as SSAB = SSE r SSE f 49

50 Testing the Interaction The mean squares for testing the interaction are MSE = SSE f N ab and MSAB = SSAB (a 1)(b 1) Then, the usual F -statistic is: F = MSAB MSE, which follows an F -distribution with (N ab) and (a 1)(b 1) df. It is fairly straightforward to extend the above testing paradigms when handling three or more factors in an unbalanced setting. 50

51 Example: Asphalt Study Let us return to the experiment regarding tensile strength of asphaltic concrete specimens. For illustration, suppose that specimens were, again, constructed with one of two types of aggregate: basalt or silicious. This is factor A with two levels. The specimens were now constructed with one of three types of kneading compaction method: regular, low, or very low. This is factor B with three levels. Three replications were originally made at each treatment level, resulting in 18 EUs being assigned to one of the 6 treatments. However, some of the specimens were damaged prior to or during testing, resulting in an unequal number of specimens being available among the treatments. The new data are presented in the table below. Aggregate Kneading Method Regular Low Very Low ȳ i Basalt 106, , 101, 98, ȳ 1j Silicious 107, 110, , 60 40, 41, ȳ 2j ȳ j

52 Example: Asphalt Study The estimate of the experimental error variance is: s 2 = [( )2 + ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 + (56 56) 2 + ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 + ( ) 2 ] = 89.83/8 = The least squares estimates of the marginal means for the basalt and silicious aggregate types are, respectively: ˆ µ 1 = ( )/3 = 86.8 ˆ µ 2 = ( )/3 = 71.4 The above estimates have standard error estimates of: ( = sˆ µ ( = sˆ µ ) = 1.51 ) =

53 Example: Asphalt Study The least squares estimates of the marginal means for the regular, low, and very low kneading compaction methods are, respectively: ˆ µ 1 = ( )/2 = ˆ µ 2 = ( )/2 = 79.4 ˆ µ 3 = ( )/2 = 48.8 The above estimates have standard error estimates of: sˆ µ 1 = sˆ µ 2 = sˆ µ 3 = ( ) = ( ) = ( ) =

54 Example: Asphalt Study Anova Table (Type III tests) Response: yield Sum Sq Df F value Pr(>F) (Intercept) e-13 *** aggregate e-05 *** compaction e-08 *** aggregate:compaction e-05 *** Residuals Signif. codes: 0 *** ** 0.01 * Above are the Type III SS results for this analysis. Again, ignore the top row that says (Intercept). The null hypothesis of no interaction between aggregate type and compaction method is: H 0 : µ ij µ kj µ im + µ km = 0 for all i, j, k, and m The F -statistic for this test is F = MSAB/MSE = /11.23 = 42.45, which follows an F -distribution with 2 and 8 df. The p-value for this test is 5.497e-05, which means that the interaction is statistically significant. 54

55 Example: Asphalt Study The tests of hypotheses for equality of the marginal means for A (aggregate type) and B (compaction method) ordinarily are not considered when the interaction is significant due to the hierarchy principle. But for illustration purposes, we will report on the results of these tests. The null hypothesis for no differences among the marginal means for asphalt type is H 0 : µ 1 = µ 2 The F -statistic for this test is F = MSA/MSE = /11.23 = 63.26, which follows an F -distribution with 1 and 8 df. The p-value for this test is 4.551e-05, which means that the aggregate type has significantly different marginal means. The null hypothesis for no differences among the marginal means for compaction method is H 0 : µ 1 = µ 2 = µ 3 The F -statistic for this test is F = MSB/MSE = /11.23 = , which follows an F -distribution with 2 and 8 df. The p-value for this test is 2.879e-08, which means that the compaction method has significantly different marginal means. 55

56 Outline of Topics 1 2 k Factorial Designs 2 Single Replications or Unequal Replications in Factorial Designs 3 Incomplete Block Designs 56

57 Why Incomplete Block Designs? It is sometimes necessary to block EUs into groups smaller than a complete replication of all treatments with a RCBD or Latin square design such designs are called incomplete block designs. Incomplete block designs can be used to decrease experimental error variance (σ 2 ) and provide more precise comparisons among treatments than is possible with a complete block design. Experiments can require a reduction in block size for one of several reasons: Complete block designs can reduce the estimate of σ 2, but sometimes the reduction is insufficient. The number of treatments may be so large as to render a complete block design impractical for reducing σ 2. The natural grouping of EUs into blocks can result in fewer units per block than required by the number of treatments for a complete block design. 57

58 General Notation In using incomplete block designs we will use the notation that t is the number of treatments. We define the block size as k, where in incomplete block designs k < t; i.e., you cannot assign all of the treatments in each block. In short, we have: t = the number of treatments; k = block size; b = number of blocks; and r i = number of replicates for treatment i in the entire design. Remember that an equal number of replications is the best way to ensure that you have minimum variance if you are looking at all possible pairwise comparisons. If r i = r for all treatments, the total number of observations in the experiment is N, where N = tr = bk. 58

59 Incidence Matrix A helpful representation in an incomplete block design is the incidence matrix, which is a matrix representation where we define the experimental design by giving the number of observations n ij for the i th treatment in the j th block. Assuming we have t treatments and b blocks, we write the incidence matrix as follows: Block Treatment 1 2 b 1 n 11 n 12 n 1b 2 n 21 n 22 n 2b t n t1 n t2 n tb Note that for an RCBD, we would have each treatment occur once within each block, so all entries in the above matrix would be 1. For an incomplete block design, the entries in the incidence matrix are a mixture of 0 s and 1 s, which simply indicate whether or not that treatment occurs in that block. In the row and column margins, we often report the sums. 59

60 Incidence Matrix - Example 1 Treatment Block Suppose that we have t = 4, b = 4, (four rows and four columns) and k = 3 (so at each block we can only put three of the four treatments, leaving one treatment out of each block). The incidence matrix for this setting is given above. In this case, the row sums (r i ) and the columns sums (k) are all equal to 3; i.e., r i r. In general, we are faced with a situation where the number of treatments is specified, and the block size, or number of EUs per block is given. This is usually a constraint given from the experimental situation and then the researcher must decide how many blocks are needed to run and how many replicates that provides in order to achieve the precision or the power for the desired test. 60

61 Incidence Matrix - Example 2 Treatment Block Suppose that we have t = 4, b = 4, (four rows and four columns) and k = 2 (so at each block we can only put two of the four treatments, leaving two treatments out of each block). The incidence matrix for this setting is given above. In this case, the row sums (r i ) and the columns sums (k) are all equal to 2; i.e., r i r. 61

62 Incidence Matrix - Example 3 Treatment Block Suppose that we have t = 5, b = 10, (five rows and ten columns) and k = 3 (so at each block we can only put three of the five treatments, leaving two treatments out of each block). The incidence matrix for this setting is given above. In this case, the row sums (r) are equal to 6, while the the columns sums (k) are equal to 3. 62

63 Example: Tomato Seed Germination Experiment Tomatoes often are produced during the winter months in arid and tropical regions. Winter production requires seeding during late summer when soil temperatures can exceed 40 C. A plant scientist wants to determine in what temperature range they could expect inhibition of tomato seed germination for a group of tomato cultivars. t = 4 treatments were chosen to represent a temperature range common for the cultivation are under consideration: 25 C, 30 C, 35 C, and 40 C. The tomato seed was subjected to a constant assigned temperature inside a control chamber. A single control chamber is an EU since true replication of any temperature treatment required a separate run of the temperature in a chamber. Any number of factors could contribute to variation in response between runs since the entire experimental setup had to be repeated for a replicate run. Thus, blocking on runs was considered essential. One complete block and replication of the experiment required four chambers; however, only three chambers were available to the scientist. The natural block of one run had fewer chambers (EUs) than treatments, so an incomplete block design was constructed. 63

64 Example: Tomato Seed Germination Experiment The incidence matrix for this experiment is given below. Three different temperatures were tested in each of the four runs. The runs represent incomplete blocks of three temperature treatments. The treatments were randomly assigned to the chambers in each run. For completeness, we have reported which of the three chambers was used for which treatment in each run. These are given in red and in parentheses. Treatment Run C 1 (A) 1 (C) 1 (B) C 1 (B) 1 (B) 0 1 (B) 3 35 C 0 1 (A) 1 (C) 1 (C) 3 40 C 1 (C) 0 1 (A) 1 (A)

65 Balanced Incomplete Block Designs In all of the examples given thus far, one thing you should have noticed is that the marginal totals in the incidence matrices are all equal thus, there is some notion of balancedness. A balanced incomplete block design (or BIBD) is arranged such that all treatments are equally replicated and each treatment pair occurs in the same block an equal number of times somewhere in the design. The balance obtained from equal occurrence of all treatment pairs in the same block results in equal precision for all comparisons between pairs of treatment means. The number of blocks in which each pair of treatments occurs together is λ = r(k 1)/(t 1), where λ < r < b. The integer value λ derives from the fact that each treatment is paired with the other (t 1) treatments somewhere in the design λ times. There are λ(t 1) pairs for a particular treatment in the experiment. The same treatment appears in r blocks with (k 1) other treatments, and each treatment appears in r(k 1) pairs, therefore λ(t 1) = r(k 1) or λ = r(k 1)/(t 1) For the tomato seed germination study, λ = 3(3 1)/(4 1) = 2. For example, the pair of treatments 25 C and 35 C occur in runs 2 and 3. 65

66 More on Balancing in BIBDs A BIBD can be constructed by assigning the appropriate combinations of treatments to each of b = ( t k) blocks to achieve a balanced design. Frequently, balance is possible with less than ( t k) blocks. When t = b, the BIBD is said to be symmetric. There is no single strategy for constructing all classes of BIBDs; however, there are a vast array of balanced and partially balanced incomplete block designs that are at our disposal. We will discuss partially incomplete block designs later. While balanced, clearly there is nonorthogonality of treatments and blocks. For estimation, we will want to use the Type III SS. 66

67 How to Randomize BIBDs After the basic design has been constructed with the treatments coded, the steps for randomizing a BIBD are as follows: 1 Randomize the arrangement of the blocks of treatment code number groups. 2 Randomize the arrangements of the treatment code numbers within each block. 3 Randomize the assignment of treatments to the treatment code numbers in the plan. We will illustrate each step using the tomato seed germination experiment therefore the basic design plan has t = 4 treatments (temperatures) and b = 4 blocks (runs) of k = 3 EUs (chambers) each. Prior to randomization, the plan is Block

68 How to Randomize BIBDs Step 1: The four treatment groups on the previous slide are (1,2,3), (1,2,4), (1,3,4), and (2,3,4). These must be randomly assigned to the runs (EUs). We choose a random permutation of the number 1 to 4 and assign the four blocks to the four runs. With the random permutation 2, 4, 1, and 3, the assignment is Run Original Block

69 How to Randomize BIBDs Step 2: Assign random treatment code numbers to the three chambers in each run. Choose a random permutation of the numbers 1 to 4 for each chamber and omit the treatment number absent in the run. Four such random permutations along with the assignment to each chamber (A,B,C) in each run are as follows: Chamber Run A B C Permutation

70 How to Randomize BIBDs Step 3: Using our treatment levels of 25 C, 30 C, 35 C, and 40 C, we assign them to a random permutation of the numbers 1 to 4. In our random permutation, we get 2 25 C, 4 30 C, 3 35 C, and 1 40 C. Replacing the numbers 1 to 4 on the previous slide with the corresponding treatment level, we get the following (which matches the design presented earlier for this example): Chamber Run A B C 1 25 C 30 C 40 C 2 35 C 30 C 25 C 3 40 C 25 C 35 C 4 40 C 30 C 35 C 70

71 BIBD Model The linear statistical model for a BIBD is as follows: where Y ij = µ + τ i + ρ j + ɛ ij, Y ij is the observation on the EU in the i th treatment of the j th block; i = 1,..., t and j = 1,..., b; µ is the grand mean; τ i is the fixed effect of the i th treatment; ρ j is the fixed effect of the j th block; and ɛ ij is the experimental error and are iid normal with mean 0 and variance σ 2. Recall that there are r replications of the t treatments in b incomplete blocks of k EUs, such that the total number of observations is N = rt = bk with each treatment pair appearing together in λ = r(k 1)/(t 1) blocks in the experiment. 71

72 Nonorthogonal Effects As noted earlier, the treatments and blocks are not orthogonal in the BIBD, which occurs because all treatments do not appear in each block. Therefore, the SS partition for treatments computed in the manner of complete block designs will not be correct for the incomplete block designs, nor will the observed treatment means provide unbiased estimates of µ i = µ + τ i. Regardless, the parameter estimates and SSTr for the BIBD can be computed with relatively straightforward formulas. 72

73 SS Partition We partition the SSTot in the usual way, as a SS due to the treatments, blocks, and error; however, the SSTr needs to be adjusted for incompleteness. Thus, we have the following partition: where SSTot = t i=1 SSTot = SSTr adj + SSBlk unadj + SSE, b j=1 (yij ȳ )2 SSBlk unadj = t b j=1 (ȳ j ȳ )2 SSTr adj = k t i=1 Q2 i /tλ, where Q i = y i B i/k and B i = b j=1 nijy j is the sum of all block totals that include the ith treatment and n ij = I{treatment i appears in block j} Note that this correction to the treatment total has the net effect of removing the block effects from the treatment total. SSE = SSTot SSTr adj SSBlk unadj 73

74 ANOVA Table and Estimates Source df SS MS Blocks (Unadjusted) b 1 t b j=1 (ȳ j ȳ ) 2 MSBlk unadj Treatments (Adjusted) t 1 k t i=1 Q2 i /tλ MSTr adj Error N t b + 1 SSE MSE Total N 1 t b i=1 j=1 (y ij ȳ ) 2 Above is the ANOVA table for a BIBD note that we have not explicitly written out SSE since (a) it is a long expression and (b) it is obtained by subtracting the other SS quantities from SSTot. Below are the least squares estimates for the linear model: where ˆµ = ȳ, ˆτ i = kq i tλ Q j = y j 1 t n ij y i r i=1 ˆρ j = rq j bλ, 74

75 Example: Catalyst Experiment A chemical engineer thinks that the time of reaction for a chemical process is a function of the type of catalyst employed. Four catalysts are being investigated. The experimental procedure consists of selecting a batch of raw material, loading the pilot plant, applying each catalyst in a separate run of the pilot plant, and observing the reaction time. Since variations in the batches of raw material may affect the performance of the catalysts, the engineer decides to use batches of raw materials as blocks. However, each batch is only large enough to permit three catalysts to be run. Thus, t = 4, b = 4, r = 3, k = 3, and λ = 2. The following table summarizes the results. Treatment Block (Batch) (Catalyst) y i y j B i Q i

76 Example: Catalyst Experiment On the previous slide, note that and that B 1 = y 1 + y 2 + y 4 = = 663 B 2 = y 2 + y 3 + y 4 = = 649 B 3 = y 1 + y 2 + y 3 = = 652 B 4 = y 1 + y 3 + y 4 = = 646 Q 1 = y B 1 = /3 = 3.0 = ˆτ 1 = 3 (4)(2) Q 1 = Q 2 = y B 2 = /3 = 2.3 = ˆτ 2 = 3 (4)(2) Q 2 = Q 3 = y B 3 = /3 = 1.3 = ˆτ 3 = 3 (4)(2) Q 3 = Q 4 = y B 4 = /3 = 6.7 = ˆτ 4 = 3 (4)(2) Q 4 =

77 Example: Catalyst Experiment Below are an alternative to boxplots, which show the partial residuals versus the individual effects (i.e., the blocking factor and the treatment). Clearly there is variability within each of the blocking factors and the treatment. Moreover, note that the batches all tend to differ, while the fourth treatment looks significantly different compared to the other three treatments. Partial for batch Partial for catalyst batch catalyst 77

78 Inferences for Treatment Means in BIBDs Recall that the least squares estimate for a treatment mean µ i is ˆµ i = ˆµ + ˆτ i. The standard error for a treatment mean estimate is sˆµi = ( MSE 1 + rt ) kr(t 1) tλ The standard error of the estimated difference between two treatment means, ˆµ i ˆµ j, is 2kMSE sˆµi ˆµ j = For testing the null hypothesis of no differences among treatment means: the test statistic is H 0 : µ 1 = = µ t vs. H A : not all µ i are equal tλ F = MSTr adj MSE F (t 1),(N t b+1) 78

79 Contrasts Among Treatment Means Treatment contrasts are estimated with least squares estimates of the treatment means, ˆµi, as t c = d i ˆµ i i=1 with standard errors s c = kmse ( t ) d 2 i tλ i=1 For testing H 0 : C = 0 vs. H A : C 0 we can either use the test statistic or we can compute the 1 df SS for the contrast t = c/s c t N t b+1 SSC = tλ( t i=1 d i ˆµ i ) 2 k t i=1 d 2 i and, subsequently, use the test statistic F = SSC/MSE F 1,(N t b+1) 79

80 Example: Catalyst Experiment Df Sum Sq Mean Sq F value Pr(>F) batch ** catalyst * Residuals Signif. codes: 0 *** ** 0.01 * Above is the ANOVA table for the catalyst experiment. Remember that the SS for the batch (block) are unadjusted, while the SS for the catalyst (treatment) are adjusted. Namely, we use Type III SS for getting the adjusted SS. The standard error for a treatment mean estimate is sˆµi = ( MSE 1 + rt ) ( kr(t 1) = 1 + tλ 3(4) ) 3(3)(4 1) = (2) The test of no differences among treatment means has the test statistic of F = with a p-value of Therefore, we reject the null hypothesis and declare that at least one catalyst has a mean response that is significantly different. 80

81 Example: Catalyst Experiment The estimate of the grand mean is ˆµ = ȳ = Thus, we can calculate the least squares of catalyst means as follows: Catalyst ˆτ i ˆµ i = ˆµ + ˆτ i 95% Confidence Intervals (70.124, ) (70.374, ) (70.749, ) (73.749, ) Using the standard error estimate sˆµi = obtained on the previous slide, we can then proceed to compute confidence intervals for the particular treatment mean. 95% confidence intervals are provided in the above table. 81

82 Row-Column Designs for Two Blocking Criteria Latin squares allow one to control variation with two blocking criteria. However, Latin squares may be impractical in some situations because the number of EUs required by the design, N = t 2, can exceed the constraints of the experimental material or treatment numbers can exceed available block sizes. Row-column designs can be used with either rows or columns or both rows and columns as incomplete blocks when two blocking criteria are used. The designs are arranged in p rows and q columns of EUs. 82

83 Example: Car Tire Treatments Recall the car tire experiment where we originally used a Latin square design. The original experiment tested t = 4 car tire treatments (A, B, C, and D) on four tire positions of each car. Thus, the row and column blocking criteria are the cars and tire positions, respectively. Now suppose that the researcher wants to evaluate t = 7 treatments, where the need to control the tire position and car is still necessary. However, the cars obviously only have four positions on which to test the seven treatments. The cars can be used as incomplete blocks with four treatments evaluated on the four tire positions. A row-column design with an incomplete set of treatments (where we have now made tire position the row block and car the column block for convenience) can be given as follows: Tire Car Position C D E F G A B 2 E F G A B C D 3 F G A B C D E 4 G A B C D E F 83

Allow the investigation of the effects of a number of variables on some response

Lecture 12 Topic 9: Factorial treatment structures (Part I) Factorial experiments Allow the investigation of the effects of a number of variables on some response in a highly efficient manner, and in a