TWO-LEVEL FACTORIAL EXPERIMENTS: IRREGULAR FRACTIONS

STAT 512 2-Level Factorial Experiments: Irregular Fractions 1 TWO-LEVEL FACTORIAL EXPERIMENTS: IRREGULAR FRACTIONS A major practical weakness of regular fractional factorial designs is that N must be a power of 2: 8 16 32 64 128 (large gaps) A broader class of 2-level designs for which low-order effects are orthogonally estimable when higher-order effects are assumed zero:

STAT 512 2-Level Factorial Experiments: Irregular Fractions 2 Orthogonal Array of strength t, OA(t): Design for which every subset of t factors (ignoring the rest) forms a full 2 t factorial design, possibly replicated less rigid structure requirement than for regular ff s we don t require that every pair of higher-order effects be either orthogonal or completely confounded allows for more flexibility in the value of N, e.g. an OA(2) could be of size 12 (every pair of factors a 2 2 3 reps...) generally not so easy to construct as regular fractional factorials A useful class of OA s of strength 2:

STAT 512 2-Level Factorial Experiments: Irregular Fractions 3 Plackett-Burman Designs (Biometrika 1946) Resolution III designs (estimable main effects in the absence of interactions) Available for N = 0 mod[4] f + 1 Characterized by first row of design matrix: N first row 8 + + + + 12 + + + + + + 16... (through N 100 in the PB paper)

STAT 512 2-Level Factorial Experiments: Irregular Fractions 4 Construction of N-run design matrix D: 1. use given row for first run 2. use cyclic permutations of the given row for runs 2 through N 1 3. use (... ) for Nth run 4. can (and usually should) randomize: columns in the resulting design matrix to the physical experimental factors. order of run execution

STAT 512 2-Level Factorial Experiments: Irregular Fractions 5 Example, N = 8, f = 7: M = (1, D) = + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

STAT 512 2-Level Factorial Experiments: Irregular Fractions 6 For f = 4, 5, 6, use a subset of these columns For N = 2 c, these are equivalent to 2 f s III regular fractions For N 2 c, these are nongeometric, still providing orthogonal estimates of main effects, but alias structure is more complex

STAT 512 2-Level Factorial Experiments: Irregular Fractions 7 Review of Bias in Linear Models Suppose y = X 1 θ 1 + X 2 θ 2 + ɛ, usual assumptions... But we fit parameters as though model is y = X 1 θ 1 + ɛ ˆθ 1 = (X 1 X 1) 1 X 1 y E[ˆθ 1 ] = (X 1 X 1) 1 X 1 (X 1θ 1 + X 2 θ 2 ) When: = θ 1 + (X 1 X 1) 1 X 1 X 2θ 2 = θ 1 + Aθ 2 θ 1 contains the intercept and main effects θ 2 contains two-factor interactions columns of X 1 are +1/-1 and orthogonal A = 1 N X 1 X 2, the alias matrix Look at this for P-B designs in 8 and 12 runs:

STAT 512 2-Level Factorial Experiments: Irregular Fractions 8 Example, N = 8, elements of A (blanks are zeroes) I A B C D E F G AB -1 AC -1 AD -1 AE -1 AF -1 AG -1 BC -1 BD -1 BE -1 BF -1 BG -1 CD -1 CE -1 CF -1 CG -1 DE -1 DF -1 DG -1 EF -1 EG -1 FG -1

STAT 512 2-Level Factorial Experiments: Irregular Fractions 9 Look at column for A... A = -BF = -CD = -EG... or, I = -ABF = -ACD = -AEG But that s not all...

STAT 512 2-Level Factorial Experiments: Irregular Fractions 10 Continuing this idea: A = -BF = -CD = -EG B = -AF = -CG = -DE C = -AD = -BG = -EF D = -AC = -BE = -FG E = -AG = -BD = -CF F = -AB = -CE = -DG G = -AE = -BC = -DF I = -ABF = -ACD = -AEG I = -ABF = -BCG = -BDE I = -ACD = -BCG = -CEF I = -ACD = -BDE = -DFG I = -AEG = -BDE = -CEF I = -ABF = -CEF = -DFG I = -AEG = -BCG = -DFG Generalized interactions of these are: +BCDF, +BEFG, +ACFG, +ADEF, +ABCE, +CDEG, +ABDG, -ABCDEFG 15 words = 2 4 1... this is a 2 7 4 III regular fractional factorial

STAT 512 2-Level Factorial Experiments: Irregular Fractions 11 Example, N = 12, f = 11, elements of 3 A I A B C D E F G H J K L AB - - - + - - + + - AC - + - - + - + - - AD - + + + - - - - - AE - - + - - - - + + AF + - + - - + - - - AG - + - - - + - + - AH - - - - + + - - + AJ + + - - - - - - + AK + - - + - + - - - AL - - - + - - + + - BC - - - - + - - + + BD - - + - - + - + - BE - - + + + - - - - BF + - - + - - - - +... (blanks are zeroes)

STAT 512 2-Level Factorial Experiments: Irregular Fractions 12 Each main effect is partially aliased with all two-factor interactions of other factors (only) The weight of each aliased term is 1/3 (rather than 1) So, for exmple: E( ˆβ) = β + 1 3 [ (αγ) (αδ)...] There are more two-factor interactions that could bias any main effect, but the bias associated with any one of them is reduced by a factor of 1 3.

STAT 512 2-Level Factorial Experiments: Irregular Fractions 13 Irregular Resolution IV Designs P-B designs are Resolution III because main effects, while orthogonally estimable, would be biased by 2-factor interactions: A = 1 N X 1 X 2 0 Think about the (complete) foldover of a P-B design (e.g. all runs repeated with all factors reversed). Model matrices for the doubled design: X 1 = 1 D 1 D X 2 = X 2 X 2

STAT 512 2-Level Factorial Experiments: Irregular Fractions 14 Alias matrix: A = 1 2N 1 1 D D X 2 X 2 = 1 2N 1 X 2 = 0 because the P-B design is an OA(2) D X 2 D X 2 = 0 therefore A = 0 21 X 2 D X 2 D X 2 So this produces Resolution IV designs with N = the smallest multiple of 8 2(f + 1), which are irregular/nongeometric if N is not a power of 2. e.g. for f = 9 and N = 24 a regular 2 9 4 IV fraction would require 32 runs.

STAT 512 2-Level Factorial Experiments: Irregular Fractions 15 EXERCISE Suppose you begin a study with an N-run OA(2) main-effects design (not necessarily a regular fractional factorial). You decide to augment this initial design with its complete fold-over, e.g. N more runs selected by reversing all factors in all the original runs. What are the statistical properties of the main effect estimates (sampling variances and possible bias due to two-factor interactions) if the two halves must be treated as blocks?

STAT 512 2-Level Factorial Experiments: Irregular Fractions 16 One-Factor-at-a-Time (OAT) Designs For any f, N = f + 1 Version 1: D =... +... +.................. +, X 1 X 1 = f + 1 f 3 f 3... f 3 f 3 f + 1 f 3... f 3 f 3 f 3 f + 1... f 3............... f 3 f 3 f 3... f + 1 Version 2: D =... +... + +............... + +... +, X 1 X 1 = f + 1 f 1 f 3... (f 1) f 1 f + 1 f 1... (f 3) f 3 f 1 f + 1... (f 5)............... (f 1) (f 3) (f 5)... f + 1

STAT 512 2-Level Factorial Experiments: Irregular Fractions 17 Regardless of the value of f, for version 1: {(X 1X 1 ) 1 } i,i = 1 2 (comp. to 1 N for orth. designs) {(X 1X 1 ) 1 } i,j = 1 4 (comp. to 0 for orth. designs) (other than the first row/column) For version 2: {(X 1X 1 ) 1 } i,i = 1 2 {(X 1X 1 ) 1 } i,i+1 = 1 4, other {(X 1X 1 ) 1 } i,j = 0 Why? In each case ˆβ i = half the difference of two data values... So these are much less efficient than orthogonal Resolution III designs Perhaps reasonable only if σ 2 is known to be very small relative to the size of important effects

STAT 512 2-Level Factorial Experiments: Irregular Fractions 18 Supersaturated Designs N < f + 1... full main effects model isn t estimable (i.e. designs aren t Res III) Used in preliminary factor screening, or where operational requirements demand a very small N Effect sparsity must be strongly assumed A popular example: Lin (Technometrics 1993) For f factors, begin by constructing a P-B design in f + 1 factors. Select one column as the branching column, include only the runs associated with one level of this factor (i.e. half the number of runs in the P-B plan). Remove the branching column (which is now confounded with the intercept).

STAT 512 2-Level Factorial Experiments: Irregular Fractions 19 Example: f = 10 Construct f = 11 P-B design with 12 runs; include only the runs with + in the 11th column: + + + + + + + + + + D = + + + + + + + + + + + + + + + + + + + + X 1X 1 has diagonal elements = 6 other row/column 1 elements = 0 all other elements = ±2 Can fit models including the mean and any 4 main effects Stepwise regression is often used for analysis

STAT 512 2-Level Factorial Experiments: Irregular Fractions 20 Can you do this with regular fractions of Resolution III (including P-B designs with N a power of 2)? Yes, but... Suppose you start with I = ABC =... Use A as a branching column. This is the same as splitting again in the regular fractional factorial framework, adding A to the list of factorial effects in the generating relation. Then I = BC results, so B=C The column-orthogonality of these designs prevents simultaneous fitting of any small subset of main effects

STAT 512 2-Level Factorial Experiments: Irregular Fractions 21 Other Irregular Fractions Full 2 f plus a regular fraction All factorial effects are estimable, some are correlated Variance of effect estimates is < σ 2 /2 f, but > σ 2 /N Replication without doubling everything 2 f 1 plus an included smaller fraction (e.g. 2 f 2 ) Same sub-comments as above, but for strings and 2 f 1 3 distinct 2 f 3 s from the same generating relation ( 3/8 rep ) More like P-B designs; complex aliasing, more estimable functions

u Shen Max D. Morris (American Research Institutes Wednesday, MayDepartment 25, 2016 1 / 16 o Augmented Plackett-Burman Designs with Replication and Improved Bias Properties Lu Shen American Research Institutes Washington, D.C. Max D. Morris Department of Statistics Iowa State University Ames, IA Wednesday, May 25, 2016

Setting: Small Two-Level Factorial Designs Resolution III (focus on estimating main effects) No centerpoints (for qualitative factors, or operational restriction to 2 levels) Some Desirable Properties: Orthogonal (maximum precision) Small number of runs Minimal bias associated with higher-order effects Replication These are generally competing properties... can t have it all. A common starting point is orthogonal designs of smallest size. u Shen Max D. Morris (American Research Institutes Wednesday, MayDepartment 25, 2016 2 / 16 o

u Shen Max D. Morris (American Research Institutes Wednesday, MayDepartment 25, 2016 3 / 16 o Plackett-Burman Designs (Biometrika, 1946) Number { of runs is = 0 mod[4] n f + 1, where f is the number of design factors Characterized by first row of design matrix: n first row 8 + + + + 12 + + + + + + 16... Construction of n-run design matrix D: the given row is the first run cyclic permutations of the given row produce runs 2 through n 1 (... ) is the nth run

u Shen Max D. Morris (American Research Institutes Wednesday, MayDepartment 25, 2016 4 / 16 o Plackett-Burman Designs Example, n = 8, f = 7: X 1 = (1, D) = + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + For f = 4, 5, 6 problem factors, use a subset of these columns For n = 2 c, these are equivalent to 2 f s III regular fractions For n 2 c, these are nongeometric, still providing orthogonal estimates of main effects, but with a more complex alias structure

Weaknesses & Doubling Strategies to Address Them D contains no replicates, and main effect estimates can be substantially biased by two-factor interactions 1 Replicate each run in the P-B design: ( ) D D + = D dferror = n But bias properties are the same as for D 2 Add the fold-over (negative) of each run in the P-B design: ( ) D D + = -D Main effect estimates are unbiased by two-factor interactions. But dferror = 0 Each yields an orthogonal design in N = 2n runs. u Shen Max D. Morris (American Research Institutes Wednesday, MayDepartment 25, 2016 5 / 16 o

u Shen Max D. Morris (American Research Institutes Wednesday, MayDepartment 25, 2016 6 / 16 o A Compromise Doubling Strategy Augment D with the same P-B design, but with the columns permuted. D = (c 1, c 2,...c f ), D + = ( D D ) = ( c1 c 2... c f c i1 c i2... c if where (i 1, i 2,..., i f ) is a permutation of (1, 2,..., f ) Do this in such a way as to jointly : maximize the number of replicate run pairs, dferror miminize a measure of potential estimation bias, φ Pareto Optimality: Find designs that: for a given dferror, have the smallest φ AND for a given φ, have the greatest dferror )

u Shen Max D. Morris (American Research Institutes Wednesday, MayDepartment 25, 2016 7 / 16 o Notes All such designs are orthogonal. All such designs have dferror 1, since both halves of the design contain (,,..., ). Because D is formed by cyclic rotation of rows, any column of D or D can be transformed to c 1 by cyclic rotation of rows 1 through n 1. As a result, it is sufficient to look at: For f = f D = (c 1 c 2... c f ) D = (c 1 c i2... c if ) }{{} permutations For f < f D = (c 1 c j2... c jf }{{} combinations from c 2 : c f ) D = (ck1 c k2... c kf ) }{{} permutations from c 1 : c f

u Shen Max D. Morris (American Research Institutes Wednesday, MayDepartment 25, 2016 8 / 16 o φ Mitchell (Technometrics, 1974) suggested using the sum of squared elements of the alias matrix as a bias design measure: If a first-order model is fitted to data from D + by least-squares, and two-factor interactions, β 2, are present, the expectation of the vector of main effects estimates, ˆβ 1, is where E(ˆβ 1 ) = β 1 + A + β 2 A + = 1 N D+ X + 2 and X + 2 is the model matrix associated with two-factor interactions. Following Mitchell, we take φ = trace(a + A + )

u Shen Max D. Morris (American Research Institutes Wednesday, MayDepartment 25, 2016 9 / 16 o Best Values of φ for f = 7 dferror f 1 2 3 4 5 6 7 8 4 - - - 0.8 - - - 0.0 5-3.0-3.0 - - - 6.0 6 6.0 6.0-6.0 - - - 12.0 7 10.5 12.0-15.0 - - - 21.0 Underlined values are Pareto-optimal.

Example: f = 7, f = 6 D = (c 1, c 2, c 3, c 4, c 5, c 6 ) I = -126 = -134 = -245 = -356 = +1235 = +1456 = +2346 Alias structure: 1 = 26 = 34 2 = 16 = 45 3 = 14 = 56 4 = 13 = 25 5 = 25 = 36 6 = 12 = 35 φ = 12, dferror = 0 D = (c 1, c 2, c 3, c 6, c 5, c 4 ) I = -124 = -136 = -256 = -345 = +1235 = +1456 = +2346 D + : Alias structure: φ = 6, dferror = 4 1 = 1 2 24 = 1 2 26 = 1 2 34 = 1 2 36 2 = 1 2 14 = 1 2 16 = 1 2 45 = 1 2 56 3 = 1 2 14 = 1 2 16 = 1 2 45 = 1 2 56 4 = 1 2 12 = 1 2 13 = 1 2 25 = 1 2 35 5 = 1 2 24 = 1 2 26 = 1 2 34 = 1 2 36 6 = 1 2 12 = 1 2 13 = 1 2 25 = 1 2 35 u Shen Max D. Morris (American Research Institutes Wednesday, May 25, Department 2016 10 / 16 o

Example: f = 7, f = 7 D = (c 1, c 2, c 3, c 4, c 5, c 6, c 7 ) Regular 2 7 4 Alias structure: Each main effect aliased with 3 two-factor interactions φ = 21, dferror = 0 D = (c 1, c 2, c 3, c 4, c 6, c 7, c 5 ) Regular 2 7 4 Alias structure: Each main effect aliased with 3 two-factor interactions: for 3 factors, one in common with those in D for 4 factors, no overlap with aliased two-factor interactions in D D + : Alias structure: for 3 factors, complete aliasing with 1 two-factor interaction, partial (- 1 2 ) aliasing with 4 for 4 factors, partial (- 1 2 ) aliasing with 6 two-factor interactions φ = 12, dferror = 2 u Shen Max D. Morris (American Research Institutes Wednesday, May 25, Department 2016 11 / 16 o

u Shen Max D. Morris (American Research Institutes Wednesday, May 25, Department 2016 12 / 16 o Best Values of φ for f = 11 dferror f 1 2 3 4 5 6 7 8 9 10 11 12 8 6.0 6.0 8.7 10.7-9.3 - - - - - 18.7 9 9.3 10.7 14.0 16.0-14.0 - - - - - 24.0 10 13.3 16.0 20.0 24.0-20.0 - - - - - 40.0 11 18.3 26.3 29.0 36.3-35.0 - - - - - 55.0 Underlined values are Pareto-optimal.

u Shen Max D. Morris (American Research Institutes Wednesday, May 25, Department 2016 13 / 16 o Example: f = 11, f = 10 D = (c 1, c 2, c 3, c 4, c 5, c 6, c 7, c 8, c 9, c 10 ) Alias structure: Each main effect is partially aliased with 36 two-factor interactions, each with weight ± 1 3 φ = 40, dferror = 0 D = (c 1, c 2, c 6, c 3, c 10, c 7, c 8, c 4, c 11, c 9 ) D + : Alias structure: Each main effect is partially aliased with 18 two-factor interactions (half of the previous set), each with weight ± 1 3 φ = 20, dferror = 6

u Shen Max D. Morris (American Research Institutes Wednesday, May 25, Department 2016 14 / 16 o Example: f = 11, f = 11 D = (c 1, c 2, c 3, c 4, c 5, c 6, c 7, c 8, c 9, c 10, c 11 ) Alias structure: Each main effect is partially aliased with 45 two-factor interactions, each with weight ± 1 3 φ = 55, dferror = 0 D = (c 1, c 2, c 3, c 4, c 7, c 10, c 5, c 9, c 8, c 11, c 6 ) D + : Alias structure: Each of 9/2 main effect is partially aliased with 27/23 two-factor interactions (subset of the previous set), each with weight ± 1 3 φ = 29, dferror = 3

u Shen Max D. Morris (American Research Institutes Wednesday, May 25, Department 2016 15 / 16 o Conclusion Proposed process: is a compromise between doubling a P-B design with all replicate runs doubling a P-B design with all foldover runs yields two-level orthogonal designs that, compared to the original P-B provide degrees of freedom for estimating error variance reduce potential bias of main effect estimates

References Mitchell, T. J. (1974), Computer Construction of D-Optimal First-Order Designs, Technometrics 16, pp. 211-220.. Plackett, R.L. and J.P. Burman (1946), The Design of Optimum Multifactorial Experiments, Biometrika 33 (4), pp. 305-25. u Shen Max D. Morris (American Research Institutes Wednesday, May 25, Department 2016 16 / 16 o