Simulation Factor Screening With Controlled Sequential Bifurcation in the Presence of Interactions

Size: px

Start display at page:

Download "Simulation Factor Screening With Controlled Sequential Bifurcation in the Presence of Interactions"

Tyler Booker
5 years ago
Views:

1 Simulation Factor Screening With Controlled Sequential Bifurcation in the Presence of Interactions Hong Wan School of Industrial Engineering Purdue University West Lafayette, IN, , U.S.A. Bruce E. Ankenman Barry L. Nelson Department of Industrial Engineering and Management Sciences Northwestern University Evanston, IL , U.S.A. March 28, 2006 Abstract Controlled Sequential Bifurcation (CSB) is a factor-screening method for discrete-event simulations. CSB controls the power for detecting important effects at each bifurcation step and the Type I Error for each unimportant factor under heterogeneous variance conditions when a main-effects model applies. Unfortunately, when factors interact as they nearly always do in computer simulation models then CSB can fail to identify all of the important factors. This paper proposes CSB-X, an improved CSB procedure that has the same error control for screening main effects that CSB does even when two-factor interactions ( X ) are present. The performance of the new method is proven and compared with the original CSB 1

2 procedure. In addition, a new fully sequential hypothesis-testing procedure is introduced that greatly improves the efficiency of both CSB and CSB-X. 1 Introduction Screening experiments are designed to investigate the controllable factors in an experiment with a view toward eliminating the unimportant ones. Controlled Sequential Bifurcation (CSB) has been proposed as an effective and efficient factor-screening method for computer simulation experiments (Wan, Ankenman and Nelson, 2006). CSB emphasizes situations that often occur in simulation experiments, but occur less frequently in physical experiments, including: a large number of factors, factors settings that can be chosen so that all factor effects are positive, easy switching between systems, and the use of common random numbers (CRN) to reduce the variances of estimated effects (Law and Kelton, 2000). CSB extends the basic Sequential Bifurcation (SB) procedure (Bettonvil, 1995; Bettonvil and Kleijnen, 1997; Cheng, 1997) to provide error control for random responses. By incorporating a multi-stage hypothesis-testing approach into sequential bifurcation, CSB is the first screening strategy to simultaneously control the Type I error for each factor and power for each bifurcation step under heterogeneous variance conditions. The methodology is easy to implement and is more efficient than traditional designs in many scenarios, making it attractive for many simulation applications. However, Wan, et al. (2006) assume that the simulation output can be represented by a main-effects model K Y = β 0 + β i x i + ε (1) i=1 2

3 with all β i 0, 0 < i K. The error term, ε, is assumed to be a Nor(0, σ 2 (x)) random variable where σ 2 (x) is unknown and may depend on x = (x 1, x 2,..., x K ), and Y is the simulation output response of interest. Unfortunately, it is not always realistic to assume that there are no interactions (and it is very unusual to know that there are none). Further, even when subject-matter experts can specify the directions of the main effects, the signs of the interactions are rarely known. As we will demonstrate later, if interactions are present and we ignore them, then the results given by CSB can be misleading. Kleijnen, Bettonvil and Persson (2003) proposed a fold-over design for SB in which a more general model (2) that includes all two-factor interactions and quadratic terms is assumed: Y = β 0 + K β i x i + i=1 K 1 i=1 K β ij x i x j + ε (2) j=i where β i 0, 0 < i, j K, is the main effect of factor i, and β ij is the interaction between factor i and j. When i = j, β ii represents the quadratic effect of factor i. The error term, ε, has the same properties as in the main-effects model. The goal of screening is still to identify the factors with important main effects, so the strategy is to include new design points that allow us to eliminate the bias of the higherorder terms on the main-effect estimators. The signs of the interactions are irrelevant and do not influence the final result, so no prior knowledge of interactions is needed. Since the method does not estimate the interactions themselves, it will miss those factors with small main effects but big interactions with other factors. For the purpose of screening, however, it is often reasonable to assume that only the main effects are of interest. For example, if one accepts the heredity property (Wu and Hamada 2000), then for an interaction to be important at least one of its parent factors should be important; therefore, the pursuit of important interactions can focus on interactions that involve at least one of the important main effect. 3

4 In this paper, we propose a modified fold-over design for CSB. At each step, the design gives an unbiased estimator of the main effect of the group, which is defined as the summation of the main effects of all factors in the group, even when two-factor interactions and quadratic terms are present. The new method is called CSB-X, where X represents interactions. It can be proved that CSB-X has the same error control property as CSB for main effects. Moreover, even CSB-X doubles the design points in CSB, in most of cases it is at least as efficient as CSB. We also improve the efficiency of CSB by substantially generalizing the fully sequential testing procedure introduced in Wan, et al. (2006). The original fully sequential testing procedure applies to only a very specific setting, rarely encountered in practice; the new test presented here is general. When variances are large, the more general two-stage hypothesis testing procedure in Wan, et al. (2006) may require many observations to achieve the desired error control. The extended fully sequential procedure introduced here can save as much as two-thirds of the computational effort relative to the two-stage hypothesis testing procedure. The paper is organized as follows: In Section 2, we give a thorough review of the CSB method and its performance guarantees. In Section 3, we discuss the new CSB-X procedure based on fold-over designs. The performance is proven, and the screening effectiveness of the original CSB and CSB-X is compared. The extended fully sequential testing procedure is discussed in Section 4. An empirical evaluation of the performances of CSB and CSB-X with the fully sequential testing procedure is reported in Section 5. A realistic example, first presented in Wan, et al. (2006), is reevaluated with CSB-X in Section 6. 2 Review of CSB In this section, we review the CSB method proposed by Wan, et. al (2006). Let α, γ, 0 and 1 be user-specified parameters with 1 > 0. The primary objective of CSB is to divide 4

5 the factors into two groups: those that are unimportant, which means β i 0 ; and those that are important, meaning β i > 0. For those factors with effects 0, we want to control the Type I Error of declaring them important to be α; and for effects that are 1, we want to provide power for identifying them as important to be γ. For those factors with effects between 0 and 1, the procedure should have reasonable power to identify them. Wan, et al. (2006) proposed a cost model that connects both the thresholds of importance ( 0 and 1 ) and factor levels with the cost to change them to guarantee a fair comparison. More generally, these parameters should be determined by the goal and properties of the specific problem. Both the CSB procedure and the CSB-X procedure described in this paper are independent of how the factor levels and thresholds of importance are determined. A formal description of CSB appears in Figure 1. In each step of this sequential procedure, the cumulative effect of a group of factors is tested for importance. The first step begins with all K factors of interest in a single group and tests that group s effect. If the group s effect is important, indicating that at least one factor in the group may have an important effect, then the group is split into two subgroups. The cumulative effects of these two subgroups are then tested in subsequent steps and each subgroup is either classified as unimportant or split into two subgroups for further testing. As the experiment proceeds, the groups become smaller until eventually all factors that have not been classified as unimportant are tested individually. The method assumes a main-effects model with normal errors, and all factor effects β i 0. More specifically, suppose the group to be tested contains factors {k 1 + 1, k 1 + 2,..., k 2 } with k 1 < k 2. The Test step in Figure 1 tests the following hypothesis to determine if the group might contain important factors: H 0 : k 2 i=k 1 +1 β i 0 vs. H 1 : k 2 i=k 1 +1 β i > 0. Let x i represent the setting of factor i. An experiment at level k is defined as the following 5

6 Initialization: Create an empty LIFO queue for groups. group {1, 2,..., K} to the LIFO queue. Add the While queue is not empty, do Remove: Remove a group from the queue. Test: Unimportant: If the group is unimportant, then classify all factors in the group as unimportant. Important (size = 1): If the group is important and of size 1, then classify the factor as important. Important (size > 1): If the group is important and size is greater than 1, then split it into two subgroups such that all factors in the first subgroup have smaller index than those in the second subgroup. Add each subgroup to the LIFO queue. End Test End While Figure 1: Structure of CSB factor settings: x i (k) = { 1, i = 1, 2,..., k 0, i = k + 1, k + 2,..., K. Thus, level k indicates an experiment at which factors 1, 2,..., k are at their high settings, and factors k+1, k+2,..., K are at their low settings. We use Z(k) to represent the response at level k so that Z(k) = β 0 + k i=1 β i + ε(k) given model (1). Let Z l (k) denotes the lth simulation replication of an experiment at level k, 0 k K. Therefore, D l (k 1, k 2 ) = Z l (k 2 ) Z l (k 1 ) is an unbiased estimator of the group effect k 2 i=k 1 +1 β i. Define a qualified hypothesis test as a test that guarantees: { Pr Declare group effect important { Pr Declare group effect important k 2 i=k 1 +1 β i 0 } α, and k 2 i=k 1 +1 β i 1 } γ. 6

7 Given a qualified procedure and assuming model (1) holds, we can prove the following two theorems for CSB: Theorem 1 CSB guarantees that Pr{Declare factor i important β i 0 } α for each factor i individually. Theorem 2 CSB guarantees that Pr {Declare a group effect important the effect γ} for each group tested. The Type I Error control, described in Theorem 1, is for each factor individually. We will briefly discuss the experiment-wise Type I Error control of CSB by evaluating the expected number of factors that are falsely classified as important, denoted E[F K ], for two extreme cases. To simplify the analysis, we assume that there are K = 2 L factors, where L is an integer, and all tests are independent. Then we have the following two theorems: Theorem 3 If model (1) holds and K i=1 β i 0, then CSB with a qualified testing procedure guarantees that E[F K ] α. Theorem 4 If model (1) holds and β i 0, i = 1, 2,..., K, but β i + β j 1 for all i j, then CSB with a qualified testing procedure guarantees that E[F K ] Kα. Realistic problems should be between these two extreme cases, but closer to Theorem 3. Therefore, CSB provides strong control of the false positive rate, regardless of the number of factors. Wan, et al. (2006) proposed two qualified tests: a two-stage testing procedure and a fully sequential testing procedure for a very special case (α = 1 γ). The two-stage testing 7

8 procedure takes a small number of observations in the first stage. If no conclusion can be made, more samples are collected in the second stage. The fully sequential testing procedure also has a first stage with a small number of observations but then it takes one observation at a time until a conclusion can be made. The fully sequential testing procedure is usually more efficient than the two-stage one. In Section 4, we will extend it to general cases (α 1 γ). In summary, CSB is a new sequential screening method for stochastic simulation experiments. Given an appropriate hypothesis testing procedure, the method control the Type I Error for each factor and power for each bifurcation step. The sequential property of the method makes it well suited for simulation experiments. Examples show that CSB is highly efficient for large-scale problems when there are few important factors and they are clustered since CSB can eliminate unimportant factors in groups. 3 Controlled Sequential Bifurcation with Fold-Over Designs (CSB-X) In this section, we introduce an improved CSB methodology called CSB-X. The new procedure can handle two-factor interactions and gives the same error control for main-effect screening as the original CSB. Moreover, under typical conditions, the new procedure is as efficient, if not more efficient, than CSB. This work exploits the fold-over design for SB of Kleijnen, Bettonvil and Persson (2003). When two-factor interactions exist (model 2), Z(k) = β 0 + k i=1 β i + k and E[D l (k 1, k 2 )] = k 2 i=k 1 β i + k 2 i=k 1 +1 i=1 i j=1 β ij +ε(k) k1 j=1 β ij + k 2 k2 i=k 1 +1 j=i β ij. Therefore, the group effect is biased by the two-factor interactions. Specifically, when large negative interactions exist, important main effects may get cancelled and the screening result is misleading. To eliminate the influence of the interactions, new design points called mirror levels will be run. For each level k 0 the mirror level of the experiment at level k, denoted by level k, is 8

9 defined by the following factor settings: x i ( k) = { 1, i = 1, 2,..., k 0, i = k + 1, k + 2,..., K. Therefore, Z( k) = β 0 k i=1 β i + k i=1 i j=1 β ij + ε( k). The initial experiment at any level consists of N 0 replications, but more generally n k denotes the number of replications that have been taken at level k. If we define Y l (k) = (Z l (k) Z l ( k))/2, 0 k K, and redefine the difference D l (k 1, k 2 ) as D l (k 1, k 2 ) = Y l (k 2 ) Y l (k 1 ), l = 1, 2,..., min{n k1, n k2 }, then it is easy to see that given model (2) the E[Y l (k)] = k i=1 β i and the E[D l (k 1, k 2 )] is k2 i=k 1 +1 β i, which is the sum of the main effects of the group. The effects of interactions and quadratic terms are eliminated. This is the main improvement of CSB-X over CSB and it is accomplished through a simple data transformation from the Z s to the Y s. With this data transformation, CSB-X is exactly the same as CSB. In CSB-X, whenever data are collected at any level k, the same amount of data is also collected at level k. All of the definitions and notations from Wan, et al. (2006) carry over to the new Y l (k). Since both Y l (k) in CSB and Y l (k) in CSB-X are normally distributed estimators of k l=1 β i, the data transformation introduced in CSB-X does not influence the validity of the statistical test, so both the two-stage testing procedure and fully sequential testing procedure introduced in Wan, et al. (2006) are qualified and can be implemented directly. The error control of CSB-X is the same as in CSB (Theorem 1 to Theorem 4). More generally, CSB-X with a qualified hypothesis testing procedure controls the Type I Error for each factor individually and guarantees the power for each step. The screening procedure does not require an equal variance assumption, and is valid with or without CRN when using our tests. 9

10 3.1 Efficiency of CSB-X Because of the introduction of mirror levels, CSB-X doubles the number of design points relative to CSB. However, unlike the traditional fold-over design, the overall computational effort of CSB-X is not necessarily doubled. For any specific group {k 1 + 1, k 1 + 2,..., k 2 } tested, define D CSB (k 1, k 2 ) = Z(k 2 ) Z(k 1 ), and D CSB-X (k 1, k 2 ) = (Z(k 2 ) Z( k 2 ))/2 (Z(k 1 ) Z( k 1 ))/2 for CSB and CSB-X, respectively. Then: Var [D CSB (k 1, k 2 )] = Var(Z(k 2 )) + Var(Z(k 1 )) 2Cov(Z(k 1 ), Z(k 2 )), while Var [D CSB-X (k 1, k 2 )] = 1/4 {Var(Z(k 2 )) + Var(Z(k 1 )) + Var(Z( k 2 )) + Var(Z( k 1 )) 2Cov(Z(k 1 ), Z(k 2 )) 2Cov(Z( k 1 ), Z( k 2 )) 2Cov(Z(k 2 ), Z( k 2 )) 2Cov(Z(k 1 ), Z( k 1 )) + 2Cov(Z( k 1 ), Z(k 2 )) + 2Cov(Z(k 1 ), Z( k 2 )) }. If Var(D CSB (k 1, k 2 )) 2Var(D CSB-X (k 1, k 2 )), then the expected number of runs required in each Test step for the fold-over design may be smaller than the original design for both the two-stage and fully sequential testing procedures in Wan, et al. (2006) and any other test for which the sample size is proportional to the variance of the tested variables. In other words, the expected value of the number of runs in the worst-case scenario for each test is not increased when we use the fold-over design. The intuition is that the fold-over design decreases the variance of the test statistic, so even though the number of design points is doubled, the number of replications taken at each design point decreases. The amount of the decrease depends on the variance and covariance structure between different levels. Here are two special cases: 10

11 If the responses at all levels are independent of each other, then Var [D CSB (k 1, k 2 )] = Var(Z(k 2 )) + Var(Z(k 1 )), and Var [D CSB-X (k 1, k 2 )] = 1/4 {Var(Z(k 2 )) + Var(Z(k 1 )) + Var(Z( k 2 )) + Var(Z( k 1 ))}. If the responses from different levels have the same variance, then the variances of the two test statistics are the same, so CSB-X should be as efficient as CSB at each step. If we implement CRN across all positive levels and all negative levels separately, leaving the responses between positive levels and negative levels independent, then when CRN is effective, Var[D CSB (k 1, k 2 )] = Var(Z(k 2 )) + Var(Z(k 1 )) 2Cov(Z(k 1 ), Z(k 2 )) < Var(Z(k 2 )) + Var(Z(k 1 )), and Var[D CSB-X (k 1, k 2 )] = 1/4 {Var(Z(k 2 )) + Var(Z(k 1 )) + Var(Z( k 2 )) + Var(Z( k 1 )) 2Cov(Z(k 1 ), Z(k 2 )) 2Cov(Z( k 1 ), Z( k 2 )) } < 1/4 {Var(Z(k 2 )) + Var(Z(k 1 )) + Var(Z( k 2 )) + Var(Z( k 1 ))}. In this case, both CSB and CSB-X work more efficiently with CRN than without CRN. If Cov(Z(k 1 ), Z(k 2 )) = Cov(Z( k 1 ), Z( k 2 )) and Var(Z(k 1 )) + Var(Z(k 2 )) = Var(Z( k 1 )) + Var(Z( k 2 )), then CSB-X is as efficient as CSB in each Test step. Other strategies can be analyzed similarly; they may increase or decrease the efficiency of CSB and CSB-X, depending on the specific covariance structure. But our overall conclusion is that we expect CSB-X to be roughly as efficient, and sometimes more efficient, than CSB while also correctly identifying important factors more effectively than CSB. 11

12 4 Extended Fully Sequential Testing Procedure A key feature of CSB-X is that it provides strong statistical inference (Theorems 1 4) relative to other factor-screening methods. There is, however, a price to be paid for such strong inference (particularly the power guarantee) in terms of the simulation effort required to attain it. Although the structure of CSB-X makes it efficient when there are a small number of important factors, the efficiency of the testing procedure used in each step is just as critical to the overall computational cost to run CSB-X. Wan, et al. (2006) described a fully sequential testing procedure for the special case when the experimenter is willing to set α = 1 γ, where α is the required probability of Type I Error and γ is the required power. The procedure is based on a ranking-and-selection procedure due to Kim (2005). The procedure adds one replication at a time and terminates as soon as a conclusion can be made. In most cases, it is much more efficient than their twostage testing procedure that has no such limitation. However, the additional requirement that α = 1 γ severely limits the fully sequential test s usefulness. In typical applications α is chosen quite small, say 0.05, while γ = 0.8, 0.85 or 0.9 is often a sufficient level of power. In this section, we generalize the fully sequential testing procedure so that it can be used when α 1 γ. This is a nontrivial extension that requires a significantly different development. The extended fully sequential testing procedure is a qualified test for CSB-X and CSB, and is invoked each time the Test step is encountered in Figure Procedure Unlike the two-stage testing procedure for which the second-stage sample size is based on the worst-case scenario, the extended fully sequential testing procedure adds one replication at a time to both upper and lower levels of the group being tested. It will reach a conclusion as soon as the information is sufficient. We use r to represent the current number of replications 12

13 at levels k 1 and k 2 (the replications are always paired for the upper and lower levels of the testing group). The test also utilizes S 2 (k 1, k 2 ) = 1 N 0 1 N 0 l=1 ( Dl (k 1, k 2 ) D(k 1, k 2 ) ) 2, the first-stage (initial N 0 observations) sample variance of the paired differences. If CSB-X is implemented, then whenever replications are taken at level k, the same number of replications are made at level k. The extended fully sequential test takes paired observations from each level, one pair at a time, and checks whether r l=1 (D l(k 1, k 2 ) r 0 ) crosses one of two termination boundaries, where r 0 is a drift parameter. If Termination Boundary 1 is crossed, the group is declared unimportant. If Termination Boundary 2 is crossed, the group is declared important. The maximum number of paired observations that will be taken is one more than M(k 1, k 2 ) = a(k 1, k 2 )/λ, where a(k 1, k 2 ) and λ are parameters described below. The procedure is illustrated in Figure 2, where the dots represent the value of the test statistic as a function of the number of paired observations. After the initial N 0 replications, the test adds one replication at a time to both upper and lower levels of the group being tested until a decision is made. The critical region for the extended fully sequential test is a function of the following quantities: ±a(k 1, k 2 ) = ±a 0 S 2 (k 1, k 2 ), the intercepts of the triangular region, and ±λ = ±( 1 0 )/4, the slopes of the sides of the triangular region. The constants a 0 and r 0 are the solutions of the following equations: and 0 0 ( ) ν(y) yθ(x) θ(x) 1 ν(y) φ (r0 0 ) f(x, N 0 1) dxdy = α, (3) θ(x) ( ) ν(y) yθ(x) θ(x) 1 ν(y) φ ( 1 r 0 ) f(x, N 0 1) dxdy = 1 γ, (4) θ(x) 13

14 where ν(y) = e 2λy, φ(x) = e x2 /2 / 2π, θ(x) = N 0 1/( xa 0 /λ), and { 1 f(x, n) = 2 n 2 Γ( n 2 1 e x 2, x 0 2 )xn 0, x < 0, which is the χ 2 distribution with n degrees of freedom, and Γ(n/2) = 0 x n/2 1 e x dx. If α = 1 γ (the symmetric case), then a 0 and r 0 are given as a 0 = 2η(N 0 1)/( 1 0 ) and r 0 = ( )/2, where η = (e 2 ln(2α)/(n 0 1) 1)/2 (Kim 2005). Our contribution is to extend the procedure to the asymmetric case (α 1 γ). Achieving high power often requires a very large number of replications, so we feel it is a significant practical contribution to decouple the choice of α and γ. As in Wan et al. (2006), data (replications) are obtained whenever new groups are formed: When forming a new group containing factors {k 1 +1, k 1 +2,..., k 2 } with k 1 < k 2, the number of observations at level k 1 and level k 2 are evened up in the following way before beginning the test: If n k1 = 0, then collect N 0 observations at level k 1 and set n k1 = N 0. If n k2 = 0, then collect N 0 observations at level k 2 and set n k2 = N 0. If n k1 < n k2, then make n k2 n k1 additional replications at level k 1 and set n k1 = n k2. If n k2 < n k1, then make n k1 n k2 additional replications at level k 2 and set n k2 = n k1. Extended Fully Sequential Test 0. Set r = n k1 = n k2. 1. If r > M(k 1, k 2 ) then (a) If r l=1 (D l(k 1, k 2 ) r 0 ) 0, then stop and classify the group as unimportant. (b) Else stop and classify the group as important. 14

15 r l 1 ( D ( k, k ) r ) l Important TB 2 TB1: Termination Boundary 1 TB2: Termination Boundary 2 N 0 M(k 1, k 2 ) # of samples TB 1 Unimportant Figure 2: Fully Sequential Test 2. Else (i.e., r M(k 1, k 2 )) (a) If r l=1 (D l(k 1, k 2 ) r 0 ) a(k 1, k 2 ) + λr (Termination Boundary 1), then classify the group as unimportant. (b) Else if r l=1 (D l(k 1, k 2 ) r 0 ) a(k 1, k 2 ) λr (Termination Boundary 2), then classify the group as important. (c) Else take one more replication at both levels k 1 and k 2, set r = r + 1, and go to Step 1. When the group effect is significantly larger than 1 or smaller than 0, the test can be much more efficient than a two-stage testing procedure because of the possibility of early 15

16 termination. The procedure proposed here is applicable to both symmetric (i.e., α = 1 γ) and asymmetric cases (i.e., α 1 γ). When α = 1 γ, the parameters a 0 and r 0 can be calculated directly; otherwise the closed forms are not available and numerical methods must be used to solve Equations (3) and (4). See Section Performance of the Extended Fully Sequential Testing Procedure The extended fully sequential testing procedure controls the Type I Error and power for each bifurcation step, i.e., it is a qualified testing procedure. More specifically, we can prove the following theorem for every single bifurcation step. The proof of the performance of the entire CSB and CSB-X procedure then follows as in Wan, et al. (2006, Theorems 1 and 2). Theorem 5 Suppose the responses D l (k 1, k 2 ) are i.i.d. normal random variables. If E [D l (k 1, k 2 )] 0, then P r{declare the group important} α. If E [D l (k 1, k 2 )] 1, then P r{declare the group important} γ. The proof is given in Appendix A. 4.3 Determining Critical Values The challenge of implementing the extended fully sequential testing procedure lies in solving Equations (3) and (4) to get a 0 and r 0. Numerical methods are used to evaluate the double integrations and a two-dimensional line search is then required to find a solution. We obtained a 0 and r 0 by using standard library functions of MatLab (The Mathworks Inc.). The following lemmas make the search relatively easy: Lemma 1 For a 0 fixed, both power and Type I Error are decreasing in r 0. Lemma 2 If the required α 1 2 and γ 1 2, then 0 r

17 The proofs of the lemmas are given in Appendix A. Lemma 2 gives initial upper and lower bounds for r 0, and based on the monotonicity property of Lemma 1, the bounds can be updated during the search process to close in on the solution. 5 Evaluation of CSB and CSB-X in the Presence of Interactions To illustrate the influence of interactions on the screening results of CSB and CSB-X, three sets of factors (K = 10) were explored with 0 = 2 and 1 = 4. The response follows model (2): Y = β 0 + K β i x i + i=1 K 1 i=1 K β ij x i x j + ε where the error ε is normally distributed with mean 0 and standard deviation 1+expected response. We also set α = 0.05 and γ = The main effects β i, i = 1, 2,..., K are fixed while interactions β ij, 0 < i, j K are randomly generated from a normal distribution with mean 0 and variance 4. For each case considered, CSB and CSB-X are applied 1000 times using the fully sequential testing procedure described in Section 4. On each trial a new interaction set (β ij ) is generated. The fraction of trials on which each factor is declared important, Pr{DI}, is recorded, which is an unbiased estimator of Pr{factor i is declared important}. In Case 1, we set (β 1, β 2,..., β 10 ) = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0); the observed frequency that each factor is declared important should be close to 0; in Case 2, (β 1, β 2,..., β 10 ) = (2, 2, 2, 2, 2, 2, 2, 2, 2, 2); the observed frequency that each factor is declared important should be no larger than 0.05; in Case 3, (β 1, β 2,..., β 10 ) = (2, 2.44, 2.88, 3.32, 3.76, 4.2, 4.64, 5.08, 5.52, 6); the observed frequency that β 1 is declared important should be no larger than 0.05, but for β 6, β 7,..., β 10 it should be near The results of the experiments are given in Table 1. We can see that CSB-X gives the desired screening results with appropriate Type I Error and power control. CSB, on the other hand, loses the control of both Type I Error and power j=i 17

18 Table 1: Screening Results ( Pr{DI}) for CSB and CSB-X with Interactions Present Factor # Effect CSB CSB-X Effect CSB CSB-X Effect CSB CSB-X # of replications when the interactions are present. For example, in Case 3 the observed Pr{DI} of β 1 is 0.18 in CSB, significantly larger than the desired Type I Error In CSB-X, the Pr{DI} is 0. For a large factor effect such as β 10 = = 6, the power of identifying the factor as important in CSB is only 0.49, which means half of the time the factor is misclassified. On the other hand, factor 10 is always identified as important by CSB-X. Because CSB gives misleading results in above cases, it is meaningless to compare the number of replications required by the two methods. The number of simulation replications required for screening in Case 1 3 are large because CSB-X guarantees small error when the variances of the responses are large. If we relax the error control requirement, less numbers of replications are needed but the results are also compromised. Specifically, if no power control is required, the normal one-sided t test can be used at each step and the number of replications required will be bounded above by 2n 0 (K + 1) for CSB-X. The extra replications are the price for more accurate results. 18

19 5.1 Comparison of CSB-X and Fractional Factorial Design for Large- Scale Problems In this section we study screening problems with 200 factors and 500 factors and compare CSB-X to a standard unreplicated fractional factorial design (i.e., an orthogonal array). For each case, only 2% of the factors are important. The important factors have effects equal to 5 and the unimportant factors have effects equal to 0. Normal errors are assumed with mean 0 and standard deviation 1 (equal variances across different levels). The threshold of importance, 0, is set to 2; and the critical threshold, 1, is set to 4. The initial number of runs at each level, N 0, is equal to 5. The Type I error is set to be α = 0.05 and power requirement is γ = For each case, there are two scenarios. The first scenario has all important factors clustered together with the smallest indices so that the number of important groups is as small as possible at each step. The second scenario has the important factors evenly spread so there are the maximum number of important groups remaining at each step. CSB-X is more efficient with the first scenario than with the second scenario. For each case and scenario considered, CSB-X with the fully sequential testing procedure is applied 1000 times and the average number of replications required for screening is recorded. The number of replications required for the fractional factorial design is the number of design points required to estimate 200 or 500 main effects with a Resolution III design, which is not influenced by the scenario. The comparison is demonstrated in Table 2, where FFD represents fractional factorial design. We can see that for both the 200 and 500 factors cases, CSB-X only takes approximately 1/3 to 1/2 of the replications required by fractional factorial designs in the clustered scenario. In the other scenario, the fractional factorial design requires fewer replications. On the other hand, if the conditions of Theorem 3 or Theorem 4 are satisfied, the expected number of factors that will be falsely classified as important (E[F K ]) for the fractional factorial 19

20 Table 2: Comparison of CSB-X and Fractional Factorial Design Scenarios Number of Runs Required CSB FFD 200 factors, {1,2,3,4} Important factors, {1,51,101,151} Important factors, {1,2,3,4,5,6,7,8,9,10} Important factors, {1,51,101,151,201,251,301,351,401,451} Important design always equals αk, which is greater than or equal to that of CSB, especially for the conditions of Theorem 3. Further, the fractional factorial design does not have power control. Therefore, from the error-control point of view, CSB-X is superior. Moreover, the typical test implemented in a fractional factorial design assumes equal variance across design points, which CSB does not require. However, the fractional factorial design does not need β i 0 (i > 0). 6 Case Study Revisited Wan, et al. (2006) implemented CSB to study a simplified semiconductor manufacturing system to identify the important machines and transporters that are worthy of investment. In this section we use CSB-X with the extended fully-sequential test to study the case again. The results from CSB-X and CSB are compared. The detailed description of the semiconductor manufacturing system will not be repeated here. In summary, there are two main processes, diffusion and lithography, each of which contains sub-steps. The raw material will be released at the rate of 1 cassette/hour and processed in single-cassette loads, beginning with the CLEAN step of diffusion, and after diffusion proceeding to the lithography process. The diffusion and lithography then alternate until the product completes processing (see Figure 3). The numbers of passes required are different among the products. The time for the movement between each process is negligible 20

21 Table 3: Mean Processing Time per Cassette for Each Step (Hours) and Cost of Machines ($Millions) Stations Fast Machine Cost per Unit Slow Machine Cost per Unit CLEAN LOAD QUARTZ OXIDIZE UNLOAD QUARTZ TEST COAT STEPPER DEVELOP TEST AGV NA NA CONVEYOR NA NA and the movement between diffusion and lithography will be handled by a transporter. The factors we are interested in are the number of fast and slow machines at each sub-step and numbers of each kind of transporter, which are listed in Table 3. In addition to the high and low settings in CSB, CSB-X requires a mirror setting for each factor. To accommodate the requirement, the original low setting for each factor is used as the mirror setting; the original high setting for each factor is used as the low setting; and a new high setting is determined by the cost model in Wan, et al. (2006). The new experimental design is given in Table 4. The simulation programming of the manufacturing system was done in simlib, a collection of ANSI-standard C support functions for simulation (Law and Kelton, 2000). CRN was implemented by assigning each station a separate stream of random numbers. CSB and CSB-X were implemented in C++ with the extended fully sequential testing procedure. For each replication, 365 days of operation were simulated with a 300 hour warm-up period. The performance measure was the long-run average cycle time (hours) weighted by percentage of different products. Here 0 was the minimum acceptable decrease in long-run cycle time, which we set to be 1 hour in this test; and 1 was the decrease in long-run cycle time that 21

22 Raw Material CLEAN COAT LOAD QUARTZ DIFFUSION OXIDIZE UNLOAD QUARTZ TRANSPORTER LITHOGRAPHY STEPPER DEVELOP TRANSPORTER TEST 1 TEST 2 Complete? No Yes Out Figure 3: Production Process of the Semiconductor Manufacturing System 22

23 Table 4: Factor Description and Settings (Unit Number) Factor id Factor Description Mirror Low High 1 Number of slow machines in OXIDIZE Number of fast machines in STEPPER Number of fast machines in COAT Number of slow machines in CLEAN Number of fast machines in TEST Number of fast machines in TEST Number of slow machines in STEPPER Number of slow machines in COAT Number of fast machines in CLEAN Number of slow machines in TEST Number of slow machines in TEST Number of slow machines in LOAD QUARTZ Number of slow machines in UNLOAD QUARTZ Number of fast machines in LOAD QUARTZ Number of fast machines in UNLOAD QUARTZ Number of AGVs Number of slow machines in DEVELOP Number of CONVEYORS Number of fast machines in OXIDIZE Number of fast machines in DEVELOP

24 Table 5: Screening Result of CSB and CSB-X (α = 0.05, γ = 0.90) Methodology Important Factors Number of Replications Required CSB {3,5,6} 225 CSB-X {3,5,6,14,15,16,18,20 } 950 we do not want to miss, which we set to be 2 hours in this test. As shown in Table 5, both CSB and CSB-X concluded that factors 3, 5 and 6 are important. CSB-X also claimed factors 14, 15, 16, 18, and 20 are important. When some factors are identified as important by CSB-X but not by CSB, then negative interactions or quadratic terms may exist so that the important main effects are masked by these negative effects. When factors are identified as important by CSB but not by CSB-X (which did not happen here), the factors identified only by CSB could be explored for significant quadratic effects. The reason is that if a factor has not been identified as important by CSB-X, then with 100(1 α)% confidence its main effect is unimportant. But if it is classified as important by CSB, then the summation of the main and quadratic effects has to be important. So the quadratic term is possibly important. CSB-X consumed 950 replications identifying 8 factors as important while CSB consumed 225 replications identifying 3 factors as important. The more factors that are tested individually, the more replications are required for screening. This was the case for CSB-X. It is possible that factors 3, 5 and 6 have very large main effects so they are easy to recognize for both methods. On the other hand, factors 14, 15, 16, 18 and 20 were identified only by CSB-X, so it may be that these main effects are on the boundary of importance and were easily masked by negative interactions, which caused their misclassification by CSB. 24

25 7 Conclusion CSB-X is a greatly improved CSB procedure which incorporates a fold-over design in the hypothesis test to identify important main effects even when two-factor interactions and quadratic terms are present. Although CSB-X doubles the number of design points compared to CSB, the decrease of the variance of the test statistic typically compensates for this when a main-effects model applies. When interactions exist, CSB is not appropriate so sample size comparisons are irrelevant. Thus, CSB-X provides all the benefits of CSB with more generality and without a loss of efficiency. The fully sequential testing procedure proposed in Wan, et al. (2006) is extended to general cases with all combinations of Type I Error and power. The fully sequential testing procedure is significantly more efficient than the two-stage testing procedure in most cases, and can be used as a standalone test with controlled size and power in applications other than factor screening. References Bettonvil, B Factor screening by sequential bifurcation. Communications in Statistics: Simulation and Computation 24 (1): Bettonvil, B., and J. P. C. Kleijnen Searching for important factors in simulation models with many factors: Sequential bifurcation. European Journal of Operational Research 96 (1): Cheng, R. C. H Searching for important factors: Sequential bifurcation under uncertainty. In Proceeding of the 1997 Winter Simulation Conference, ed. S. Andradóttir, K. J. Healy, D. H. Withers, and B. L. Nelson, Piscataway, New Jersey: Institute of Electrical and Electronics Engineers. Available online via org/wsc97papers/0275.pdf [accessed July 6, 2003]. 25

26 Hartmann, M An improvement of Paulson s procedure for selecting the population with the largest mean from k normal populations with a common unknown variance. Sequential Analysis 10 (1 & 2): Kim, S-H Comparison with a standard via fully sequential procedures. ACM TOMACS, 15(2): Kleijnen, J. P. C., B. Bettonvil, and F. Persson Finding the important factors in large discrete-event simulation: Sequential bifurcation and its applications, In Screening, ed. A. M. Dean and S. M. Lewis. New York: Springer-Verlag, in press. Available online via < bin&id= &mime =application/pdf&file=/staff/kleijnen/ sb-springer-06.pdf> [accessed July 6, 2003]. Law, A. M. and W.D. Kelton Simulation Modeling and Analysis, 3rd Edition, New York: McGraw-Hill. Wan, H., B. E. Ankenman and B. L. Nelson Controlled sequential bifurcation: A new factor-screening method for discrete-event simulation. Operations Research, forthcoming. Wu, J. C. F., and M. Hamada Experiments: Planning, Analysis, and Parameter Design Optimization. New York: John Wiley & Sons, Inc. A Appendix: Proofs A.1 Proof of Lemma 1 For a group {k 1 k 2 } with group effect ζ = k 2 k 1 +1 β i, an arbitrary sample path is SP r0 (r) = r l=1 (D l(k 1, k 2 ) r 0 ), r = N 0, N 0 + 1,.... If r 0 < r 0, then for every sample path, SP r 0 (r) > SP r0 (r) for all r. Thus, every sample path that is classified as unimportant (exits down) under r 0 will also be classified as unimportant under r 0. However, a path that is classified as important (exits up) under r 0, may or may not be classified as important under r 0. Therefore, 26

27 the probability of declaring a group important decreases in r 0, while the probability of declaring it unimportant increases in r 0. A.2 Proof of Lemma 2 Suppose the group effect is ζ and the stopping time of the test is T. Pr{Declare group i as important ζ 0, r 0 = 0 } { T } = Pr D l (k 1, k 2 ) 0 a(k 1, k 2 ) T λ ζ 0 l=1 { T } Pr (D l (k 1, k 2 ) 0 ) a(k 1, k 2 ) T λ ζ = 0 = 1/2. l=1 The last equality holds because E[ T l=1 (D l(k 1, k 2 ) 0 ) ζ = 0 ] = 0; and the region [ a(k 1, k 2 ) + rλ, a(k 1, k 2 ) rλ ] is symmetric. Similarly, Pr{Declare group i as important ζ 1, r 0 = 1 } { T } = Pr D l (k 1, k 2 ) 1 a(k 1, k 2 ) T λ ζ 1 l=1 { T } Pr D l (k 1, k 2 ) 1 a(k 1, k 2 ) T λ ζ = 1 = 1/2. l=1 Combined with Lemma 1, we can conclude that 0 r 0 1. A.3 Proof of Theorem 5 To prove the theorem, we need the following lemma: Lemma 3 (Hartmann 1991) Let X 1, X 2,... be independent and identically distributed Nor(, 1) random variables with > 0. Let S(n) = n j=1 X j, L(n) = a + λn, and U(n) = a λn 27

28 for some a > 0 and λ 0. Let R(n) denote the interval [L(n), U(n)] (R(n) = when L(n) > U(n)), and let T = min{n : S(n) / R(n)} be the first time the partial sum S(n) does not fall in the triangular region defined by R(n). Let E be the event S(T ) L(T ) and R(T ), or S(T ) 0 and R(T ) =. Then Pr{ε} e 2λξ (1 + e 2λξ ) φ ( ) ξ a/λ a/λ dξ a/λ. The proof of the theorem is a direct application of Lemma 3. Let σ 2 = Var(D l (k 1, k 2 )). When the group effect ζ 0, { T } Pr (D l (k 1, k 2 ) r 0 ) min{0, a(k 1, k 2 ) + T λ} l=1 { T } = 1 Pr (r 0 D l (k 1, k 2 )) min {0, a(k 1, k 2 ) + T λ} l=1 { T (r 0 D l (k 1, k 2 )) = 1 Pr σ l=1 [ { T (r 0 D l (k 1, k 2 )) = 1 E Pr σ l=1 [ ( ) e 2 λ σ ξ ξ (r 0 ζ) 1 E 1 + e φ a/λ σ 2 λ σ ξ a/λ 1 E = 1 E [ [ e 2 λ σ ξ 1 + e 2 λ σ ξ φ e 2λy 1 + e 2λy φ { min 0, a(k 1, k 2 ) σ } } + T λ σ { min 0, a(k 1, k 2 ) + T λ } σ σ S2 (k 1, k 2 ) ( ) ξ (r 0 0 ) a/λ σ a/λ ( yσ a/λ (r 0 0 ) a/λ σ ] dξ by Lemma 3 a/λ ] dξ since 0 ζ a/λ ) ] σdy a/λ [ ( ) ] e 2λy = 1 E 1 + e φ yσ 2λy S(k 1, k 2 ) a 0 /λ a0 /λs(k 1, k 2 )(r 0 0 ) σdy σ S(k 1, k 2 ) a/λ ( ν(y) = 1 θ(x) 1 ν(y) φ yθ(x) (r ) 0 0 ) f(x, N 0 1) dydx (5) θ(x) 0 = 1 α. Since θ(x) when x 0, Equation (5) relies on the fact that ( lim θ(x) ν(y) x 0 1 ν(y) φ yθ(x) (r ) 0 0 ) f(x, N 0 1) = 0. θ(x) 28 }]

29 This is because ( lim θ(x) ν(y) x 0 1 ν(y) φ yθ(x) (r ) 0 0 ) f(x, N 0 1) = lim c 1 x (N0 4)/2 e c2 x, θ(x) x 0 where c 1 and c 2 are constants. When N 0 4, the limit clearly equals 0. When N 0 = 2 or 3, using L Hopital s rule, it is easy to show that the limit still equals 0. The same result is also used in the following proof. Assume the group effect ζ 1, { T } Pr (D l (k 1, k 2 ) r 0 ) min{0, a(k 1, k 2 ) + T λ} l=1 { T } = Pr (D l (k 1, k 2 ) r 0 ) min{0, a(k 1, k 2 ) + T λ} l=1 { T (D l (k 1, k 2 ) r 0 ) = Pr σ l=1 [ { T (D l (k 1, k 2 ) r 0 ) = E Pr σ l=1 [ ( ) e 2 λ σ ξ ξ (ζ r 0 ) E 1 + e φ a/λ σ 2 λ σ ξ a/λ E = E [ [ e 2 λ σ ξ 1 + e 2 λ σ ξ φ e 2λy 1 + e 2λy φ { min 0, a(k 1, k 2 ) σ } } + T λ σ { min 0, a(k 1, k 2 ) + T λ } σ σ S2 (k 1, k 2 ) ( ) ξ ( 1 r 0 ) a/λ σ a/λ ( yσ a/λ ( 1 r 0 ) a/λ σ ] dξ by Lemma 3 a/λ ] dξ since 1 ζ a/λ ) ] σdy a/λ [ ( ) ] e 2λy = E 1 + e φ yσ 2λy S(k 1, k 2 ) a 0 /λ a0 /λs(k 1, k 2 )( 1 r 0 ) σdy σ S(k 1, k 2 ) a/λ ( ν(y) = θ(x) 1 ν(y) φ yθ(x) ( ) 1 r 0 ) f(x, N 0 1) dydx θ(x) 0 = 1 γ. }] 29

Improving the Efficiency and Efficacy of Controlled Sequential Bifurcation for Simulation Factor Screening

Improving the Efficiency and Efficacy of Controlled Sequential Bifurcation for Simulation Factor Screening Hong Wan School of Industrial Engineering, Purdue University, West Lafayette, IN, 47907, USA,