Two-Stage Controlled Fractional Factorial Screening for Simulation Experiments

Size: px

Start display at page:

Download "Two-Stage Controlled Fractional Factorial Screening for Simulation Experiments"

Dwayne Stevens
6 years ago
Views:

1 Two-Stage Cotrolled Fractioal Factorial Screeig for Simulatio Experimets Hog Wa Assistat Professor School of Idustrial Egieerig Purdue Uiversity 35 N. Grat Street West Lafayette, IN (765) Fax: (765) Bruce E. Akema Associate Professor Departmet of Idustrial Egieerig ad Maagemet Scieces Northwester Uiversity 245 Sherida Rd. Evasto, IL. 628 (847) Fax (847) url: JUNE 23, 26 ABSTRACT Factor screeig with statistical cotrol makes sese i the cotext of simulatio experimets that have radom error, but ca be ru automatically o a computer ad thus ca accommodate a large umber of replicatios. The discrete-evet simulatios commo i the operatios research field are well suited to cotrolled screeig. I this paper two methods of factor screeig with cotrol of Type I error ad power are compared. The two screeig methods are both robust with respect to two-factor iteractios ad o-costat variace. The first method is the sequetial method itroduced by Wa, Akema, ad Nelso (25a) called CSB-X. The secod method uses a fractioal factorial desig i combiatio with a two-stage procedure for cotrollig power that has bee adapted from Bishop ad Dudewicz (978). The two-stage cotrolled fractioal factorial (TCFF) method requires less prior iformatio ad is more efficiet whe the percetage of importat factors is 5% or higher.

2 Key Words Computer experimets, Discrete-evet simulatio, Desig of Experimets, Factor Screeig, Sequetial Bifurcatio Bruce E. Akema is a Associate Professor i the Departmet of Idustrial Egieerig ad Maagemet Scieces at the McCormick School of Egieerig at Northwester Uiversity. His curret research iterests iclude respose surface methodology, desig of experimets, robust desig, experimets ivolvig variace compoets ad dispersio effects, ad desig for simulatio experimets. He is a past chair of the Quality Statistics ad Reliability Sectio of INFORMS, is a Associate Editor for Naval Research Logistics ad is a Departmet Editor for IIE Trasactios: Quality ad Reliability Egieerig. He is a member of ASA ad ASQ. akema@orthwester.edu URL: Hog Wa is a Assistat Professor i the School of Idustrial Egieerig at Purdue Uiversity. Her research iterests iclude desig ad aalysis of simulatio experimets; simulatio optimizatio; simulatio of maufacturig, healthcare ad fiacial systems; quality cotrol; ad applied statistics. She has taught a variety of courses ad is a member of INFORMS ad ASA. Her ad web addresses are hwa@purdue.edu ad 2

3 Itroductio The term computer experimet is commo i the statistical literature (see Sater, Willliams, ad Notz, 23) ad refers to a experimet o a computer simulatio of a physical system such as a material or a process. Typically, computer experimets are thought to be determiistic (havig o radom error other tha roudig error). This is because may egieerig simulatios use fiite elemet or other differetial equatio-based simulatio methods that have o radom compoet. However, i the operatios research commuity, the term simulatio is most commoly used to refer to discrete-evet simulatios of maufacturig or service systems (see Baks, Carso, Nelso ad Nicol, 25). These simulatios are typically complex queueig systems that simulate products or customers flowig through a system. The simulatio is drive by radom umber geerators that draw radom arrival times for the products or customers ad radom service ad failure times for the machies or statios. The iput distributios for these radom umber geerators are based o the distributios observed for each type of machie or statio i the system. Typical software packages for this type of discrete-evet simulatio iclude Witess, Area, Simul8, ad Factory Explorer. May other software packages have bee developed for more specific scearios, for example, MedModel for healthcare simulatio. Although they do have radom oise like physical experimets, discrete-evet simulatio experimets share some attributes with computer experimets: ) I may cases, the whole experimet ca be programmed to ru automatically. 2) Compared to physical experimets, it is usually cheaper ad easier to switch factor settigs i a simulatio experimet. Therefore, the desig of the simulatio experimet is ofte sequetial, usig a desig criterio to choose the ext ru or to stop the experimet. 3

4 3) I may cases, replicatios are ot as expesive as they are i physical experimets. This fact allows for the experimet size to be determied by a desired precisio (or correctess) istead of a preset experimetal budget. I this article, we will compare two approaches for factor screeig i the cotext of discrete-evet simulatio experimets. Factor screeig refers to a experimet that is desiged to sort a large set of factors ito those that have a importat effect o the system i questio ad those that do ot substatially affect the system respose (see Dea ad Lewis, 25). Factor screeig is used for may differet purposes. It is ofte used as a first stage of a optimizatio or ivestigatio procedure iteded to reduce the umber of factors that must be icluded i more detailed ivestigatio. However, there are may other reasos for performig factor screeig icludig () gaiig a better uderstadig of a system by fidig the factors that affect the output ad equally importat fidig those factors which have a egligible effect o the output over the rage of iterest, (2) preparig for future adjustmet or cotrol of a system by fidig a set of factors that ca affect the output, ad (3) simplifyig the simulatio model by removig portios of the model that have little effect o the output of iterest. Typically i factor screeig, a hierarchical argumet is used so factors with importat mai effects are first screeed out ad the additioal experimets are doe to determie if those importat factors iteract with each other or have higher order effects. Sice the cost of replicatio is so high, the emphasis i factor screeig for physical experimets has bee to allow for the estimatio of as may mai effects as possible i as few rus as possible. This has lead to approaches such as supersaturated desigs (see Lu ad Xu, 24) ad group screeig (see Lewis ad Dea, 2). I view of the three characteristics of discrete-evet simulatio experimets listed above, the screeig methods that we will compare i this article will be drive ot by a specific umber of experimetal rus, but istead by a desired level of Type I 4

5 error (the probability of declarig a factor importat whe it is ot) ad power (the probability of correctly declarig a factor importat). Due to the statistical cotrol that we impose o this screeig experimet, we will call this cotrolled screeig. Of course, give that the desired level of statistical cotrol (power ad level of Type I error) is achieved, it is desirable to reduce the umber of replicatios ad thereby reduce the total amout of computer time eeded for the screeig experimet. Thus, the umber of replicatios eeded to achieve the desired level of cotrol will be the criterio for comparig the two screeig methods. I discrete-evet simulatios, the resposes of iterest are ofte system performace measures like average cycle time for a product to pass through the system. For complicated systems, there are ofte hudreds of factors. For example, there may be may statios or machies i the maufacturig system ad each statio or machie may have several factors associated with it. I may cases, these factors have a kow directio of effect o the respose. For example, icreasig the speed of a trasporter will likely decrease the average cycle time if it has ay effect at all. Usig the assumptio that the directios of the effects of the factors are kow ad the hierarchical assumptio that liear effects are a good idicator factor importace, Wa, Akema, ad Nelso (25a) proposed Cotrolled Sequetial Bifurcatio (CSB), a cotrolled factor screeig method for simulatio experimets that uses a form of biary search called sequetial bifurcatio. Sequetial bifurcatio was origially proposed for determiistic screeig by Bettovil ad Kleije (997). CSB was exteded to provide error cotrol for screeig mai effects eve i the presece of two-factor iteractios ad secod order effects of the factors (Wa, Akema, ad Nelso, 25b). The ew method is called CSB-X. A iterestig fidig i Wa, et al. (25b) is that if iteractios are ot preset, CSB-X is usually as efficiet as CSB i idetifyig mai effects (this depeds o the variace structure at differet levels of the factors). Whe two-factor iteractios are preset, CSB is uable to correctly 5

6 cotrol the idetificatio of the mai effects due to cofoudig with the iteractios. Thus, CSB-X is recommeded for cotrolled screeig i almost all cases. The primary purpose of this paper is to compare CSB-X with aother method of screeig, amely a fractioal factorial desig that is used as part of a two-stage method for cotrollig Type I error ad power. I the ext sectio, we briefly describe the objectives of cotrolled screeig. This structure was first itroduced i Wa, et al. (25a). Sectio 3 briefly describes CSB-X ad the itroduces a versio of fractioal factorial screeig that cotrols Type I error ad power usig a two-stage samplig procedure similar to the oe used i Bishop ad Dudewicz (978) for a twoway ANOVA layout. Sectio 4 compares the two methods uder a variety of scearios ad shows the circumstaces uder which each method is preferred. Coclusios are the draw. Objective of the Cotrolled Screeig The purpose of this paper is to compare the performace of two methods of cotrolled screeig. I order to compare the methods fairly, some structure o the screeig problem is ow established. We begi with the model. Suppose there are K factors ad a simulatio output respose represeted by Y, the we assume the followig full secod order model: K K K, () Y = β + β x + β x x + ε k k km k m k= k= m= k where x k represets the level of factor k ad ε is a ormally distributed error term whose 2 variace is ukow ad may deped o the levels of the factors, i.e., ( ) ( x x x ) ε ~Nor(, σ x ), where x =, 2,, K. I this paper, we assume that the factor levels are coded o a scale from - to. Later, we will focus o Resolutio IV, two-level desigs which ofte used i mai effects screeig with o-egligible iteractios. I these desigs, the liear mai effects estimators 6

7 are ot cofouded by ay other terms; however, may of the secod order iteractios may be cofouded together ad the pure quadratic terms will cofouded with the estimate of β. Other desigs could i priciple be adapted to the proposed method, but we leave this for future research. To ease the discussio, we also assume that larger values of the respose, Y, are more desirable. The case of a smaller is better respose is easily hadled by makig the respose egative. The goal of the cotrolled screeig is to provide a experimetal desig ad aalysis strategy to idetify which factors have large liear mai effects, the β k s i (). Two thresholds are established:. is the threshold of importace used to establish the level of Type I error. Specifically, if β <, the the probability of declarig factor k importat should k be less tha α, where α is selected by the experimeter. Typically α = is the critical threshold used to cotrol the power. Specifically, if β, the the probability of declarig factor k importat should be greater tha γ, where γ is selected by the experimeter Typically γ.8. k The scalig of the factors ca be doe i ay way that the experimeter fids useful; however, sice all factor effects are to be compared to the same thresholds, they should be comparable i some sese. Oe method would be to have the respose be dollars of profit produced ad thus all costs for chagig factor levels would be icorporated ito the profit calculatio. Aother scalig method, itroduced i Wa, et al. (25a), assures that all the factor effects are o the same cost scale. For purposes of discussio, we will follow the cost-based scalig method of Wa, et al. (25a). The method assumes that the lower the factor settig, the 7

8 lower the cost. The scalig is such that the cost to chage each factor from the level () to the highest cost level () is a fixed cost, say c*. Wa, et al. (25a) shows how the levels of discrete factors (e.g. the umber of machies or umber of operators) ca be icorporated ito this costbased scalig system by itroducig a weightig for each factor. The beefit of this cost-based scalig is that each liear effect, β k, ca be compared directly to the thresholds ad to the other liear effects sice β k represets the expected chage i the respose whe spedig c* to chage factor k. The two thresholds ow also become easily expressed cocepts: The threshold of importace,, is the miimum amout by which the respose must rise i order to justify a expediture of c* dollars. The critical threshold,, is a icrease i the respose that should ot be missed if it ca be achieved for a expediture of c* dollars. Sice the respose is to be maximized, the it is atural to assume that icreasig the cost of ay factor will either have o effect or icrease the respose. Thus i Wa, et al (25a), it is assumed that the directios of all mai effects are kow ad the levels of the factors are set to make all the mai effects positive, i.e., βk, < k K. Methodology I this sectio, we will first review CSB-X, the method proposed i Wa, et al. (25b) that provides cotrolled factor screeig i the presece of two-factor iteractios. We will the preset a method of cotrolled screeig that uses a two-stage hypothesis testig procedure i a stadard fractioal factorial desig. The secod method will be called TCFF for Two-stage Cotrolled Fractioal Factorial. 8

9 CSB-X, Cotrolled Sequetial Bifurcatio with Iteractios CSB-X is a cotrolled versio of sequetial bifurcatio. Sequetial bifurcatio is a series of steps, where withi each step a group of factors is tested. Withi the group, all factors are completely cofouded together (as i group screeig, see Lewis ad Dea, 2). Therefore, it is critical to CSB-X that the directio of all the factor effects be positive (i.e. βk, < k K ) to prevet cacellatio of effects. If a group of factors is tested ad the group effect (the sum of all the mai effects of the factors i the group) is ot foud to be sigificatly greater tha (the threshold of importace) the all the factors i the group are declared uimportat ad are dropped from further screeig. If the group effect is foud to be greater tha ad there is oly oe factor i the group, the that factor is declared to be importat. If the group effect is foud to be greater tha ad there is more tha oe factor i the group, the group is split ito two subgroups that are to be tested later. This procedure is followed util all factors are classified as either importat or uimportat. Kleije ad Bettovil (997) suggest various practical methods for splittig ad testig the groups. Specifically, splittig groups ito subgroups whose size is a power of 2 ca improve the efficiecy of the procedure. Also, if oe selects for testig groups that have large group effects before groups that have small group effects, the procedure ca sequetially reduce the upper boud o the greatest effect that has ot yet bee idetified. This upper boud is very useful if the procedure is stopped early. Sice we are comparig the statistical cotrol of the procedures, we pla to complete the screeig procedure i all cases ad thus the sequetial upper boud is ot of iterest. Also, we foud i empirical testig, that splittig groups ito subgroups whose size is a power of 2 did ot always improve efficiecy ad ever led to icreases i efficiecy that substatially chaged our coclusios. Thus, for better compariso 9

10 to Wa, et al. (25b), we will use the procedure that simply splits each importat group ito two equally sized subgroups. I the case of a odd group size, the subgroup sizes differ by oe. We will also follow the testig procedure of Wa, et al. (25b). To defie the desig poits of their experimet, the desig poit at level k is defied as x i ( k), i =, 2,3, k =, i = k +, k + 2, K. Thus, level k has all factors with idices greater tha k set to o the to scale ad all factors with idices less tha or equal to k, set to. Similarly, a desig poit at level k is defied as x i ( k), i =, 2,3, k =, i = k +, k + 2, K. To test a group of factors, say factors with idices i the set { k +, k + 2,, k2}, four desig poits are used, level k, level k, level k 2, level k2. To follow the otatio i Wa, et al. (25b), let Z ( k ) represet the observed simulatio output for the th replicatio at level k. The procedure esures that all four desig poits have the same umber of replicatios ad the test statistic, D ( k, k ) 2, is calculated as ( ) ( ) = ( ) ( ) ( ) ( ) D k k Z k Z k Z k Z k, /2. They show that give (), the expected value of D ( k, k ) from the group { k +, k + 2,, k2}, i.e., 2 (, 2) = 2 is the sum of the factors mai effects k k. k= k + E D k k β

11 The group is declared importat if the ull hypothesis, H : k2 β, is rejected at k k= k + level α. The group is declared uimportat if the ull hypothesis is ot rejected, provided that the test used has power γ for the alterative hypothesis, H A : k2 k= k + β. To guaratee this k statistical cotrol, Wa, et al. (25b) provide a fully sequetial test that first takes a iitial umber,, of replicatios at each of the four desig poits ad the adds replicatios oe-at-atime (to all four desig poits) util the group effect ca be declared either importat or uimportat. Eve uder o-costat variace coditios as i (), CSB-X with the fully sequetial test guaratees that the probability of Type I error is α for each factor idividually ad the power is at least γ for each step of the algorithm. TCFF, Two-stage Cotrolled Fractioal Factorial The TCFF method begis by selectig a fractioal factorial desig that is capable of achievig the desired resolutio for the umber of factors beig screeed. Recall that a resolutio III desig ca estimate all mai effects idepedet of each other, but does cofoud mai effects with certai two-factor iteractios (this is similar to CSB). A resolutio IV desig also allows for idepedet estimatio of mai effects, ad i additio removes ay bias due to two-factor iteractios betwee the factors. I order to match the capabilities of CSB-X, we will use resolutio IV desigs for the compariso i this paper. May books (e.g. Wu ad Hamada, 2) ad software packages provide recommeded fractioal factorial desigs for various resolutios ad various values of K, the umber of factors to be screeed. If a recommeded desig caot be easily foud for a particular value of K, a resolutio IV desig ca always be easily costructed by takig a orthogoal array large eough to accommodate K two-level factors ad the foldig it over accordig to the procedure

12 described o page 72 of Wu ad Hamada (2). Sice the foldover procedure doubles the umber of rus i the orthogoal array, this procedure produces a resolutio IV desig for K factors that has at least 2 (K+) desig poits. A very extesive list of orthogoal arrays for differet values of K has recetly bee published by Kuhfield ad Tobias (25). The method described below for cotrollig the power does ot require ay particular resolutio desig ad i fact ca be used with higher resolutio desigs if oe wats to scree for factors with importat two-factor iteractios or higher order effects. Of course higher resolutio desigs will require more desig poits ad potetially may more observatios. Oce the fractioal factorial desig (with N desig poits) is selected, the ext step is to cotrol the power ad Type I error of the mai effect estimates. To accomplish this, we adapt a two-stage method developed to cotrol power for the two-way ANOVA layout by Bishop ad Dudewicz (978). The first stage of the method requires a iitial umber, say, of replicatios of the fractioal factorial desig, i.e., observatios at each desig poit. Sice the fractioal factorial is likely to be quite large, we recommed that be relatively small such as 3 or 5. The secod stage of the method adds replicatios to certai rows of the factorial desig to get better estimates of the mea respose of those rows for which the resposes have high variace. We deote the total umber of replicatios for row i after both stages as i. Thus, the umber of additioal replicatios take for row i i the secod stage is i. The TCFF procedure is represeted graphically i Figure. The keys to the cotrol of Type error ad power for this method are: () the calculatio for i ad (2) careful selectio of the estimators for each factor effect. 2

13 The TCFF Method The Desig Matrix First Stage Secod Stage K factors replicatios i - replicatios N rows Figure. A graphical view of the TCFF Method. Before describig the calculatios for i, a brief overview of the aalysis will be helpful. Oce all the replicatios for stage two are made for each row, a sigle pseudo-observatio, called Y i, will be calculated for each row. Each Y i will be a weighted average of all the observatios from row i. Give the ormal error structure show i (), careful selectio of the umber of rus i each row ad the weights i the weighted averages will result i each of these Y i s havig a distributio which is proportioal to a o-cetral t-distributio with - degrees of freedom. Oce the psuedo-observatios are calculated for each row, the desig is treated like a ureplicated fractioal factorial desig ad the mai effects of each factor are calculated i the stadard way (either with cotrasts as described o page 228 of Motgomery, 2 or through regressio aalysis, see page 3 of Wu ad Hamada, 2). Sice each of the Y i s follows a t- distributio, the ull distributio of each effect estimate ca be foud through Mote Carlo simulatio or through a ormal approximatio. Thus, a α-level test ca be used to test the 3

14 hypothesis: H : βk or H : β k >. The calculatio of the umber of replicatios i the secod stage of the method esures that if the effect of factor k is greater tha, the probability that factor k will be declared importat is greater tha γ. The steps for determiig i are ow give (The formulas are adapted from Bishop ad Dudewicz, 978.):. Let the cdf of the distributio of the average of N idepedet stadard t- distributed radom variables each with degrees of freedom be called (,, ) t N x, where x is the argumet of the cdf. Let c be the -α quatile of this distributio ad c be the -γ quatile of this distributio. These quatiles ca be easily obtaied through simulatio or through a ormal approximatio (see the Appedix). Values for c ad c for some commo values of ad N are give i the tables i the Appedix. 2. Let z = c c 2 ad let s i be the sample stadard deviatio from the replicates of row i from the first stage. The total umber of replicatios eeded for row i after the secod stage of the method is the i 2 s i = max +, +, z where x deotes the greatest iteger less tha x. Thus, i the secod stage, i ew observatios are take for each row i. After all observatios are take, each factor effect, β k, is tested as follows: 4

15 . For each row i the fractioal factorial desig, a weight for each replicatio must be calculated. The jth replicatio i the ith row will be called y ij ad the associated weight will be called a ij. 2. A prelimiary value called b i is first calculated for each row as: b i 2 ( ) iz si = + ( ) s 2 i i i, where is the umber of iitial replicatios i the first stage, i is the total umber of replicatios of row i after the secod stage, z, ad s i is = c c 2 the sample stadard deviatio from the iitial replicates of row i from the first stage. 3. The weight for the jth replicatio i the ith row is a ij ( ) i bi = for j =, 2,,, ad a ij b j = +, + 2,, i. = i for 4. These weights are the used to calculate a sigle weighted average respose for each row, called Y i, where Y. Give (), each Y i has a scaled ocetral t-distributio such that i = a y i ij ij j= K K K Y ~ zt + β + β x + β x x, (2) i i k ik km ik im k= k= m= k where each t i is a idepedet radom variable that has a cetral t-distributio with degrees of freedom. 5

16 5. The desig ca ow be viewed as a ureplicated fractioal factorial desig with observatios Y i. Let the level of the kth factor i the ith row of the desig matrix be x ik. The estimator for the coefficiet of the mai effect for the kth factor is: N ˆ β = x Y. (3) k ik i N i= 6. Each x ik i the fractioal factorial desig will be either or, ad each colum, x k, is orthogoal to the mea, all the other mai effects, ad all two-factor iteractio colums. Sice the t-distributio is symmetric (i.e. t i has the same distributio as t i ), it follows from (2) ad (3) that N ˆ ti βk ~ βk + z, (4) N i= where each t i is a idepedet radom variable that has a cetral t-distributio with degrees of freedom. 7. We ow test the hypothesis: H : βk or H : β k >. The ull hypothesis is rejected oly if ˆk β > + c z sice it follows from (4) that ˆ β > + c z β < < α, Pr k k where c is the -α quatile of the distributio with cdf t ( N,, x) Bishop ad Dudewicz (978) use a theorem from Stei (945) to show that for a two-way ANOVA layout, this procedure provides exact cotrol of Type I error eve if the variace chages across treatmet combiatios. They also exted the results to a multi-way ANOVA i Bishop ad Dudewicz (98). Sice the fractioal factorial desig ca be viewed as a multi-way ANOVA with various mai effects fully cofouded with high order iteractios whose effects 6.

17 are assumed to be zero, the procedure guaratees exact cotrol of Type I error give the orthogoal desig ad the model i (). Power cotrol is achieved by choosig the value for z. The ull hypothesis is rejected oly if ˆk β > + c z. By (4), ˆ Pr βk > + c z βk γ. To cotrol power, z must be the set such that + c z = + c z so that if β, the probability of rejectig the ull hypothesis is greater tha γ (i.e. ˆ Pr βk > + c z βk γ ). Therefore z = c c This cocludes the methodology for TCFF. I the ext sectio, a simple umerical example with a small umber of factors is provided to help clarify the computatios. 2. Numerical Example Suppose that a small maufacturig system has 2 statios ad at each statio there are three factors: the umber of machies (coded M ad M2), the umber of operators (coded O ad O2), ad the frequecy of prevetative maiteace (coded F ad F2). The goal is to use a 6-ru resolutio IV fractioal factorial to scree for factors that have a large effect o the daily throughput of the system. Table shows the desig matrix ad the first stage observatios. We have chose = 3, =, α =.5, γ =.95, ad = 4. 7

18 The Desig Matrix Row M M2 O O2 F F2 i The First Stage Replicatios y y y i2 y i3 i Table. Data from the first stage of the umerical example. s i I order to calculate i for each row, we begi by calculatig z. We ca approximate c ad c, the α ad γ quatiles of the t distributio by applyig the cetral limit theorem ad usig a ormal approximatio. Sice the variace of a t-distributio with 3 degrees of freedom is 3, the variace of the average of 6 idepedet t-distributed radom variables should be approximately ormal with variace 3/6. Thus, c φ ( ) 3.95 =.722 ad 6 c 3 6 φ (.5) =.722, where φ ( x) is the iverse fuctio for the stadard ormal distributio. Usig, Mote Carlo simulatios, we foud values of c =.675 ad c =.675 (see Appedix Table A2). We will use the Mote Carlo simulatio results to 8

19 calculate z as follows: z 2 = = c c 35,66. The calculated values for i are (5, 5, 5, 5, 5, 5, 5, 7, 9, 5, 5, 5, 5, 5, 5, 2) for rows -6 respectively. The i observatios for the secod stage, the b i s, ad Y i s are give i Table 2. Row i5 The Secod Stage Replicatios ad Computatios y y i6 y i7 y i8 y i9 y i y i y i2 b i Table 2. Data from the secod stage of the umerical example. The estimates for the factor effect coefficiets are show i Table 3. Sice + c z = 7, the we see that M ad F2 have importat effects. Coefficiet Mea M M2 O O2 F F2 Estimate Table 3. The estimates for the coefficiets for the mai effect of each factor. Compariso of Methods With ormally distributed error, both CSB-X ad TCFF have guarateed performace for Type I error ad power, eve whe there is uequal variace ad two-factor iteractios preset i the model. Thus, the primary measure for compariso betwee the two methods is the umber of replicatios that it takes to gai the required performace. We set up scearios uder which to test the two methods. Each of these scearios is used for 2 ad 5 factors with %, 9 Y i

20 5%, ad % of the factors importat. The TCFF method uses a 52-ru, resolutio IV desig for the 2 factor cases ad a 24-ru, resolutio IV desig for the 5 factor cases. TCFF uses =3 replicatios for the first stage. CSB-X uses =5. I each sceario, the size of the importat effects is 5 ad the size of uimportat effects is. We set to 2, ad to 4 for all scearios ad the error was ormally distributed. The first 3 scearios have equal variace. I the ext 5 scearios, each of the importat factors also has a dispersio effect, which meas that whe oe of those factors chages level, the variability of the error either icreases or decreases. Half the importat factors icrease the variability ad the other half decrease the variability. Each dispersio effect icreases or decreases the stadard deviatio of the error by 2% i the cases with 2 factors ad by 8% i the cases with 5 factors. I the fial 3 scearios, the stadard deviatio of the error is proportioal to the expected value of the respose. The costat of proportioality was. ad.4 for the 2 ad 5 factor cases, respectively. These are very importat scearios sice for both simulatio experimets ad physical experimets, it is very commo that variability ad average respose value are related. Each of the scearios is described below.. The importat effects are clustered together at the begiig of the factor set ad the stadard deviatio for the radom error is set to 3 for all observatios. 2. The importat effects are distributed at regular itervals throughout the factor set ad the stadard deviatio for the radom error is set to 3 for all observatios. 3. The importat effects are radomly distributed throughout the factor set ad the stadard deviatio for the radom error is set to 3 for all observatios. 4. The importat effects are clustered together at the begiig of the factor set. Each of the importat factors also has a dispersio effect. I this sceario all the positive dispersio effects are clustered at the begiig of the factor set. 2

21 5. The importat effects are clustered together at the begiig of the factor set. Each of the importat factors also has a dispersio effect. I this sceario, the positive dispersio effects are distributed throughout the set of importat factors. 6. The importat effects are distributed throughout the factor set. Each of the importat factors also has a dispersio effect. I this sceario all the positive dispersio effects are clustered at the begiig of the set of importat factors. 7. The importat effects are distributed throughout the factor set. Each of the importat factors also has a dispersio effect. I this sceario, the positive dispersio effects are distributed throughout the set of importat factors. 8. The importat effects are radomly distributed throughout the factor set. Each of the importat factors also has a dispersio effect which is radomly chose to be positive or egative. 9. The importat effects are clustered together at the begiig of the factor set ad the stadard deviatio for the radom error is proportioal to the average respose.. The importat effects are distributed at regular itervals throughout the factor set ad the stadard deviatio for the radom error is proportioal to the average respose.. The importat effects are radomly distributed throughout the factor set ad the stadard deviatio for the radom error is proportioal to the average respose. There were macro-replicatios of each sceario for 2 ad 5 factors ad with %, 5%, ad % of the factors beig importat. Sice both methods elimiate the effects of twofactor iteractios, iteractios were radomly geerated for each simulatio ad the two methods were give the same iteractio matrix for each of the macro-replicatios uder each sceario. The iteractios were geerated from a N (,2) distributio ad radomly assiged 2

22 betwee factors. Factors with importat effects had a 64% chace of iteractio with other importat factors ad a 6% chace of iteractio with factors with uimportat effects. Each pair of uimportat factors had oly a 4% chace of iteractio. Surprisigly, usig the same iteractio matrix did ot seem to iduce much correlatio betwee the macro-replicatios. The average ad stadard deviatio of the umber of replicatios across the macro-replicatios for each method for each sceario are reported i Tables 4 ad 5. The average umber of replicatios is plotted across the scearios i Figures 2-7. It is clear from Figures 2 ad 5 that CSB-X requires fewer replicatios whe oly % of the factors are importat. I most of these cases, the TCFF method takes the miimum umber of rus 4 24 = 496 rus. Figures 3, 4, 6, ad 7 show that the TCFF method is better o average tha CSB-X whe the percetage of importat factors is greater tha 5%. Tables 4 ad 5 show that the stadard deviatio across the macro-replicatios is always lower for TCFF ad that it ca be quite high for CSB-X especially i the scearios where the stadard deviatio of the error is proportioal to the average respose (Scearios 9,, ad ). Also, CSB-X is sesitive to the order i which the importat effects appear, where TCFF is ot. Naturally CSB-X does very well whe the importat effects are clustered, but performs worse whe they are radomly placed or distributed regularly. Fially, recall that CSB-X requires the directio of the factor effects to be kow, where TCFF does ot. Coclusio Two methods, CSB-X ad TCFF, for screeig of a large umber of factors with statistical cotrol of the Type I error ad the power have bee reviewed ad compared. If there are very few importat factors (less tha 5%) ad there is substatial prior iformatio (kow directio of factor effects ad some prior iformatio that allows the importat effects to be clustered), CSB-X ca have substatial gais i efficiecy over the TCFF approach, however, it 22

23 ca have highly variable results depedig o the quality of the prior iformatio ad the extet to which the error variace chages across observatios. The TCFF method requires little prior iformatio, but does ivolve a substatial iitial ivestmet of replicatios. Oce this iitial ivestmet is made, the method is relatively stable ad performs better tha CSB-X whe the percetage of importat factors icreases above 5%. 23

24 Sceario # of Factors # of Importat Factors Sigificat Effects Order Variace Model CSB-X TCFF Ave # of Reps Stdev. of Reps Ave # of Reps 2 2 Clustered Equal Distributed Equal Radom Equal Clustered Clust. Disp. Eff Clustered Dist. Disp. Eff Distributed Clust. Disp. Eff Distributed Dist. Disp. Eff Radom Disp. Eff Clustered Prop. to mea Distributed Prop. to mea Radom Prop. to mea Clustered Equal Distributed Equal Radom Equal Clustered Clust. Disp. Eff Clustered Dist. Disp. Eff Distributed Clust. Disp. Eff Distributed Dist. Disp. Eff Radom Disp. Eff Clustered Prop. to mea Distributed Prop. to mea Radom Prop. to mea Clustered Equal Distributed Equal Radom Equal Clustered Clust. Disp. Eff Clustered Dist. Disp. Eff Distributed Clust. Disp. Eff Distributed Dist. Disp. Eff Radom Disp. Eff Clustered Prop. to mea Distributed Prop. to mea Radom Prop. to mea Table 4. Simulatio results for CSB-X ad TCFF with 2 factors. Stdev. of Reps 24

25 Sceario # of factors # of Importat Factors Sigificat Effects Order Variace Model CSB-X TCFF Ave # of Reps Stdev. of Reps Ave # of Reps 5 5 Clustered Equal Distributed Equal Radom Equal Clustered Clust. Disp. Eff Clustered Dist. Disp. Eff Distributed Clust. Disp. Eff Distributed Dist. Disp. Eff Radom Disp. Eff Clustered Prop. to mea Distributed Prop. to mea Radom Prop. to mea Clustered Equal Distributed Equal Radom Equal Clustered Clust. Disp. Eff Clustered Dist. Disp. Eff Distributed Clust. Disp. Eff Distributed Dist. Disp. Eff Radom Disp. Eff Clustered Prop. to mea Distributed Prop. to mea Radom Prop. to mea Clustered Equal Distributed Equal Radom Equal Clustered Clust. Disp. Eff Clustered Dist. Disp. Eff Distributed Clust. Disp. Eff Distributed Dist. Disp. Eff Radom Disp. Eff Clustered Prop. to mea Distributed Prop. to mea Radom Prop. to mea Table 5. Simulatio results for CSB-X ad TCFF with 5 factors. Stdev. of Reps 25

26 CSB-X vs Two-Stage FFD (% of 2 factors importat) 6 Average # of Reps CSB-X 2-Stage FFD Scearios Figure 2. Average umber of rus with % of 2 factors importat. CSB-X vs Two-Stage FFD (5% of 2 factors importat) Average # of Reps CSB-X 2-Stage FFD Scearios Figure 3. Average umber of rus with 5% of 2 factors importat. CSB-X vs Two-Stage FFD (% of 2 factors importat) Average # of Reps CSB-X 2-Stage FFD Scearios Figure 4. Average umber of rus with % of 2 factors importat. 26

27 CSB-X vs Two-Stage FFD (% of 5 factors importat) Average # of Reps Scearios CSB-X 2-Stage FFD Figure 5. Average umber of rus with % of 5 factors importat. CSB-X vs Two-Stage FFD (5% of 5 factors importat) Average # of Reps CSB-X 2-Stage FFD Scearios Figure 6. Average umber of rus with 5% of 5 factors importat. CSB-X vs Two-Stage FFD (% of 5 factors importat) 6 Average # of Reps CSB-X 2-Stage FFD Scearios Figure 7. Average umber of rus with % of 5 factors importat. 27

28 Appedix: Critical Values for the t distributio: I this appedix, we show how to compute values for c ad c through Mote Carlo simulatio or with a ormal approximatio. Let the cdf of the distributio of the average of N idepedet stadard t-distributed radom variables each with degrees of freedom be called t ( N,, x), where x is the argumet of the cdf. Let c be the -α quatile of this distributio ad c be the -γ quatile of this distributio. Values for c ad c for some commo values of α, γ, ad N are give i Tables A, A2, ad A3. Mote Carlo Simulatio to determie c ad c for t ( N,, x).. Geerate a set of N radom variables, t i {, N}, from a cetral t- distributio with degrees of freedom ad compute the average, called i t j N ti N i = =. 2. Choose a large umber, M, ad repeat Step M times to obtai the average of each set, t j {, M}. j 3. Rak order the t j s so that t ( j) > t ( k) j > k to obtai t ( ) j {, M}. j 4. Let j = Roud [ α M ] ad j Roud [ γ M ] c ( ) j ( ) = t, where Roud [ x ] rouds x to the earest iteger. Normal Approximatio to determie c ad = ( ), the c = t( ) ad c for t ( N,, x). j The variace of a t-distributio with v degrees of freedom is v v 2 for v > 2. If N is relatively large, the by the cetral limit theorem, the average of N idepedet t-distributed 28

29 radom variables is approximately ormal with variace v N v ( 2). Thus, c v N v φ ( 2) ( α ) ad c v N v φ ( 2) ( γ ), where φ ( x) is the iverse fuctio for the stadard ormal distributio. I Tables A, A2, ad A3, critical values c ad c calculated by Mote Carlo simulatio (ad the ormal approximatio) are preseted for cases where the ormal approximatio is ot very accurate. For N > 32, the ormal approximatio is reasoably accurate ad thus tables are ot ecessary. Notice that if ( γ ) α =, the c = c. The ormal approximatio caot be used whe = 3 sice there is o closed form for the variace of the t-distributio with 2 degrees of freedom. = (/a) = (.425) = 5.2 (.63) = 6.3 (.62) = 7.7 (.7) = 8. (.973) = (.95) =.965 (.933) Table A. The critical values, c, for. N = 8 N = 6 N = (/a).98 (.7).848 (.822).737 (.75).726 (.72).72 (.688).668 (.672).676 (.659).426 (/a).758 (.72).63 (.582).536 (.53).57 (.54).57 (.487).483 (.475).473 (.466) α =. These are also the values, c, for γ =.99. (Normal Approximatio results are i paretheses.) 29

30 = 3.29 (/a) = 4.93 (.7) = 5.82 (.822) = (.75) = 7.78 (.72) = (.688) = (.672) =.665 (.659) Table A2. The critical values, c, for.5 N = 8 N = 6 N = (/a).675 (.72).57 (.582).523 (.53).58 (.54).48 (.487).473 (.475).465 (.466).742 (/a).494 (.54).4 (.4).378 (.375).365 (.356).346 (.344).336 (.336).33 (.33) α =. These are also the values, c, for γ =.95. (Normal Approximatio results are i paretheses.) = 3.92 (/a) = (.785) = 5.66 (.64) = (.585) = (.555) = (.536) = 9.58 (.523) =.52 (.54) Table A3. The critical values, c, for. N = 8 N = 6 N = (/a).5 (.555).444 (.543).44 (.44).396 (.392).38 (.379).36 (.37).36 (.363).543 (/a).37 (.392).39 (.32).292 (.292).279 (.277).273 (.268).263 (.262).259 (.257) α =. These are also the values, c, for γ =.9. (Normal Approximatio results are i paretheses.) 3

31 Ackowledgemet Dr. Akema s work was partially supported by EPSRC Grat GR/S9673 ad was udertake while Dr. Akema was visitig Southampto Statistical Scieces Research Istitute at the Uiversity of Southampto i the UK. Dr. Wa s research was partially supported by Purdue research Foudatio (Award No ). We would also like to thak Russell Cheg, Sue Lewis, Barry Nelso, ad a aoymous referee for may helpful discussios ad commets. Refereces Baks, J.; Carso II, J. S.; Nelso, B. L.; ad Nicol, D. M. (25). Discrete-Evet System Simulatio. Fourth Editio, Pretice Hall, New Jersey. Bettovil, B. ad Kleije, J. P. C. (997). Searchig for Importat Factors i Simulatio Models with may Factors: Sequetial Bifurcatio. Europea Joural of Operatioal Research 96, pp Bishop, T. A. ad Dudewicz, E. J. (978). Exact Aalysis of Variace with Uequal Variaces Test Procedures ad Tables. Techometrics 2, pp Bishop, T. A. ad Dudewicz, E. J. (98). Heteroscedastic ANOVA. Sakhya 2, pp Dea, A. M. ad Lewis, S. M., eds. (25). Screeig: Methods for Experimetatio i Idustry. Spriger-Verlag, New York. Kuhfeld, W. F. ad R. D. Tobias, R. D., (25)., Large Factorial Desigs for Product Egieerig ad Marketig Research Applicatios., Techometrics 47, pp Lewis, S. M. ad Dea, A. M. (2). Detectio of Iteractios i Experimets with Large Numbers of Factors (with discussio).. J. R. Statist. Soc. B 63, pp

32 Lu, X. ad Wu, X. (24) A Strategy of Searchig for Active Factors i Supersaturated Screeig Experimets. Joural of Quality Techology 36, pp Motgomery, D. C. (2). The Desig ad Aalysis of Experimets, Fifth Editio, Wiley, 2. Sater, T. J.; Williams, B.; ad Notz, W. (23). The Desig ad Aalysis of Computer Experimets. Spriger-Verlag, New York. Stei, C. (945). A Two Sample Test for a Liear Hypothesis whose Power is Idepedet of the Variace. Aals of Mathematical Statistics 6, pp Wa, H.; Akema, B. E.; ad Nelso, B. L. (25a). Cotrolled Sequetial Bifurcatio: A New Factor-Screeig Method for Discrete-Evet Simulatio. Operatios Research, to appear. Wa, H.; Akema, B. E.; ad Nelso, B. L. (25b). Simulatio Factor Screeig With Cotrolled Sequetial Bifurcatio i the Presece of Iteractios. Workig Paper, Departmet of Idustrial Egieerig ad Maagemet Scieces, Northwester Uiversity, Evasto, IL. Wu, J. C. F. ad Hamada, M. (2). Experimets: Plaig, Aalysis ad Parameter Desig Optimizatio. Wiley, New York. 32

There is no straightforward approach for choosing the warmup period l.

There is no straightforward approach for choosing the warmup period l. B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace.