DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS

Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS Ruichen Jin, Wei Chen* Integrated DEsign Automation Laboratory (IDEAL) University of Illinois at Chicago Agus Sudianto V-Engine Engineering Analytical Powertrain Ford Motor Comany ABSTRACT Metamodeling aroach has been widely used due to the high comutational cost of using high-fidelity simulations in engineering design. The accuracy of metamodels is directly related to the exerimental designs used. Otimal exerimental designs have been shown to have good sace filling and roective roerties. However, the high cost in constructing them limits their use. In this aer, a new algorithm for constructing otimal exerimental designs is develoed. There are two maor develoments involved in this work. One is on develoing an efficient global otimal search algorithm, named as enhanced stochastic evolutionary (ESE) algorithm. The other is on develoing efficient algorithms for evaluating otimality criteria. The roosed algorithm is comared to two existing algorithms and is found to be much more efficient in terms of the comutation time, the number of exchanges needed for generating new designs, and the achieved otimality criteria. The algorithm is also very flexible to construct various classes of otimal designs to retain certain structural roerties. Key words: metamodeling, otimal design, comuter exeriments, stochastic evolutionary algorithm 1. INTRODUCTION Metamodeling aroach has been widely used due to the high comutational cost of using high-fidelity simulations in engineering design. While the accuracy of an aroximation is directly related to the metamodeling aroach itself, design of comuter exeriments, or called samling (for simulations) also has a considerable effect on the accuracy of a metamodel. As more details will be rovided later, design of comuter exeriments could be formulated as an otimization roblem for which finding the globally otimal design (locations of multile samles) involves combinatorial exhaustive search and is comutationally rohibitive even for a small dimensional Corresonding Author, weichen1@uic.edu. After June 1, 003, Mechanical Engineering, Northwestern University, Evanston, IL 6008-3111, weichen@northwestern.edu. roblem. Develoing an efficient samle construction algorithm for otimizing design of comuter exeriments is the focus of this aer. It is generally believed that a good design for comuter exeriments should be sace-filling which means that the samle oints should sread out over the entire design sace as evenly as ossible to cature the design behavior. Because in most of roblems only a small grou of factors are virtually significant, it is also desired that there are no relicates or significant oint-clustering in the roection of the design onto the subsace of significant (or called effective) factors. Tradeoff often has to be made between the aforementioned sace filling roerty and the roective roerty in lowdimensional subsaces. Various designs (or samling techniques) have been used for comuter exeriments. Koehler and Owen (1996) rovided a good review on design and analysis of comuter exeriments. Simson, et al. (001) comared five different exerimental designs and four metamodeling aroaches in terms of their caability to generate accurate aroximations. Existing methods can be roughly ut into two categories. One category of designs are constructed by combinatorial, geometrical or algebraic aroaches, such as Latin hyercube designs (LHD) (McKay, et al., 1979), orthogonal arrays (OA) (Owen, 199), orthogonal array-based Latin hyercube designs (Tang, 1993), etc. Those designs often have good roective roerty in lowdimensional subsaces; however, their samle oints in high or full-dimensional sace are scattered randomly. The other category of designs are constructed by algorithmic aroaches under certain otimality criteria, such as minimax and maximin designs (Johnson, et. al., 1990), maximum entroy designs (Currin, et al., 1991), integrated mean squared-error (IMSE) designs (Sacks, et. al., 1989), and uniform designs (Fang and Wang, 1994). Those designs usually have good sace-filling roerties. However, obtaining those designs can be either difficult or comutationally intractable. Some otimal designs may not have good roective roerties in low-dimensional subsaces. For instance, Morris and Mitchell (1995) found that 1 Coyright 003 by ASME

maximin distance designs are often concentrated in the corners of the design sace when the number of oints is relatively small comared to the number of variables. To imrove the low-dimensional roective roerty as well as to maintain a good comutational efficiency in samling, some researchers roose to search an otimal design within the class of LHDs, which have good roective roerties in each dimension and still have much freedom in generating distinct candidate designs. Morris and Mitchell (1995) introduced otimal LHDs based the φ criterion (a variant of the maximin distance criterion); Parks (1994) introduced otimal LHDs based on either the maximum entroy criterion or the IMSE criterion; Fang, et al (00) introduced otimal LHDs based on the Centered L discreancy criterion. Other classes of designs that have good roective roerties in two-dimensional (or higher) subsace, e.g., OAbased LHDs, are also romising. Searching the otimal design of exeriments within a class of designs (e.g., LHD), even though more tractable than searching in the entire samle sace without any restrictions, is still difficult to solve exactly. Exhaustive search method is comutationally rohibitive even for a small roblem. For examle, for otimizing 10 4 LHDs (10 runs, 4 factors), the number of distinct designs is more than 10. It is more ractical to solve otimal design (of exeriments) roblems aroximately. Toward this effort, Morris and Mitchell (1995) adated a version of simulated annealing (SA) algorithm for constructing otimal LHDs; Park (1994) develoed a rowwise element exchange algorithm for constructing otimal LHDs; Ye, et al (000) used the columnwise-airwise (CP) algorithm (Li and Wu, 1997) for constructing otimal symmetrical LHDs; Fang, et al (00) adated the threshold acceting (TA) algorithm (essentially a variant of SA) in constructing otimal LHD. The otimal designs constructed by these algorithms have been shown to have a good sace filling roerty. However, the comutational cost of these existing algorithms is generally high. For examle, Ye, et al (000) reorted that generating an otimal 5 4 LHDs using CP could take several hours on a Sun SPARC 0 workstation. For a design as large as 100 10, the comutational cost could be formidable; thus, search rocesses often stoed before finding a good design. In this aer, we roose an algorithm that is able to quickly construct a good design of exeriments given a limited comutational resource but also has the caability of moving away from a locally otimal design. There are two maor develoments involved. One is on develoing an efficient global otimal search algorithm, named as the enhanced stochastic evolutionary (ESE) algorithm. The other is on develoing efficient algorithms for evaluating otimality criteria (such as the entroy criterion, the φ criterion, and the CL criterion) to facilitate the search of otimal exerimental designs. The roosed method is esecially useful for constructing median to large-sized design of exeriments. For examle, for a 100 10 LHD, the roosed algorithm is able to find a good design within minutes, if not within seconds. Furthermore, the algorithm is able to work on different classes of designs and retain certain secial structural roerties, e.g., the balance roerty of LHDs and the orthogonality of OA and OA-based LHDs. Due to the limited sace, in this aer, we show only how it is used to otimize LHDs.. THE TECHNOLOGICAL BASE An exerimental design with n runs and m factors is usually written as an n m matrix X = [x 1, x,..., x n ] T, where each row x T i =[x i1,x i,...,x im ] stands for an exerimental run and each column stands for a factor or a variable. The otimal exerimental design roblem we are interested is to search a design X * in a given design class Z, which otimizes (for simlicity, minimization is considered) a given otimality criterion f, i.e, min f ( X). (1) X Z In Sections.1 and., descritions of exerimental designs with secial structural roerties and different otimality criteria will be given, resectively. Section.3 is an introduction of the existing samle construction algorithms..1. Designs With Secial Structural Proerties.1.1. Balanced Design A column in a design is balanced if the number of runs assigned to each level of the column is the same. A design is called a balanced design if all columns of the design are balanced (c.f., Li, 1997). The concet of balanced designs covers a wide range of designs of interest. For examle, OAs and LHDs are secial cases of balanced designs. A balanced design with n runs and m factors is denoted as: U (... n q1q qm ), where q i is the number of levels for the i th column. If all the q i s in a balanced design are equal, the design is said to be symmetrical and denoted as U ( m n q ) ; otherwise, it is said to be mixed-level or asymmetrical..1.. Latin Hyercube Design A LHD (McKay et al., 1979) is an n m matrix in which each column is a random ermutation of {1,,, n}. It has good roective roerties on any single dimension. LHDs are secial cases of symmetrical balanced designs with its level numbers equal to run numbers. LHDs have been alied in many comuter exeriments where all the factors (variables) are continuous. However, in the case that some factors are discrete or have to be fixed at certain given values, asymmetrical balanced designs are more aroriate: for continuous factors, the number of levels could be set to be equal to the number of runs; for other factors, the number of levels could be set based on the discrete levels of the factors..1.3. Orthogonal Array (OA) and OA-based LHD (OL) A design is called a strength-r orthogonal array and denoted as OA (... n q1q qm; r), if all ossible level combinations for any r factors aear equally often (Owen, 199; Hedayat et al., 1999). OAs are secial cases of balanced designs with orthogonality between columns. Geometrically, the roection of a strength-r OA onto a r-dimensional subsace of factors i 1, i,...,i r will be a q i1 q i..q ir grid. Strength- OAs have been extensively used for lanning exeriments in industry. However, when a large number of factors are studied but only a few of them are virtually significant, OA roected onto the subsace of the significant factors can result in relication of oints. For examle, each one-dimensional roection of an OA 16 (4 5 ;) has only 4 distinct oints. To avoid relications in roections, Tang (1993) roosed to use OA to construct an Coyright 003 by ASME

imroved LHD, called OA-based LHD (OL), which to some degree inherits both the r-dimensional uniformity of a strengthr OA and one-dimensional uniformity of LHD... Otimality Criteria Otimal criteria are used to achieve the sace-filling roerty in design of comuter exeriments. Three widely used otimality criteria are considered in this work...1. Maximin Distance Criterion and φ Criterion A design is called a maximin distance design (Johnson, et al, 1990) if it maximizes the minimum inter-site distance: min d( xi, x ), () 1 i, n, i where d(x i, x ) is the distance between two samle oints x i and x : m t d( x i, x ) = di = xik x k, t = 1or. (3) k = 1 Morris and Mitchell (1995) roosed an intuitively aealing extension of the maximin distance criterion. For a given design, by sorting all the inter-sited distance d i (1 i, n, i ), a distance list (d 1, d,..., d s ) and an index list (J 1, J,..., J s ) can be obtained, where d i s are distinct distance values with d 1 <d <...<d s, J i is the number of airs of sites in the design searated by d i, s is the number of distinct distance values. A design is called a φ -otimal design if it minimizes: 1/ i s φ = J id, (4) i= 1 where is a ositive integer. With a very large, the minimum distance d 1 will dominate all subsequent items. In that case, the φ criterion is equivalent to the maximin distance criterion.... Entroy Criterion Shannon (1948) used entroy to quantify the "amount of information": the lower the entroy, the more recise the knowledge is. From the Bayesian viewoint, the lower the osterior entroy, the smaller is the uncertainty in the rediction of the resonse at unobserved sites. Minimizing the osterior entroy is equivalent to finding a set of design oints on which we have the least knowledge. It has been further shown that the entroy criterion is equivalent to minimizing the following (see, e.g., Koehler and Owen, 1996): log R, (5) where R is the correlation matrix of the exerimental design T matrix X = x, x,...,, whose elements are: [ 1 x n ] m t Ri = ex θ k xik x k,1 i, n;1 t, (6) k = 1 where θ k (k=1,..,m) are correlation coefficients...3. Centered L Discreancy Criterion The L discreancy is a measure of the difference between the emirical cumulative distribution function of an exerimental design and the uniform cumulative distribution function. In other words, the L discreancy is a measure of 1/ t non-uniformity of a design. Among L discreancy, L discreancy is used most frequently since it can be exressed analytically and is much easier to comute. Hickernell (1998) roosed three formulas of L discreancy, among which the centered L -discreancy (CL ) seems the most interesting. n m 13 1 1 CL ( X) = (1 + xik 0.5 xik 0.5 ) 1 n i= 1 k = 1 (7) n n m 1 1 1 1 + (1 0.5 0.5 ). + xik + xk xik xk n i= 1 = 1 k= 1 A design is called uniform design if it minimizes the centered L discreancy (Fang, et al, 000)..3. Existing Algorithms for Constructing Exeriments A tyical exeriment-constructing algorithm searches a good design of exeriments, reresented by X, reeatedly in the following rocedure: 1. Start from a randomly chosen starting design X 0 ;. Construct a new design (or a set of new designs) by some kinds of udating oerations on the current design; 3. Comute the criterion value of the new design and decide whether to relace the current design with the new one. Udating oerations are critical in samle construction algorithms. There are two maor tyes of oerations, i.e., rowwise oerations and columnwise oerations. A rowwise oeration changes a row of a design X, while a columnwise oeration changes a column. A review on columnwise and rowwise algorithms can be found in Li and Wu s (1997). We are interested in columnwise oerations since they are articularly easier to kee the structure roerties of a design in relation to columns, such as the balance and orthogonality roerties introduced earlier in Section.1. For LHDs or balanced designs in general, ermuting (changing the values of) individual elements in a column may still retain the balance structure of the design but could largely change the current design and therefore some good features of the current design will not be inherited and the otimization rocess may dramatically slow down. In this study, we focus on a articular tye of columnwise oeration, called elementexchange, which interchanges two distinct elements in a column and guarantee to retain the balance roerty. Take a 5 4 LHD for examle: 1 4 0 3 4 0 3 1 3 4 4 0 1 0 3 1 Exchange two elements in the second column 1 4 0 3 0 0 3 1 3 4 4 4 1 0 3 1 Figure 1. Element-exchange in a 5 4 LHD Obviously, after the element-exchange, the balance roerty of nd column is retained, i.e., exactly one run is assigned to each of the five levels in the column. Therefore, the design induced by exchanging two elements in the second column of the LHD is still a LHD. Another advantage of using element-exchange, as to be shown in Section 3., is that the evaluation of an otimal criterion of a new design induced by an element-exchange can be very efficient. In the rest of this section, we review three existing otimization algorithms. 3 Coyright 003 by ASME

.3.1. CP algorithm The CP algorithm (Li and Wu, 1997) starts from an n m randomly chosen design X. Each iteration in the algorithm is divided into m stes. At the i th ste, the CP algorithm comares all ossible distinct designs induced by exchanges in the i th column of the current design X, and selects the best design X try from all those designs. If after an iteration, X try is better than X, i.e., f(x try ) < f(x), the rocedure will be reeated; if no imrovement is achieved at an iteration, the search will be terminated. The CP algorithm could quickly find a locally otimal design. However, deending on the starting design, the otimal design obtained could be of low quality. In ractice, with the CP algorithm the otimization rocess needs to reeat for N s cycles from different starting designs and the best design among the otimal designs from all cycles is selected..3.. SA algorithm The SA algorithm (Morris and Mitchell, 1995) begins with a randomly chosen design, and roceeds through examination of a sequence of new designs, each generated by a randomly chosen element-exchange within a randomly chosen column of the current design X. A new design X try relaces X if it leads to an imrovement. Otherwise, it will relace X with robability of ex{ [ f ( Xtry ) f ( X)] T}, where T is a arameter called temerature in the analogous hysical rocess of annealing of solids. Initial set to T 0, T will be monotonically reduced by some cooling schedule. Morris and Mitchell used T = αt as the cooling schedule, where α is a constant called cooling factor here. SA usually converges slowly to a high quality design..3.3. TA algorithm The TA algorithm (Winker and Fang, 1998) is essentially a variant of the SA. Instead of acceting a new design with some robability, TA determines whether to accet a new design X try by using a simle deterministic accetance criterion: f ( X try ) f ( X) T h, where T h is called threshold. T h is monotonically reduced based on some cooling schedule. TA has been used for constructing uniform designs (c.f., Fang 000; Fang, et al, 00). 3. PROPOSED ALGORITHM FOR CONSTRUCTING OPTIMAL EXPERIMENTAL DESIGN In this section, a new algorithm for constructing otimal exerimental design is resented. Using the columnwise element-exchange as the basic oeration, our roosed algorithm can be used to find efficiently an otimal design that maintains the secial structural roerty of a articular class of design, e.g., to obtain an otimal LHD when randomly choosing LHD as the starting design. To overcome the difficulties associated with the existing methods and to achieve much imroved efficiency, our roosed method adats and enhances a global search algorithm, i.e., the stochastic evolutionary algorithm (Section 3.1), and utilizes efficient algorithms for evaluating different otimality criteria (Section 3.) that significantly reduces the comutational burden. 3.1. Enhanced Stochastic Evolutionary (ESE) Algorithm The enhanced stochastic evolutionary (ESE) algorithm is used in this work to control the entire rocess of searching otimal designs. The method is adated and enhanced from the stochastic evolutionary (SE) algorithm, which was develoed by Saab and Rao (1991) for general combinatorial otimization alications. Similar to TA, SE decides whether to accet a new design by a threshold-based accetance criterion. However, the strategy (or schedule) for SE to change the value of threshold is different from TA or SA. The threshold T h is initially set to a small value T h0. The value is incremented based on certain warming schedules only if it seems that the algorithm is stuck at a local otimum; whenever a better solution is found in the rocess of warming u, T h is set back to T h0. It is shown (Saab and Rao, 1991) that SE can converge much faster than SA and be caable of moving away from low quality local otimum to find a high quality solution. However, ractically, it is often difficult to decide the value of T h0 and the warming schedule for different roblems. The ESE algorithm develoed in this work uses a sohisticated combination of warming schedule and cooling schedule to control T h so that the algorithm can be self-adusted to suit different exerimental design roblems (i.e., different classes of designs, different otimality criteria and different sizes of designs). The ESE algorithm, as shown in Fig, consists of double loos, i.e., the inner loo and the outer loo. While the inner loo constructs new designs by element-exchanges and decides whether to accet them based on an accetance criterion, the outer loo controls the entire otimization rocess by adusting the threshold T h in the accetance criterion. In the entire rocess, X best is used to kee track of the udated best design. No Yes Inner Loo Randomly ick J distinct element-exchanges within column (i mod m) X = X try n act =n act +1 f(x) < f(x bst ) X bst = X n im =n im +1 i = 0, n act = 0, n im = 0 Choose the best design X try from J design induced by exchanges Yes i=i+1 f(x try )-f(x) T random(0,1) No i < M No Yes Outer Loo Initialize X = X 0, X best = X T h = T h0 X old_best = X best i = 0, n act = 0, n im = 0 Inner Loo f(x old_best )- f(x best ) > tol Yes No flag im = 1 flag im =0 Udate T h Stoing criterion is satisfied? Sto Yes Figure. Flowchart of the ESE Algorithm No 4 Coyright 003 by ASME

3.1.1. Inner Loo The inner loo has M iterations. Generally, at iteration i, the algorithm randomly icks J distinct element-exchanges in i (mod m) column of the current design X and chooses the best design X try based on the values of otimal criterion. If X try is better than the current design X, it will be acceted to relace X; otherwise, X try will be acceted to relace X if it satisfies the following accetance criterion: f Th random(0,1), (8) where f = f ( X try ) f ( X), random(0,1) is a function that generates uniform random numbers between 0 and 1 and T h > 0 is a control arameter, which is called threshold here. If f T h, X try will never be acceted and if 0 < f < Th, let S = random(0,1), then X try will be acceted with robability: P( S f Th ) = 1 f T h. (9) With this accetance criterion, a temorarily worse design could be acceted and a slightly worse design (i.e., a small f ) is more likely to relace the current design than a significantly worse design (i.e., a large f ). In addition, a given increase in criterion value is more likely to be acceted if T h has a relatively high value. The setting of T h will be discussed later. The values of arameters involved in the inner loo, i.e., J and M, are re-secified. Choice of J: This arameter is the number of distinct element exchanges generated at each iteration. The algorithm will comare the designs induced by those exchanges and find the best design. This treatment, as CP, could enable the algorithm raidly find a locally otimal design. However, unlike CP, which comares all ossible distinct designs induced by exchanges, our algorithm only randomly icks J distinct designs resulted from exchanges. This randomized behavior together with the accetance criterion is intended to allow the search to escae from locally otimal designs. Based on our testing exerience, too large of J may make it more ossible to be stuck in a locally otimal design for small-sized designs and lead to low efficiency for large-sized designs. In our test, we set J to be n e /5 but no large than 50, where n e is the number of all n ossible distinct element-exchanges in a column ( for a q i LHD and ( n / qi ) for a balanced design). For mixedlevel balanced designs, the values of J will be different for different columns. Choice of M: The arameter is the number of iterations in the inner loo, i.e., the number of tries the algorithm will make before going on to the next threshold T h. It seems reasonable that M should be larger for larger roblems. In our test, we set M to be n e m / J but no larger than 100. 3.1.. Outer Loo Deending on whether any imrovement in criterion is made in a cycle (a run of the inner loo), the search rocess of ESE (and similarly that of the original SE) can be divided into two rocesses: the imroving rocess and the exloration rocess. Once the criterion is imroved after a cycle, i.e., flag im = 1, the search rocess will be turned to the imroving rocess; on the other hand, if no imrovement is made in a criterion after a cycle, i.e., flag im = 0, the search rocess will be turned to the exloration rocess. The imroving rocess is intended to raidly find a locally otimal design, while the exloration rocess is intended to hel the algorithm escae from a locally otimal design. The maximum number of cycles is used as the stoing criterion. The outer loo controls the otimization rocess by udating the value of the threshold T h. At the beginning of the otimization rocess, T h is set to be a small value, i.e., T h0 = 0.005 criterion value of the initial design. Unlike the original SE, in the imroving rocess of ESE, the value of T h will not be fixed to T h0, rather it will be adusted and maintained on a small value that is suitable to a secific roblem based on n ac, the number of acceted designs in the inner loo, and n im, the number of better designs found in the inner loo. In the exloration rocess, T h is increased and decreased in a relatively large range based on n ac. Based on our tests, the following roosed schedules for controlling T h is found to work very well for different exerimental design roblems: 1. In the imroving rocess, T h is maintained on a small value so that only better design or slightly worse design will be acceted. Secifically, T h will be decreased if the accetance ratio n act is larger than a small ercentage (e.g., 0.1) of the number of total designs J and n im is less than n act ; T h will be increased otherwise. The following equations are used in our algorithm to decrease and increase T h, resectively, T' = α1t and T ' = T / α1, where 0 <α 1 < 1. The setting of α 1 = 0.8 aears to work well in all tests... In the exloration rocess, T h will fluctuate within a range based on the value of n act. If n act is less than a small ercentage (e.g., 0.1) of J, T h will be raidly increased until n act is larger than a large ercentage (e.g. 0.8) of J. If this haens, T h will be slowly decreased until n act is less than the small ercentage. This rocess will be reeated until an imroved design is found. The following equations are used to decrease and increase T h, resectively, T' = α T and T ' = T / α 3, where 0 < α 3 < α < 1. Based on our exerience, we set α = 0.9 and α 3 = 0.7. T h is increased raidly (so that more worse designs could be acceted) to hel moving away from a locally otimal design. T h is decreased slowly for searching better designs after moving away from the local otimal design. 3.. Efficient Algorithms for Evaluating Otimality Criteria As an otimality criterion is reeatedly evaluated whenever a new design of exeriments is constructed, the efficiency of this evaluation becomes critical for otimizing the design of exeriment within a reasonable time frame. In this work, we roose efficient evaluation methods that take into account the feature of our udating oeration, i.e., when using columnwise element-exchanges for generating new designs, only two elements in the design matrix are involved each time. The evaluations of otimal criteria, such as φ criterion, the entroy criterion, and the CL criterion, involve different tyes of matrices (e.g., the inter-distance matrix D, the correlation matrix R, and the discreancy matrix C, resectively). Reevaluating all the elements in the matrices each time is not affordable, esecially if the matrix size is large (determined by the number of exeriments and number of factors). Comlete descritions of our algorithms for the aforementioned criteria 5 Coyright 003 by ASME

can be found in Jin (003). Illustration is only rovided here for the algorithm associated with the φ criterion. The comutational savings for all algorithms will be summarized. The re-evaluation of φ based on Eq. 4 includes three arts, i.e., the evaluation of all the inter-site distances, the sorting of those inter-site distances to obtain a distance list and index list, and the evaluation of φ. The evaluation of all the inter-site distances will take O(mn ), the sorting will take O(n log (n)) (c.f. Press, et al, 1997), and the evaluation of φ will take O(s log ()) (since is an integer, -owers can be comuted by reeated multilications). In total, the comutational comlexity will be O(mn )+O(n log (n))+o(s log ()). Therefore, re-evaluating φ will be very time-consuming. Before introducing the new algorithm, a new equation of φ is first rovided, which hels develo an efficient evaluation algorithm by avoiding the sorting required by Eq. 4. Let D = [ d i ] n n be a symmetric matrix, whose elements are the inter-site distances of the current design X, the new equation, called -norm form here, is exressed by: 1/ = ( 1/ di ) = di 1 i< n 1 i< n 1/ φ. (10) The equivalence between this form and Eq. 4 can be easily roved, which is omitted here. Our new algorithm takes into account the fact that after an exchange ( x i k x 1 ik ), only elements in rows i 1 and i and columns i 1 and i are changed in D matrix. For any 1 nand i1, i, let: then: and t = xi k x k x i1k t k s( i1, i, k, ) x, (11) d i i = 1 1 i + t 1/ [ d s( i, i, k, ] t ' = d ' 1 ) (1) 1 t 1/ [ d s( i, i, k, ] t di ' = d i ' = i 1 ). (13) With the above reresentation, the comutational comlexity of udating the elements in D matrix is O(n). The new φ can be comuted by: [( di ') di ] + [( di ' di ] ' φ = φ + 1 1 ), (14) 1 n, i1, i 1 n, i1, i of which the comutational comlexity is O(n log ()). The total comutational comlexity of the new algorithm is O(n)+O(n log ()). This results in significant reduction of comutation comared to re-evaluating φ. The new algorithm for evaluating the entroy criterion involves a new Cholesky decomosition algorithm, while that for evaluating the CL criterion emloys a similar idea as that for the φ criterion. A comarison of the comutational comlexity of totally re-evaluating all elements in matrices and those of our new algorithms are summarized in Table 1. From the table, we find that for the φ criterion and the CL criterion, with the new algorithms, the efficiency can be significantly imroved. The new comutational comlexity is close to O(n) in both cases. However, for the entroy criterion, because of the 1/ involvement of matrix determinant calculation, the efficiency is not imroved dramatically (comlexity larger than O(n )). Table 1. Comutational Comlexity of Criterion Evaluation φ CL Entroy Re-evaluating O(mn )+O(n log (n)) Algorithms +O(s O(mn log ) O(n 3 )+O(mn ) ()) New O(n)+O(n log Algorithms ()) O(n) O(n )+O(n) ~ O(n 3 )+O(n) Table rovides illustrative results on the efficiency. The ratio between the time (T r ) needed to totally re-evaluating all matrix elements and the time (T n ) needed by our new algorithm shows the imrovement. The emirical results match with our analytical examinations earlier. We also found that the larger the size of an exerimental design, the more savings the algorithm will make. Comared to other two algorithms, the entroy criterion is much less efficient. It is also observed that with the new algorithms, the comuting time for the φ criterion is.3~3.0 times as much as that for the CL criterion. Table. Comuting Time (secs) of Criteria for 500,000 LHDs T r stands for the time needed to totally re-evaluating the criterion value of a LHD for 500,000 times. T n stands for the time needed to construct 500,000 different LHDs by element-exchanges and comute their criterion values by the roosed criterion-evaluation algorithm in ESE. φ ( = 50, t =1) CL Entroy (θ = 5, t =) T r T n T rt/t n T r T n T rt/t n T r T n T rt/t n 1 4 LHDs 1. 5.5. 10.7.4 4.5 16.6 14. 1. 5 4 LHDs 53.0 10.1 5. 41.5 3.4 1.1 75.3 39.8 1.9 50 5 LHDs 39 19.8 1.1 197 6.5 30.3 347 167.1 100 10 LHDs 1378 45. 30.5 1305 15.9 8.1 116 101.1 While the construction algorithm roosed is suitable for otimizing designs with the balance roerty, it can also be extended to otimizing designs with other secial structural roerties. For instance, it can be extended to obtain otimal OAs or OLs and maintain the orthogonality roerty. In that case, new udating oerations will be used to relace simle element-exchanges in a column so as to retain the orthogonality roerty required by OAs or OLs. The descrition of this extension can be found in Jin, 003. 4. TEST RESULTS AND COMPARATIVE STUDY Our roosed algorithm can be used for otimizing various classes of designs of exeriments, including but not limited to LHDs, general balanced designs, OAs, and OLs. Here we rovide two examles of otimal designs based on the CL criterion. The first one, as shown in Fig. 3, is an otimal balanced design, in which the factors 1~4 have 16 levels (equal to number of runs) and factors 5 and 6 have 4 levels. Fig. 5 shows its roection onto 4 th (16 levels) and 5 th (4 levels) factors. From the figure, we can find that the balance roerty is retained, i.e., 4 th factor is exlored once in each of the 16 bins and 5 th factor is exlored 4 times in each of the 4 bins. The second examle, as shown in Fig. 4, is an otimal OL 16 (4 5 ; ). Fig. 6 shows the roection of the otimal OL 16 (4 5 ) onto the subsace of 4 th and 5 th factors. Factors 4 and 5 get exlored once in each of 4 4 squares and each of them individually gets exlored once in each of 16 equal bins (not shown in the figure). 6 Coyright 003 by ASME

4 1 11 11 0 1 0 9 8 15 6 13 5 1 3 3 14 6 8 3 0 5 3 0 7 3 1 0 5 9 3 9 5 14 1 0 14 7 1 13 3 1 11 10 9 6 0 0 7 14 3 0 10 15 15 10 1 15 4 7 0 6 1 3 1 0 3 1 8 4 0 1 1 8 11 10 1 3 3 13 13 1 4 1 Figure 3. Otimal U 16(16 4 4 ) Based on the CL Criterion (CL = 0.0365) 3.5.5 1.5 0.5 15 14 6 8 10 1 9 13 14 14 5 15 6 6 13 9 1 3 6 15 8 5 1 4 3 7 0 5 7 4 0 10 1 5 8 14 15 9 8 1 13 1 13 10 0 3 4 8 9 6 5 14 0 11 11 11 11 4 1 13 1 7 1 1 9 0 7 10 3 11 3 10 4 7 15 Figure 4. Otimal OL 16(4 5 ; ) Based on the CL Criterion (CL = 0.01364) -0.5-0.5-0.5 7.5 15.5-0.5 3.5 7.5 11.5 15.5 Figure 5. Proection of the Otimal U 16(16 4 4 ) onto the Subsace of 4 th and 5 th Factors 15.5 11.5 7.5 3.5 Figure 6. Proection of the Otimal OL 16(4 5 ; )onto the Subsace of 4 th and 5 th Factors In the rest of this section, we will demonstrate the erformance and efficiency of the roosed algorithm by comaring it with the existing algorithms. The comarative study below will focus on otimal LHDs, which have been widely alied and studied in the literature. In Section 4.1, we will comare the comuting time of our method with the CP algorithm based on the results resented in the literature. This is followed in Section 4. with a more comrehensive comarison of erformance and efficiency between our roosed ECE algorithm, and the existing CP and the SA algorithms. In Section 4.3, we verify the quality of the otimal designs of exeriments obtained by our method by comaring the achieved otimality criterion with that from random designs. 4.1. A Preliminary Comarison Savings in Comuting Time In the following, an illustrative comarison between our roosed ESE algorithm and the CP algorithm resented in Ye, et al, 000 is rovided to show the significant savings achieved by our method. It should be noted that besides using the ESE algorithm, our method also emloys efficient algorithms for evaluating otimality criteria; the savings in comuting time is a combination of both. The comarison is for otimal 5 4 LHDs constructed based on the φ criterion ( = 50 and t = 1). It should be noted that even though Ye, et al used the φ criterion (with the same arameter settings as in our tests) as the otimality criterion in constructing otimal LHDs, their results were reorted in the form of (maximizing) the minimum L 1 distance (the larger the better), which, as discussed before, is strongly related to but not totally in accord with the φ value (the smaller the better). To be consistent, the results of our roosed ESE are also in the form of minimum L 1 distance in Table 3. Table 3. ESE vs. CP for Constructing Otimal 5 4 LHDs Based on φ Criterion ( = 50 and t = 1) N e stands for number of exchanges (shown in thousands). The results for CP are from Ye, et al (000, based on Sun SPARC 0 Workstation). In the CP test, 100 cycles (shown in the arentheses following the numbers of exchanges) are used. ESE is tested on a PC with a Pentium III 650 MHZ CPU. N e in Thous Min L 1 Distance Comuting Time CP 4(100) 0.8750 10.63 hr ESE 10 0.9167.5 sec In the work of Ye, et al, the otimization rocess was reeated for 100 cycles starting from different random LHDs and the design with the largest minimum L 1 distance of the 100 constructed otimal designs was reorted as the final otimal design. The number of exchanges or comuting time of CP is the total number or time used in the 100 cycles. From the results, it is found that the designs constructed by our ESE with less than.5 seconds is better than those constructed by CP with around 10.63 hours. In fact, ESE is tested for many times and the minimum distances are consistently larger than or equal to 0.9167. The saving of comuting time is dramatic even if the difference between the comuting latforms is considered. As introduced earlier, such a good efficiency is achieved by: Imroving the efficiency of criterion evaluation (5 times faster than totally re-evaluating for the examle test case; more significant imrovement for larger size designs, see Table ); Using fewer exchanges with ESE to search an otimal design (10,000 with ESE Vs,41,900 with CP). The test results match with our theoretical examinations of the efficiency of the algorithms for criterion evaluations shown in Section 3.. The following comarative study is to further demonstrate the erformance of ESE algorithm in saving the numbers of exchanges. 4.. A Further Comarison of Performance and Efficiency We tested two oular criteria, i.e., the φ criterion ( = 50 and t = 1) and the CL criterion in our comarative study. For SA, the tests are limited to the φ criterion since the arameter settings of SA were originally rovided by Morris and Mitchell (1995) to suit the φ criterion. For CP, both criteria are tested since there are no secial arameters to be set. Due to the sace limitation, we only discuss the results from using the φ criterion here. Our imlementations of CP and SA in the test are based on the algorithms roosed in Li and Wu (1997) and in Morris and Mitchell (1995), resectively (see the descrition in Section.3). One maor change in our imlementations of CP and SA is that our roosed algorithms for criterion evaluation described in Section 3. are used instead of reevaluating all matrix elements as currently being done in the literature. Thus, our imlementations run much faster than the original imlementations in the literature. The imlementation of SA, including arameter settings, is the same as Morris and Mitchell s. Even though we attemt to reroduce the original imlementations of CP and SA in the literature, the results from the two imlementations may not be exactly the same as in the literature. The tests are conducted on two sets of LHDs of relatively small sizes, i.e., 1 4 and 5 4, and two sets of LHDs of relatively large sizes, i.e., 50 5 and 100 10. As randomness is involved in all constructing algorithms, we reeat the same test 7 Coyright 003 by ASME

for 100 times starting from different initial LHDs. On each set of LHDs, two tyes of comarison are made, i.e., Tye-I: Comaring the erformance of ESE with that of SA and CP in terms of the average of criterion values of otimal designs with nearly the same numbers of exchanges. Tye-II: Comaring the efficiency of ESE with that of SA and CP in terms of numbers of exchanges needed for ESE to achieve otimal designs with the average of criterion values slightly better than that of SA or CP. In both tyes of comarison, t-test is used to statistically comare the average criterion value of the otimal designs generated by ESE with those generated by SA or CP. The - value is used to measure the level at which the observed difference (< 0) between the average criterion values is statistically significant. The smaller the -values are, the more statistical significance it has. While the standard in scientific research is that the -value should be below 5%, here we use a much tighter standard that the -value should be smaller than 0.001%. For tye-i comarison, this standard is not that critical since virtually all the -values in the comarison are much smaller than 0.001%; for tye-ii comarison, however, this standard is used to udge whether otimal designs generated by ESE are close to but still statistically significantly better than those generated by SA or CP. Corresonding to two tyes of comarisons, two grous of tests for ESE are erformed. For tye-i comarison, ESE is terminated at a number of exchanges close to that of SA or CP and this grou of tests for ESE is denoted as ESE (I); for tye-ii comarison, ESE is terminated when -values are smaller than 0.001%, and this grou of tests for ESE is denoted as ESE (II). We have also used ermutation test (c.f., Good, 000) for the comarison, which is a hyothesis test based on re-samling from randomly ermuted samle data. Unlike t-test, ermutation test does not assume a normal distribution. The -values achieved from ermutation test (not resented here) are similar to those from t-test. 4..1. Results of Small Sizes of Designs For small-sized LHDs, relatively large number of exchanges is affordable. For examle, with,865,600 exchanges, it takes ESE about 57 seconds to construct an otimal 5 4 LHDs based on the φ criterion. The tests for small-sized roblems are therefore focused on the caability of moving away from locally otimal designs and finding better exerimental designs given a large number of exchanges. The results of using the φ criterion are shown in Table 4. For each algorithm, two sets of tests with different numbers of exchanges are conducted. For SA, the two sets of tests corresond to two different values for cooling factor α suggested by Morris and Mitchell (1995), i.e., α = 0.90 (faster cooling) and 0.95 (slower cooling), resectively. In a articular set of tests, the numbers of exchanges of SA for constructing otimal designs will differ test by test. For instance, for 1 4 LHD and α = 0.95, the numbers of exchanges could be anywhere between 36,384 and 1,19,48. The numbers of exchanges of SA shown in the table are the average numbers. CP is terminated at a cycle number N s, which is selected so that the average number of exchanges is close to that of SA. The numbers of exchanges shown are also the average of 100 tests. The results of SA are also used to determine when to sto ESE. Table 4. Test Results of otimal 1 4 LHDs and 5 4 LHDs based on φ criterion ( =50 and t =1) For SA, Sets 1 & corresond to α = 0.90 and α = 0.95, resectively. N e (shown in thousands) stands for the average numbers of exchanges of 100 tests in each set of tests. For CP, cycle numbers N s are given in the arentheses following the average numbers of exchanges. Set 1 Set φ criterion N e in Thous Mean(STD) N e in Thous Mean(STD) 1 4 LHDs 5 4 LHDs SA 89 0.8569(0.0131) 53 0.8505(0.0133) CP 9(154) 0.8581(0.008) 530(80) 0.8546(0.0096) ESE (I) 86 0.8384(0.0057) 50 0.836(0.0041) ESE (II) 96 0.8483(0.0114) 174 0.846(0.0084) SA 1416 1.105(0.0101) 74 1.1149(0.0103) CP 144(65) 1.1495(0.0078) 744(14) 1.1455(0.0070) ESE (I) 1416 1.1051(0.0060) 74 1.0989(0.0051) ESE (II) 470 1.1150(0.007) 840 1.107(0.007) For tye-i comarison, error-bar lots (Figs. 7 and 8) are used to dislay the mean and variability of achieved φ values from 100 tests. The error bars (thick vertical lines) are each drawn a distance of one STD above and below the mean value. For each algorithm, a mean-line links the middles (i.e., the means) of error-bars. The dash error-bars and mean-lines are for the results of SA. From the figures, it is found that with similar number of exchanges, on average the roosed ESE always achieves better designs than both SA and CP with resect to the φ criterion. This is also confirmed statistically by the -values in t-tests, which are all smaller than 1.0e-15. Furthermore, ESE is more efficient than both SA and CP. Table 4 shows that to obtain a statistically significantly better design for both 1 4 and 5 4 LHDs, ESE needs less than 1/3 of exchanges used in SA and in CP. hi 0.88 0.87 0.86 0.85 0.84 a = 0.9 CP SA ESE 1x4 LHD a = 0.95 0.83.5 3 3.5 4 4.5 5 5.5 Exchange Number x 10 5 Figure 7. Tye-I Comarison for 1 4 LHDs (φ criterion) hi 1.16 1.14 1.1 1.1 a = 0.9 CP SA ESE 5x4 LHD a = 0.95 1 1.5.5 3 Exchange Number x 10 6 Figure 8. Tye-I Comarison for 5 4 LHDs (φ criterion) When using the CL criterion, it is found that ESE uses around 1/3 ~ 1/ exchanges used in CP for 1 4 LHDs and around 1/6 ~ 1/ of exchanges used in CP for 5 4 LHDs to achieve statistically significantly better designs. 4... Results for Large Sizes of Designs The comutational cost of constructing an otimal design of large sizes is much larger than that of small sizes. For largesize designs, our goal is to find a good design using limited comutational resource. Our comarison focuses on how efficient our algorithm is comared to others by using the same amount of reasonable numbers of exchanges, which are considered as small in relative to the size of the LHDs. For large-sized designs, SA in general converges much more slowly than CP and ESE. Therefore with the numbers of exchanges that are small relative to the size of design, the SA search rocess will not be able converge before the maximum 8 Coyright 003 by ASME

number of exchanges is reached. As the result, the design generated by SA could be much inferior to those generated by CP and ESE. For instance, for 50 5 LHDs based on the φ criterion, with around 1,50,000 exchanges, the average criterion value of SA (α = 0.9) is 1.4658 in comarison with 0.9875 for ESE and 1.03 for CP. Therefore for large roblems, SA may not be suitable since it needs excessive numbers of exchanges. Our test for large-sized designs will only focus on CP and ESE. CP rovides baselines for determining when to sto ESE in both tyes of comarisons. For large-sized roblems, the comutational cost could be too high for CP to even finish a single cycle. For instance, a single cycle of CP for 100 10 LHD with φ criterion could take 31,48,000 exchanges (,758 seconds). Therefore, the tests of CP for large-sized LHDs have been restricted to at most several cycles for 50 5 LHDs and one cycle for 100 10 LHDs. Table 5 shows the maximum numbers of exchanges and the comuting time. From the table we can find that the comuting time has been limited to merely several minutes (if not seconds). Table 5. Maximum Number of exchanges and Comuting Time for Constructing Otimal LHDs N e stands for number of exchanges. T c stands for comuting time. The data shown are for ESE. The data for CP are similar. φ CL Max N e 1000 Max T c Max N e 1000 Max T c 50 5 LHDs 1945 77 sec 960 35 sec 100 10 LHDs 500 19 sec 7685 198 sec As shown in Table 6, for each algorithm, three sets of tests with different numbers of exchanges are erformed. For 50 5 LHDs, the numbers of exchanges of the first set of tests are not sufficient to finish one cycle; the second set of tests involves exactly one cycle and the numbers of exchanges are the average of the 100 tests; likewise, the third set of tests involves exactly 5 cycles. For 100 10 LHDs, even though large numbers of exchanges are used for CP in all three sets of tests, they are not sufficient to finish the first cycle. Table 6. Test Results of otimal 50 5 LHDs and 100 10 LHDs based on φ criterion ( =50, t =1) N e stands for the average numbers of exchanges. For CP, the cycle numbers are rovided in the arentheses following the numbers of exchanges. If there are no cycle numbers marked, it means that CP is stoed within the first cycle Set 1 Set Set 3 φ criterion N e in Mean N e in Mean N e in Mean Thous (STD) Thous (STD) Thous (STD) 50 5 LHDs 100 10 LHDs CP 61 ESE (I) 60 ESE (II) 10 CP 97 ESE (I) 80 ESE (II) 10 1.1564 (0.011) 1.0486 (0.007) 1.164 (0.0099) 0.5381 (0.0044) 0.456 (0.001) 0.514 (0.0031) 404 (1) 400 80 545 500 0 1.040 (0.0097) 1.0076 (0.0059) 1.0348 (0.0069) 0.5059 (0.004) 0.455 (0.0014) 0.4996 (0.005) 1948 (5) 1945 110 55 500 140 1.0311 (0.0068) 0.9850 (0.0038) 1.048 (0.0063) 0.4660 (0.0014) 0.4440 (0.0010) 0.4634 (0.0015) The means and variability of the achieved φ values for 50 5 LHDs and 100 10 are shown in Figs. 9 and 10 for tyes-i comarison. From the figures, it is found that ESE consistently outerforms CP, which is also confirmed by t-tests (-values are all smaller than 1.0e-15). From Table 6, it is observed that ESE is much more efficient than CP. To reach statistically significantly better designs than CP, ESE needs only around 1/17~ 1/5 of exchanges used in CP for 50 5 LHDs and 1/9 ~ 1/18 of exchanges used in CP for 100 10 LHDs. hi 1.5 1. 1.15 1.1 1.05 1 CP ESE 50x5 LHD 0.95 0 0.5 1 1.5 Exchange Number x 10 6 Figure 9. Tye-I Comarison for 50 5 LHDs (φ criterion) hi 0.58 0.56 0.54 0.5 0.5 0.48 0.46 100x10 LHD CP ESE 0.44 0 0.5 1 1.5.5 3 Exchange Number x 10 6 Figure 10. Tye-I Comarison for 100 10 LHDs(φ criterion) Similar tests to the above have been carried out for the CL criterion. It is found that ESE consistently outerforms CP, which is confirmed by t-tests (-values are all smaller than 1.0e-15). It is observed that ESE is much more efficient than CP. To reach statistically significantly better designs than CP, ESE needs only around 1/3 ~ 1/4 of exchanges used in CP for 50 5 LHDs and 1/33 ~ 1/10 of exchanges used in CP for 100 10 LHDs. 4.3 Verifications by Comaring with Random Designs For large-sized designs, we have used small number of exchanges in relative to the size of designs. As the global otimal design is never known, one way to assess the quality of these otimal designs are to estimate the robability of a random design being better than an otimal design, i.e., F( tot ) = P( X tot ), (15) where X, a random variable, stands for criterion values of random designs and t ot is the criterion value of an otimal design (here the mean of the criterion values of a set of otimal designs are used for t ot ). If the robability is trivial, we could consider the otimal design has a significantly low criterion value. This evaluation deends on the cumulative distribution function (CDF) F(x) of the criterion values of random designs, which generally can only be estimated by Monte-Carlo methods. For 50 5 LHDs, 0,000,000 random designs are generated and their φ and CL values are comuted. It is found that the average criterion value (e.g., 0.9850 for φ, as shown in set 3 in Table 6) of the otimal designs by ESE is far beyond the left tails of the emirical cumulative distribution curves. To estimate the robability in Eq. 15, an aroximation curve has been used to extend the left tail of the emirical cumulative distribution curve. It is obtained that for the φ criterion, the robability P ( φ < tot ) is roughly in 10-15 magnitude; for the CL criterion, the robability P ( CL < tot ) is roughly in 10-19 magnitude. Similar observations can also be obtained for 100 10 LHDs. We then can conclude that for large-sized LHDs, those otimal designs constructed by ESE generally have significantly low φ values or CL values. 9 Coyright 003 by ASME