Design and Analysis of Computer Experiments for Screening Input Variables. Dissertation. Hyejung Moon, M.S. Graduate Program in Statistics

Design and Analysis of Computer Experiments for Screening Input Variables Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Hyejung Moon, M.S. Graduate Program in Statistics The Ohio State University 2010 Dissertation Committee: Thomas J. Santner, Co-Adviser Angela M. Dean, Co-Adviser William I. Notz

c Copyright by Hyejung Moon 2010

ABSTRACT A computer model is a computer code that implements a mathematical model of a physical process. A computer code is often complicated and can involve a large number of inputs, so it may take hours or days to produce a single response. Screening to determine the most active inputs is critical for reducing the number of future code runs required to understand the detailed input-output relationship, since the computer model is typically complex and the exact functional form of the inputoutput relationship is unknown. This dissertation proposes a new screening method that identifies active inputs in a computer experiment setting. It describes a Bayesian computation of sensitivity indices as screening measures. It provides algorithms for generating desirable designs for successful screening. The proposed screening method is called GSinCE (Group Screening in Computer Experiments). The GSinCE procedure is based on a two-stage group screening approach, in which groups of inputs are investigated in the first stage and then inputs within only those groups identified as active at the first stage are investigated individually at the second stage. Two-stage designs with desirable properties are constructed to implement the procedure. Sensitivity indices are used to measure the effects of inputs on the response. Inputs with large sensitivity indices are determined by comparison with a benchmark null distribution constructed from user-specified, low-impact inputs. The use of low-impact inputs is useful for screening out inputs ii

having small effects as well as those that are totally inert. Simulated examples show that, compared with one-stage procedures, the GSinCE procedure provides accurate screening while reducing computational effort. In this dissertation, the sensitivity indices used as screening measures are computed in a Gaussian process model framework. This approach is known to be computationally efficient by using small numbers of expensive computer code runs for the estimation of sensitivity indices. The existing approach for quantitative inputs is extended so that sensitivity indices can be computed when inputs include a qualitative input in addition to quantitative inputs. An orthogonal design in which the design matrix has uncorrelated columns is important for estimating the effects of inputs. Moreover, a space-filling design for which design points are well spread out is needed to explore the experimental region thoroughly. New algorithms for achieving such orthogonal space-filling designs are proposed in this dissertation. The three kinds of software are provided for the proposed GSinCE procedure, computation of sensitivity indices, and design search algorithms. iii

This is dedicated to my daughter Moonyoung, son Nathan, husband Jungick, and parents. iv

ACKNOWLEDGMENTS I would first like to express my gratitude to my co-advisors, Professor Thomas Santner and Professor Angela Dean. They have given me tremendous help in my professional development and great guidance in my life. They are very special teachers and mentors to me. I am truly grateful for the effort that they have put into my education and the time that they have shared with me. I would also like to thank Professor William Notz for helpful comments and support as a member of my dissertation committee. I want to give special thanks to my parents for their love and support. Without their help and sacrifices, my husband Jungick and I could not finish Ph.D. study at the same time. I would also like to thank Jungick for his love and for every moment that we have shared during our Ph.D. study. I am most thankful to my precious little ones, daughter Moonyoung and son Nathan. They have given me all the happiness, hope, and strength to do my best in my life. v

VITA October 1977............................... Korea 2000........................................B.S. Statistics, Korea University 2000 to 2004................................Statistician, The Bank of Korea 2006........................................M.S. Statistics, The Ohio State University 2005 to present............................. Graduate Research Associate, Graduate Teaching Associate, The Ohio State University FIELDS OF STUDY Major Field: Statistics vi

TABLE OF CONTENTS Page Abstract....................................... Dedication...................................... Acknowledgments.................................. Vita......................................... List of Tables.................................... ii iv v vi x List of Figures................................... xiv Chapters: 1. Introduction.................................. 1 1.1 Computer Experiments........................ 1 1.2 Gaussian Stochastic Process Model.................. 2 1.3 Screening Procedure.......................... 5 1.3.1 Screening in Computer Experiments............. 5 1.3.2 Group Screening in Physical Experiments.......... 6 1.4 Design of Computer Experiments................... 7 1.5 Overview of Dissertation........................ 9 2. Two-stage Sensitivity-based Group Screening in Computer Experiments. 10 2.1 Introduction.............................. 10 2.1.1 Background........................... 10 2.1.2 Overview of the Proposed Procedure............. 13 2.2 GSinCE Initialization Stage...................... 14 2.3 GSinCE Procedure Stage 1...................... 16 vii

2.3.1 Stage 1 Sampling Phase.................... 16 2.3.2 Stage 1 Grouping Phase.................... 17 2.3.3 Stage 1 Analysis Phase.................... 19 2.4 GSinCE Procedure Stage 2...................... 24 2.4.1 Stage 2 Sampling Phase.................... 24 2.4.2 Stage 2 Analysis Phase.................... 25 3. Performance of GSinCE........................... 26 3.1 Simulation Studies to Set τ...................... 26 3.1.1 Simulations for f = 20..................... 29 3.1.2 Simulations for f = 30..................... 33 3.1.3 Simulations for f = 10..................... 35 3.1.4 Summary of Simulation Studies................ 43 3.2 Application of GSinCE in Least Favorable Cases.......... 43 3.2.1 Small Percentage of Active Inputs.............. 44 3.2.2 Non-linear Functions...................... 45 3.2.3 Detecting Large Effects.................... 50 3.3 Properties of Two-stage Designs.................... 51 3.3.1 Augmented Design....................... 52 3.3.2 Combined Design at Stage 2.................. 54 4. Application of GSinCE............................ 57 4.1 Examples from the Literature..................... 57 4.1.1 Borehole Model......................... 58 4.1.2 A Model for the Weight of an Aircraft Wing......... 60 4.1.3 OTL Circuit Model...................... 62 4.1.4 Piston Simulator Model.................... 64 4.1.5 Summary............................ 65 4.2 A Real Computer Experiment: FRAPCON Model......... 66 4.2.1 Description of Code...................... 66 4.2.2 Use of GSinCE......................... 67 4.2.3 Implementations........................ 70 5. Computation of Sensitivity Indices..................... 81 5.1 Sensitivity Indices of Quantitative Inputs.............. 81 5.1.1 Definition of Sensitivity Indices................ 82 5.1.2 Estimation in Gaussian Process Framework......... 87 5.1.3 The Integrals: sgint, dbint, mxint............. 94 5.1.4 Example............................. 100 viii

5.2 Sensitivity Indices of Mixed Inputs.................. 103 5.2.1 Setup.............................. 104 5.2.2 Correlation Function for Mixed Inputs............ 105 5.2.3 Estimation of Sensitivity Indices for Mixed Inputs...... 107 5.2.4 Example............................. 115 6. Algorithms for Generating Maximin Latin Hypercube and Orthogonal Designs...................................... 118 6.1 Introduction.............................. 118 6.2 Maximin Criteria for Space-filling Designs.............. 121 6.3 Algorithms for Space-filling Latin Hypercube Designs........ 123 6.3.1 Complete Search and Random Generation.......... 123 6.3.2 Random Swap Methods for Maximin LHDs......... 124 6.3.3 A Smart Swap Method for Maximin LHDs.......... 125 6.4 Algorithms for Orthogonal Maximin Designs............. 127 6.4.1 Orthogonal Maximin LHDs.................. 127 6.4.2 Orthogonal Maximin Gram-Schmidt Designs......... 129 6.5 Comparisons.............................. 133 6.5.1 Maximin LHDs......................... 133 6.5.2 Orthogonal Maximin Designs................. 135 6.6 Summary................................ 139 7. Alternative Two-stage Designs........................ 141 7.1 Orthogonal Array-based Latin Hypercube Design.......... 141 7.2 Stage 1 Design for a Two-stage Group Screening Procedure.... 143 7.2.1 Construction.......................... 143 7.2.2 Secondary Criteria....................... 148 7.3 Stage 2 Design for a Two-stage Group Screening Procedure.... 149 7.4 Limitations............................... 150 7.4.1 Availability of OA-based LHD................. 150 7.4.2 Group Variable Defined by Averaging............ 150 8. Software.................................... 155 8.1 GSinCE Code.............................. 155 8.2 Sensitivity Code............................ 158 8.3 Maximin Code............................. 164 Bibliograhpy.................................... 168 ix

LIST OF TABLES Table Page 3.1 Marginal probabilities and coefficient distributions for the simulation study.................................... 29 3.2 Six combinations used to recommend τ................. 30 3.3 Median and IQR values of the performance measures, and average number of groups and average total runs over 200 test functions with about 25% of active inputs among f = 20 inputs for each τ in each combination; value in parentheses is the number of test functions generated with no active inputs........................... 32 3.4 Modified values of q L and q NN. Other probabilities are as in Table 3.1 to achieve about 25% of f = 30 inputs active............ 34 3.5 Median values of the performance measures, and median/average values of true/claimed active inputs over 50 test functions with about 25% of active inputs among f = 30 inputs; value in parentheses is the number of test functions generated with no active inputs....... 34 3.6 Modified values of q L to achieve about 25%, and 35% of f = 10 inputs active, while keeping other probabilities as in Table 3.1........ 36 3.7 Median values of the performance measures, and median/average values of true/claimed active inputs over 100 test functions with about 25% of active inputs among f = 10 inputs; value in parentheses is the number of test functions generated with no active inputs....... 38 3.8 Median values of the performance measures, and median/average values of true/claimed active inputs over 100 test functions with about 35% of active inputs among f = 10 inputs; value in parentheses is the number of test functions generated with no active inputs....... 40 x

3.9 Median values of the performance measures, and median/average values of true/claimed active inputs over 100 test functions with about 20% of active inputs among f = 10 inputs; value in parentheses is the number of test functions generated with no active inputs....... 42 3.10 Median values of FDR, FNDR, specificity, sensitivity over 30 functions having small percentages of active inputs................ 44 3.11 All coefficients of test function (3.5)................... 46 3.12 Results of automatic grouping and applying GSinCE for test function (3.5).................................... 47 3.13 Results of original, modified procedures and one-stage method.... 48 3.14 Comparisons of median values of FDR, FNDR, specificity, sensitivity over 30 non-linear functions for the original and modified procedures. 50 3.15 Result of automatic grouping and applying GSinCE in favorable situations................................... 51 3.16 Minimum inter-point distance and computation time of two design methods and the estimated TESIs based on each of these designs.. 53 4.1 Grouping and active effect selection by GSinCE for the borehole model 59 4.2 Computation times and active effect selection of the four procedures for the borehole model using 70 runs.................. 60 4.3 Grouping and active effect selection by GSinCE for the aircraft wing weight model............................... 61 4.4 Computation times and active effect selection of the four procedures for the aircraft wing weight model using 65 runs............ 62 4.5 Grouping and active effect selection by GSinCE for the OTL circuit model................................... 63 4.6 Computation times and active effect selection of the four procedures for the OTL circuit model using 60 runs................ 63 xi

4.7 Grouping and active effect selection by GSinCE for the piston model 64 4.8 Computation times and active effect selection of the four procedures for the piston model using 85 runs.................... 65 4.9 Summary of screening for all outputs based on grouping by EDA... 71 4.10 Stage 1 grouping by EDA and selection for y 1............. 72 4.11 Stage 1 grouping by EDA and selection for y 2............. 72 4.12 Stage 1 grouping by EDA and selection for y 3............. 73 4.13 Stage 1 grouping by EDA and selection for y 4............. 73 4.14 Grouping by expert............................ 74 4.15 Summary of screening for all outputs based on grouping by expert.. 75 4.16 Construction of subgroups within a group made by expert...... 76 4.17 Summary of screening for all outputs based on grouping by expert and EDA.................................... 78 4.18 Summary of screening for all groupings................. 79 5.1 Estimated sensitivity indices for the example function in (5.46) using different correlation functions and approaches............. 103 5.2 Estimated sensitivity indices for the example function in (5.84).... 117 6.1 Characteristics of best (n, k) = (9, 4) designs formed using criterion d (2) min : ϕ 15 is Morris and Mitchell (1995) objective function with p = 15; ρ 2 ave is average squared correlation; d (4) min is minimum 4-dimensional rectangular distance; ρ max is maximum absolute correlation; T is number of starting designs........................ 134 xii

6.2 Characteristics of best (n, k) = (40, 5) designs formed using criterion d (2) min : ϕ 15 is Morris and Mitchell (1995) objective function with p = 15; ρ 2 ave is average squared correlation; d (5) min is minimum 4-dimensional rectangular distance; ρ max is maximum absolute correlation; T is number of starting designs........................ 135 6.3 Best orthogonal maximin 9 4 designs found by the OMLHD, OSGSD- ϕ 15, OSGSD-d (2) min algorithms based on 7 minutes of computational time and scatterplot matrices of these designs................. 136 6.4 Comparisons of best designs found by OMLHD, OSGSD-ϕ 15, and OSGSDd (2) min algorithms based on 7 minutes of computational time: ϕ 15 is Morris and Mitchell (1995) objective function with p = 15; ρ 2 ave is average squared correlation; d (4) min is minimum 4-dimensional rectangular distance; ρ max is maximum absolute correlation; T is number of starting designs................................... 137 6.5 Distributions of ϕ 15 and d (4) min values in 100 9 4 designs produced by the OSGSD-ϕ 15 algorithm (4 seconds of computation) and corresponding values of the best scaled OMLHD design indicated by horizontal lines 138 6.6 Comparisons of best designs found by OMLHD, OSGSD-ϕ 15, and OSGSDd (2) min algorithms based on 9 hours and 42 minutes of computational time: ϕ 15 is Morris and Mitchell (1995) objective function with p = 15; ρ 2 ave is average squared correlation; d (5) min is minimum 5-dimensional rectangular distance; ρ max is maximum absolute correlation; T is number of starting designs........................ 139 6.7 Distributions of ϕ 15 and d (5) min values in 100 40 5 designs produced by the OSGSD-ϕ 15 algorithm (349 seconds of computation) and corresponding values of the best scaled OMLHD design indicated by horizontal lines................................ 140 7.1 Secondary design criteria for Stage 1 design.............. 149 7.2 40 4 Stage 1 design X (1)........................ 153 7.3 Ranges and estimated TESIs of 2 groups under 3 different groupings. 153 xiii

LIST OF FIGURES Figure Page 3.1 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 200 test functions versus τ 100% for functions with about 25% of active inputs among f = 20 inputs....................... 31 3.2 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 50 test functions versus τ 100% for functions with about 25% of active inputs among f = 30 inputs........................... 35 3.3 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 100 test functions versus τ 100% for functions with about 25% of active inputs among f = 10 inputs....................... 37 3.4 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 100 test functions versus τ 100% for functions with about 35% of active inputs among f = 10 inputs....................... 39 3.5 Median values of FDR (line with circle), FNDR (line with asterisk), specificity (line with cross), sensitivity (line with diamond) over 100 test functions versus τ 100% for functions with about 20% of active inputs among f = 10 inputs....................... 41 5.1 Description of dbint of the cubic correlation function......... 97 5.2 Description of R Y (x, η 1 ; θ)r Y (x, η 2 ; θ) of the cubic correlation function 99 7.1 Rotate Method.............................. 145 xiv

7.2 Shrink Method with h = 0.25...................... 147 xv

CHAPTER 1 INTRODUCTION 1.1 Computer Experiments There are many complex physical phenomena that are impossible or too expensive to study using physical experiments. However, some of these physical processes can be described by means of a mathematical model which relates inputs to output. A computer model is the implementation of such mathematical models in computer code. A computer experiment is the use of the computer code as an experimental tool in which the experimenter seeks to determine the computational response of the code to the inputs. Computer experiments are prevalent in a wide range of studies, for example, in engineering (Fang, Li, and Sudjianto (2005)), in biomechanics (Ong, Lehman, Notz, Santner, and Bartel (2006)), in the physical sciences (Higdon, Kennedy, Cavendish, Cafeo, and Ryne (2004)), in the life sciences (Upton, Guilak, Laursen, and Setton (2006), Fogelson, Kuharsky, and Yu (2003)), in economics (Lempert, Williams, and Hendrickson (2002)) and other areas of natural science. The output from most computer codes is deterministic; that is, two runs of a computer code at the same set of input values give an identical output value. Hence the traditional principles of blocking, randomization, and replication of the physical 1

experiment are not required for the design of a computer experiment. A computer code is often complicated and can involve a large number of inputs, so it may take hours or days to produce a single output. Thus there is a need for efficient screening methods for detecting inputs that have influential impacts on an input-output system. A flexible predictor (see Section 1.2) is often fitted to the outputs to provide a rapidlycomputable surrogate predictor (a metamodel for the code). The performance of the predictor depends upon the choice of the training design points used to develop the predictor, so there is a need for careful design. In this dissertation, a new screening method is proposed for computer experiments. The computation of sensitivity indices as screening measures is discussed together with construction of designs for successful screening. 1.2 Gaussian Stochastic Process Model Although the output from a computer code is deterministic, uncertainty arises since the exact functional form of the input-output relationships is unknown from a limited number of runs. So statistical models are needed to characterize the uncertainty. The Gaussian process (GP) model has been popularly used to model the output from a computer experiment, because it provides a flexible framework by producing a large class of potential response surfaces and easily adapts to the presence of nonlinearity and interactions. In the following, the GP model (see Sacks, Welch, Mitchell, and Wynn (1989) and Santner, Williams, and Notz (2003), chapters 2 and 3) is reviewed briefly. Let y(x) be a scalar output which is a function of a k-dimensional vector of inputs, x = (x 1, x 2,..., x k ). Then the GP model treats the deterministic output y(x) as a 2

realization of a random function Y (x), Y (x) = f (x)β + Z(x) (1.1) where f(x) = (f 1 (x),..., f q (x)) is a q 1 vector of known regression functions at x and β = (β 1,..., β q ) is a q 1 vector of unknown regression coefficients. The Z( ) is a stationary Gaussian process with mean zero, variance 1/λ Z, and covariance function Cov(Z(x), Z( x)) = 1 λ Z R(x x) (1.2) where x and x are two input sites, and R(x x) is a correlation function of Z( ). Valid correlation functions must have R(0) = 1, must be symmetric about the origin, i.e., R(h) = R( h), and be positive definite. One of the popular correlation functions is the product power exponential correlation function, k R(x x) = exp( θ j (x j x j ) p j ) (1.3) j=1 where θ j > 0, and 0 < p j 2. The Gaussian correlation function is the special case when p j = 2. Cubic and Matérn correlation functions are also widely used (see Santner et al. (2003), chapter 2). Suppose that output y(x 0 ) is predicted at the new input site x 0, based on the training data y n = (y(x 1 ),..., y(x n )) at the n input sites x 1,..., x n. Then Y 0 = Y (x 0 ), Y 1 = Y (x 1 ),..., Y n = Y (x n ) are also a GP from (1.1), and hence the joint distribution of Y 0 and Y n = (Y 1,..., Y n ) is a multivariate normal distribution ( Y0 Y n ) N 1+n [ ( f 0 F ) ( 1 1 r β, 0 λ Z R r 0 ) ] (1.4) where f 0 = f(x 0 ) is a q 1 vector of regression functions at x 0, F is an n q matrix of regression functions having (i, j) th element f j (x i ) for 1 i n, 1 j q, 3

r 0 = (R(x 0 x 1 ),..., R(x 0 x n )), and R is an n n matrix having (i, j) th element R(x i x j ). Let ψ be a vector of parameters of the correlation function R( ). Given the training data y n and the model parameters (β, λ Z, ψ), Y 0 has a conditional normal distribution [Y 0 y n, β, λ Z, ψ] N [ f 0 β + r 0 R 1 (y n F β), ] 1 (1 r 0 R 1 r 0 ), (1.5) λ Z from (1.4). The minimum MSPE linear unbiased or a best linear unbiased predictor (BLUP) of Y 0 (see Sacks et al. (1989)) is Ŷ 0 = f 0 β + r 0 R 1 (y n F β) (1.6) where β = (F R 1 F ) 1 F R 1 y n is the generalized least squares estimator of β. In practice, the parameters ψ in the correlation function R( ) are unknown, so the estimates R and ˆr 0 can be used instead of R and r 0 in (1.6). Such a predictor is called empirical best linear unbiased predictor (EBLUP) of Y 0. According to the method used for estimating ψ, different EBLUPs such as maximum likelihood EBLUP, restricted maximum likelihood EBLUP, cross-validation EBLUP, and posterior mode EBLUP can be obtained. See Santner et al. (2003), chapter 3, for more details. In a fully Bayesian approach, prior distributions for the model parameters (β, λ Z, ψ) are specified, and then the predictor of Y 0 is obtained as the mean of the predictive distribution [Y 0 y n ], i.e., E(Y 0 y n ) = E Y0 (E β,λz,ψ(y 0 y n, β, λ Z, ψ)). (1.7) To compute the Bayesian predictor numerically, one can take draws of the model parameters (β, λ Z, ψ) from the posterior distribution [β, λ Z, ψ y n ] using Markov Chain Monte Carlo (MCMC) sampling method, compute E(Y 0 y n, β, λ Z, ψ) = f 0 β + 4

r 0 R 1 (y n F β) using each draw of (β, λ Z, ψ), and take a sample mean of estimates of E(Y 0 y n, β, λ Z, ψ) based on different draws of the parameters. 1.3 Screening Procedure A computer code can involve a huge number of inputs and may take hours or days to produce a single output at a certain set of input values because of the complexity of the codes. Thus there is a need for efficient methods for detecting inputs that have major impacts on an input-output system. Also in physical experiments, a conventional factorial experiment may not be economically feasible when numerous factors are considered. So it is necessary to identify influential inputs at an early stage of experimentation. These can be investigated further at a later stage. Screening methods developed in the setting of computer experiments are reviewed in Section 1.3.1 and group screening in physical experiments in Section 1.3.2. A new two-stage group screening procedure that identifies active inputs in a computer experiment setting and borrows ideas from group screening is proposed in Chapter 2. 1.3.1 Screening in Computer Experiments Most of the screening methods in computer experiments are based on the GP model described in Section 1.2. For example, Sacks et al. (1989) used a decomposition of the output function y(x) into an average effect, main effects for each input, twofactor interactions, and high-order interactions, estimated the effects by replacing y(x) by the predictor based on the GP model, and plotted the estimated effects to investigate the importance of the inputs. Welch, Buck, Sacks, Wynn, Mitchell, and Morris (1992) extended Sacks et al. (1989) method so as to build an accurate predictor and identify important inputs when there are up to 30-40 inputs. Oakley and O Hagan 5

(2004) presented a Bayesian approach to perform probabilistic sensitivity analysis which formulates uncertainty in the model inputs by a joint probability distribution and then analyzes the induced uncertainty in the output. Schonlau and Welch (2006) described the implementation of visualizing the estimated effects and quantifying the importance of the inputs via an ANOVA type of decomposition. Linkletter, Bingham, Hengartner, Higdon, and Ye (2006) proposed a Bayesian method to select active inputs based on the posterior distribution of the parameters of the Gaussian correlation function. Campbell, McKay, and Williams (2006) suggested sensitivity analysis for functional computer model outputs by expanding the functional outputs in terms of an appropriate set of basis functions and doing sensitivity analysis on the coefficients of the expansion. Higdon, Gattiker, Williams, and Rightley (2008) performed sensitivity analysis for high-dimensional output using basis representations to reduce the dimensionality. 1.3.2 Group Screening in Physical Experiments Group screening methodology was first described by Dorfman (1943) for blood screening and was later adapted to the setting of physical experiments by Watson (1961) for identifying active factors in cases where there are many potentially influential input factors. In two-stage group screening, the first stage of experimentation is done on groups of factors. The individual factors within the groups identified as active in the first stage are then investigated individually in a second-stage experiment. The early work in this area considered models in which only main effects were considered, but has now been extended to handle interactions. Lewis and Dean (2001) investigated two-stage group screening strategies for detecting interactions. Vine, Lewis, 6

and Dean (2005) developed methodology for handling groups of unequal sizes as well as unequal probabilities for factors being active. Morris (2006) gave a survey of group screening and its use in searching for active factors. Vine, Lewis, Dean, and Brunson (2008) discussed practical aspects involved in running a two-stage group screening experiment for investigating interactions. 1.4 Design of Computer Experiments The output from most computer codes is deterministic and hence, no replications are required at any design point. Moreover, for the thorough exploration of the experimental region, the design points should be spread evenly throughout the region. Such designs for which the points are unreplicated and well spread output are called space-filling. McKay, Beckman, and Conover (1979) introduced Latin hypercube designs for use in computer experiments. In its simplest form, each column of an n k Latin hypercube design (LHD) has hth column ξ h = [ξ 1h, ξ 2h,..., ξ nh ] which can be obtained from a random permutation π h = [π 1h, π 2h,..., π nh ] of 1,..., n. Then ξ ih is the midpoint of the interval [(π ih 1)/n, π ih /n]. In a slightly more sophisticated approach, a random point in this interval may be taken; the latter procedure is used in this dissertation. All LHDs have the one-dimensional space-filling property that an observation is taken in every one of the n evenly spaced intervals over the [0, 1] range of each input. However, they need not have space-filling properties in higher dimensions. Several criteria to generate space-filling designs are described in Santner et al. (2003), chapter 5. In particular, the maximin distance criterion which finds a design which 7

maximizes minimum inter-point distance was first introduced by Johnson, Moore, and Ylvisaker (1990) and extended by Morris and Mitchell (1995). Another desirable property of a design for a computer experiment is that of orthogonality, where the design matrix has uncorrelated columns. If values of two inputs are highly correlated, then it is difficult to distinguish their effects on the output. The orthogonal design allows one to assess the effects of the different inputs independently. Tang (1993) proposed a method of constructing the orthogonal array-based LHDs by combining the desirable properties of both orthogonal arrays and LHDs. Owen (1994) proposed an algorithm for generating LHDs with small pairwise correlations between input variables. Tang (1998) developed an algorithm for reducing polynomial canonical correlations of LHDs by extending Owen (1994) s algorithm. Ye (1998) proposed a construction method for orthogonal LHDs with n = 2 m + 1 runs and k = 2m 2 input variables, and used an improvement algorithm for selecting designs within this class under space-filling and other criteria. Butler (2001) presented a construction method for LHDs which are orthogonal with respect to models based on trigonometric functions. Steinberg and Lin (2006) constructed orthogonal LHDs with n = 2 k runs where the number k of inputs is a power of 2, which can include more inputs than those proposed by Ye (1998). Cioppa and Lucas (2007) extended Ye (1998) s approach to construct orthogonal LHDs that can accommodate more inputs and presented a method that improves the space-filling properties of the resulting LHD at the expense of inducing small correlations. Joseph and Hung (2008) proposed an exchange algorithm for efficient generation of LHDs under a weighted combination of orthogonality and space-filling criteria. Lin, Mukerjee, and Tang (2009) proposed a method for constructing large dimensional orthogonal and nearly orthogonal LHDs 8

by using an orthogonal array and a small LHD. Bingham, Sitter, and Tang (2009) constructed a class of orthogonal designs which have various choices for the number of levels and flexible runs sizes by relaxing the constraint of LHD. In Chapter 6, a new method for achieving orthogonal and space-filling designs, based on Gram-Schmidt orthogonalization is proposed. 1.5 Overview of Dissertation The rest of dissertation is organized as follows. Chapter 2 proposes a new twostage group screening procedure that identifies active inputs in a computer experiment setting. The performance of the proposed method is discussed in Chapter 3 and application of the new method is demonstrated in Chapter 4. The computation of sensitivity indices as screening measures is discussed in Chapter 5. Chapter 6 presents a new algorithm for generating maximin LHDs and a new algorithm for achieving orthogonal maximin designs using Gram-Schmidt orthogonalization. Alternative approaches for creating two-stage designs are given in Chapter 7. Chapter 8 provides software for the proposed screening procedure, computation of sensitivity indices, and design search algorithms. 9

CHAPTER 2 TWO-STAGE SENSITIVITY-BASED GROUP SCREENING IN COMPUTER EXPERIMENTS This chapter proposes a new two-stage group screening procedure that identifies active inputs in computer experiments. The whole procedure is explained here in detail. Further discussions related with the performance of the procedure are described in Chapter 3 and the application using various examples is shown in Chapter 4. 2.1 Introduction 2.1.1 Background A computer model is a numerical implementation of a mathematical description of an input-output relationship of a physical process. Modelling through computer codes is prevalent in a wide range of applications, for example, in engineering (Fang et al. (2005)), in biomechanics (Ong et al. (2006)), in the physical sciences (Higdon et al. (2004)), in the life sciences (Upton et al. (2006), Fogelson et al. (2003)), in economics (Lempert et al. (2002)), and other areas of natural science. Over the past 20 years, the use of computer codes as experimental tools has become increasingly sophisticated. In addition to inputs that describe different treatments, computer models can allow the user to vary environmental inputs that describe the 10

conditions in which the process operates, and calibration inputs which are unknown physical constants in the underlying mathematical model and for which expert-based subjective distributions are available. For example, Ong, Santner, and Bartel (2008) presented an application in the biomechanical engineering design of a prosthetic acetabular cup in which the hip socket of a prosthetic total hip replacement rotates. In addition to inputs defining the cup geometry, their study included inputs representing environmental conditions such as the patient bone quality and loading patterns, inputs describing mis-alignments from nominal cup insertion values (which represent the level of surgeon skill), and inputs describing unknown aspects of the true physical setting such as the interface friction between the bone and prosthesis. The finiteelement codes used in this and other complex applications can require up to 24 hours for a single run. Consequently, there is a need for efficient methods for detecting inputs that have major impacts on an input-output system. These are called the active or influential inputs. Once identified, researchers can restrict attention to varying only the active inputs (while setting other inputs to nominal values), thus reducing the number of future code runs needed to understand the detailed input-output relationship. The literature contains several proposals for screening inputs in computer experiments where the deterministic output is modelled as a realization of a random function. An approach that decomposes a Gaussian random function approximator of a computer model into an average effect, main effects for each input, two-factor interactions, and high-order interactions, and plots the estimated effects or quantifies the importance of the effects has been applied by many authors (section 1.3.1), for example, Sacks et al. (1989), Welch et al. (1992), Oakley and O Hagan (2004), and 11

Schonlau and Welch (2006). Linkletter et al. (2006) proposed a Bayesian method to select active inputs based on the posterior distribution of the parameters of the Gaussian correlation function. Campbell et al. (2006) and Higdon et al. (2008) performed sensitivity analysis for multiple outputs. For complex computer codes that are expensive to run and that must account for many inputs, standard screening methods described in Section 1.3 can be timeconsuming. A screening method is presented below that incorporates experimental design considerations and group screening and allows the user to identify influential inputs in a computer experiment with computational efficiency. As mentioned in Section 1.3.2, group screening methodology was first described by Dorfman (1943) for blood screening and was later adapted to the setting of physical experiments by Watson (1961) for identifying active factors in cases where there are many potentially influential input factors. The early work in this area considered models in which only main effects were considered, but has now been extended to handle interactions (see Morris and Mitchell (1983), Lewis and Dean (2001), and Vine et al. (2005); see also reviews by Kleijnen (1987) and Morris (2006)). Group Screening works well under effect sparsity, where the proportion of active main effects and interactions is small. In two-stage group screening, the first stage of experimentation is done on groups of factors. The individual factors within the groups identified as active in the first stage are then investigated individually in a secondstage experiment. The proposed method in this dissertation provides a two-stage group screening procedure that identifies active inputs in a computer experiment setting; it eliminates non-active inputs having small effects, not merely those having zero effects. This 12

approach reduces the number of experimental runs needed to understand the inputoutput relationship because groups of inputs with small effects are dropped at an early stage of the procedure. 2.1.2 Overview of the Proposed Procedure For clarity of exposition, the description of each stage of the proposed procedure is divided into sampling, grouping (for Stage 1 only), and analysis phases. The proposed procedure is called the GSinCE (Group Screening in Computer Experiments) procedure. An outline is given below, with further details in Sections 2.2-2.4. Initialization Given that n runs of the computer code are to be made in Stage 1, a matrix X with n rows and (n 1) columns satisfying certain desirable properties is generated, as described in Section 2.2. The choice of n is discussed in Section 2.3.1. Stage 1 In the sampling phase, a set of columns from X is selected to produce a design matrix X (1). The computer code is run at the design points (rows) in X (1) and a Gaussian process (GP) model is fitted to the output. In the grouping phase, the output is used to place the inputs into disjoint sets (groups). All inputs in the same group are set equal to the same level, defined by a design matrix G (Section 2.3.2) and the fitted GP model is used to predict the output at the design points in G. The analysis phase (Section 2.3.3) uses total effect sensitivity indices to determine which groups of inputs are inactive and which potentially contain active inputs. To judge whether a group is active or non-active, an additional low-impact input is created to use as a benchmark (c.f. Linkletter et al. (2006), Wu, Boos, and Stefanski (2007)). Stage 2 The inputs in the groups selected as active in Stage 1 are investigated individually in Stage 2. In the Stage 2 sampling phase (Section 2.4.1), a new design 13

matrix X (2) is selected in such a way that the design points in the combined X (1), X (2) retain, as closely as possible, desirable properties identified in Section 2.2. The computer code is run at the design points in X (2). The Stage 2 analysis phase uses the outputs from both stages in a second sensitivity analysis to make the final selection of active inputs (Section 2.4.2). 2.2 GSinCE Initialization Stage Suppose there are f experimental inputs, where the range of j th input is [a j, b j ] and a j and b j are known constants, for j = 1,..., f. Assume that the domain for the vector of the f inputs is the entire hyper-rectangle f j=1 [a j, b j ]. The design will be obtained from the scaled input space [0, 1] f and x tj [0, 1] will denote the value of the j th scaled input on the t th run of the design. Then the computer code is run to obtain the output using the unscaled input, z tj x tj (b j a j ) + a j, for t = 1,..., n and j = 1,..., f. In the Initialization Stage, a preliminary design matrix X is constructed with n rows and n 1 columns as below. Denote the j th column of X by ξ j = (ξ 1j,..., ξ nj ), where denotes transpose, j = 1,..., n 1, and the i th row as x i = (x i1,..., x i(n 1) ), i = 1,..., n. The design matrices for the Stage 1 sampling and grouping phases will be drawn from this matrix, as will those for the low-impact inputs. There are three requirements for the design matrix X. First, the columns of X are required to be uncorrelated to allow independent assessment of the effects of the different inputs. Second, the minimum and maximum values in each column must be 0 and 1, respectively; if this is not the case then those variables whose scaled input values in the design have larger ranges will have a larger impact on the response, artificially 14

induced by the design (see Section 3.3.2). Third, the design X should be spacefilling at each stage in the sense that the selected design maximizes the minimum inter-point distance in all 2-dimensional subspaces of the input space. This helps to insure that all regions of the input space are explored (c.f. Sacks et al. (1989), and Santner et al. (2003), chapter 5). These three properties are referred to, respectively, as (P.1), (P.2), and (P.3). An algorithm for generating X which satisfies (P.1) and (P.2), and approximately satisfies (P.3) follows. Step 1 Randomly generate an n (n 1) Latin hypercube design matrix Λ = (λ 1,..., λ n 1 ) with rank (n 1), (see McKay et al. (1979)). Step 2 Center each column of Λ : v h = λ h (λ h 1/n)1 for h = 1,..., n 1, where 1 is a vector of n unit elements. Step 3 Apply the Gram-Schmidt algorithm to form orthogonal columns u h = (u 1h,..., u nh ), u h = { v1, h = 1; v h h 1 i=1 u i v h u i 2 u i, h = 2,..., n 1. Step 4 Scale the values of u h to [0,1] to give ξ h = (ξ 1h,..., ξ nh ), where ξ ih = (u ih min{u 1h,..., u nh })/(max{u 1h,..., u nh } min{u 1h,..., u nh }), for i = 1,..., n and h = 1,..., n 1. Set X = (ξ 1,..., ξ n 1 ). Step 5 Select the design matrix X = (ξ 1,..., ξ n 1 ) which maximizes the minimum inter-point distance over all projections of the design into 2-dimensional space, i.e., maximizes min i<j; i,j {1,2,...,( n 2)} { } min (ξ ih ξ jh ) 2 + (ξ il ξ jl ) 2. h<l; h,l {1,2,...,( n 1 2 )} 15

Step 5 can be carried out (approximately) in a brute-force manner by repeating Steps 1-4 many times and selecting the best maximin design among the candidate designs generated. Alternatively, some form of genetic exchange algorithm (see Bartz- Beielstein (2006)) could be used to find an approximate maximin design, for example, the evolutionary operation (EVOP) method used in Forrester, Sóbester, and Keane (2008), chapter 1. Step 4 of the algorithm guarantees (P.2), while (P.1) can be verified as follows. Let ξ h = ξ h 1/n be the arithmetic mean of the elements in the h th column of X, then by construction, the correlation of ξ h and ξ l, h l, is r(ξ h, ξ l ) = (ξ h ξ h 1) (ξ l ξ l 1) (ξh ξ h 1) (ξ h ξ h 1)(ξ l ξ l 1) (ξ l ξ l 1) = u h u l u h u h u l u l = 0, where u h and u l are defined in Step 3 and satisfy u h u l = 0 and u h 1 = u l 1 = 0 from Step 2. Alternative distance criteria such as the average distance criterion over all low-dimensional projections could also be used, (see, for example, Welch (1985)). 2.3 GSinCE Procedure Stage 1 2.3.1 Stage 1 Sampling Phase The GSinCE procedure is to be used in a screening situation where it is reasonable to assume that only a small fraction (say 25% or less) of the inputs are active. Loeppky, Sacks, and Welch (2009) justified 10 number of inputs as a reasonable rule of thumb for the number of runs in an effective initial computer experiment. Using this base value, 5 runs for each active input in each stage is reasonble. As an example, with f = 20 inputs and a conservative assumption of a maximum of 40% active inputs, one may take n = 5 (f 0.4) = 2f runs in Stage 1. 16

The Stage 1 design matrix, X (1), is taken to be the first f columns of the n (n 1) preliminary design matrix X ; thus X (1) = (ξ 1,..., ξ f ). Denote a vector of outputs from the Stage 1 code runs as y(x (1) ). A Bayesian GP model (see Higdon et al. (2004) and Higdon et al. (2008)), Y (x) = Z(x) + ϵ(x) (2.1) is fitted to the data y(x (1) ). Z( ) is taken to be a stationary Gaussian process with zero mean, variance 1/λ Z and covariance function Cov(Z(x), Z( x)) = 1 λ Z R(x, x) = 1 λ Z f j=1 ρ 4(x j x j ) 2 j, (2.2) where x = (x 1,..., x f ) and x = ( x 1,..., x f ) are two design points. The GP model (2.1) is a special case of (1.1) when f (x)β = 0. The term ϵ(x) in (2.1) is added to represent numerical or other small scale noise and is modeled by a white noise process that is independent of Z( ) and has mean 0 and (small) prior variance 1/λ ϵ. The output y(x (1) ) is centered to have sample mean 0 and unit variance to conform to the prior specification when this model is fitted. The Bayesian model can be fitted using the GPM/SA (Gaussian Process Models for Simulation Analysis) software of Gattiker (2005). The posterior distributions of the model parameters will be used to predict output as in (1.7) for the group variables in the grouping phase in Section 2.3.2. 2.3.2 Stage 1 Grouping Phase Initial grouping of the inputs into groups that have similar effects on the response is critical for efficient group screening. The individual inputs can be divided into groups using information from subject experts, or using exploratory data analysis of the Stage 1 data, or a combination. Alternatively, an automatic grouping 17

procedure can be used as described below, where M is the user-selected maximum group size. The method uses Fisher-transformed Pearson correlation coefficients, r j = tanh 1 (r(ξ j, y(x (1) ))), j = 1,... f, (see Fisher (1921)), where the correlation coefficient r(ξ j, y(x (1) )) measures the strength of the linear relationship between the jth input and the output. Step 1 Set q = f. Step 2 Compute the sample mean r and the sample standard deviation s r of r 1,..., r q. Let a reference distribution for r 1,..., r q be N( r, s 2 r ). Step 3 Divide the reference distribution into ν = q/m intervals where the ith boundary is defined to be the i/ν 100th percentile of N( r, s 2 r ), i.e., Φ 1 r,s 2 (i/ν), r i = 1,..., ν 1. Step 4 Group r 1,..., r q into ν groups based on the boundaries of the reference distribution and count the number of elements observed in each group, h 1,..., h ν. Step 5 If h 1 > M and h ν > M, then go to Step 6. Otherwise, go to Step 7. Step 6 Subdivide each of groups 1 and ν, repeating Steps 1-4, first setting q = h 1 for the leftmost group, and then q = h ν for the rightmost group. Update ν as the total number of groups so that the corresponding group sizes are h 1,..., h ν. Go to Step 5 with the updated groups. Step 7 Sequentially examine h 1, h 2,..., h ν. Let i be the smallest index for which h i > M. If there exists no i for which h i > M, then stop and set m = ν. Otherwise, sequentially examine h ν, h ν 1,..., h ν (ν i). Let j be the smallest index for which h ν j > M. Let q = h i + h i+1 +... + h ν j and go to Step 8. Step 8 Relabel r 1,..., r q corresponding to the inputs to be re-grouped and go to Step 2. 18

After the f individual inputs have been divided into the m groups, a design matrix G = (g 1,..., g m ) is formed from a random selection of m columns from X for the group variables. From G, design matrix X P = (ξ P 1,..., ξ P f ) is constructed in terms of the f individual inputs, where all the inputs in group i are set to the levels defined by g i, i = 1,..., m. For example, if the inputs 1, 5, 6 are assigned to group 1, then ξ P 1 = ξ P 5 = ξ P 6 = g 1. The design matrix X P is used to predict the output based on the fitted GP model (Section 2.3.1). The resulting values, denoted by ŷ(x P ) (or, more simply, by ŷ(g)), are used in Section 2.3.3 to select the active groups. The training data, y(x (1) ), will be used again in Section 2.4.2 to select active individual inputs within the active groups. 2.3.3 Stage 1 Analysis Phase Sensitivity Indices In this section and Section 2.4.2, the total effect sensitivity index (TESI) is used to detect active effects. This subsection reviews the definition of sensitivity indices when the input region is [0, 1] f. See Chapter 5 for more details. Sobol (1993) showed that the function y(x) can be uniquely decomposed as f y(x) = y 0 + y j (x j ) + y jh (x j, x h ) +... + y 1,2,...,f (x 1,..., x f ) (2.3) j=1 1 j<h f where the terms are recursively defined by y 0 = y(x 1,..., x f )dx 1... dx f [0,1] f y j (x j ) = y(x 1,..., x f )dx j y 0 [0,1] f 1 y jh (x j, x h ) = y(x 1,..., x f )dx jh y j (x j ) y h (x h ) y 0 [0,1] f 2 19

and so on. Here dx j denotes integration over all inputs except x j, and dx jh denotes the integration over all inputs except x j and x h. The individual components of Sobol s decomposition are centered; that is, they satisfy 1 0 y j1,...,j s (x j1,..., x js )dx jk = 0, for any 1 k s, and orthogonal; that is, they satisfy y j1,...,j s (x j1,..., x js )y h1,...,h t (x h1,..., x ht )dx 1... dx f = 0, [0,1] f for any (j 1,..., j s ) (h 1,..., h t ). Variance-based indices are obtained by squaring both sides of the Sobol decomposition (2.3) and integrating over [0, 1] f (Sobol (1993)). This leads to the variance decomposition, f V = V j + V jh + + V 1,2,...,f (2.4) j=1 1 j<h f where V = y 2 (x 1,..., x f )dx 1... dx f y0, 2 [0,1] f V j = 1 0 y 2 j (x j )dx j, and V jh = 1 1 0 0 y 2 jh(x j, x h )dx j dx h, and additional terms are defined similarly. Sensitivity indices are obtained by dividing each component in the variance decomposition (2.4) by the total variance V. The main effect sensitivity index of the j th input, is defined to be S j = V j /V. The twofactor sensitivity index of the j th and h th inputs is defined to be S jh = V jh /V. Higherorder sensitivity indices are defined similarly. The TESI of the j th input (Homma and Saltelli (1996)) is the sum of all sensitivity indices involving the j th input, i.e., T j = S j + h j S jh +... + S 1,2,...,f. (2.5) The sensitivity indices are computed using the Bayesian method of Oakley and O Hagan (2004) as implemented in GPM/SA; the sensitivity index is estimated by 20