Fast Statistical Surrogates for Dynamical 3D Computer Models of Brain Tumor

Size: px

Start display at page:

Download "Fast Statistical Surrogates for Dynamical 3D Computer Models of Brain Tumor"

Leonard Wilkerson
6 years ago
Views:

1 Fast Statistical Surrogates for Dynamical 3D omputer Models of Brain Tumor Dorin Drignei Department of Mathematics and Statistics Oakland University, Rochester, MI 48309, USA Abstract. Understanding how malignant brain tumors are formed and evolve has direct consequences on the development of efficient methods for their early detection and treatment. Adequate mathematical models for brain tumor growth and invasion can be helpful in clarifying some aspects of the mechanism responsible for the tumor. These mathematical models are typically implemented in computer models, which can be used for computer experimentation to study how changes in inputs, such as growth and diffusion parameters, affect the evolution of the virtual brain tumor. The computer model considered in this paper is defined on a three dimensional (3D) anatomically accurate digital representation of the human brain, which includes white and grey matter, and on a time interval of hundreds of days to realistically simulate the tumor development. onsequently, this computer model is very computationally intensive and only small size computer experiments can be conducted, corresponding to a small sample of inputs. This paper presents a computationally efficient multidimensional kriging method to predict the evolution of the virtual brain tumor at new inputs, conditioned on the virtual brain tumor data available from the small size computer experiment. The analysis shows that this prediction can be more accurate than a computationally competing model. Keywords: Astrocytomas; BrainWeb; omputer experiments; Kriging; Numerical models. Introduction Despite recent advances in computerized tomography (T) and magnetic resonance imaging (MRI), the chances for early detection and subsequent successful treatment of malignant brain tumors are still low. In general, the cancerous tumors develop from one or several mutating cells which sustain rapid uncontrolled growth and invade the normal tissue. This paper is concerned with the most common type of primary brain tumors, called gliomas, which begin in glial cells (the supportive tissue of the brain). The most common gliomas are known as astrocytomas, originating in connective tissue cells called astrocytes. The astrocytomas, in turn, could be low-grade (the least malignant), mid-grade (moderately malignant) or highgrade (glioblastomas, the most malignant). In children, astrocytomas are most commonly located in the cerebellum whereas in adults the most common location is in the cerebral hemispheres. The exact causes of these brain tumors and their mechanism of development are still under intense scientific investigation. Mathematical models of brain tumors are helpful in understanding some aspects of the mechanism responsible for brain tumors, with implications for prognosis and treatment. The dynamical three dimensional (3D) mathematical model considered here includes growth and diffusion parameters. The growth parameter is an unknown constant appearing in the growth term of the mathematical model. As in Swanson et al (2000) and Murray (2003), chapter, distinct parameters for diffusion of a brain tumor are specified for white and grey brain matter, with larger diffusion parameters in the white matter since the tumor diffuses faster in such tissue. These mathematical models are typically implemented in computer models, or codes. Different combinations of parameters (or inputs) lead to appropriate simulations of virtual brain tumors with different degrees of malignancy (the outputs). Experimentation with these computer models, in which the inputs are systematically changed, represents a useful method for understanding their effects on the outputs. For each input, these computer models are solved numerically over time and three spatial dimensions, and they are very computationally intensive, with a single run taking hours of computational time. Therefore, only small size experiments (corresponding to a small number of sampled inputs) can be conducted and analyzed.

2 The main goal of this paper is to illustrate a computationally efficient method for predicting virtual brain tumors at any input in an input space, conditioned on the virtual brain tumor data available from the small number of computer model runs. More precisely, a small number of inputs are sampled in the input space according to a specific design, the corresponding dynamical 3D computer model runs are performed and the output data are recorded. An appropriate statistical model for the sampled output data is estimated and kriging methods are used to predict the output data at new inputs. This prediction can then be used as a substitute (or surrogate) for the computationally intensive dynamical 3D computer model, at any input in the input space. An important aspect of the output data sets analyzed here is that these are deterministic: running the computer model twice for the same input yields the same output data set. In this paper we take the empirical Bayesian approach of urrin et al (99) where, before the computer model runs are performed, the joint distribution of a Gaussian process is assumed as a prior distribution for the unknown output, over the entire input space. The general statistical methodology of kriging for design and analysis of computer experiments is discussed, in the context of univariate output, by Sacks et al (989), urrin et al (99) and in the books by Santner et al (2003) and Fang et al (2006). More recently, a Bayesian model that relies on this methodology has been used by Bayarri et al (2007a) to calibrate computer models. The output data sets presented in this paper are high-dimensional. Straightforward generalizations of the univariate case to accommodate analysis of such large output data sets have computational limitations, as pointed out in Drignei and Morris (2006). More precisely, the general methodology would require the specification of a high dimensional dense covariance matrix, which in turn would be difficult to estimate and use in kriging formulas. Drignei (2006) proposed a two-stage method for the analysis of high dimensional computer output, which was illustrated with a geophysical example. Drignei and Morris (2006) developed methodology specifically for the analysis of finite difference computer models, in which the partial derivatives of the original mathematical model are approximated by finite differences on a grid. The form of the finite difference dynamical 3D computer model of a brain tumor considered here permits the development of a more computationally efficient modification of the prediction method presented in Drignei and Morris (2006). Other recent work in the area of multidimensional computer output analysis has been done by Bayarri et al (2007b) and Higdon et al (2007), who used basis representations for multidimensional output in a Bayesian framework and in the context of computer model calibration. This paper is organized as follows. Section 2 discusses the computer experiment with the dynamical 3D computer model of brain tumor growth and invasion, the input parameters and the output data sets. Section 3 presents the statistical methodology, including the statistical model, kriging prediction and validation. Section 4 presents the results and some conclusions are presented in Section 5. 2 An experiment with the computer model of brain tumor 2. The computer model The computer model is a deterministic, iterative relationship in time and three spatial dimensions. It originates in a continuous mathematical model (a partial differential equation), which includes diffusion and growth terms (Murray 2003, p 545). The mathematical model is defined on a rectangular spatial domain and in a finite time interval. It does not have a closed form, analytical solution, and therefore a numerical, approximate solution is obtained. The partial derivatives with respect to time and space are approximated on a discrete grid of points, generating an iterative relationship commonly known as a finite difference scheme. Finer grids produce more accurate but more computationally intensive numerical solutions. The analysis presented here concerns a hypothetical adult patient, who has developed a brain tumor initially localized at a spatial point x 0. The untreated brain tumor is then followed for about 600 days. onsider the time interval [0, L T ] and the rectangular spatial domain [0, L S ] [0, L S2 ] [0, L S3 ]. This four-dimensional volume is discretized on a fine grid with M F + equally distanced time points and (S F + 2) (S 2F + 2) (S 3F + 2) equally distanced space points. Adjacent points are separated by time increment tf = L T /M F and space increment = L Si /(S if + ), i =, 2, 3. While theoretically the space increments need not be equal, the brain spatial grid available for this analysis requires equal spaced increments for all three spatial dimensions. The development of BrainWeb ( allows users to obtain anatomically accurate digital data sets of the human brain (ollins et al, 998). In particular, here the interest is in data sets that define the white and grey matter, and the brain boundaries in an

cm 3 rectangular box. The fine grid has 9 09 9 space points, 2mm apart in each dimension.

3 cm 3 rectangular box. The fine grid has space points, 2mm apart in each dimension. Figure presents a smoothed image of this data set, sectioned at the center x 0 (approximately the same as in Murray 2003, p.577) of the future virtual tumor. While the tumor starts from a single cell, by the time of the first computerized tomography (T) scan it evolved in a bundle of malignant cells, assumed normally distributed at a center point x 0, with a spread b and maximum cell density a Y (0, i, j, k) = a exp[ (x i x 0i ) 2 + (y j y 0j ) 2 + (z k z 0k ) 2 ], b where Y is the brain tumor cell density, x = (x i, y j, z k ), x 0 = (x 0i, y 0j, z 0k ), i =,..., S F +2, j =,..., S 2F +2, k =,..., S 3F + 2, with S F = 89, S 2F = 07, S 3F = 89, and parameters a and b similar to those in Murray (2003), p Here 0 is the initial time for the analysis presented. The computer model is iteratively expressed as Y (t +, i, j, k) = Y (t, i, j, k) + tf { Y (t,i+,j,k) Y (t,i,j,k) Y (t,i,j,k) Y (t,i,j,k) [D(i, j, k) D(i, j, k) [D(i, j, k) Y (t,i,j+,k) Y (t,i,j,k) Y (t,i,j,k+) Y (t,i,j,k) D(i, j, k) Y (t,i,j,k) Y (t,i,j,k) Y (t,i,j,k) Y (t,i,j,k ) [D(i, j, k) D(i, j, k ) ρ Y (t, i, j, k)}, t = 0,..., M F, i = 2,..., S F +, j = 2,..., S 2F + and k = 2,..., S 3F +. This finite difference relationship is said to be explicit because the numerical solution at time t + is obtained explicitly from the numerical solution at previous time points. However, implicit finite differences may also be considered (e.g. Swanson et al 2004). In the above relationship, D(i, j, k) represents the array of non-homogeneous parameters (or coefficients) of diffusion of a tumor inside the brain, which are piecewise constant: D g, D w diffusion parameters in the grey and white matter respectively, and zero anywhere else in the rectangular domain (e.g. Tracqui et al 995, p. 20). As in Murray (2003), the tumor diffuses about 5 times faster in the white than in the grey matter, therefore we choose D w = 5D g. Also, ρ is the tumor growth parameter. The unknown parameters D g and ρ will be the inputs in the computer model and will be discussed below. To maintain numerical stability, a very small time step needs to be chosen; here M F = 5, 003 equally spaced time points are considered. The numerical solution computed on such a fine grid, in a loop with respect to time and in vector format with respect to each spatial dimension, takes about 5 hours and 45 min for each run in R on a computer with a 2.6 GHz Intel ore Duo processor and 2GB RAM. Note that BrainWeb allows the user to choose even finer spaced grids, of mm 3 (as in Murray 2003) or 0.5mm 3 resolutions. Such fine spaced grids require a smaller time increment (to maintain numerical stability), which in turn give more accurate, but even more computationally intensive numerical solutions. Figure. Anatomically accurate model of human brain containing white and grey matter, embedded in a rectangular domain and sectioned at the center of the future virtual tumor. 3

Figure 2. Low grade virtual brain tumor (black contours) corresponding to input (D g = 0.0003, ρ = 0.

4 Figure 2. Low grade virtual brain tumor (black contours) corresponding to input (D g = , ρ = 0.002), at four different times separated by 200 days (columns), in transversal (upper row), coronal (middle row), sagittal (lower row) brain sections. Figure 3. High grade virtual brain tumor (black contours) corresponding to input (D g = 0.003, ρ = 0.02), at four different times separated by 200 days (columns), in transversal (upper row), coronal (middle row), sagittal (lower row) brain sections. 2.2 The input parameters omputer experiments have been conducted in Murray (2003), especially in two-dimensional coronal sections of the brain, to assess the effect of diffusion (invasiveness) and growth parameters on the evolution of a brain tumor. The input parameter vectors (D g, ρ) have been constrained to the rectangular input space [0.0003, 0.003] [0.002, 0.02], as in Murray (2003) p.572, since values in this region lead to virtual brain tumors in agreement with clinical data. In this paper we consider computer experiments in the same input 4

5 region, but the output data sets have three spatial dimensions. Figures 2 and 3 show snapshots of the brain tumor at interval of 200 days, for two corners of the input space: a low grade tumor (D g = , ρ = 0.002) and a high grade tumor (D g = 0.003, ρ = 0.02). One can clearly see wide differences among simulated tumors across the input space. Figure 4 shows the input space, along with 5 input vectors (D g, ρ) (plotted as o ) and sampled according to a maximin Latin hypercube design (Morris et al 993). This design spreads out the sampled inputs, in the sense that no two sampled inputs are too close. The dynamical 3D brain tumor computer model discussed above is run for each of these sampled inputs and output data sets are obtained. In addition, to validate the prediction method (to be discussed later), 2 more inputs plotted as + in Figure 4 are sampled according to a Latin hypercube design that maximizes the minimum distance among the points themselves and from the 5 inputs of the first design. 2.3 The output data sets For each of the 5 sampled inputs there are 5, data values, which would give an overall sample size of more than 200 billion. It was necessary to use a fine grid, in order to obtain an accurate numerical solution. However, due to the smoothness of the numerical solution (as one can see from Figures 2 and 3), the fine grid data will only be retained on a coarser grid, to facilitate further analysis. Each spatial dimension has been coarsened three times and, for numerical stability, the time dimension nine times, the fine grid data being retained on the resulting coarse grid. Therefore, the coarse grid on which the fine grid data have been retained has M = 667 equally distanced time points in the time interval [0, L T ] and (S + 2) (S 2 + 2) (S 3 + 2) = equally distanced space points in the rectangular spatial domain [0, L S ] [0, L S2 ] [0, L S3 ]. While an even coarser grid may be considered, one needs to retain enough grid structure to preserve key geometrical properties of the tumor output data set. Denote Y the fine grid data set retained on the above coarse grid. Now, for each of the 5 sampled inputs, there are 59,309,076 data values, giving a total of 889,636,40 values as the size of the final data set retained for further analysis. Note that the fine grid data Y retained on the coarse grid is not equal to the coarse numerical solution obtained by running the computer model above with coarse grid increments..00! D g Figure 4. The input space, along with D = 5 sampled inputs (D g, ρ) for analysis ( o ) and P = 2 sampled inputs (D g, ρ) for prediction validation ( + ). The goal in this paper is to develop a computationally efficient prediction method for the fine grid virtual brain tumor data set retained on the coarse grid, at any new input in the input space, therefore avoiding new runs with the computationally intensive dynamical 3D brain tumor model. This will be accomplished by developing a statistical model for the output data set at the D = 5 sampled inputs and then using kriging-type methodology to obtain the prediction at new inputs. 5

6 3 Methodology 3. The statistical model Modeling directly the large output data sets by dense covariance matrices is unpractical, as demonstrated in Drignei and Morris (2006). Instead, a statistical model that closely follows the iterative finite difference relationship will be used, hence incorporating the output data generating mechanism. For each of the 5 sampled inputs compute T (t, i, j, k) = Y (t +, i, j, k) Y (t, i, j, k) t { [D (i, j, k) Y (t,i+,j,k) Y (t,i,j,k) D (i, j, k) Y (t,i,j,k) Y (t,i,j,k) [D (i, j, k) Y (t,i,j+,k) Y (t,i,j,k) D (i, j, k) Y (t,i,j,k) Y (t,i,j,k) [D (i, j, k) Y (t,i,j,k+) Y (t,i,j,k) D (i, j, k ) Y (t,i,j,k) Y (t,i,j,k ) ρ Y (t, i, j, k)} for t = 0,..., M, i = 2,..., S +, j = 2,..., S 2 +, k = 2,..., S 3 +, with t = L T /M, = L Si /(S i + ), i =, 2, 3 and D is D retained on the coarse space grid. These are approximate local truncation errors, similar to those defined for example in Thomas (995) p , but rescaled by t to simplify the computation of statistical prediction. Second order stationary distributions are suitable as prior distributions for the approximate local truncation errors due to theoretical properties showing that these quantities have roughly constant magnitude across space-time, and spatiotemporal averages of approximately zero. Moreover, Taylor series arguments show that local truncation errors are a combination of higher order derivatives of the output Y, and are in general rougher than the outputs themselves. This leads to the assumption that the spatiotemporal correlations of local truncation errors are in general weak, and therefore spatiotemporal independence will be assumed for the approximate local truncation errors. Let T denote the complete vector of the M S S 2 S 3 D approximate local truncation errors. The statistical model assumed is T N(0, σ 2 D I M S S 2 S 3 ), where represents the Kronecker product. Here D is the D D correlation matrix for inputs with (m, n) element D (m, n) = exp{ θ Dg [d Dg (m) d Dg (n)] 2 } exp{ θ ρ [d ρ (m) d ρ (n)] 2 }, a product of one-dimensional correlations, where d Dg and d ρ are the coordinates of D g and ρ respectively, rescaled to the interval [0, ] for better numerical stability. The maximum likelihood estimates of parameters θ Dg and θ ρ are obtained by minimizing iteratively the function where log(ˆσ 2 ) + log(det( D)), D ˆσ 2 = T ( D I M S S 2 S 3 )T = M S S 2 S 3 D M S S 2 S 3 D D i,j= ( D [i, j])t r[., i]t r [., j], T r being the vector T reorganized as an M S S 2 S 3 D matrix. Since ˆσ 2 is used in the iterative likelihood optimization and it changes its value at each iteration, it is more computationally convenient to use the last form, with the scalars T r[., i]t r [., j], i, j =,..., D, computed before the iterative likelihood optimization begins. 3.2 Prediction The main goal in this paper is to predict the fine grid virtual brain tumor, retained on the coarse grid, at a new input (D p g, ρ p ). The prediction distribution of approximate local truncation errors at the new input is the conditional multivariate normal distribution with mean ˆT p D = [(ĈpĈ D ) I M S S 2 S 3 ]T = D i= (ĈpĈ D ) it r[., i], 6

7 a weighted sum of approximate local truncation errors vectors at the sampled inputs (the true but unknown statistical parameters have been replaced by their maximum likelihood estimates). The prediction error covariance is ˆσ p D 2 = ˆσ2 ( ĈpĈ D Ĉ p) I M S S 2 S 3. Here, Ĉ p (n) = exp{ ˆθ Dg [d D p g d Dg (n)] 2 } exp{ ˆθ ρ [d ρ p d ρ (n)] 2 }, with n =,..., D. The univariate version of this prediction method is sometimes called simple kriging (e.g. ressie 993, p. 0). Note that kriging prediction is done over the input space, not over the spatial component of the data sets. The output data Y is a linear function of T, which implies that Y has also a normal distribution. From Appendix, it follows that the point prediction Ŷ p D for output data Yp at the new input can be computed iteratively as Ŷp D (t +, i, j, k) = Ŷp D (t, i, j, k) + t { [D (i, j, k) Ŷ p D (t,i+,j,k) Ŷ p D (t,i,j,k) D (i, j, k) Ŷ p D (t,i,j,k) Ŷ p D (t,i,j,k) [D (i, j, k) Ŷ p D (t,i,j+,k) Ŷ p D (t,i,j,k) D (i, j, k) Ŷ p D (t,i,j,k) Ŷ (t,i,j,k) [D (i, j, k) Ŷ p D (t,i,j,k+) Ŷ p D (t,i,j,k) D (i, j, k ) Ŷ p D (t,i,j,k) Ŷ p D (t,i,j,k ) ρ Ŷ p D (t, i, j, k)} + ˆT p D (t, i, j, k) for t = 0,..., M, i = 2,..., S +, j = 2,..., S 2 + and k = 2,..., S 3 +, starting with the same initial values presented in Section 2., retained on the coarsely spaced grid. While it is algebraically possible to write down the theoretical covariance prediction error for Ŷ p D, it is computationally unpractical to do so, as it requires the multiplication of a matrix of size M S S 2 S 3 M S S 2 S 3 = 49, 068, 45 49, 068, 45 with its transpose (see Appendix). In practice, it often suffices to obtain individual prediction intervals, which involve the main diagonal elements of the covariance prediction error. While computationally less demanding than the entire covariance prediction error, this is still computationally intensive to be practically useful. Instead, a more computationally feasible solution to obtain individual prediction intervals is by simulation. onsider the following iterative relationship that simulates R realizations Ỹ p D from the conditional distribution of Yp on Y : Ỹp D (t +, i, j, k, r) = Ỹp D (t, i, j, k, r) + t { [D (i, j, k) Ỹ p D (t,i+,j,k,r) Ỹ p D (t,i,j,k,r) D (i, j, k) Ỹ p D (t,i,j,k,r) Ỹ p D (t,i,j,k,r) [D (i, j, k) Ỹ p D (t,i,j+,k,r) Ỹ p D (t,i,j,k,r) D (i, j, k) Ỹ p D (t,i,j,k,r) Y (t,i,j,k,r) [D (i, j, k) Ỹ p D (t,i,j,k+,r) Ỹ p D (t,i,j,k,r) D (i, j, k ) Ỹ p D (t,i,j,k,r) Ỹ p D (t,i,j,k,r) ρ Ỹ p D (t, i, j, k, r)} + ˆT p D (t, i, j, k) + ˆσ p D 2 ɛ t,i,j,k,r for t = 0,..., M, i = 2,..., S +, j = 2,..., S 2 +, k = 2,..., S 3 +, r =,..., R and ɛ independent standard normal random variables. The initial values are the same as above. If τ0 D 2 (t, i, j, k) represents the conditional variance of Yp (t, i, j, k) on Y at each point (t, i, j, k), then is univariate normal and conditionally independent of R τ0 D 2 (t, i, j, k) (Ỹ Yp (t, i, j, k) Ŷ p D (t, i, j, k) τ 0 D (t, i, j, k) r= p D (t, i, j, k, r) Ŷ p D (t, i, j, k))2, which has a χ 2 R distribution. Therefore, conditioned on Y, the distribution of Yp (t, i, j, k) Ŷ p D (t, i, j, k) R r= (Ỹ p D (t, i, j, k, r) Ŷ p D (t, i, j, k))2 /R 7

8 is t R. The individual prediction interval of Y p (t, i, j, k) is defined as (Ŷ p D (t, i, j, k) t R, α/2 SE, Ŷ p D (t, i, j, k) + t R, α/2 SE), R where SE = r= (Ỹ p D (t, i, j, k, r) Ŷ p D (t, i, j, k))2 /R. In practice R will typically be small (e.g R = 2) since larger R would require more computational effort. The key to the computational efficiency of the prediction method developed in this paper is the parametric model for the predicted output, which permits the use of a relatively small number of simulations Ỹ p D in conjunction with the point prediction Ŷp D. The above point prediction (along with the prediction intervals) defines the fast statistical surrogate. 3.3 Validation The following validation measures for prediction are considered here: the root mean square error P M,S,S 2,S 3 RMSE = (Yp P M S S 2 S (t, i, j, k) Ŷ p D (t, i, j, k))2, 3 the maximum absolute value of error p,t,i,j,k= MaxErr = max Yp (t, i, j, k) Ŷ p D (t, i, j, k) and the actual coverage of prediction intervals with a certain nominal coverage (e.g. 95%) OV ER = P M S S 2 S 3 P M,S,S 2,S 3 p,t,i,j,k= δ [Y p (t,i,j,k) I p,t,i,j,k ], where I p,t,i,j,k is the individual prediction interval of Yp (t, i, j, k) at each point (p, t, i, j, k), and δ is the indicator function. The coarse numerical solution computed on the coarse grid, denoted by X below, and the prediction of the fine grid output data use comparable computational resources (the computational times are given in the next Section). Therefore the coarse numerical solution will be used as a computational competitor for prediction, in order to compare their accuracy with respect to the validation measures defined above. This coarse numerical solution is faster but less accurate than the fine grid numerical solution, when compared on the coarse grid. The validation measures RMSE and MaxErr for the coarse numerical solution are defined similarly, with X instead of Ŷ p D X(t +, i, j, k) = X(t, i, j, k) + t { [D (i, j, k) X(t,i+,j,k) X(t,i,j,k). The coarse numerical solution on the coarse grid is computed as [D (i, j, k) X(t,i,j+,k) X(t,i,j,k) D (i, j, k) X(t,i,j,k) X(t,i,j,k) D (i, j, k) X(t,i,j,k) X(t,i,j,k) [D (i, j, k) X(t,i,j,k+) X(t,i,j,k) D (i, j, k ) X(t,i,j,k) X(t,i,j,k ) ρ X(t, i, j, k)}, t = 0,..., M, i = 2,..., S +, j = 2,..., S 2 + and k = 2,..., S Analysis of brain tumor computer experiment The computer model has been run for D = 5 input vectors (D g, ρ) shown in Figure 4 ( o ) and the output data Y have been recorded. The approximate local truncation errors data T have been obtained as in Section 3. and the statistical model presented there has been fitted. The variance parameters will be fixed at their maximum likelihood values ˆθ Dg = 0.77, ˆθ ρ = 6.76 and ˆσ 2 = 0.4 throughout the rest of the analysis. Figure 5 shows contour plots of the correlations for each input dimension, the correlation of the first input dimension decreasing much slower with distance than the correlation of the second input dimension. 8

9 To demonstrate the quality of the statistical prediction, the fine grid computer model has been run at the P = 2 new inputs (Dg, p ρ p ) denoted by + in Figure 4, the output data Y retained on the coarse grid have been obtained, and their predicted values along with the validation measures presented in Section 3.3 have been computed. (For applications where new runs cannot be obtained, one could use cross-validation methods.) To compare the results, the coarse numerical solution X along with the corresponding RMSE and MaxErr measures have also been computed. This coarse numerical solution takes about 2 minutes per run. The computation of ˆT p D takes about 3 minutes. The iterative computation of the point prediction Ŷ p D alone takes about 2 minutes, and the computation of the R = 2 statistical simulations Ỹ p D takes about 4 minutes. The computational times for the coarse numerical solution X and for prediction are comparable, given that the computational time of the fine grid numerical solution is measured in hours. However, as one can see from Table, the statistical prediction is much more accurate than the coarse numerical solution. (In Table, the index p of the accuracy measures stands for statistical prediction and c for coarse numerical solution). As the number D of sampled input parameters increases, the accuracy of the statistical prediction is expected to increase even further; however, this requires more fine-grid computer model runs and therefore additional computational resources are needed. The individual prediction intervals appear to over-cover the true output values for both 95% and 99% nominal values, although the over-coverage seems to be smaller for the 99% nominal value. Increasing the number of simulations appears to improve slightly the actual coverage (e.g. for R = 0 the actual coverage was 72 and 920 for the 95% and 99% nominal coverage, respectively). However, this requires an increased computational effort and in practice one needs to balance the accuracy and the computational efficiency of the statistical surrogate d Dg d! d Dg d! Figure 5. ontour plots of correlation matrices for each input dimension (D g left panel, ρ right panel). Table. Results. RMSE p /RMSE c MaxErr p /MaxErr c OV ER 95% (99%) (947) The above RMSE and MaxErr are overall measures with respect to all P = 2 new inputs. In order to assess their performance at individual new inputs, the RMSE and MaxErr measures at each of the P = 2 new inputs were also computed, by taking P = in the formulas presented in Section 3.3. Figure 6 contains boxplots of the differences of log of these 2 values, showing that the statistical prediction validation measures RMSE p and MaxErr p are smaller than their coarse numerical solution counterparts RMSE c and MaxErr c at any of the 2 new inputs, although for one new input, (D g, ρ) = (0.002, ), the values of MaxErr p and MaxErr c are almost equal. The analysis shows that the statistical prediction has achieved greater accuracy than a computationally comparable numerical solution computed with the original computer model. 9

10 log(rmsep) log(rmsec) log(maxp) log(maxc) Figure 6. Boxplots of differences of log RMSE (left) and of differences of log MaxErr (right) at individual P = 2 prediction inputs, for point prediction and coarse numerical solution. 5 onclusions This paper has presented a computationally efficient method for constructing fast statistical surrogates for a computationally intensive dynamical 3D computer model of brain tumor growth and invasion. The method consisted in sampling a small number of input parameters, running the computer code for each input and obtaining the output data sets. An appropriate statistical model for the output data sets was estimated and a multidimensional kriging method was used to predict the output data sets at new inputs. The prediction defined the fast statistical surrogates. The statistical model followed closely the finite difference relationship underpinning the computer model. The method was tested at a new set of inputs and the statistical surrogate was more accurate than a computationally comparable coarse numerical solution generated with the computer model. Ultimately, the interest is in developing efficient methods for early detection and treatment of brain tumors. Mathematical models for surgical resection and chemotherapy treatment of brain tumors have already been developed (Murray 2003, Sections.9-0). With these models, one can investigate the effects of the surgical resection and/or chemotherapy on the subsequent evolution of the brain tumor. However, the resulting dynamical 3D computer models are even more complex and computationally intensive than the computer model discussed in this paper, therefore precluding scientists from conducting large size computer experiments. The statistical methodology presented in this paper, with appropriate modifications, may be used to reduce the computational effort and increase the size of those experiments. Appendix This Appendix derives the prediction distribution for the virtual tumor data set, outlines some computational issues and ways to overcome them. The output data in vector format at each of the sampled input vectors can be written as Yi = µ i + A i T i, following from the iterative linear relationship between Y and T. The matrices A i are lower triangular. In vector format, Y = µ+at, with Y = [Y,..., YD ], µ = [µ,..., µ D ], T = [T,..., T D ] and A = diag(a,..., A D ) (A a block diagonal matrix). The covariance matrix of Y is cov(y ) = σ 2 A( D I M S S 2 S 3 )A, with cov(yi, Y j ) = σ2 D (i, j)a i A j. onsider now a new input (Dg, p ρ p ). For true but unknown statistical parameters, the covariance between the output data at the new input and the output data at sampled inputs is cov(yp, Y ) = σ 2 [ p ()A p A,..., p (D)A p A D ] = σ2 A p ( p I M S S 2 S 3 )A. Therefore, the point prediction of Yp is the conditional mean Yp D = µ p + cov(yp, Y )cov(y ) (Y µ) = µ p + A p ( p I M S S 2 S 3 )A A ( D I M S S 2 S 3 )A AT = µ p + A p ( p D I M S S 2 S 3 )T = µ p + A p T p D, which justifies the use of 0

11 the iterative relationship to obtain Ŷ p D in Section 3.2. The prediction covariance matrix is the conditional covariance matrix cov(yp, Yp ) cov(yp, Y )cov(y ) cov(yp, Y ) = σ 2 A p A p σ 2 A p ( p I M S S 2 S 3 )A A ( D I M S S 2 S 3 )A A( p I M S S 2 S 3 )A p = σ 2 A p A p σ 2 A p ( p D p)a p = σ 2 ( p D p)a p A p. The computation of the prediction covariance matrix requires the product of two matrices of size M S S 2 S 3 M S S 2 S 3 and it is therefore unpractical. The computation of theoretical individual prediction intervals, which involve only the main diagonal of the product A p A p, is also unpractical. A more computationally practical way to obtain prediction intervals is through simulations from the prediction conditional distribution, which can be done iteratively as explained in Section 3.2. Acknowledgments This research has been supported in part by Oakland University Provost s New Faculty Research Fellowship. References [] Bayarri, M.J., Berger, J.O., A., Paulo, R., Sacks, J., afeo, J.A., avendish, J., Lin,.H. and Tu, J. (2007). A Framework for Validation of omputer Models. Technometrics, 49, [2] Bayarri, M.J., Walsh, D., Berger, J.O., afeo, J., Garcia-Donato, G., Liu, F., Palomo, Parthasarathy, R.J., Paulo, R. and Sacks, J. (2007), omputer Model Validation with Functional Output, Annals of Statistics, 35, [3] ollins, D.L., Zijdenbos A.P., Kollokian V., Sled J.G., Kabani N.J., Holmes.J., Evans A.. (998), Design and onstruction of a Realistic Digital Brain Phantom, IEEE Transactions on Medical Imaging, 7, [4] ressie, N. A. (993), Statistics for Spatial Data, New York, Wiley. [5] urrin,., Mitchell, T., Morris, M., and Ylvisaker D. (99), Bayesian prediction of deterministic functions, with applications to the design and analysis of computer experiments, Journal of the American Statistical Association, 86, [6] Drignei, D. (2006), Empirical Bayesian Analysis for High-Dimensional omputer Output, Technometrics, 48, [7] Drignei, D. and Morris, M.D. (2006), Empirical Bayesian Analysis for omputer Experiments Involving Finite-Difference odes, Journal of the American Statistical Association, 0, p [8] Fang, K.T., Li, R. and Sudjianto, A. (2006). Design and Modeling for omputer Experiments, hapman and Hall. [9] Higdon, D., Gattiker, J., Williams, B., and Rightley, M. (2007), omputer model validation using high dimensional outputs, in Bayesian Statistics 8, eds. Bernardo, J., Bayarri, M. J., Dawid, A. P., Berger, J. O., Heckerman, D., Smith, A. F. M., and West, M., London: Oxford University Press. [0] Johnson, M., Moore, L. and Ylvisaker D. (989), Minimax and Maximin Distance Designs, Journal of Statistical Planning and Inference, 26, [] Morris, M.D., Mitchell, T.J., and Ylvisaker, D. (993), Bayesian Design and Analysis of omputer Experiments: Use of Derivatives in Surface Prediction,Technometrics, 35, [2] Murray J.D. (2003), Mathematical biology II: spatial models and biomedical applications, Springer, New York. [3] Sacks, J., Welch, W.J., Mitchell, T.J. and Wynn, H.P. (989), Design and analysis of computer experiments, Statistical Science, 4,

12 [4] Santner, T.J., Williams, B.J., Notz, W.I. (2003). The Design and Analysis of omputer Experiments, Springer, New York. [5] Swanson KR, Alvord E Jr and Murray JD (2000), A quantitative model for differential motility of gliomas in grey and white matter. ell Prolif, 33, [6] Swanson K. R., Alvord E.. Jr. and Murray J. D. (2004), Dynamics of a model for brain tumors reveals a small window for therapeutic intervention, Discrete and ontinuous Dynamical Systems - B, 4, [7] Thomas, J.W. (995). Numerical Partial Differential Equations: Finite Difference Methods, Springer, New York. [8] Tracqui, P., ruywagen, G.., Woodward, D. E., Bartooll, G. T., Murray, J. D. and Alvord, E.. Jr. (995), A mathematical model of glioma growth: the effect of chemotherapy on spatio-temporal growth, ell Prolif, 28,

A Kriging Approach to the Analysis of Climate Model Experiments

A Kriging Approach to the Analysis of Climate Model Experiments Dorin Drignei Department of Mathematics and Statistics Oakland University, Rochester, MI 48309, USA Abstract. A climate model is a computer