A Kriging Approach to the Analysis of Climate Model Experiments

Size: px
Start display at page:

Download "A Kriging Approach to the Analysis of Climate Model Experiments"

Transcription

1 A Kriging Approach to the Analysis of Climate Model Experiments Dorin Drignei Department of Mathematics and Statistics Oakland University, Rochester, MI 48309, USA Abstract. A climate model is a computer implementation of a mathematical model for the physical processes underlying the climate. An immediate use of a climate model is performing climate model experiments, where uncertain input quantities, such as greenhouse gas and aerosol concentrations, are systematically varied in order to understand their effects on the climate system. The climate models, however, are computationally intensive and only small size experiments can be conducted. This paper presents a multidimensional kriging method to predict climate model variables at new inputs, based on the experimental data available. The method is particularly suitable for situations where the climate model data sets share a common pattern across the input space, such as surface temperatures that are lower at Poles, higher at Equator and possibly increasing over time. The results demonstrate the potential of the kriging methodology presented in this paper as an exploratory tool in climate science. Keywords: Computer experiments; Equilibrium climate sensitivity; Nonstationary models; Surface temperature data. 1 Introduction The study of climate has become increasingly important due to its effects on the planet s environmental and ecological systems. Most often, the climate is defined in terms of weather variables, such as temperature, precipitation and wind, averaged over a time-span (e.g. 30 years, as defined by the World Meteorological Organization) and over the whole Earth or a region. Numerical models that simulate the climate based on physical processes are important tools used by scientists in understanding the climate system. A typical climate model, termed an atmosphere ocean general circulation model (AOGCM), is a complex computer code including atmosphere, ocean, land and ice components. Such a model may be used for experimentation, where uncertain input quantities (or inputs) are varied systematically in order to study their effects on the climate system. For example, the Third and Fourth Assessment Reports (TAR and AR4, respectively) of the Intergovernmental Panel on Climate Change (IPCC) discuss climate projections in terms of temperature, sea level change, precipitation over time and space, resulting from climate model experiments under various hypothetical future emission scenarios, driven by demographic, technological and economic factors. While this is perhaps one of the most widely publicized examples of climate model experiments in recent years, many other examples are discussed in the climate literature. For example, Sokolov and Stone (1998) proposed a simplified climate model and varied climate model parameters such as climate sensitivity and the rate of heat uptake by the deep ocean in order to study their effect on modeled temperature and sea level change over space and/or time. The climate models are in general computationally intensive, each run (corresponding to an input) taking hours or even days on high-performance computers. Therefore, only a small number of climate model runs, corresponding to a small set of inputs, can be performed in climate model 1

2 experiments. Such computational constraints have limited, for example, the size of the experiments considered in TAR and AR4. If the possible effect of a new, hypothetical greenhouse gas emission scenario on the time series of future global mean surface temperatures needs to be investigated, new climate model run(s) must be performed. This paper develops a computationally efficient statistical method to explore the experimental (or input) space by predicting variables of the climate system at any new input, therefore avoiding new runs with the computationally intensive climate model. The method is illustrated with a set of model runs whose inputs are parameters associated with a simplified climate model. More precisely, a limited number of inputs are carefully selected in the input space and the output from the corresponding slow climate model runs are obtained. Then an adequate statistical model for the output data set is postulated and kriging-type methods are used to interpolate statistically across the input space, therefore predicting the climate model output at new inputs. This paper extends in several directions a univariate statistical methodology called design and analysis of computer experiments and described, among others, by Sacks et al (1989), Currin et al (1991), Santner et al (2003), Fang et al (2006). Many computer models output time series or space-time data sets for each input. However, significant research on the analysis of computer experiments with multidimensional output only recently has been advanced (Fang et al 2006, Chapter 7). Bayarri et al (2007) and Higdon et al (2007) used basis representations for multidimensional output in a Bayesian framework and in the context of computer model calibration. Drignei (2006) proposed a computationally fast statistical model as a surrogate for a geophysical ocean model with high dimensional output. Drignei and Morris (2006) pointed out possible computational difficulties with likelihood optimization for large, multidimensional output data sets and suggested a statistical model underpinned by the output data generating mechanism. This paper proposes computationally efficient statistical models for multidimensional output data sets. The statistical models are particularly suitable for situations where the output data sets share a common pattern across the input space, such as low/high model surface temperatures at Poles/Equator and possibly increasing over time. The paper is organized as follows. Section 2 presents an experiment with the MIT 2D climate model and discusses the inputs and the output data set. Section 3 outlines the development of the statistical models and Section 4 describes the kriging prediction and model validation. This methodology is then applied in Section 5 to analyze the experiment with the MIT 2D climate model. Some concluding remarks are presented in Section 6. 2 A climate model experiment There is a suite of climate models in use today. The most realistic include three spatial dimensions (3D) but these are also very computationally intensive. For example, the AOGCM developed at the National Center for Atmospheric Research, called the Community Climate System Model (CCSM), requires weeks of computational time on a massively parallel supercomputer to simulate 50 model years. Due to tremendous computational resources required, these models are only suitable for very small size experiments. Simpler climate models that reproduce the large scale behavior of 3D AOGCMs may be more appropriate for experimentation. One such climate model is used in this paper and is a two dimensional (latitude and vertical) atmospheric model coupled with a diffusive ocean model, developed at the Massachusetts Institute of Technology (MIT) Joint Program on the Science and Policy of Global Change (Sokolov and Stone, 1998). This two dimensional climate model, called the MIT 2D climate model, reproduces many of the nonlinear interactions occurring in simulations with 3D AOGCMs and at the same time it requires much less computational resources. The MIT 2D climate model currently runs at 2

3 about 4 hours computational time per 50 model years on a 3GHz Pentium4 Linux workstation. Technical details about the MIT 2D model can be found in Sokolov and Stone (1998) who showed, for example, that there is wide disagreement among more complex coupled AOGCMs on the rate of heat uptake by the ocean (with corresponding uncertainty in surface warming), and that the MIT 2D model can match their transient behavior if appropriate values for the deep ocean diffusion coefficients are chosen. Stone and Yao (1987, 1990), Yao and Stone (1987) also provide technical details. For the purpose of this research, the MIT 2D climate model is considered a black-box model, in which the inputs are uncertain parameters and the resulting output data sets are recorded over space and time. Therefore, no information about the physics underlying the MIT 2D model is included in the statistical methodology. 2.1 The input parameters Climate models involve a number of parameters that are a priori unknown, but here we focus on three of them. An important and yet uncertain parameter in climatology is the equilibrium climate sensitivity, S, defined as the equilibrium global mean temperature response to a doubling of CO 2 from preindustrial levels. To predict the climate, one must also know how quickly the oceans will equilibrate to additional warming. The rate of warming is governed by how quickly the oceans can mix excess heat into the deeper layers. In 3D AOGCMs, multiple processes are affecting the net mixing of heat into the deep-ocean. In simpler models, such as the MIT 2D climate model, these processes can be set by a single parameter, the rate of heat uptake by the deep ocean, K v. The third uncertain parameter considered here is the strength of the anthropogenic aerosol forcing, F aer. These three parameters are collectively denoted by θ = [S, K v, F aer ], which defines the input vector. D = 20 inputs θ i are sampled in the input space [0.4, 10.5] [0.40, 12.65] [ 1.55, 0] according to a maximin distance design (Johnson et al 1990) from a large list of inputs of interest from a climatological point of view. A maximin distance design ensures that the sampled inputs cover the input space and are spread out, in the sense that no two sampled inputs are too close. Additionally, P = 5 more inputs are chosen for prediction validation purposes. These inputs are given in Table 1. A similar input space has been considered by Forest et al (2002), who used observational records in combination with output data sets to calibrate the MIT 2D climate model. Table 1. The design inputs (left four columns) and the validation inputs (fifth column) (S K v F aer ) (S K v F aer ) (S K v F aer ) (S K v F aer ) (S K v F aer ) The output data sets Among the output variables, probably the most popular is the Earth s surface temperature, which will be considered in this paper. For each input, the model surface temperature analyzed here is a matrix of size 24 56, corresponding to 24 latitudes and 56 years (the time interval ). 3

4 Also available are four replicates for each of these surface temperature data sets, called ensemble members, obtained by changing the initial conditions in the climate model. This is a standard method for generating realizations of the same climate system. This paper will follow the common approach in climatology to analyze the means over the ensemble members, called ensemble means. The ensemble variability will also be accounted for in the statistical model. Unless otherwise specified in the rest of the paper, the climate model surface temperatures are the ensemble means. Figure 1 shows key features of the model surface temperature data set for one of the D = 20 sampled inputs, θ = (10.5, 0.40, 0.26). The left panel shows the surface temperature across the 24 latitudes for the first among the 56 years, with lower temperatures at Poles and higher temperature at Equator. While the MIT 2D model and the data sets considered here do not have a longitude component, it is instructive to draw the left plot with a fictitious longitude dimension in order to better see such temperature pattern, across Earth. The right panel shows the time series of surface temperature at Equator, which may have an increasing trend. The patterns noted in the two plots (quadratic in latitude and linear in time) are representative for the temperature data sets at all other inputs and this characteristic will be exploited in the next section in the development of the statistical models. The goal in this paper is to predict the model surface temperature at any new input in the input space, and therefore avoiding new runs with the computationally intensive climate model. This will be accomplished by developing a statistical model for the output data set at the D = 20 sampled inputs and then using kriging methodology to predict output data at new inputs. Figure 1. Model surface temperatures at sampled input θ = (10.5, 0.40, 0.26). Left: Temperatures across latitude at year 1 (temperature is constant across longitude). Right: Time series of temperatures at Equator. 3 Statistical models For many computer models there is a common output pattern across the input space. For example, Bayarri et al (2007) discuss an engineering application in which the output time series of model load data at each input appear to be strikingly similar. In another engineering application, Fang et al (2006), Chapter 7, show plots of log(engine noise) curves for each sampled input, which again appear very similar in shape. In the current climate application, for each sampled input, the surface temperatures are higher at Equator, lower at Poles and an increasing temporal trend appears to be present. Therefore, a statistical model with input-free mean seems reasonable and is intended to capture the general features common to all inputs, whereas the input-to-input variation is modeled by the covariance matrix which will be assumed separable in input, space and temporal dimensions. 4

5 As it will be pointed out later in this section, the specification of a general statistical model with unstructured, input-dependent mean and/or non-separable covariance has computational drawbacks for larger data sets. 3.1 The mean The climate model surface temperature data set can be organized as an array of dimension N L N T D with N L = 24, N T = 56, D = 20. To better describe the statistical model, this three dimensional data set is stacked as a vector of length N L N T D and denoted by Y. A multivariate normal distribution is assumed for Y, with general mean vector µ(y) = 1 D ν and covariance matrix Γ. Under the assumption of separability, the covariance matrix can be written as a Kronecker product of smaller covariance matrices Γ = Ω Θ C T C L reflecting the various dimensions of the data set (inputs, time and latitudes), so that Γ has elements Ω Θ (i 1, i 2 ) C T (j 1, j 2 ) C L (k 1, k 2 ). Therefore, here separability refers to the multiplicative decomposition of the covariance into purely spatial, temporal and input components. The maximum likelihood estimator of ν has the relatively simple analytical formula D (Ω 1 Θ ) i,j Y ṛ,i ˆν = D (Ω 1, Θ ) i,j a weighted average of output data over the sampled inputs, where Y ṛ,. is the output data vector Y reorganized as a N L N T D matrix. The above statistical model with general mean, however, cannot answer all the climatologically important statistical questions. For example, is there a statistically significant increasing temporal trend in the modeled surface temperature data set? For applications where regression variables are available, an alternative model ν = Xβ may be considered, leading to µ(y) = 1 D Xβ. The maximum likelihood estimator of β is ˆβ = [X (C T 1 C L 1 )X] 1 X (C T 1 C L 1 ) D (Ω Θ 1 ) i,j Y ṛ,i D (Ω Θ 1 ) i,j. The estimators ˆβ, ˆν and their variances are derived in Appendix A. The general mean model may be more flexible than the regression model, but it could have a larger number of mean parameters. The two models will be compared by testing their prediction capabilities on new climate model output data. In this application a polynomial regression with second order latitude terms, a linear temporal term and their interactions will be considered, so that X = [1, L, L 2, T, LT, L 2 T], L = 1 NT (1 : N L ) and T = (1 : N T ) 1 NL. In order to investigate if an input-dependent mean leads to an improved fit, an extended regression has been considered, with the regression matrix having the following structured form [ ] X e = 1 D X U 1 NT N L where the matrix X is given in the polynomial regression above and the D 3 matrix U is given by U i,. = [θ i,1, θ i,2, θ i,3 ], i = 1,..., D. Such a partitioned structure of the extended regression matrix leads to computationally efficient formulas for the generalized least squares estimates of the regression coefficients and their covariance matrix. 3.2 The covariance matrix While the data analyzed in this paper are the ensemble means, their model actually originates in a model for the ensemble members. This relationship is explained in Appendix B, with emphasis on 5

6 the covariance structure. The covariance matrix for the ensemble means is Γ = Ω Θ C T C L. The covariance of inputs is given by Ω Θ = σ 2 C Θ +τ 2 I, reflecting the decomposition of the model surface temperatures into a climate signal and ensemble noise. An unbiased estimate of the parameter τ 2 (see Appendix B) is ˆτ 2 = In an effort to reduce the computational burden for likelihood optimization, τ 2 will be fixed at its estimated value. The matrix C Θ describes a smooth input correlation in the climate signal and is given by C Θ (i, j) = exp( η 1 (l S (θ i ) l S (θ j )) 2 η 2 (l K (θ i ) l K (θ j )) 2 η 3 (l F (θ i ) l F (θ j )) 2 ), i, j = 1,..., D, where l S, l K, l F are the coordinates of the inputs θ, rescaled to [0, 1]. This correlation is commonly used in the computer experiments literature to describe smooth dependence across the input space (e.g. Sacks et al 1989, Santner et al 2003, Fang et al 2006). The temporal correlation matrix C T considered here has elements C T (i, j) = exp( η T l T (i) l T (j) ), i, j = 1,..., N T and the latitude correlation matrix C L has elements C L (i, j) = exp( η L l L (i) l L (j) ), i, j = 1,..., N L, where l T and l L are the time and latitude coordinates, rescaled to [0, 1] for better numerical stability. The more general power exponential correlation (e.g. Sacks et al 1989) for all dimensions may be considered, but the likelihood optimization is more computationally intensive since it would include five additional unknown statistical parameters. The correlations, however, need not be stationary. There are examples in the climate literature (e.g. Rauthe et al 2004, among others), where a nonparametric and non-stationary covariance for the spatial dimension is estimated from control climate model runs. In order to study the sensitivity of the proposed statistical model on the covariance stationary assumption, here a parametric non-stationary correlation (e.g. Hughes- Olivier et al 1998, Schabenberger and Gotway 2005, p. 422) for latitudes is fitted from the data described in Section 2, C L (i, j) = exp( η s1 l L (i) l L (j) exp(η s2 c i c j + η s3 min(c i, c j ))), i, j = 1,..., N L, including a point source at location c, where c i = l L (i) c. Here the Equator will be considered a temperature point source (due to various circulation and transfer mechanisms) and therefore c = 0.5 is chosen. 3.3 Likelihood optimization The estimated values of the parameters appearing in the covariance matrix Γ for the general mean model are obtained by minimizing the function -2 Log (Likelihood)/DN T N L given by (ignoring some constants) log(det(ω Θ )) D + log(det(c T)) N T + log(det(c L)) N L + (Y 1 D ˆν) Γ 1 (Y 1 D ˆν) DN T N L. For the regression model, the covariance parameters estimates are obtained similarly, with Xˆβ instead of ˆν, whereas for the extended regression model replace 1 D ˆν with X e ˆβ. The nonlinear likelihood optimization is done iteratively after some starting values are chosen. At each iteration an updated value of the maximum likelihood estimator ˆν (or ˆβ) is used. The statistical parameters will then be fixed at their final values throughout the rest of the statistical analysis. Besides 6

7 naturally occurring in many applications, the two models for the mean described above have also computational advantages, due mainly to the simplicity of the analytical formulas for ˆν and ˆβ. When comparing these two models, note that ˆν does not include C 1 T C 1 L, whereas ˆβ does, which leads to further computational savings when using the general mean model. While the maximum likelihood approach is perhaps the most popular method for parameter estimation in computer experiments, other estimation methods could be used, such as penalized likelihood (Li and Sudjianto, 2005), cross-validation, REML or posterior mode (e.g Santner et al 2003, section 3.3.2). There are computational difficulties with general statistical models having unstructured inputdependent regression variables (Drignei and Morris 2006), such as coarser numerical solutions of the same climate model or output from simpler but faster climate models. For example, for the data set considered here, the likelihood of a statistical model with six unstructured, input-dependent regression variables would be about 25 times more computationally intensive than the likelihood of the regression model with six input-free variables. This computational time increases even further if, in addition, the covariance would be unstructured, non-separable (e.g. Genton, 2007). The differences among the computational times could become substantial when the objective functions are optimized nonlinearly and a large number of function evaluations are required. These differences would be magnified even further for larger data sets including all three spatial dimensions, a possibly longer time interval and/or a larger set of statistical parameters. 4 Prediction at new inputs and validation The statistical model described above is used to predict the climate model output at an arbitrary set Π of P new inputs in the input space. Let Y Π be the corresponding climate model output reorganized in a vector of length N L N T P. Two versions of kriging will be considered: simple and universal. These will lead to the same computed point prediction, but the prediction covariance matrices will be different, with the universal kriging having an extra term depending on regression variables. The prediction distribution is multivariate normal with mean vector Ỹ Π = 1 P ˆν+Γ ΠΘ Γ 1 (Y 1 D ˆν) = 1 P ˆν+{[(σ 2 C ΠΘ )(σ 2 C Θ +τ 2 I D ) 1 ] I NL N T }(Y 1 D ˆν) (replace ˆν with X ˆβ for the input-free regression model). The simple kriging prediction covariance matrix is V Π s = Γ Π Γ ΠΘ Γ 1 Γ ΠΘ = [(σ 2 C Π + τ 2 I P ) (σ 2 C ΠΘ )(σ 2 C Θ + τ 2 I D ) 1 (σ 2 C ΠΘ ) ] C T C L where Γ Π = (σ 2 C Π + τ 2 I P ) C T C L and Γ ΠΘ = σ 2 (C ΠΘ C T C L ). Here C Π (i, j) = exp( η 1 (l S (π i ) l S (π j )) 2 η 2 (l K (π i ) l K (π j )) 2 η 3 (l F (π i ) l F (π j )) 2 ), i, j = 1,..., P, where l S, l K, l F are the coordinates of the new inputs π s rescaled to [0, 1], and C ΠΘ (i, j) = exp( η 1 (l S (θ i ) l S (π j )) 2 η 2 (l K (θ i ) l K (π j )) 2 η 3 (l F (θ i ) l F (π j )) 2 ) for i = 1,..., D, j = 1,..., P. The formula (5.34) in Schabenberger and Gotway (2005) for the universal kriging prediction covariance matrix becomes V Π u = V Π s +[1 P X Γ ΠΘ Γ 1 (1 D X)][(1 D X) Γ 1 (1 D X)] 1 [1 P X Γ ΠΘ Γ 1 (1 D X)] = 7

8 [(σ 2 C Π +τ 2 I P ) (σ 2 C ΠΘ )(σ 2 C Θ +τ 2 I D ) 1 (σ 2 C ΠΘ ) ] C T C L +E {X[X (C T 1 C L 1 )X] 1 X } for the input-free regression model, where E = 1 D (Ω 1 {1 P [(σ 2 C ΠΘ )(σ 2 C Θ +τ 2 I D ) 1 ].,j }{1 P [(σ 2 C ΠΘ )(σ 2 C Θ +τ 2 I D ) 1 ].,j }. Θ ) i,j j=1 j=1 Setting X = I NT N L in the previous formula, one obtains V Π u = [(σ 2 C Π + τ 2 I P ) (σ 2 C ΠΘ )(σ 2 C Θ + τ 2 I D ) 1 (σ 2 C ΠΘ ) + E] C T C L, the universal kriging covariance matrix for the input-free general mean model. To avoid a possible confusion, one should note that while the data sets considered here have a spatial component (the latitudes), the kriging prediction is done, in fact, over the input space. The measures of validation are based on the output data and their predictions, at a set Π of P new inputs. These measures are: the root mean square error RMSE = 1 N L N T P the maximum absolute value of prediction error N L,N T,P i,t,p=1 (Y Π Ỹ Π ) 2 i,t,p, MaxErr = max (Y Π Ỹ Π ) i,t,p and the actual coverage of prediction intervals with a nominal coverage (e.g. 95%) COV ER = 1 N L N T P N L,N T,P i,t,p=1 δ [(YΠ ) i,t,p (INT ) i,t,p ], where (INT ) i,t,p is the prediction interval of (Y Π ) i,t,p at each point (i, t, p), and δ is the indicator function. Here we distinguish between prediction intervals based on the simple or universal kriging. In addition, residual analysis based on the standardized prediction residuals V Π 1/2 (Y Π Ỹ Π ) can be performed separately for simple and universal kriging to check the normality of the prediction distribution. Ideally, output data from at least a few model runs should be used for validation purposes. However, if this is not possible, cross-validation methods may be used instead. 5 Analysis of MIT 2D climate model experiment This section presents results for the MIT 2D climate model experiment discussed in Section 2, based on the methodology presented in Sections 3 and 4. The estimates of parameters β and their standard errors for the input-free polynomial regression are shown in Table 2 whereas the estimates of parameters β and their standard errors for the extended, input-dependent regression are shown in Table 3, with the index s referring to the model including the latitude stationary covariance and the index n referring to the model including the latitude non-stationary covariance. There is perceptible evidence of an increasing time trend in the surface temperatures and of positive association between the climate sensitivity parameter and the surface temperatures. However, there is not much difference between these estimated values when comparing the stationary versus non-stationary latitude covariances. A thorough visual inspection of Figure 1 (right panel) reveals that the linear trend is not perfect, with approximately the first and last two thirds of the time series 8

9 increasing whereas the middle third is roughly constant, which may be better fitted by the general mean model. Table 4 shows the estimates of the covariance parameters for the input-free general mean model (GenM ean), input-free polynomial regression model (P olyreg) and the extended, input-dependent regression model (ExtdReg). Table 2. Coefficients estimates and their standard errors for the input-free polynomial regression. Variable 1 L L 2 T L T L 2 T ˆβ s SE( ˆβ s ) ˆβ n SE( ˆβ n ) Table 3. Coefficients estimates and their standard errors for the extended regression. Variable 1 L L 2 T L T L 2 T S K v F aer ˆβ s SE( ˆβ s ) ˆβ n SE( ˆβ n ) Table 4. Variance parameters estimates. Parameter η 1 η 2 η 3 η T η L η s1 η s2 η s3 σ 2 GenMean s GenMean n P olyreg s P olyreg n ExtdReg s ExtdReg n Table 5. Results. Parameter RMSE ( o C) MaxErr ( o C) COV ER sp COV ER uv GenMean s GenMean n P olyreg s P olyreg n ExtdReg s ExtdReg n The set Π of P = 5 new inputs shown in Table 1 is used for validation. The validation measures are given in Table 5 and they show that the general mean model is at least twice as accurate as any of the regression models, with respect to RMSE and MaxErr. The choice of stationary or non-stationary latitude correlation again doesn t seem to have an important effect. The actual coverage is close to the nominal 95% for all statistical models, except perhaps for the general mean model with non-stationary covariance model, which appears to have a lower actual coverage rate. Here COV ER sp refers to simple kriging and COV ER uv refers to universal kriging, and one can 9

10 notice only very little difference between their values. The results for the two regression models do not differ much, although the input-free regression model seems to have a larger RMSE but a lower MaxErr than the extended regression model. Figure 2 shows results for the input-free general mean model and for the input-free polynomial regression, with stationary covariances only. The plots from the models with latitude non-stationary covariance as well as and the input-dependent extended regression model (with both stationary and non-stationary covariances) are similar to those presented in Figure 2, and are therefore omitted. The left panels in Figure 2 show the plot of true new output data versus the predicted new output data and one can see that the general input-free mean model (upper left) is more accurate than the input-free polynomial regression model (lower left). In such plots, the closer the scatterplot is to the main diagonal, the more accurate the point prediction is. Figure 2 right panels show normal probability plots of standardized prediction residuals based on simple kriging covariance formulas for the two models (plots based on universal kriging are similar). These normal probability plots reveal that the normality assumption is not perfectly satised by either model, although it seems much more appropriate for the general mean model. Figure 2. True versus predicted new output data (left panels) and normal probability plots of standardized prediction residuals (right panels). The upper row corresponds to the general input-free mean model and the lower row corresponds to the input-free polynomial regression. 6 Concluding remarks This paper has presented a kriging approach to the analysis of climate model experiments. The basic strategy was to perform a relatively small number of runs with the computationally intensive climate model and then obtain the output data. An appropriate statistical model was considered for these 10

11 data, which was then used to predict climate model output at new inputs. The statistical model has a mean vector suitable for the climate application discussed here, where there is a common pattern for the output data set at each input. Prediction results from statistical models with two different input-free means have been presented. While a polynomial regression model appears to be more intuitive for the climate application discussed, the general mean model is more flexible and appears to predict more accurately new data. Since it is not always possible to find good regression variables, the latter model is also expected to be more widely used in applications. An extended regression model which includes input-dependent variables does not seem to improve significantly on the previous fit. Similarly, results based on universal kriging and/or a non-stationary covariance model for latitudes do not differ significantly from those based on simple kriging and/or stationary covariance models. Computationally intensive numerical models are not uncommon in environmental sciences. For example, the U.S. Environmental Protection Agency (EPA) recommends using CALPUFF, a complex dispersion model that simulates the effects of space-time meteorological conditions on pollution transport, transformation and removal. Its User s Guide (available as a link from the EPA s web page) points out that CALPUFF can take hours of computational time for some applications. While the focus in this paper was on a particular problem important to climate science, there are some useful general principles underlying this work which may be applied to other environmental problems involving computationally intensive numerical models and associated numerical experiments. Appendix A The maximum likelihood estimator of the parameter vector β along with its covariance matrix are derived in this Appendix. The formulas for the maximum likelihood estimator of ν and its variance result as a particular case in the derivation below, with X = I and β = ν. The part of -2 Log (Likelihood) that contains β can be rewritten as (Y 1 D Xβ) Γ 1 (Y 1 D Xβ) = (Ω 1 Θ ) i,j (Y ṛ,i Xβ) (C 1 T C 1 L )(Y ṛ,j Xβ) = (Ω 1 Θ ) i,j Y ṛ,i (C 1 T C 1 L )Y ṛ,j 2β X (C 1 T C 1 L ) (Ω 1 Θ ) i,j Y ṛ,j +β X (C 1 T C 1 L )Xβ (Ω 1 Θ ) i,j. By taking its partial derivatives with respect to β and setting them equal to zero, one obtains X (C 1 T C 1 L )X (Ω 1 Θ ) i,j β = X (C 1 T C 1 L ) (Ω 1 Θ ) i,j Y ṛ,j and hence the formula for ˆβ. If one denotes w j = D i=1 (Ω Θ 1 ) i,j D (Ω Θ 1 ) i,j then ˆβ = [X (C 1 T C 1 L )X] 1 X (C 1 T C 1 L ) w j Y ṛ,j j=1 11

12 and therefore var( ˆβ) = [X (C 1 T C 1 L )X] 1 X (C 1 T C 1 L )[ w i w j cov(y ṛ,i, Y ṛ,j)] (C 1 T C 1 L )X[X (C 1 T C 1 L )X] 1 = [X (C 1 T C 1 L )X] 1 X (C 1 T C 1 L ) [ w i w j (Ω Θ ) i,j (C T C L )](C 1 T C 1 L )X[X (C 1 T C 1 L )X] 1 = = ( w i w j (Ω Θ ) i,j )[X (C 1 T C 1 L )X] 1. Appendix B This Appendix details the covariance structure of the statistical models presented in this paper. Denote by Y k,i the data set of ensemble members, with k = 1,..., R (R = 4 is the number of ensemble members in this application) and i = 1,..., N T N L D. A mixed model for Y is Y k,i = µ(y) i + Z i + ɛ k,i, where the climate signal Z and the (partially colored) ensemble noise ɛ are independent of each other, of mean zero, with covariance matrices σ 2 Σ Z and γ 2 Σ ɛ I R, respectively. In vector format, Y has covariance matrix σ 2 Σ Z J R + γ 2 Σ ɛ I R, where J R is the R R matrix of ones. The model for ensemble means is Y i = Ȳ.,i = µ(y) i + Z i + ɛ.,i with covariance matrix Γ = σ 2 Σ Z + τ 2 Σ ɛ and τ 2 = γ 2 /R. In practice it would be difficult to work with general matrices Σ Z and Σ ɛ, and therefore one needs to make some assumptions. First, under the assumption of separability, each matrix will be written as a Kronecker product of smaller matrices to reflect the input, time and space (latitude) dimensions. Another assumption is that each ensemble member has a time-space correlation matrix C T C L, which is then inherited by the climate signal and the ensemble noise. However, the climate signal varies smoothly from input to input, whereas the ensemble noise does not depend on inputs. These assumptions lead to Γ = (σ 2 C Θ + τ 2 I) C T C L. Finally, note that N ˆγ 2 1 T N L D 1 = [ N T N L D R 1 i=1 R (Y k,i Ȳ.,i ) 2 ] is unbiased estimator of γ 2 and ˆτ 2 = ˆγ 2 /R is unbiased estimator of τ 2. k=1 References [1] Bayarri, M.J., Walsh, D., Berger, J.O., Cafeo, J., Garcia-Donato, G., Liu, F., Palomo, Parthasarathy, R.J., Paulo, R. and Sacks, J. (2007), Computer Model Validation with Functional Output, Annals of Statistics, 35,

13 [2] Currin, C., Mitchell, T., Morris, M., Ylvisaker D. (1991), Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments, Journal of the American Statistical Association, 86, [3] Drignei, D. (2006), Empirical Bayesian Analysis for High-Dimensional Computer Output, Technometrics, 48, [4] Drignei, D. and Morris, M.D. (2006), Empirical Bayesian Analysis for Computer Experiments Involving Finite-Difference Codes, Journal of the American Statistical Association, 101, p [5] Fang, K.T., Li, R. and Sudjianto, A. (2006). Design and Modeling for Computer Experiments, Boca Raton, FL: Chapman and Hall/CRC Press. [6] Forest, C. E., Stone, P. H., Sokolov, A.P., Allen, M.R. and Webster, M.D. (2002), Quantifying Uncertainties in Climate System Properties with the Use of Recent Climate Observations, Science, 295, [7] IPCC,2001: Climate Change 2001:The Scientific Basis. Contribution of Working Group I to the Third Assessment Report of the Intergovernmental Panel on Climate Change [Houghton,J.T.,Y. Ding,D.J. Griggs,M. Noguer,P.J. van der Linden,X. Dai,K. Maskell,and C.A. Johnson (eds.)]. Cambridge University Press, Cambridge,United Kingdom and New York, NY, USA, 881pp. [8] IPCC, 2007: Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change [Solomon, S., D. Qin, M. Manning, Z. Chen, M. Marquis, K.B. Averyt, M. Tignor and H.L. Miller (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 996 pp. [9] Genton, M. G. (2007), Separable Approximations of Space-time Covariance Matrices, Environmetrics, 18, [10] Higdon, D., Gattiker, J., Williams, B., and Rightley, M. (2007), Computer Model Validation Using High Dimensional Outputs, in Bayesian Statistics 8, eds. Bernardo, J., Bayarri, M. J., Dawid, A. P., Berger, J. O., Heckerman, D., Smith, A. F. M., and West, M., London: Oxford University Press. [11] Hughes-Olivier, J.M., Gonzales-Farias, G., Lu, J.C and Chen, D (1998), Parametric Nonstationary Correlation Models, Statistics and Probability Letters, 40, [12] Johnson, M., Moore, L. and Ylvisaker D. (1990), Minimax and Maximin Distance Designs, Journal of Statistical Planning and Inference, 26, [13] Li, R. and Sudjianto, A. (2005). Analysis of Computer Experiments Using Penalized Likelihood in Gaussian Kriging Models, Technometrics, 47, [14] Rauthe, H., Hense, A. and Paeth, H. (2004). A Model Intercomparison Study of Climate Change Signals in Extratropical Circulation, International Journal of Climatology, 24, [15] Sacks, J., Welch, W.J., Mitchell, T.J. and Wynn, H.P. (1989), Design and Analysis of Computer Experiments, Statistical Science, 4,

14 [16] Santner, T.J., Williams, B.J., and Notz, W.I. (2003), The Design and Analysis of Computer Experiments, New York: Springer. [17] Schabenberger, O. and Gotway, C. A. (2005), Statistical Methods for Spatial Data Analysis, Boca Raton, FL: Chapman and Hall/CRC Press. [18] Sokolov, A. P., and Stone, P. H. (1998), A Flexible Climate Model for Use in Integrated Assessments, Climate Dynamics, 14, [19] Stone, P.H. and Yao, M.S. (1987), Development of a Two-dimensional Zonally Averaged Statistical-dynamical Model. II: the Role of Eddy Momentum Fluxes in the General Circulation and their Parametrization, Journal of the Atmospheric Sciences, 44, [20] Stone, P.H. and Yao, M.S. (1990), Development of a Two-dimensional Zonally Averaged Statistical-dynamical Model. III: Parametrization of the Eddy Fluxes of Heat and Moisture, Journal of Climate, 3, [21] Yao, M.S. and Stone, P.H. (1987), Development of a Two-dimensional Zonally Averaged Statistical-dynamical Model. I: the Parameterization of Moist Convection and its Role in the General Circulation, Journal of the Atmospheric Sciences, 44,

Fast Statistical Surrogates for Dynamical 3D Computer Models of Brain Tumor

Fast Statistical Surrogates for Dynamical 3D Computer Models of Brain Tumor Fast Statistical Surrogates for Dynamical 3D omputer Models of Brain Tumor Dorin Drignei Department of Mathematics and Statistics Oakland University, Rochester, MI 48309, USA Email: drignei@oakland.edu

More information

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Bayesian Dynamic Linear Modelling for. Complex Computer Models Bayesian Dynamic Linear Modelling for Complex Computer Models Fei Liu, Liang Zhang, Mike West Abstract Computer models may have functional outputs. With no loss of generality, we assume that a single computer

More information

Empirical Bayesian Analysis for Computer Experiments Involving Finite-Difference Codes

Empirical Bayesian Analysis for Computer Experiments Involving Finite-Difference Codes Empirical Bayesian Analysis for Computer Experiments Involving Finite-Difference Codes Dorin DRIGNEI and Max D. MORRIS Computer experiments are increasingly used in scientific investigations as substitutes

More information

Gaussian Processes for Computer Experiments

Gaussian Processes for Computer Experiments Gaussian Processes for Computer Experiments Jeremy Oakley School of Mathematics and Statistics, University of Sheffield www.jeremy-oakley.staff.shef.ac.uk 1 / 43 Computer models Computer model represented

More information

Monitoring Wafer Geometric Quality using Additive Gaussian Process

Monitoring Wafer Geometric Quality using Additive Gaussian Process Monitoring Wafer Geometric Quality using Additive Gaussian Process Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1 1 Department of Industrial and Systems Engineering, National University of Singapore 2 Department

More information

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial

More information

Introduction to Global Warming

Introduction to Global Warming Introduction to Global Warming Cryosphere (including sea level) and its modelling Ralf GREVE Institute of Low Temperature Science Hokkaido University Sapporo, 2010.09.14 http://wwwice.lowtem.hokudai.ac.jp/~greve/

More information

Flexible Spatio-temporal smoothing with array methods

Flexible Spatio-temporal smoothing with array methods Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS046) p.849 Flexible Spatio-temporal smoothing with array methods Dae-Jin Lee CSIRO, Mathematics, Informatics and

More information

Bayesian spatial quantile regression

Bayesian spatial quantile regression Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse

More information

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation

A Framework for Daily Spatio-Temporal Stochastic Weather Simulation A Framework for Daily Spatio-Temporal Stochastic Weather Simulation, Rick Katz, Balaji Rajagopalan Geophysical Statistics Project Institute for Mathematics Applied to Geosciences National Center for Atmospheric

More information

Climate Change: the Uncertainty of Certainty

Climate Change: the Uncertainty of Certainty Climate Change: the Uncertainty of Certainty Reinhard Furrer, UZH JSS, Geneva Oct. 30, 2009 Collaboration with: Stephan Sain - NCAR Reto Knutti - ETHZ Claudia Tebaldi - Climate Central Ryan Ford, Doug

More information

Mustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson

Mustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson Proceedings of the 0 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelspach, K. P. White, and M. Fu, eds. RELATIVE ERROR STOCHASTIC KRIGING Mustafa H. Tongarlak Bruce E. Ankenman Barry L.

More information

An introduction to Bayesian statistics and model calibration and a host of related topics

An introduction to Bayesian statistics and model calibration and a host of related topics An introduction to Bayesian statistics and model calibration and a host of related topics Derek Bingham Statistics and Actuarial Science Simon Fraser University Cast of thousands have participated in the

More information

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

What is the IPCC? Intergovernmental Panel on Climate Change

What is the IPCC? Intergovernmental Panel on Climate Change IPCC WG1 FAQ What is the IPCC? Intergovernmental Panel on Climate Change The IPCC is a scientific intergovernmental body set up by the World Meteorological Organization (WMO) and by the United Nations

More information

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data

Models for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise

More information

Introduction to emulators - the what, the when, the why

Introduction to emulators - the what, the when, the why School of Earth and Environment INSTITUTE FOR CLIMATE & ATMOSPHERIC SCIENCE Introduction to emulators - the what, the when, the why Dr Lindsay Lee 1 What is a simulator? A simulator is a computer code

More information

Limit Kriging. Abstract

Limit Kriging. Abstract Limit Kriging V. Roshan Joseph School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205, USA roshan@isye.gatech.edu Abstract A new kriging predictor is proposed.

More information

An Introduction to Coupled Models of the Atmosphere Ocean System

An Introduction to Coupled Models of the Atmosphere Ocean System An Introduction to Coupled Models of the Atmosphere Ocean System Jonathon S. Wright jswright@tsinghua.edu.cn Atmosphere Ocean Coupling 1. Important to climate on a wide range of time scales Diurnal to

More information

Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging

Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Jeremy Staum Collaborators: Bruce Ankenman, Barry Nelson Evren Baysal, Ming Liu, Wei Xie supported by the NSF under Grant No.

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

X t = a t + r t, (7.1)

X t = a t + r t, (7.1) Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

Models for models. Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research

Models for models. Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research Models for models Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research Outline Statistical models and tools Spatial fields (Wavelets) Climate regimes (Regression and clustering)

More information

Computer Model Calibration or Tuning in Practice

Computer Model Calibration or Tuning in Practice Computer Model Calibration or Tuning in Practice Jason L. Loeppky Department of Statistics University of British Columbia Vancouver, BC, V6T 1Z2, CANADA (jason@stat.ubc.ca) Derek Bingham Department of

More information

1. Gaussian process emulator for principal components

1. Gaussian process emulator for principal components Supplement of Geosci. Model Dev., 7, 1933 1943, 2014 http://www.geosci-model-dev.net/7/1933/2014/ doi:10.5194/gmd-7-1933-2014-supplement Author(s) 2014. CC Attribution 3.0 License. Supplement of Probabilistic

More information

Why build a climate model

Why build a climate model Climate Modeling Why build a climate model Atmosphere H2O vapor and Clouds Absorbing gases CO2 Aerosol Land/Biota Surface vegetation Ice Sea ice Ice sheets (glaciers) Ocean Box Model (0 D) E IN = E OUT

More information

Torben Königk Rossby Centre/ SMHI

Torben Königk Rossby Centre/ SMHI Fundamentals of Climate Modelling Torben Königk Rossby Centre/ SMHI Outline Introduction Why do we need models? Basic processes Radiation Atmospheric/Oceanic circulation Model basics Resolution Parameterizations

More information

Computer experiments with functional inputs and scalar outputs by a norm-based approach

Computer experiments with functional inputs and scalar outputs by a norm-based approach Computer experiments with functional inputs and scalar outputs by a norm-based approach arxiv:1410.0403v1 [stat.me] 1 Oct 2014 Thomas Muehlenstaedt W. L. Gore & Associates and Jana Fruth Faculty of Statistics,

More information

Conjugate Analysis for the Linear Model

Conjugate Analysis for the Linear Model Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,

More information

mlegp: an R package for Gaussian process modeling and sensitivity analysis

mlegp: an R package for Gaussian process modeling and sensitivity analysis mlegp: an R package for Gaussian process modeling and sensitivity analysis Garrett Dancik January 30, 2018 1 mlegp: an overview Gaussian processes (GPs) are commonly used as surrogate statistical models

More information

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,

More information

Climate Modeling Dr. Jehangir Ashraf Awan Pakistan Meteorological Department

Climate Modeling Dr. Jehangir Ashraf Awan Pakistan Meteorological Department Climate Modeling Dr. Jehangir Ashraf Awan Pakistan Meteorological Department Source: Slides partially taken from A. Pier Siebesma, KNMI & TU Delft Key Questions What is a climate model? What types of climate

More information

arxiv: v1 [stat.me] 10 Jul 2009

arxiv: v1 [stat.me] 10 Jul 2009 6th St.Petersburg Workshop on Simulation (2009) 1091-1096 Improvement of random LHD for high dimensions arxiv:0907.1823v1 [stat.me] 10 Jul 2009 Andrey Pepelyshev 1 Abstract Designs of experiments for multivariate

More information

Spatial Inference of Nitrate Concentrations in Groundwater

Spatial Inference of Nitrate Concentrations in Groundwater Spatial Inference of Nitrate Concentrations in Groundwater Dawn Woodard Operations Research & Information Engineering Cornell University joint work with Robert Wolpert, Duke Univ. Dept. of Statistical

More information

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014 Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Chapter 6: Modeling the Atmosphere-Ocean System

Chapter 6: Modeling the Atmosphere-Ocean System Chapter 6: Modeling the Atmosphere-Ocean System -So far in this class, we ve mostly discussed conceptual models models that qualitatively describe the system example: Daisyworld examined stable and unstable

More information

Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"

Kneib, Fahrmeir: Supplement to Structured additive regression for categorical space-time data: A mixed model approach Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach" Sonderforschungsbereich 386, Paper 43 (25) Online unter: http://epub.ub.uni-muenchen.de/

More information

Ensemble Data Assimilation and Uncertainty Quantification

Ensemble Data Assimilation and Uncertainty Quantification Ensemble Data Assimilation and Uncertainty Quantification Jeff Anderson National Center for Atmospheric Research pg 1 What is Data Assimilation? Observations combined with a Model forecast + to produce

More information

Mesoscale meteorological models. Claire L. Vincent, Caroline Draxl and Joakim R. Nielsen

Mesoscale meteorological models. Claire L. Vincent, Caroline Draxl and Joakim R. Nielsen Mesoscale meteorological models Claire L. Vincent, Caroline Draxl and Joakim R. Nielsen Outline Mesoscale and synoptic scale meteorology Meteorological models Dynamics Parametrizations and interactions

More information

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Physician Performance Assessment / Spatial Inference of Pollutant Concentrations Dawn Woodard Operations Research & Information Engineering Cornell University Johns Hopkins Dept. of Biostatistics, April

More information

Use of Design Sensitivity Information in Response Surface and Kriging Metamodels

Use of Design Sensitivity Information in Response Surface and Kriging Metamodels Optimization and Engineering, 2, 469 484, 2001 c 2002 Kluwer Academic Publishers. Manufactured in The Netherlands. Use of Design Sensitivity Information in Response Surface and Kriging Metamodels J. J.

More information

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is

Multinomial Data. f(y θ) θ y i. where θ i is the probability that a given trial results in category i, i = 1,..., k. The parameter space is Multinomial Data The multinomial distribution is a generalization of the binomial for the situation in which each trial results in one and only one of several categories, as opposed to just two, as in

More information

Projections of future climate change

Projections of future climate change Projections of future climate change Matthew Collins 1,2 and Catherine A. Senior 2 1 Centre for Global Atmospheric Modelling, Department of Meteorology, University of Reading 2 Met Office Hadley Centre,

More information

A HIERARCHICAL MODEL FOR REGRESSION-BASED CLIMATE CHANGE DETECTION AND ATTRIBUTION

A HIERARCHICAL MODEL FOR REGRESSION-BASED CLIMATE CHANGE DETECTION AND ATTRIBUTION A HIERARCHICAL MODEL FOR REGRESSION-BASED CLIMATE CHANGE DETECTION AND ATTRIBUTION Richard L Smith University of North Carolina and SAMSI Joint Statistical Meetings, Montreal, August 7, 2013 www.unc.edu/~rls

More information

Dynamic Matrix-Variate Graphical Models A Synopsis 1

Dynamic Matrix-Variate Graphical Models A Synopsis 1 Proc. Valencia / ISBA 8th World Meeting on Bayesian Statistics Benidorm (Alicante, Spain), June 1st 6th, 2006 Dynamic Matrix-Variate Graphical Models A Synopsis 1 Carlos M. Carvalho & Mike West ISDS, Duke

More information

Fast Bayesian Inference for Computer Simulation Inverse Problems

Fast Bayesian Inference for Computer Simulation Inverse Problems Fast Bayesian Inference for Computer Simulation Inverse Problems Matthew Taddy, Herbert K. H. Lee & Bruno Sansó University of California, Santa Cruz, Department of Applied Mathematics and Statistics January

More information

Statistical analysis of regional climate models. Douglas Nychka, National Center for Atmospheric Research

Statistical analysis of regional climate models. Douglas Nychka, National Center for Atmospheric Research Statistical analysis of regional climate models. Douglas Nychka, National Center for Atmospheric Research National Science Foundation Olso workshop, February 2010 Outline Regional models and the NARCCAP

More information

Effect of Ocean Warming on West Antarctic Ice Streams and Ice Shelves. Bryan Riel December 4, 2008

Effect of Ocean Warming on West Antarctic Ice Streams and Ice Shelves. Bryan Riel December 4, 2008 Effect of Ocean Warming on West Antarctic Ice Streams and Ice Shelves Bryan Riel December 4, 2008 Ice Sheet Mass Balance/WAIS Dynamics -Mass Balance = (Ice/Snow Accumulation) (Surface melting, ice outflux,

More information

Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation

Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation Fast Dimension-Reduced Climate Model Calibration and the Effect of Data Aggregation Won Chang Post Doctoral Scholar, Department of Statistics, University of Chicago Oct 15, 2014 Thesis Advisors: Murali

More information

P -spline ANOVA-type interaction models for spatio-temporal smoothing

P -spline ANOVA-type interaction models for spatio-temporal smoothing P -spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee 1 and María Durbán 1 1 Department of Statistics, Universidad Carlos III de Madrid, SPAIN. e-mail: dae-jin.lee@uc3m.es and

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Training: Climate Change Scenarios for PEI. Training Session April Neil Comer Research Climatologist

Training: Climate Change Scenarios for PEI. Training Session April Neil Comer Research Climatologist Training: Climate Change Scenarios for PEI Training Session April 16 2012 Neil Comer Research Climatologist Considerations: Which Models? Which Scenarios?? How do I get information for my location? Uncertainty

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry

More information

Multivariate spatial models and the multikrig class

Multivariate spatial models and the multikrig class Multivariate spatial models and the multikrig class Stephan R Sain, IMAGe, NCAR ENAR Spring Meetings March 15, 2009 Outline Overview of multivariate spatial regression models Case study: pedotransfer functions

More information

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter

Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint

More information

Course topics (tentative) The role of random effects

Course topics (tentative) The role of random effects Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen

More information

arxiv: v1 [stat.me] 24 May 2010

arxiv: v1 [stat.me] 24 May 2010 The role of the nugget term in the Gaussian process method Andrey Pepelyshev arxiv:1005.4385v1 [stat.me] 24 May 2010 Abstract The maximum likelihood estimate of the correlation parameter of a Gaussian

More information

Applications of Tail Dependence II: Investigating the Pineapple Express. Dan Cooley Grant Weller Department of Statistics Colorado State University

Applications of Tail Dependence II: Investigating the Pineapple Express. Dan Cooley Grant Weller Department of Statistics Colorado State University Applications of Tail Dependence II: Investigating the Pineapple Express Dan Cooley Grant Weller Department of Statistics Colorado State University Joint work with: Steve Sain, Melissa Bukovsky, Linda Mearns,

More information

Equilibrium Climate Sensitivity: is it accurate to use a slab ocean model? Gokhan Danabasoglu and Peter R. Gent

Equilibrium Climate Sensitivity: is it accurate to use a slab ocean model? Gokhan Danabasoglu and Peter R. Gent Equilibrium Climate Sensitivity: is it accurate to use a slab ocean model? by Gokhan Danabasoglu and Peter R. Gent National Center for Atmospheric Research Boulder, Colorado 80307 Abstract The equilibrium

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

Spatio-temporal prediction of site index based on forest inventories and climate change scenarios

Spatio-temporal prediction of site index based on forest inventories and climate change scenarios Forest Research Institute Spatio-temporal prediction of site index based on forest inventories and climate change scenarios Arne Nothdurft 1, Thilo Wolf 1, Andre Ringeler 2, Jürgen Böhner 2, Joachim Saborowski

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Introduction to Spatial Data and Models

Introduction to Spatial Data and Models Introduction to Spatial Data and Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,

More information

DETECTION AND ATTRIBUTION

DETECTION AND ATTRIBUTION DETECTION AND ATTRIBUTION Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC 27599-3260, USA rls@email.unc.edu 41st Winter Conference in Statistics

More information

Introduction to Ensemble Kalman Filters and the Data Assimilation Research Testbed

Introduction to Ensemble Kalman Filters and the Data Assimilation Research Testbed Introduction to Ensemble Kalman Filters and the Data Assimilation Research Testbed Jeffrey Anderson, Tim Hoar, Nancy Collins NCAR Institute for Math Applied to Geophysics pg 1 What is Data Assimilation?

More information

Kriging by Example: Regression of oceanographic data. Paris Perdikaris. Brown University, Division of Applied Mathematics

Kriging by Example: Regression of oceanographic data. Paris Perdikaris. Brown University, Division of Applied Mathematics Kriging by Example: Regression of oceanographic data Paris Perdikaris Brown University, Division of Applied Mathematics! January, 0 Sea Grant College Program Massachusetts Institute of Technology Cambridge,

More information

Spatial smoothing using Gaussian processes

Spatial smoothing using Gaussian processes Spatial smoothing using Gaussian processes Chris Paciorek paciorek@hsph.harvard.edu August 5, 2004 1 OUTLINE Spatial smoothing and Gaussian processes Covariance modelling Nonstationary covariance modelling

More information

Consequences for Climate Feedback Interpretations

Consequences for Climate Feedback Interpretations CO 2 Forcing Induces Semi-direct Effects with Consequences for Climate Feedback Interpretations Timothy Andrews and Piers M. Forster School of Earth and Environment, University of Leeds, Leeds, LS2 9JT,

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

NSF Expeditions in Computing. Understanding Climate Change: A Data Driven Approach. Vipin Kumar University of Minnesota

NSF Expeditions in Computing. Understanding Climate Change: A Data Driven Approach. Vipin Kumar University of Minnesota NSF Expeditions in Computing Understanding Climate Change: A Data Driven Approach Vipin Kumar University of Minnesota kumar@cs.umn.edu www.cs.umn.edu/~kumar Vipin Kumar UCC Aug 15, 2011 Climate Change:

More information

Understanding Uncertainty in Climate Model Components Robin Tokmakian Naval Postgraduate School

Understanding Uncertainty in Climate Model Components Robin Tokmakian Naval Postgraduate School Understanding Uncertainty in Climate Model Components Robin Tokmakian Naval Postgraduate School rtt@nps.edu Collaborators: P. Challenor National Oceanography Centre, UK; Jim Gattiker Los Alamos National

More information

Sensitivity analysis in linear and nonlinear models: A review. Introduction

Sensitivity analysis in linear and nonlinear models: A review. Introduction Sensitivity analysis in linear and nonlinear models: A review Caren Marzban Applied Physics Lab. and Department of Statistics Univ. of Washington, Seattle, WA, USA 98195 Consider: Introduction Question:

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Basics of Point-Referenced Data Models

Basics of Point-Referenced Data Models Basics of Point-Referenced Data Models Basic tool is a spatial process, {Y (s), s D}, where D R r Chapter 2: Basics of Point-Referenced Data Models p. 1/45 Basics of Point-Referenced Data Models Basic

More information

Latin Hypercube Sampling with Multidimensional Uniformity

Latin Hypercube Sampling with Multidimensional Uniformity Latin Hypercube Sampling with Multidimensional Uniformity Jared L. Deutsch and Clayton V. Deutsch Complex geostatistical models can only be realized a limited number of times due to large computational

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

Gaussian Process Approximations of Stochastic Differential Equations

Gaussian Process Approximations of Stochastic Differential Equations Gaussian Process Approximations of Stochastic Differential Equations Cédric Archambeau Dan Cawford Manfred Opper John Shawe-Taylor May, 2006 1 Introduction Some of the most complex models routinely run

More information

MIT Joint Program on the Science and Policy of Global Change Constraining Climate Model Properties Using Optimal Fingerprint Detection Methods

MIT Joint Program on the Science and Policy of Global Change Constraining Climate Model Properties Using Optimal Fingerprint Detection Methods MIT Joint Program on the Science and Policy of Global Change Constraining Climate Model Properties Using Optimal Fingerprint Detection Methods Chris E. Forest, Myles R. Allen, Andrei P. Sokolov and Peter

More information

What s for today. Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model

What s for today. Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model What s for today Continue to discuss about nonstationary models Moving windows Convolution model Weighted stationary model c Mikyoung Jun (Texas A&M) Stat647 Lecture 11 October 2, 2012 1 / 23 Nonstationary

More information

Sequential adaptive designs in computer experiments for response surface model fit

Sequential adaptive designs in computer experiments for response surface model fit Statistics and Applications Volume 6, Nos. &, 8 (New Series), pp.7-33 Sequential adaptive designs in computer experiments for response surface model fit Chen Quin Lam and William I. Notz Department of

More information

Appendix E. OURANOS Climate Change Summary Report

Appendix E. OURANOS Climate Change Summary Report Appendix E OURANOS Climate Change Summary Report Production of Climate Scenarios for Pilot Project and Case Studies The protocol developed for assessing the vulnerability of infrastructure requires data

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL

Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL 1 Bayesian inference & process convolution models Dave Higdon, Statistical Sciences Group, LANL 2 MOVING AVERAGE SPATIAL MODELS Kernel basis representation for spatial processes z(s) Define m basis functions

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

A Flexible Climate Model For Use In Integrated Assessments

A Flexible Climate Model For Use In Integrated Assessments A Flexible Climate Model For Use In Integrated Assessments Andrei P. Sokolov and Peter H. Stone Center for Global Change Science. Massachusetts Institute of Technology, 77 Massachusetts Ave. Room 54-1312,

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods Robert V. Breunig Centre for Economic Policy Research, Research School of Social Sciences and School of

More information

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

Point-Referenced Data Models

Point-Referenced Data Models Point-Referenced Data Models Jamie Monogan University of Georgia Spring 2013 Jamie Monogan (UGA) Point-Referenced Data Models Spring 2013 1 / 19 Objectives By the end of these meetings, participants should

More information

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Alan Gelfand 1 and Andrew O. Finley 2 1 Department of Statistical Science, Duke University, Durham, North

More information

ATM S 111, Global Warming Climate Models

ATM S 111, Global Warming Climate Models ATM S 111, Global Warming Climate Models Jennifer Fletcher Day 27: July 29, 2010 Using Climate Models to Build Understanding Often climate models are thought of as forecast tools (what s the climate going

More information