Bayesian sensitivity analysis of a cardiac cell model using a Gaussian process emulator Supporting information

Similar documents
Gaussian Processes for Computer Experiments

Introduction to emulators - the what, the when, the why

Parameter estimation and prediction using Gaussian Processes

Gaussian Process Approximations of Stochastic Differential Equations

Recursive Estimation

Learning Gaussian Process Models from Uncertain Data

Uncertainty in energy system models

COM336: Neural Computing

Data assimilation with and without a model

Data assimilation with and without a model

An adaptive kriging method for characterizing uncertainty in inverse problems

(Extended) Kalman Filter

Understanding the uncertainty in the biospheric carbon flux for England and Wales

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

1 Data Arrays and Decompositions

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Variational Principal Components

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Kalman Filtering in a Mass-Spring System

Global Optimisation with Gaussian Processes. Michael A. Osborne Machine Learning Research Group Department o Engineering Science University o Oxford

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

An EM algorithm for Gaussian Markov Random Fields

Lecture 4: Dynamic models

Big model configuration with Bayesian quadrature. David Duvenaud, Roman Garnett, Tom Gunter, Philipp Hennig, Michael A Osborne and Stephen Roberts.

An introduction to Bayesian statistics and model calibration and a host of related topics

Adaptive ensemble Kalman filtering of nonlinear systems

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Machine Learning - MT & 5. Basis Expansion, Regularization, Validation

6.4 Kalman Filter Equations

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit

Statistics 910, #15 1. Kalman Filter

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

Modelling and Control of Nonlinear Systems using Gaussian Processes with Partial Model Information

State Estimation with a Kalman Filter

5.2 Expounding on the Admissibility of Shrinkage Estimators

arxiv: v1 [stat.me] 24 May 2010

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Appendix of the paper: Are interest rate options important for the assessment of interest rate risk?

Probabilistic & Unsupervised Learning

Lecture 10: Parameter Estimation for ODE Models

Bayes Decision Theory

The Variational Kalman Smoother

PREDICTION FROM THE DYNAMIC SIMULTANEOUS EQUATION MODEL WITH VECTOR AUTOREGRESSIVE ERRORS

EE 565: Position, Navigation, and Timing

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Computer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression

MATRICES. a m,1 a m,n A =

ENGR352 Problem Set 02

Diagnostics for Linear Models With Functional Responses

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham

Covariance function estimation in Gaussian process regression

Bayesian Inference for the Multivariate Normal

Principles of DCM. Will Penny. 26th May Principles of DCM. Will Penny. Introduction. Differential Equations. Bayesian Estimation.

Uncertainty propagation in a sequential model for flood forecasting

MIXTURE MODELS AND EM

Computer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression

AMS-207: Bayesian Statistics

CS 532: 3D Computer Vision 6 th Set of Notes

Introduction to Gaussian Processes

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Machine learning - HT Maximum Likelihood

Multivariate Bayesian Linear Regression MLAI Lecture 11

Efficiency and Reliability of Bayesian Calibration of Energy Supply System Models

variability of the model, represented by σ 2 and not accounted for by Xβ

MATH5745 Multivariate Methods Lecture 07

Supplementary Note on Bayesian analysis

Ridge Regression Revisited

Hierarchy. Will Penny. 24th March Hierarchy. Will Penny. Linear Models. Convergence. Nonlinear Models. References

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016

Nonparametric Drift Estimation for Stochastic Differential Equations

The Bayesian approach to inverse problems

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

1 Kalman Filter Introduction

Bayesian Inference. Chapter 9. Linear models and regression

Linear Regression (9/11/13)

Estimating percentiles of uncertain computer code outputs

The Expectation-Maximization Algorithm

Spatially Adaptive Smoothing Splines

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

Dynamic System Identification using HDMR-Bayesian Technique

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

ST 740: Linear Models and Multivariate Normal Inference

Covariance Matrix Simplification For Efficient Uncertainty Management

November 2002 STA Random Effects Selection in Linear Mixed Models

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1

Application of the Ensemble Kalman Filter to History Matching

Discovery Guide. Beautiful, mysterious woman pursued by gunmen. Sounds like a spy story...

Supplementary materials for Multivariate functional response regression, with application to fluorescence spectroscopy in a cervical pre-cancer study

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

. a m1 a mn. a 1 a 2 a = a n

The Gaussian distribution

Improving power posterior estimation of statistical evidence

New Insights into History Matching via Sequential Monte Carlo

Transcription:

Bayesian sensitivity analysis of a cardiac cell model using a Gaussian process emulator Supporting information E T Y Chang 1,2, M Strong 3 R H Clayton 1,2, 1 Insigneo Institute for in-silico Medicine, University of Sheffield, Sheffield, UK. 2 Department of Computer Science, University of Sheffield, Sheffield, UK. 3 School of Health and Related Research, University of Sheffield, Sheffield, UK. E-mail: r.h.clayton@sheffield.ac.uk Introduction In this supporting information, we include details of the mathematics that underpin the approach taken in this study. A much more in-depth coverage of Gaussian process GP) emulators is given in the MUCM webpages and the toolkit that has been developed there http://mucm.aston.ac.uk/mucm/mucmtoolkit/index.php? page=metahomepage.html. Our aim here is to describe the pathway through this material that was followed for the present study. We have used notation that is consistent with the MUCM pages. Vector and matrix quantities are indicated by bold type, and posterior estimates given design data are indicated with an asterisk. Emulator construction Each emulator was described by a GP, composed of a mean function and a covariance function, GPx) = hx) T β + σ 2 cx, x ). 1) We used a linear form for the mean function with hx) = 1, x) T, hx) T β = β 0 + β 1 x 1 +... + β P x P, 2) and a Gaussian form for the covariance [ P { } ] 2 xp,1 x p,2 ) cx 1, x 2 ) = exp. 3) p=1 δ p 1

This choice enabled direct calculation of variance based sensitivity indices. In these expressions x = x 1, x 2,..., x P ) are P inputs parameters), the emulator hyperparameters β and δ are vectors of length P, and σ 2 is a scalar. These quantities were obtained by fitting to the design data, assuming weak prior information on β and σ 2 [2]. In this approach, we first obtained a best estimate for δ, ˆδ, from the maximum of the posterior log likelihood given the design data D and outputs fd), and and a prior estimate of δ, δ 0. Best estimates of β and σ 2, were then obtained from ˆδ. The value for ˆδ was chosen to be the value that optimised the posterior distribution πδ, given the design data D and fd), and assuming a prior distribution of δ, π δ δ) ˆσ 2 ) n q)/2 A 1/2 H T AH 1/2 4) Where N was the number of design points 200), Q = P + 1, H an N Q matrix given by H = [hx 1 ), hx 2 ),..., hx N )] T, 5) A an N N matrix 1 cx 1, x 2 ) cx 1, x N ) cx 2, x 1 ) 1. A =., 6)... cx N, x 1 ) 1 c the covariance function given by Equation 3, x 1, x 2,..., x N rows of the design data D, and ˆσ 2 = 1 N Q 2) fd)t { A 1 A 1 H H T A 1 H ) 1 H T A 1} fd). 7) The value of ˆδ was obtained by maximising equation 4 using the design data as described in the implementation details below. This enabled ˆσ 2 to be calculated from equation 7, and ˆβ was then calculated from ˆβ = H T A 1 H ) 1 H T A 1 fd), 8) The set of hyper parameters ˆδ, ˆσ 2, and ˆβ, together with the mean function given by equation 2 and the covariance function equation 3 specified each emulator. 2

Emulator validation Once the emulators were constructed, they could be used to estimate the output e.g. AP D 90 ) for an input x using the posterior mean and variance of the emulator output given the design data. An asterisk is used to denote a posterior output,. The posterior mean of each emulator output was given by a linear combination of the inputs modified by a term describing the fit to the design data m x) = hx) T ˆβ + cx) T A 1 fd) H ˆβ ), 9) and the posterior variance by v x, x ) = ˆσ 2{ cx, x ) cx) T A 1 cx ) + hx) T cx) T A 1 H ) H T A 1 H ) 1 hx ) T cx ) T A 1 H ) T }. 10) In these equations the linear mean function hx) was given by equation 2, the covariance matrix cx) that depends on ˆδ by equation 3, and the matrices H and A by equations 5 and 6 respectively. These functions were used to validate the emulators against test data D t and corresponding outputs fd t ) obtained from N t = 20 additional runs of the simulator. For each row j = 1,..., N T of D t with test inputs x t j, and test output yt j = fxt j ), the standardised error between the predicted emulator output m x t ) and the observed simulator output y t j was calculated from E j = yt j m x t j ) v x t j, xt j ). 11) The Mahanalobis distance for the complete set of test data was a measure of overall agreement between the predicted and test data, and was calculated from MD = fd t ) m ) T V ) 1 fd t ) m ), 12) where m was an N t 1 vector m x t 1), m x t 2),..., m x t N t )), and V is an N t N t matrix, where each element V i, j) is v x t i, xt j ). The theoretical distribution for MD has a mean of N t and a variance of 2N t N t + N Q 2)/N Q 4) [1]. 3

Following validation, a new version of each emulator was built by combining the design data {D, fd)} and test data {D t, fd t )} and using the N + N t = 220 sets of input and output data to obtain updated values of the emulator hyperparameters ˆδ, ˆβ and ˆσ2. The updated emulator was used for all subsequent analysis. Uncertainty in emulator output If the probability density function of uncertain inputs is given by ωx), then the posterior expectation of the emulator output http://mucm.aston.ac.uk/mucm/mucmtoolkit/index.php?page=procuagp.html) with random inputs X is E [E[fX)]] = m x)ωx) dx, 13) and the variance of this expectation was V ar [E[fX)]] = v x, x )ωx)ωx ) dxdx. 14) The expected variance of the emulator output was given by E [V ar[fx)]] = I 1 V ar [E[fx))]) + I 2 E [E[fx))]) 2 ) 15) where I 1 = v x, x)ωx) dx 16) and I 2 = m x) 2 ωx) dx. 17) Our choice of linear mean, weak prior, and Gaussian correlation function for the GP enabled the direct calculation of these integrals, provided that the probability density function of the inputs ωx) was multivariate Gaussian with specified mean m and variance var. The expectation of the emulator output was then 4

E [E[fx))] = R h T ˆβ + Rt T e, 18) Where e was an N 1 vector e = A 1 fd H ˆβ) 19) R h a 1 Q vector R h = 1, m), 20) where m was a 1 P vector of the mean values of each input, and R t a 1 N vector, where the k th element of N) was given by R t k) = B 1/2 2C + B 1/2 Qk m k exp ) ). 21) 2 In this expression C was a diagonal P P prior correlation matrix where Ci, j) = ˆδi) for i = j, and Ci, j) = 0 otherwise, B the precision matrix of the input distribution ωx), the inverse of a diagonal covariance matrix CV where CV i, j) = vari) for i = j, and Ci, j) = 0 otherwise, and vari) was the i th element of a P 1 variance vector var corresponding to the variance in input i. The function Q k m k ) yielded a single scalar Q k m k) = 2 m k x k ) T C m k x k ) + m k m) T B m k m) 22) where m k was a P 1 vector given by m k = 2C + B) 1 2Cx k + Bm), 23) and where x k was the k th row of inputs with P elements) in the design data used to build the emulator. The variance of the expectation of the emulator output V ar [E[fx]] was given by V ar [E[fx)]] = ˆσ 2 [ U R T t A 1 R t + R h G T R t ) T WR h G T R t ) ]. 24) In this expression U = B B2 1/2 25) 5

where B2 was a 2P 2P matrix, B21...P, 1...P ) = 2C + B, B2P + 1...2P, P + 1...2P ) = 2C + B, B21...P, P + 1...2P ) = 2C, B2P + 1...2P, 1...P ) = 2C, 26) G an N Q matrix, G = A 1 H, 27) and W a Q Q matrix, W = H T A 1 H. 28) The expectation of the variance of the emulator output E [V ar[fx]] could then be calculated using equation 15, where I 1 = ˆσ 2 [ 1 trace A 1 R tt ) + trace W Rhh 2R ht G + G T R t tg ))] 29) and I 2 = ˆβ T R hh ˆβ + 2 ˆβT R ht e + e T R tt e. 30) In these expressions R tt was an N N matrix, where entry k, l) was given by R tt k, l) = B 1/2 4C + B 1/2 exp Q kl m kl)/2), 31) where m kl was a P 1 vector m kl = 4C + B) 1 2Cx k + 2Cx l + Bm), 32) 6

and Q kl m kl) a scalar given by Q kl m kl) = 2m kl x k ) T Cm kl x k ) + 2m kl x l ) T Cm kl x l ) +m kl m) T Bm kl m). 33) R ht was a Q N matrix, where the k th column was given by R ht k) = R t k)f, 34) where F was a Q 1 vector with entries 1, m k1)...m kp )), where m k was given by equation 23. Finally, R hh was a Q Q matrix R h h = R T h R h, where R h was given by equation 20. In the manuscript, Table 2 shows E [E[fx))], V ar [E[fx))], and E [V arfx))]/e [E[fx)]] 100 for each of the eight emulators, obtained by setting the mean vector m for the inputs to be 0.5, 0.5, 0.5, 0.5, 0.5, 0.5), and the variance vector var to be 0.04, 0.04, 0.04, 0.04, 0.04, 0.04). The distributions of AP D 90 in Figure 5a) were obtained from E [E[fx))] and E [V arfx))] calculated using an identical mean vector with all entries set to 0.5, and a variance vector where all elements were set to 0.0001 except for the element corresponding to G K, which was set to 0.01, 0.02, 0.05, and 0.1. Calculation of mean effects The mean effect M w x w ) shows how the emulator output averaged over uncertain inputs changes when input x w is given a fixed value. The posterior expectation of M w x w ) is a scalar, and is given by M w x w ) = R w ˆβ + Tw e. 35) This equation had the same form as equation 18. R w was a 1 Q vector with R w 1) = 1, and R w i) = x w for i = w, and the mean value of input i mi) otherwise for i = 2...Q. T w was a 1 N vector, where the k th element was given by 7

[ T w k) = i w B 1/2 ii exp 2C ii + B ii ) 1/2 exp 1 ) ) ] 2Cii B ii x i,k m i ) 2 2 2C ii + B ii ) 1 2 x w x w,k ) T 2C ww x w x w,k ), 36) where x i/w,k was the value of the i/w th of P ) input on the k th of N) of the design data used to build the emulator, and all other quantities are defined above. Calculation of sensitivity indices from the GP emulators For each emulator, the sensitivity index of each input was calculated from E [V w ]/E [V ar[fx)]]. The total variance in emulator output E [V ar[fx)]]was calculated as described in equation 15, and the variance in the emulator output when input x w was fixed E [V w ], the sensitivity variance, was calculated from E [V w ] = E [E[E[fx x w )] 2 ]] E [E[fx)] 2 ]. 37) The first term in this equation was given by E [E[E[fx x w )] 2 ]] = ˆσ 2 { U w trace [ W Q w S w A 1 H H T A 1 S T w + H T A 1 P w A 1 H )]} + e T P w e + 2 ˆβ T S w e + ˆβ T Q w ˆβ, 38) where U w was a scalar U w = ) 1/2 B ii, 39) B ii + 4C ii i w P w was an N N matrix, where the k, l) th element was 8

[ B ii P w k, l) = 2C ii + B ii i w exp 1 ) 2Cii B ii [xi,k m i ) 2 + x i,l m i ) 2])] 2 2C ii + B ii { ) 1/2 B ww exp 1 4C ww + B ww 2 1 4C ww + B ww ) [4C 2 wwx w,k x w,l ) 2 + 2C ww B ww xw,k m w ) 2 + x w,l m w ) 2) ])} 40) Q w is a Q Q matrix, which was assembled in the following steps Q w 1, 1) = 1 Q w 1, 2...Q) = m1...p ) Q w 2...Q, 1) = m1...p ) Q w 2...Q, 2...Q) = m m T Q w w + 1, w + 1) = Q w w + 1, w + 1) + B 1 ww, 41) and S w a Q N matrix, where the k, l) th element was S w k, l) = E[h k x)] 1 i P ) B 1/2 [ ii exp 1 )] 2Cii B ii x 2C ii B ii ) 1/2 i,l m i ) 2 2 2C ii + B ii 1 if k = 1 E[h k x)] = if k w m k 42) 43) 2C kk x k,l +B kk m k 2C kk +B kk if k = w Implementation All of the code for this study was implemented in Matlab, using expressions detailed in the MUCM toolkit http://mucm.aston.ac.uk/mucm/mucmtoolkit/). The code was tested against the numerical examples provided in the toolkit. Each emulator was fitted using design data and outputs obtained from N = 200 runs of the LR1991 9

simulator, so for each emulator D was an N P 200 3) matrix, and fd) an N 1 200 1) vector. The design data were used to construct H. An initial estimate of σ 2 was then obtained from equation 7. Following the MUCM toolkit, we re-parameterised equation 4 with τ = 2 log e δ). This removes a constraint on the optimisation because δ ranges from 0 whereas τ can range from. Matrix A was thus calculated from equations 6 and 3 with δ = expτ/2), and the Nelder-Mead algorithm as implemented in the Matlab function fminsearch was then used to find the minimum of π τ) = π δ expτ/2)) 44) with a tolerance of 10 5 and a maximum number of iterations set to 2000. A vector of initial values for the correlation length hyperparameters, δ 0, was required for the Nelder-Mead minimisation. We found the fit of the emulator to the test data was sensitive to the choice of δ 0. Good sets of initial estimates found by trial and error) were 0.1, 1.0, 1.0, 1.0, 1.0, 1.0) for the emulator for maximum dv /dt, 0.1, 1.0, 1.0, 0.1, 1.0, 1.0) for peak V m, the variance of each column of D for the emulators for dome voltage, AP D 90, and resting voltage, and 1.0, 1.0, 1.0, 1.0, 1.0, 1.0) for the remaining emulators. References [1] Leonardo S. Bastos and Anthony OHagan. Diagnostics for Gaussian Process Emulators. Technometrics, 514):425 438, November 2009. [2] M. C. Kennedy and A O Hagan. Predicting the output from a complex computer code when fast approximations are available. Biometrika, 87:1 13, 2000. 10