Introduction to emulators - the what, the when, the why

Similar documents
Sensitivity analysis and calibration of a global aerosol model

Gaussian Processes for Computer Experiments

STAT 518 Intro Student Presentation

Gaussian Process Regression

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Learning Gaussian Process Models from Uncertain Data

Gaussian Process Approximations of Stochastic Differential Equations

Introduction to Gaussian Processes

Advanced Introduction to Machine Learning CMU-10715

Computer experiments with functional inputs and scalar outputs by a norm-based approach

Sensitivity analysis in linear and nonlinear models: A review. Introduction

Nonparameteric Regression:

Optimal Designs for Gaussian Process Models via Spectral Decomposition. Ofir Harari

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Gaussian Process Regression Forecasting of Computer Network Conditions

Statistical Techniques in Robotics (16-831, F12) Lecture#20 (Monday November 12) Gaussian Processes

Gaussian Process Regression and Emulation

Neutron inverse kinetics via Gaussian Processes

Introduction to Gaussian-process based Kriging models for metamodeling and validation of computer codes

Multi-fidelity sensitivity analysis

Bayesian sensitivity analysis of a cardiac cell model using a Gaussian process emulator Supporting information

Optimization of Gaussian Process Hyperparameters using Rprop

Monitoring Wafer Geometric Quality using Additive Gaussian Process

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes

A Process over all Stationary Covariance Kernels

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

arxiv: v1 [stat.me] 24 May 2010

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Sensitivity analysis using the Metamodel of Optimal Prognosis. Lectures. Thomas Most & Johannes Will

Model Selection for Gaussian Processes

Some methods for sensitivity analysis of systems / networks

Gaussian Processes. 1 What problems can be solved by Gaussian Processes?

Introduction to Gaussian Processes

Nonparametric Bayesian Methods (Gaussian Processes)

Efficient Resonant Frequency Modeling for Dual-Band Microstrip Antennas by Gaussian Process Regression

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

An Introduction to Gaussian Processes for Spatial Data (Predictions!)

Introduction to Gaussian Process

Sequential Importance Sampling for Rare Event Estimation with Computer Experiments

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

mlegp: an R package for Gaussian process modeling and sensitivity analysis

GAUSSIAN PROCESS REGRESSION

arxiv: v1 [stat.me] 10 Jul 2009

An introduction to Bayesian statistics and model calibration and a host of related topics

Gaussian Processes in Machine Learning

Can you predict the future..?

Stat 890 Design of computer experiments

Nonparmeteric Bayes & Gaussian Processes. Baback Moghaddam Machine Learning Group

Probabilistic numerics for deep learning

Lecture 5: GPs and Streaming regression

ADVANCED MACHINE LEARNING ADVANCED MACHINE LEARNING. Non-linear regression techniques Part - II

Probabilistic Models for Learning Data Representations. Andreas Damianou

Model Based Clustering of Count Processes Data

An adaptive kriging method for characterizing uncertainty in inverse problems

COMP 551 Applied Machine Learning Lecture 21: Bayesian optimisation

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

One-at-a-Time Designs for Estimating Elementary Effects of Simulator Experiments with Non-rectangular Input Regions

Prediction of double gene knockout measurements

Theory and Computation for Gaussian Processes

Efficient geostatistical simulation for spatial uncertainty propagation

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

A Unified Framework for Uncertainty and Sensitivity Analysis of Computational Models with Many Input Parameters

Introduction. Chapter 1

Uncertainty in energy system models

Parallel Bayesian Global Optimization, with Application to Metrics Optimization at Yelp

Stochastic Processes, Kernel Regression, Infinite Mixture Models

Data-driven Kriging models based on FANOVA-decomposition

Introduction to Statistical Methods for Understanding Prediction Uncertainty in Simulation Models

Étude de classes de noyaux adaptées à la simplification et à l interprétation des modèles d approximation.

Use of Design Sensitivity Information in Response Surface and Kriging Metamodels

STA 4273H: Statistical Machine Learning

Virtual Sensors and Large-Scale Gaussian Processes

Iterative Gaussian Process Regression for Potential Energy Surfaces. Matthew Shelley University of York ISNET-5 Workshop 6th November 2017

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Adaptive Sampling of Clouds with a Fleet of UAVs: Improving Gaussian Process Regression by Including Prior Knowledge

Machine Learning Lecture 2

Gaussian Process Regression: Active Data Selection and Test Point. Rejection. Sambu Seo Marko Wallat Thore Graepel Klaus Obermayer

BAYESIAN DECISION THEORY

CS 7140: Advanced Machine Learning

Reliability Monitoring Using Log Gaussian Process Regression

Efficient sensitivity analysis for virtual prototyping. Lectures. Thomas Most & Johannes Will

Introduction to Gaussian Processes

Latin Hypercube Sampling with Multidimensional Uniformity

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes

Efficiency and Reliability of Bayesian Calibration of Energy Supply System Models

CPSC 540: Machine Learning

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

New Insights into History Matching via Sequential Monte Carlo

Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach. Radford M. Neal, 28 February 2005

Parameter estimation and prediction using Gaussian Processes

Gaussian Processes (10/16/13)

CSC321 Lecture 2: Linear Regression

Finding minimum energy paths using Gaussian process regression

Gaussian Process Regression Model in Spatial Logistic Regression

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

System identification and control with (deep) Gaussian processes. Andreas Damianou

Derivative observations in Gaussian Process models of dynamic systems

Context-Dependent Bayesian Optimization in Real-Time Optimal Control: A Case Study in Airborne Wind Energy Systems

Transcription:

School of Earth and Environment INSTITUTE FOR CLIMATE & ATMOSPHERIC SCIENCE Introduction to emulators - the what, the when, the why Dr Lindsay Lee 1

What is a simulator? A simulator is a computer code used to represent some real world process 2

Why simulate? Understand Which aerosol are most effective at cooling the climate? Predict Could aerosol be used to cool the climate for geoengineering? 3

Notation y = f(x) represents the simulation process y is the simulator output x is the simulator input f is the simulator 4

Uncertainty in simulation Y = f X Computer codes are imperfect representations of the real world. How do these imperfections affect our understanding and predictions? What is the effect on Y? 5

Uncertainty in climate simulators 6

What is X? X are model inputs/parameters They may be spatial fields/time series They may be single values used globally or regionally 7

Where X is a spatial field or series We will represent the uncertainty in X by a single value to perturb the whole field, or series (maybe regionally, but mostly globally) 8

Uncertainty in aerosol simulator inputs Diagram courtesy of PNNL 9

Uncertainty in simulator inputs X is uncertain Y is uncertain x can be given uncertainty distribution G Marginally x k has distribution G k Obtain y i = f(x i ) for given x i from G 10

Simulator complexity Simulators contain lots of equations (hundreds of lines of code) Simulator resolution increases with computer power Credit: NOAA 11

Bottom line f is often too complex to simulate many x i from G to characterise the effect of uncertainty on y 12

Emulation Replace f with f f is the emulator f is much quicker to run but accurately represents f 13

More accurately Replace y k = f(x) with y k = f(x) Define an output (or selection of) to be emulated - doesn t replace the whole simulator for all research 14

What is y? From the simulator y can be any/all model output/diagnostics For emulation y must be more restricted y is often a scalar representing a single model output, at a particular time and location Eg. global mean temperature for 2000 total number for particulate matter aerosol > 2.5um in a model grid box y could be a spatial field, or a time series, or EOF/PCAs, or a selection of model outputs 15

Emulator complexity The complexity of the emulator is dependent on the complexity of the model output to be emulated 16

Continuing.. We will carry on assuming y is scalar and x is a collection of scalar values x = x 1, x 2, x 3,, x p for p uncertain parameters Use a selection of x i, y i from f to build f 17

When is an emulator useful? uncertainty analysis sensitivity analysis Quantify the impact of uncertainty on model output Understand model response to uncertain inputs And the model is too complex to do the sampling required 18

A statistical emulator An emulator that gives estimates of its own uncertainty We tend to use the Gaussian process 19

The Gaussian process emulator, (Rasmussen and Williams, 2006 & O Hagan 2006). Each point in space has a Gaussian distribution Each collection of points has a multivariate Gaussian distribution The whole collection is a Gaussian process In its basic form requires smooth output Does not require the output to be Gaussian 20

The GP prior Set up the Gaussian process prior function f x β, ~ GP(h x T β, 2 c(x, x )) h x T β is the mean function c x, x is the covariance function β, are hyperparameters 21

Specifying the GP mean Use the mean function to specify any functional form known to exist In the absence of prior information we tend to specify it as constant or a linear regression formula h x = 1 h x = β 0 + β 1 x 1 + β 2 x 2 + + β p x p 22

What does the prior mean function do? When there are very few simulator runs to give information the function is weighted towards the prior mean Outside the bounds of the simulator runs the emulator tends towards the prior mean With more (well-chosen) simulator runs the hyperparameters β are better estimated 23

β hyperparameters Can be plugged in if a particular regression model is required Usually use maximum likelihood to estimate them and plug-in Could use a Bayesian approach but timeconsuming 24

Specifying the GP covariance The covariance function should reflect c x, x = 1 and that c x, x decreases as x x increases Includes further hyperparameters δ that specify the speed of covariance decrease with increasing x x cov f x, f(x σ = σ 2 c(x, x ) 25

Typical choices for c(x, x ) Gaussian: c x, x δ = exp{ x x 2 2δ 2 } 1 + 5 x x δ Matern 5/2: c x, x δ = + 5(x x ) 2 3δ 2 exp( 5 x x ) δ 26

Effect of covariance The uncertainty at simulator runs is 0 unless a nugget is used The uncertainty between points decreases as it gets closer to simulator suns The width of bounds between points increases as δ decreases 27

The posterior distribution A Gaussian process, with parameters estimated by the training data Any point can be estimated Uncertainty in the estimate can also be estimated 28

Emulator uniqueness The emulator is non-unique Hyperparameters will be different if you re-run Unless you set the seed Using prior information can help stability Often, it s not actually a problem 29

Building an emulator From Jill Johnson following O Hagan (2006) 30

Aerosol model emulation with DiceKriging (Roustant et al., 2012 & Lee,at al, 2013) Emulation done gridbox by gridbox for sensitivity analysis 31

Training data Need to update the prior with simulator points Require these points to provide good information about the function through uncertain space Often consider no prior information regarding particular parts of the space 32

Design for training data (McKay et al., 1979) Latin hypercube sampling Good for inactive dimensions Use optimum or maximin May want piecewise LHS 33

Latin hypercube sampling 34

The marginals 35

Sequential design When emulator validation suggests can update the design sequentially If no reason to search particular region, augment the design Can add points sequentially, perhaps Sobol, to particular regions 36

Factorial designs These are not recommended The covariance function is better estimated using a variety of distances between points Uncertainty between training points will be very high 37

Validation of emulators Have to check the emulator can predict what the emulator would say Usually want to check out of sample Can use leave one out when circumstances dictate 38

Aims of validation Prediction value Ensure emulator predictions are Ensure e close to simulator Prediction uncertainty Ensure uncertainty in emulator is small enough for further inference 39

40

Validation design When out-of-sample tend to design two groups, as per Bastos & O Hagan (2009) 1/3 close to simulated points and 2/3 spaced out Helps reveal particular difficulties 41

Validation statistics Individual prediction errors Treat like regression errors, approximately Gaussian and < 2 standardised Mahalanobis distance single summary of individual prediction errors extreme values indicate emulator/simulator mismatch Pivoted Cholesky errors following variance decomposition particular patterns aid interpretation of mismatch 42

Validation plots 43

Aerosol emulator validation 44

Leave one out When out-of-sample is not possible Leave a point out, build emulator, produce validation statistics Repeat for all points and compare statistics Any points with extreme statistics indicate problems 45

Emulation for understanding 46

Emulation for uncertainty analysis 47

Emulation for sensitivity analysis (Saltelli et al., 2000) using R sensitivity (Pujol et al., 2008) 48

Aerosol model sensitivity analysis 49

Aerosol model sensitivity analysis II 50

Multiple model sensitivity analysis I 51

Multiple model sensitivity analysis II 52

Emulators for model constraint I 53

Emulators for model constraint II 54

Extension to GP emulators Stochastic simulator add an estimated nugget to covariance Multivariate emulator must characterise the covariance between multiple outputs Discontinuities could build multiple emulators, perhaps multivariate 55

What an emulator can t do Extrapolate too far with confidence Estimate a simulator output it s not trained to Replace the simulator 56

When not to use an emulator When the simulator is very cheap to run When it won t validate When you haven t got a good set of simulator runs for training When you aren t sure what you ll use it for 57

Methodology references Rasmussen, Carl Edward, and Christopher KI Williams., 2006. Gaussian processes for machine learning. Vol. 1. Cambridge: MIT press. O Hagan, A., 2006. Bayesian analysis of computer code outputs: a tutorial. Reliability Engineering & System Safety, 91(10), pp.1290-1300. Roustant, O., Ginsbourger, D. and Deville, Y., 2012. DiceKriging, DiceOptim: Two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization. McKay, M.D., Beckman, R.J. and Conover, W.J., 1979. Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21(2), pp.239-245. Bastos, L.S. and O Hagan, A., 2009. Diagnostics for Gaussian process emulators. Technometrics, 51(4), pp.425-438. Saltelli, A., Chan, K. and Scott, E.M. eds., 2000. Sensitivity analysis (Vol. 1). New York: Wiley. Pujol, G., Iooss, B. and Iooss, M.B., 2008. Package sensitivity. 58

Selected Lee references 59