Modeling uncertainty in metric space. Jef Caers Stanford University Stanford, California, USA

Similar documents
Distance-based stochastic modeling: theory and applications

Inverting hydraulic heads in an alluvial aquifer constrained with ERT data through MPS and PPM: a case study

Bootstrap confidence intervals for reservoir model selection techniques

Direct forecasting without full model inversion Jef Caers

Ensemble Kalman Filtering in Distance-Kernel Space

A Stochastic Collocation based. for Data Assimilation

Maximum variance formulation

Reliability of Seismic Data for Hydrocarbon Reservoir Characterization

Updating of Uncertainty in Fractured Reservoirs driven by Geological Scenarios

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Assessing the Value of Information from Inverse Modelling for Optimising Long-Term Oil Reservoir Performance

SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels

Platzhalter für Bild, Bild auf Titelfolie hinter das Logo einsetzen

Geostatistical History Matching coupled with Adaptive Stochastic Sampling: A zonation-based approach using Direct Sequential Simulation

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

Designing Kernel Functions Using the Karhunen-Loève Expansion

Advances in Locally Varying Anisotropy With MDS

Experiences with Model Reduction and Interpolation

Contents. Acknowledgments

Introduction to Machine Learning

Distance Preservation - Part I

New Developments in LS-OPT - Robustness Studies

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

Morphing ensemble Kalman filter

Kernel Methods. Barnabás Póczos

Statistical Rock Physics

Data assimilation in high dimensions

Discriminative Direction for Kernel Classifiers

Stochastic representation of random positive-definite tensor-valued properties: application to 3D anisotropic permeability random fields

Integration of seismic and fluid-flow data: a two-way road linked by rock physics

. Frobenius-Perron Operator ACC Workshop on Uncertainty Analysis & Estimation. Raktim Bhattacharya

Support Vector Machines

Statistical Pattern Recognition

Employing Model Reduction for Uncertainty Visualization in the Context of CO 2 Storage Simulation

Metric-based classifiers. Nuno Vasconcelos UCSD

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Visual Tracking via Geometric Particle Filtering on the Affine Group with Optimal Importance Functions

Preprocessing & dimensionality reduction

Introduction to Bayesian methods in inverse problems

Explaining Results of Neural Networks by Contextual Importance and Utility

Advanced Machine Learning & Perception

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

4 Bias-Variance for Ridge Regression (24 points)

A Spectral Approach to Linear Bayesian Updating

Oil Field Production using Machine Learning. CS 229 Project Report

Global Sensitivity Analysis of Complex Systems implications for natural resources. Jef Caers Geological Sciences Stanford University, USA

Cross-validation methods for quality control, cloud screening, etc.

Best Practice Reservoir Characterization for the Alberta Oil Sands

Computational and Statistical Learning Theory

PRECONDITIONING MARKOV CHAIN MONTE CARLO SIMULATIONS USING COARSE-SCALE MODELS

Data Assimilation Research Testbed Tutorial

Principal Components Analysis. Sargur Srihari University at Buffalo

PCA, Kernel PCA, ICA

Dynamic System Identification using HDMR-Bayesian Technique

LECTURE NOTE #11 PROF. ALAN YUILLE

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Course on Inverse Problems

Kernel-Based Contrast Functions for Sufficient Dimension Reduction

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Uncertainty Quantification and Validation Using RAVEN. A. Alfonsi, C. Rabiti. Risk-Informed Safety Margin Characterization.

RESERVOIR MODELING & CHARACTERIZATION TOOLS: APPLYING METHODS & TOOLS FROM THE OIL AND GAS INDUSTRY TO ENHANCE GEOTHERMAL RESOURCES.

The Gram Schmidt Process

The Gram Schmidt Process

Effect of velocity uncertainty on amplitude information

A new Hierarchical Bayes approach to ensemble-variational data assimilation

Gaussian Process Approximations of Stochastic Differential Equations

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

Implicit sampling for particle filters. Alexandre Chorin, Mathias Morzfeld, Xuemin Tu, Ethan Atkins

Robotics 2 Target Tracking. Kai Arras, Cyrill Stachniss, Maren Bennewitz, Wolfram Burgard

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Ensemble Kalman filter assimilation of transient groundwater flow data via stochastic moment equations

Comparing the gradual deformation with the probability perturbation method

Gaussian Process Approximations of Stochastic Differential Equations

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Neural networks and optimization

Advanced Machine Learning & Perception

Kalman Filter Computer Vision (Kris Kitani) Carnegie Mellon University

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space

Expectation Propagation for Approximate Bayesian Inference

THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS

Least Squares. Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter UCSD

Apprentissage non supervisée

Optimisation under Uncertainty with Stochastic PDEs for the History Matching Problem in Reservoir Engineering

USING GEOSTATISTICS TO DESCRIBE COMPLEX A PRIORI INFORMATION FOR INVERSE PROBLEMS THOMAS M. HANSEN 1,2, KLAUS MOSEGAARD 2 and KNUD S.

Least Squares Estimation Namrata Vaswani,

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

Analysis of the Pattern Correlation between Time Lapse Seismic Amplitudes and Saturation

Reducing Uncertainty in Modelling Fluvial Reservoirs by using Intelligent Geological Priors

Abstract. 1 Introduction. Cointerpretation of Flow Rate-Pressure-Temperature Data from Permanent Downhole Gauges. Deconvolution. Breakpoint detection

Dimension Reduction and Low-dimensional Embedding

Sequential Monte Carlo Samplers for Applications in High Dimensions

Intelligent Systems:

Dinesh Kumar, Mehrdad Raisee and Chris Lacor

Branch-and-Bound Algorithm. Pattern Recognition XI. Michal Haindl. Outline

Bayesian Dynamic Linear Modelling for. Complex Computer Models

Nonlinear Model Reduction for Uncertainty Quantification in Large-Scale Inverse Problems

Towards a probabilistic hydrological forecasting and data assimilation system. Henrik Madsen DHI, Denmark

Introduction to Machine Learning

Transcription:

Modeling uncertainty in metric space Jef Caers Stanford University Stanford, California, USA

Contributors Celine Scheidt (Research Associate) Kwangwon Park (PhD student)

Motivation Modeling uncertainty and prediction in engineering Effective? What modeling efforts matter for critical decisions? how much of present approaches are routinely used by geoscientists and engineers in the real applications? Efficient? Do our methodologies work at the scale and time framework of real projects and real time decision making?

State of the art Monte Carlo simulation Generating 100s of alternative models is still infeasible Mostly resort to generating a few models Important uncertainties often ignored Experimental design / response surface analysis Limited to simple variables Limited in scope of applicability

Modeling, distances and metric space Modeling uncertainty in metric space

Modeling in the Earth Sciences Data and Input variables: Geological Geophysical Engineering Model Geo processes Waves Flow Complex System Output Response prediction Optimization Control variables High Dimensional space Low Dimensional space Fit for purpose modeling Dimension reduction

Dimension/complexity reduction High Dimensional space Lower Dimensional space Traditional dimension reduction method (PCA) do not account for purpose One may risk discarding vital elements of the model Dimension may not be sufficiently reduced

Distance: metric space High Dimensional space Metric space Distance Cartesian No Axis defined Max dimension = # models Distance = some difference (scalar) between any two reservoir models What distance? Chose a distance that is correlated with the difference in response

Distances: multi dimensional scaling 1000 Gaussian realizations x 1 2D 2D map map of of realizations: explicit explicit visualization of of uncertainty x 2 x 1000 Calculate Euclidean distance d ij = ( x i x j ) T ( x i x j )

MDS Summary Reservoirmodel matrix X Dot product B = XX Reconstruction X = [ x, x, K, ] = x Euclideandistance matrix A T 1 VΛ 1/2 2 = VΛV T L

Application tailored distance 1000 Gaussian models 2D map of models Calculate a connectivity distance

Role of MDS To transform the application tailored distance into an approximating Euclidean distance (Ed) We know a lot of theory on Ed To visualize metric space by projection An explicit visual and diagnostic tool for (prior) model uncertainty, how model updating proceeds

Proper choice of distance leads to purpose driven structuring of models in metric space

Importance of purpose driven metric Euclidean distance Connectivity distance Color indicates amount of water produced

Important about the distance A distance or metric (difference between any two models) is NOT the same as a proxy (a measure of a single model) The better this distance is correlated to the difference in target output response, the more effective the distance approach will be (worst case=pure random)

Kernels: working with non Euclidean metrics Modeling uncertainty in metric space

Metric Space on x ϕ Feature Space on x No Axis defined ϕ 1 No Axis defined MDS MDS Kernels are a mathematical tool to map between metric spaces

Kernel transformations Transformation from one metric space to another does not require knowledge of ϕ, only knowledge of the dot product ϕ T ϕ ϕ T ( x) ϕ T ( y) = k( x, y) example k( x, y) = exp( ( x y) T σ ( x y) ) Approximating Euclidean distance obtained with MDS Role of the Kernel Increase dimension, seperability and linearity

Example 2D projection of models From metric space 2D projection of models in feature space RBF Kernel

Model expansion in metric space Modeling uncertainty in metric space

Why model expansion? Area of interest Common tasks in modeling: Model screening Model updating Conditioning models Response uncertainty Model expansion allows generating new models WITHOUT needing the original methodology/algorithms that generated the initial models

Karhunen Loeve expansions Gaussian realizations X Dot - product B = XX KL - expansion x new [ x, x,, x ] = K Euclidean distance matrix A T = 1 = VΛV VΛ 2 1/ 2 y T L

Euclidean distance based KL expansions 1000 models 3 new models KL expansion Calculate Euclidean distance

MDS of new models Blue dots: 1000 new models Red dots: 1000 original models both populations reflect the same uncertainty

Non Gaussian, non Euclidean model expansions? metric space ϕ Model expansion ϕ 1 feature space MDS MDS

Model expansion in feature space Model expansion feature space Φ x i a φ K ij ( x i ) X a Φ T ( x ) ( ) i x j xi x j = k( x, ) = i x j exp σ Kernel or Gram Matrix K (L L) K = V K Λ V K T K MDS Using K Φ = V K Λ 1/ 2 K Karhunen - Loeve expansion : 1 ϕ( x) = Φ b with b = VKy L ( y is a standard Gaussian vector)

Mapping back to metric space Model expansion? metric space ϕ ϕ 1 feature space MDS With B? MDS With K

Mapping back: the pre image problem MDS With B? MDS With K ϕ pre - image problem : ˆx new d = arg min ϕ( x x new d new d )-Φb new 2 2 with b new = 1 L V K y new solution : β opt i ˆx new d L opt = βi xd,i with β i= 1 i= 1 is only function of K, y new L and opt i X = 1

Creating a new model? L L new opt opt x = βi xi with βi = 1 i= 1 i= 1 metric space (hard data conditioning is maintained) Three methods MDS on B 1.Unconstrained method 2.Transformation method 3.Stochastic optimization method See next presentation of Celine Scheidt

Non-Euclidean, non-gaussian example 300 realizations generated using Boolean sampling Definition of connectivity distance 4 new realizations using non Euclidean, Non Gaussian KL expansion

Data conditioning in metric space Modeling uncertainty in metric space

A simple conditioning problem Reference permeability field Fractional flow data FWT (%) Time (days)

Formulating the conditioning problem in metric space The data: X = [ x x K x ] G = [ g x ) g( x ) Kg( x )] 1 2 L ( 1 2 L The problem: find x such that : data = g( x) The metric: d ij = d( g( x ), g( x i j )) i,j d i,data = d ( g( x ), data) i i The augmented data: X + = [ ] true + x x K x, x G = [ g x ) g( x ) Kg( x ), data] 1 2 L ( 1 2 L The problem in metric space find x such that : d( x x ) = d, d,true 0 with x MDS a x d

Illustration L=200 200 initial permeability models True earth Find the collection of models that map onto this location: model expansion in metric space

Note: diagnosing a wrong prior 200 initial permeability models True earth No need to even attempt to history match this data due to data model inconsistency

Illustration 200 initial permeability models True earth Find the collection of models that map onto this location: model expansion in metric space

The post image problem Model expansion metric space ϕ ϕ 1 feature space? MDS On B MDS On K?

The post image problem MDS On B MDS On K? find x such that : d( x x ) = d, d,true 0 with x MDS a x d y opt = arg min d( x, x ) with ϕ( x ) = Φ y new d,true new d new d 1 L V K y new Use gradual deformation to find multiple Gaussian vector solutions

Illustration four realizations through solving the post image problem by gradual deformation Solving the post image problem does not require any new flow simulation

Channelized case Reference injector Production well

Initial set 200 models mapped from metric space

Solve the post image problem 4 history matched models

Use of proxy distances Requires small CPU Proxy metric space Requires large CPU Actual metric space MDS MDS

Use of proxy distance Cluster and select models Proxy metric space Solve post image problem Actual metric space MDS MDS

Illustration P1 y 100 realizations of permeability (SGS) 100x100x1 y 2 wells: y O1 y 1 production well (P1),1 observation well (O1) y Objective: History match pressure at O1 y Distance: Difference in pressure at O1 for each time step y Proxy distance: Difference in pressure at O1 for last three time step 3500 Distance - Proxy 3000 ρ = 0.89 2500 2000 1500 1000 500 0 0 2000 4000 6000 8000 Distance - Eclipse 10000 12000

Illustration MDS projection from proxy metric space Truth 7 models are selected for flow simulation

Illustration MDS projection from actual metric space Solve the post image problem through model expansion Construct 100 new models

Illustration 3500 3000 Pressure at 01 2500 2000 1500 1000 500 0 200 400 600 800 Time (days) 100 history matched models obtained by performing only 7 flow simulations

What s next? Celine: More on solving the pre image/post image problem Kwangwon: Kalman filtering in Metric space: a reservoir case study Celine: Joint construction of high resolution and coarse models in metric space Mehrdad: Multi point algorithms in metric space

Concluding remarks MUMS: Modeling Uncertainty in Metric Space: Powerful diagnostic tool on model uncertainty, datamodel consistency, model updating Distances allows including the purpose of modeling Working with ensembles is more effective and efficient than working on a single model at a time