Modeling uncertainty in metric space Jef Caers Stanford University Stanford, California, USA
Contributors Celine Scheidt (Research Associate) Kwangwon Park (PhD student)
Motivation Modeling uncertainty and prediction in engineering Effective? What modeling efforts matter for critical decisions? how much of present approaches are routinely used by geoscientists and engineers in the real applications? Efficient? Do our methodologies work at the scale and time framework of real projects and real time decision making?
State of the art Monte Carlo simulation Generating 100s of alternative models is still infeasible Mostly resort to generating a few models Important uncertainties often ignored Experimental design / response surface analysis Limited to simple variables Limited in scope of applicability
Modeling, distances and metric space Modeling uncertainty in metric space
Modeling in the Earth Sciences Data and Input variables: Geological Geophysical Engineering Model Geo processes Waves Flow Complex System Output Response prediction Optimization Control variables High Dimensional space Low Dimensional space Fit for purpose modeling Dimension reduction
Dimension/complexity reduction High Dimensional space Lower Dimensional space Traditional dimension reduction method (PCA) do not account for purpose One may risk discarding vital elements of the model Dimension may not be sufficiently reduced
Distance: metric space High Dimensional space Metric space Distance Cartesian No Axis defined Max dimension = # models Distance = some difference (scalar) between any two reservoir models What distance? Chose a distance that is correlated with the difference in response
Distances: multi dimensional scaling 1000 Gaussian realizations x 1 2D 2D map map of of realizations: explicit explicit visualization of of uncertainty x 2 x 1000 Calculate Euclidean distance d ij = ( x i x j ) T ( x i x j )
MDS Summary Reservoirmodel matrix X Dot product B = XX Reconstruction X = [ x, x, K, ] = x Euclideandistance matrix A T 1 VΛ 1/2 2 = VΛV T L
Application tailored distance 1000 Gaussian models 2D map of models Calculate a connectivity distance
Role of MDS To transform the application tailored distance into an approximating Euclidean distance (Ed) We know a lot of theory on Ed To visualize metric space by projection An explicit visual and diagnostic tool for (prior) model uncertainty, how model updating proceeds
Proper choice of distance leads to purpose driven structuring of models in metric space
Importance of purpose driven metric Euclidean distance Connectivity distance Color indicates amount of water produced
Important about the distance A distance or metric (difference between any two models) is NOT the same as a proxy (a measure of a single model) The better this distance is correlated to the difference in target output response, the more effective the distance approach will be (worst case=pure random)
Kernels: working with non Euclidean metrics Modeling uncertainty in metric space
Metric Space on x ϕ Feature Space on x No Axis defined ϕ 1 No Axis defined MDS MDS Kernels are a mathematical tool to map between metric spaces
Kernel transformations Transformation from one metric space to another does not require knowledge of ϕ, only knowledge of the dot product ϕ T ϕ ϕ T ( x) ϕ T ( y) = k( x, y) example k( x, y) = exp( ( x y) T σ ( x y) ) Approximating Euclidean distance obtained with MDS Role of the Kernel Increase dimension, seperability and linearity
Example 2D projection of models From metric space 2D projection of models in feature space RBF Kernel
Model expansion in metric space Modeling uncertainty in metric space
Why model expansion? Area of interest Common tasks in modeling: Model screening Model updating Conditioning models Response uncertainty Model expansion allows generating new models WITHOUT needing the original methodology/algorithms that generated the initial models
Karhunen Loeve expansions Gaussian realizations X Dot - product B = XX KL - expansion x new [ x, x,, x ] = K Euclidean distance matrix A T = 1 = VΛV VΛ 2 1/ 2 y T L
Euclidean distance based KL expansions 1000 models 3 new models KL expansion Calculate Euclidean distance
MDS of new models Blue dots: 1000 new models Red dots: 1000 original models both populations reflect the same uncertainty
Non Gaussian, non Euclidean model expansions? metric space ϕ Model expansion ϕ 1 feature space MDS MDS
Model expansion in feature space Model expansion feature space Φ x i a φ K ij ( x i ) X a Φ T ( x ) ( ) i x j xi x j = k( x, ) = i x j exp σ Kernel or Gram Matrix K (L L) K = V K Λ V K T K MDS Using K Φ = V K Λ 1/ 2 K Karhunen - Loeve expansion : 1 ϕ( x) = Φ b with b = VKy L ( y is a standard Gaussian vector)
Mapping back to metric space Model expansion? metric space ϕ ϕ 1 feature space MDS With B? MDS With K
Mapping back: the pre image problem MDS With B? MDS With K ϕ pre - image problem : ˆx new d = arg min ϕ( x x new d new d )-Φb new 2 2 with b new = 1 L V K y new solution : β opt i ˆx new d L opt = βi xd,i with β i= 1 i= 1 is only function of K, y new L and opt i X = 1
Creating a new model? L L new opt opt x = βi xi with βi = 1 i= 1 i= 1 metric space (hard data conditioning is maintained) Three methods MDS on B 1.Unconstrained method 2.Transformation method 3.Stochastic optimization method See next presentation of Celine Scheidt
Non-Euclidean, non-gaussian example 300 realizations generated using Boolean sampling Definition of connectivity distance 4 new realizations using non Euclidean, Non Gaussian KL expansion
Data conditioning in metric space Modeling uncertainty in metric space
A simple conditioning problem Reference permeability field Fractional flow data FWT (%) Time (days)
Formulating the conditioning problem in metric space The data: X = [ x x K x ] G = [ g x ) g( x ) Kg( x )] 1 2 L ( 1 2 L The problem: find x such that : data = g( x) The metric: d ij = d( g( x ), g( x i j )) i,j d i,data = d ( g( x ), data) i i The augmented data: X + = [ ] true + x x K x, x G = [ g x ) g( x ) Kg( x ), data] 1 2 L ( 1 2 L The problem in metric space find x such that : d( x x ) = d, d,true 0 with x MDS a x d
Illustration L=200 200 initial permeability models True earth Find the collection of models that map onto this location: model expansion in metric space
Note: diagnosing a wrong prior 200 initial permeability models True earth No need to even attempt to history match this data due to data model inconsistency
Illustration 200 initial permeability models True earth Find the collection of models that map onto this location: model expansion in metric space
The post image problem Model expansion metric space ϕ ϕ 1 feature space? MDS On B MDS On K?
The post image problem MDS On B MDS On K? find x such that : d( x x ) = d, d,true 0 with x MDS a x d y opt = arg min d( x, x ) with ϕ( x ) = Φ y new d,true new d new d 1 L V K y new Use gradual deformation to find multiple Gaussian vector solutions
Illustration four realizations through solving the post image problem by gradual deformation Solving the post image problem does not require any new flow simulation
Channelized case Reference injector Production well
Initial set 200 models mapped from metric space
Solve the post image problem 4 history matched models
Use of proxy distances Requires small CPU Proxy metric space Requires large CPU Actual metric space MDS MDS
Use of proxy distance Cluster and select models Proxy metric space Solve post image problem Actual metric space MDS MDS
Illustration P1 y 100 realizations of permeability (SGS) 100x100x1 y 2 wells: y O1 y 1 production well (P1),1 observation well (O1) y Objective: History match pressure at O1 y Distance: Difference in pressure at O1 for each time step y Proxy distance: Difference in pressure at O1 for last three time step 3500 Distance - Proxy 3000 ρ = 0.89 2500 2000 1500 1000 500 0 0 2000 4000 6000 8000 Distance - Eclipse 10000 12000
Illustration MDS projection from proxy metric space Truth 7 models are selected for flow simulation
Illustration MDS projection from actual metric space Solve the post image problem through model expansion Construct 100 new models
Illustration 3500 3000 Pressure at 01 2500 2000 1500 1000 500 0 200 400 600 800 Time (days) 100 history matched models obtained by performing only 7 flow simulations
What s next? Celine: More on solving the pre image/post image problem Kwangwon: Kalman filtering in Metric space: a reservoir case study Celine: Joint construction of high resolution and coarse models in metric space Mehrdad: Multi point algorithms in metric space
Concluding remarks MUMS: Modeling Uncertainty in Metric Space: Powerful diagnostic tool on model uncertainty, datamodel consistency, model updating Distances allows including the purpose of modeling Working with ensembles is more effective and efficient than working on a single model at a time