Prediction by conditional simulation

Similar documents
The Poisson storm process Extremal coefficients and simulation

Simulation of a Gaussian random vector: A propagative approach to the Gibbs sampler

Is there still room for new developments in geostatistics?

F denotes cumulative density. denotes probability density function; (.)

Poisson line processes. C. Lantuéjoul MinesParisTech

Statistics & Data Sciences: First Year Prelim Exam May 2018

GeoEnv -July Simulations. D. Renard N. Desassis. Geostatistics & RGeostats

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Kriging Luc Anselin, All Rights Reserved

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

6 Markov Chain Monte Carlo (MCMC)

Monte Carlo Simulation. CWR 6536 Stochastic Subsurface Hydrology

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Partially Observable Markov Decision Processes (POMDPs)

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Markov Chain Monte Carlo Lecture 4

On Markov chain Monte Carlo methods for tall data

Reliability Theory of Dynamic Loaded Structures (cont.) Calculation of Out-Crossing Frequencies Approximations to the Failure Probability.

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

On the Optimal Scaling of the Modified Metropolis-Hastings algorithm

Étude de classes de noyaux adaptées à la simplification et à l interprétation des modèles d approximation.

Chapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory

Reservoir characterization

16 : Markov Chain Monte Carlo (MCMC)

Handbook of Spatial Statistics Chapter 2: Continuous Parameter Stochastic Process Theory by Gneiting and Guttorp

Bayesian Methods for Machine Learning

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Monte Carlo Simulations

Reservoir connectivity uncertainty from stochastic seismic inversion Rémi Moyen* and Philippe M. Doyen (CGGVeritas)

Mustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Concepts and Applications of Kriging. Eric Krause

Linear Regression (continued)

Topics in Stochastic Geometry. Lecture 4 The Boolean model

Numerical Methods with Lévy Processes

Lecture 2: From Classical to Quantum Model of Computation

Contents of this Document [ntc5]

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics

11/8/2018. Spatial Interpolation & Geostatistics. Kriging Step 1

arxiv: v1 [stat.co] 18 Feb 2012

Linear discriminant functions

Bayesian Inference. Chapter 9. Linear models and regression

MATHEMATICS LEARNING AREA. Methods Units 1 and 2 Course Outline. Week Content Sadler Reference Trigonometry

Geometric Steiner Trees

Introduction to Real Analysis Alternative Chapter 1

Computer Intensive Methods in Mathematical Statistics

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Spatial Interpolation & Geostatistics

F 1 =. Setting F 1 = F i0 we have that. j=1 F i j

Modelling trends in the ocean wave climate for dimensioning of ships

Multimodal Nested Sampling

7 Geostatistics. Figure 7.1 Focus of geostatistics

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

CONDITIONAL SIMULATION OF A COX PROCESS USING MULTIPLE SAMPLE SUPPORTS

LECTURE 5 HYPOTHESIS TESTING

CPSC 540: Machine Learning

On the cells in a stationary Poisson hyperplane mosaic

Monte Carlo conditioning on a sufficient statistic

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

Covariance function estimation in Gaussian process regression

Math 180B Homework 9 Solutions

Ch 6. Model Specification. Time Series Analysis

Jesper Møller ) and Kateřina Helisová )

BME STUDIES OF STOCHASTIC DIFFERENTIAL EQUATIONS REPRESENTING PHYSICAL LAW

CS 147: Computer Systems Performance Analysis

Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear

DIRECT AND INDIRECT LEAST SQUARES APPROXIMATING POLYNOMIALS FOR THE FIRST DERIVATIVE FUNCTION

Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 5: Single Layer Perceptrons & Estimating Linear Classifiers

Stability of boundary measures

Random Processes. DS GA 1002 Probability and Statistics for Data Science.

Recent developments in object modelling opens new era for characterization of fluvial reservoirs

Monte Carlo Methods. Leon Gu CSD, CMU

Practical Statistics

Module Master Recherche Apprentissage et Fouille

Stat 516, Homework 1

Metric Spaces Math 413 Honors Project

Consistent Downscaling of Seismic Inversions to Cornerpoint Flow Models SPE

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

Markov Chain Monte Carlo (MCMC)

Nonlinear Dynamical Systems Lecture - 01

Outline lecture 6 2(35)

Probability and Statistics

Undirected Graphical Models

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Lecture 18: March 15

Least squares: introduction to the network adjustment

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

1. Introductory Examples

Extreme Value Analysis and Spatial Extremes

An inverse scattering problem in random media

Reliability of Seismic Data for Hydrocarbon Reservoir Characterization

Adaptive Sampling of Clouds with a Fleet of UAVs: Improving Gaussian Process Regression by Including Prior Knowledge

Monte Carlo-based statistical methods (MASM11/FMS091)

Advances in Locally Varying Anisotropy With MDS

Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

Rirdge Regression. Szymon Bobek. Institute of Applied Computer science AGH University of Science and Technology

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Transcription:

Prediction by conditional simulation C. Lantuéjoul MinesParisTech christian.lantuejoul@mines-paristech.fr

Introductive example Problem: A submarine cable has to be laid on the see floor between points a and b. We want to predict its length l = b a 1 + [y (x)] 2 dx starting from the sea-bed topography ( y(x), a x b ) along a trajectory that has been sampled every 100m. Submarine trajectory and samples 1

Introductive example (2) A natural idea: Predict the cable length as the length of the predicted trajectory ( ŷ(x), a x b ) ˆl = b a 1 + [ŷ (x)] 2 dx Unfortunately, the predicted value ˆl is much smaller than the actual one l. Predicted submarine trajectory and samples 2

Introductive example (3) Comments: The prediction algorithm proposed is inefficient because of the data: no local information available the predictor: not well adapted as l does not depend linearly of y Alternative approach: The sea-bed topography ( y(x), a x b ) is considered as a realization of some stochastic process Y = ( Y (x), a x b ), whose statistical features have to be chosen in full compatibility with the available data. This makes it possible to predict l by a Monte Carlo approach using conditional simulations. 3

Simulation and conditional simulation A simulation is a realization of the stochastic process. A conditional simulation is a simulation that honors the available data. 4

Three examples of conditional simulations 5

Using conditional simulations Suppose that a probabilistic model has been chosen. Given n conditional simulations y 1,..., y n, we calculate from which we can l(y i ) = b a 1 + [y i (x)]2 dx i = 1,..., n predict the cable length (average of the l(y i ) s); assign a precision for the predicted length (e.g. variance of the l(y i ) s); give confidence limits for the predicted length (based on the histogram of the l(y i ) s); predict the probability that the length does not exceed a given critical value. The obtained results are totally tributary of the stochastic model that has been chosen. 6

An illustration (after Chilès, 1977) Yeu island 7

Samples 8

Prediction by linear regression (kriging) 9

Conditional simulations 10

Simulated profiles 11

Probability that a point belongs to the island 12

Results Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0 1 2 3 4 5 6 7 Density 0.00 0.02 0.04 0.06 0.08 10 15 20 25 30 35 area 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 volume 10 15 20 25 30 35 40 height Predicted Simulated Actual Area (km 2 ) 22.94 23.37 23.32 Volume (km 3 ) 0.169 0.188 Height (m) 15.93 21.32 27.50 13

Scatter diagrams 0.05 0.15 0.25 0.35 area 0.754 0.422 10 15 20 25 30 0.05 0.15 0.25 0.35 volume 0.813 height 10 15 20 25 30 35 40 10 15 20 25 30 10 15 20 25 30 35 40 14

Motivation in reservoir engineering Case of a fluvial reservoir: the sandstone lenses or channels are porous and may contain oil; the clay facies acts as a barrier for the circulation of oil. The geometry of the reservoir is required as an input of a fluid flow simulation exercise. How to predict it starting from the available data? 15

Data integration Diagraphic interpretation: It provides facies information all along each well Sismic interpretation: It provides the facies proportion within the reservoir Well tests: They provide connectivity properties 16

Outline Introductive example Gaussian random field definition conditional simulation algorithm Excursion set of a gaussian random function definition conditional simulation algorithm Boolean model definition intersection property conditional simulation algorithm Conclusions 17

Gaussian random field 18

Gaussian random field Let Y = (Y (x), x IR d ) be a 2 nd order stationary random field with mean m and covariance function C Definition: Y is said to be gaussian if any linear combination of its variables is gaussianly distributed: n n n a i Y (x i ) G m a i, a i a j C(x i x j ) i=1 Fundamental property: i=1 i,j=1 The spatial distribution of Y is totally characterized by its mean and its covariance function. 19

Examples (different covariance functions) spherical exponential hyperbolic stable gaussian cardinal sine 20

Presentation of the problem How can we produce a realisation y of a stationary gaussian random field with mean 0, covariance function C, that satisfies y(x α ) = y α α A 21

Principle Write where Y (x) = Y A (x) + Y (x) Y A (x) x IR d Y A (x) = α A λ α(x)y (x α ) Y (x) Y A (x) kriging estimator (known) kriging residual (unknown) Y A and Y Y A are two independent gaussian random fields. Accordingly, put Y CS (x) = Y A (x) + S(x) S A (x) x IR d where S is an independent copy of Y. S S A is known insofar as realizations of S can be produced (non conditional simulations). 22

Conditional simulation Algorithm and verifications Algorithm: (i) generate a simulation s. Put s α = s(x α ) for each α A; (ii) for each x IR d, do (ii.i) compute the kriging weights ( λ α (x), α A ) ; (ii.ii) return y CS (x) = y A (x)+s(x) s A (x) = s(x)+ α λ α(x)(y α s α ). Verifications: if x = x α, then y CS (x α ) = y α + s(x α ) s α = y α (conditioning data are honored) if C(x x α ) 0 for each α A, then y CS (x) 0 + s(x) 0 = s(x) (far from the data points, the conditional simulation is non conditional) 23

Illustration Model m = 0, C = sph(1, 20). Simulation field 300 200. Simulation (TL), conditioning data points (TR), kriging estimate (BL) and conditionnal simulation (BR). 24

Four conditional simulations 25

Excursion set of a gaussian random field 26

Excursion set of a gaussian random field Basic ingredients: a 2 nd order stationary, standardized gaussian random field Y with covariance function C. a numerical value λ. Definition: The excursion set of Y at level λ is the set of points where Y takes values greater or equal to λ X λ (x) = { 1 if Y (x) λ 0 if Y (x) < λ 27

Examples at various levels Gaussian random field (gaussian covariance) and its excursion sets at levels 1, 0.5, 0, 0.5 et 1. 28

Examples for different covariance functions Excursion sets at level 0 derived from gaussian random fields with spherical (TL), exponential (TM), gaussian (TR), stable (BL), hyperbolic (BM) and cardinal sine (BR) covariance functions. 29

Conditional simulation Presentation of the problem How to produce realizations of the excursion set at level λ of a standardized gaussian random field with covariance function C, in such a way that each facies contains a finite number of prespecified points? 30

Conditional simulation algorithm Conditioning data set (TL). Conditional simulation of Y at the data points only (Gibbs sampler) (TR). Conditional simulation of Y (BL). Threshold at level λ (BR). 31

Four conditional simulations 32

Boolean model 33

Constructing a Boolean model 34

Definition of a boolean model Basic ingredients: a Poisson point process P with intensity function θ = ( θ(x), x IR d) ; a family ( A(x), x IR d) of independent nonempty compact random subsets (referred to as objects ). The statistical properties of A(x) can be specified by its distribution function T x (K) = P {A(x) K } K K(IR d ) Definition: A Boolean model is the union of the objects located at the Poisson points. X = A(x) X = foreground x P X c = background 35

Examples of boolean models Boolean models of segments (TL), disks (TR), Poisson (BL) and Voronoi polygons (BR). 36

Intersection of the model and a domain If X is a Boolean model with parameters (θ, T ) and D is a compact domain, then X D is also a boolean model with parameters (θ (D), T (D) ) given by Poisson intensity θ (D) (x) = θ(x)t x (D) x IR d distribution function T x (D) (K) = T x(k) T x (D) K K(D) x y D 37

Typical objets Assumption: The Poisson intensity θ (D) has a finite integral ϑ(d). Consequences: X D has a finite (Poisson) number of objects; θ (D) ( )/ϑ(d) is a probability density function on IR d. Definition: The random compact set A(ẋ) A(ẋ) D with ẋ θ (D) /ϑ(d) is called a typical object (hitting D). Its distribution function is (for K D) T (D) (K) = θ (D) (x) IR d ϑ(d) T x (K) T x (D) dx = IRd θ(x) T x (K) dx IR d θ(x) T x (D) dx Property: X D is the union of a Poisson number (mean ϑ(d)) of independent typical objects. 38

Conditional simulation Presentation of the problem How to produce realizations of X in the domain D subject to the conditions that one finite set of points C 0 must be contained in X c and another set C 1 in X? X X c 39

Compatibility between model and data In contrast to a gaussian random function, a boolean model may not be compatible to a data set. Example: Objects are circular with fixed radius, and a foreground conditioning point is closely surrounded by background conditioning points: We have not succeeded in finding a direct algorithm for simulating a boolean model conditionally. In what follows, an iterative algorithm is presented. 40

Initialization 0000 1111 00000 11111 00000 11111 Principle: independent typical objects are sequentially generated. Any object that covers a conditioning point of the background is rejected. The initialisation procedure stops when all foreground conditioning points have been covered by objects. 41

One step of conditional simulation 0000 000000 000000 1111 111111 111111 00000 11111 0000 1111 000 111 0000 00000 1111 111110000 00000 1111 11111 0000 000000 000000 1111 111111 111111 0000 000000 000000 1111 111111 111111 000 111 0000 00000 1111 11111 0000 000000 000000 1111 111111 111111 0000 000000 000000 1111 111111 111111 0000 000000 000000 1111 111111 111111 0000 00000 1111 11111 0000 00000 1111 11111 000 111 0000 000000 000000 1111 111111 111111 0000 000000 000000 1111 111111 111111 0000 1111 P (k,k+1)= ϑ(d) ϑ(d)+k+1 k objects P (k,k 1)= k ϑ(d)+k 42

Example of conditional simulation Stationary boolean model of disks (Poisson intensity 0.00766; radii are exponentially distributed with mean 5). The simulation field is 100 100. Left, a non conditional simulation and 100 conditioning data points. Middle and right, two conditional simulations. 43

Display of the simulation at various steps 0 100 200 400 800 1600 Display of the conditional simulation at steps 0, 100, 200, 400, 800 and 1600. 44

Conclusions For a number of prediction problems, conditional simulations are an interesting possibility in numerous fields of the earth sciences: They automatically provide fully compatible estimates for all features of interest; They can also assign each of them a precision (variance or confidence limits); Numerous difficulties remain Model choice: The obtained results are totally dependent on the stochastic model that has been chosen; Algorithm: design, rate of convergence (perfect simulations?) Data integration: accounting for complex data (non-linearity, different supports...) 45