Oil Field Production using Machine Learning. CS 229 Project Report

Similar documents
ENHANCED LINEARIZED REDUCED-ORDER MODELS FOR SUBSURFACE FLOW SIMULATION

Machine Learning Applied to 3-D Reservoir Simulation

Enhanced linearized reduced-order models for subsurface flow simulation

CS229 Final Project. Wentao Zhang Shaochuan Xu

Geostatistical History Matching coupled with Adaptive Stochastic Sampling: A zonation-based approach using Direct Sequential Simulation

Learning Tetris. 1 Tetris. February 3, 2009

Analysis of the Pattern Correlation between Time Lapse Seismic Amplitudes and Saturation

Machine Learning Basics: Stochastic Gradient Descent. Sargur N. Srihari

Machine Learning. Regression. Manfred Huber

Linear Models for Regression

Kaviti Sai Saurab. October 23,2014. IIT Kanpur. Kaviti Sai Saurab (UCLA) Quantum methods in ML October 23, / 8

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Abstract. 1 Introduction. Cointerpretation of Flow Rate-Pressure-Temperature Data from Permanent Downhole Gauges. Deconvolution. Breakpoint detection

EMEKA M. ILOGHALU, NNAMDI AZIKIWE UNIVERSITY, AWKA, NIGERIA.

Forecasting with Expert Opinions

CS 540: Machine Learning Lecture 1: Introduction

CHARACTERIZATION OF FRACTURES IN GEOTHERMAL RESERVOIRS USING RESISTIVITY

A Trajectory Piecewise-Linear Approach to Model Order Reduction and Fast Simulation of Nonlinear Circuits and Micromachined Devices

CSC321 Lecture 20: Autoencoders

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning in Modern Well Testing

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Advances in Locally Varying Anisotropy With MDS

Outline. DESTINY: A Software for Flow Diagnostics and History Matching Using Streamlines. Hongquan Chen. Software Introduction.

CSC321 Lecture 7 Neural language models

Adaptive Sampling of Clouds with a Fleet of UAVs: Improving Gaussian Process Regression by Including Prior Knowledge

We Prediction of Geological Characteristic Using Gaussian Mixture Model

REDUCING ORDER METHODS APPLIED TO RESERVOIR SIMULATION

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection

KAI ZHANG, SHIYU LIU, AND QING WANG

Uncertainty quantification and calibration of computer models. February 5th, 2014

Deep Reinforcement Learning SISL. Jeremy Morton (jmorton2) November 7, Stanford Intelligent Systems Laboratory

Assignment 3. Introduction to Machine Learning Prof. B. Ravindran

IN the field of ElectroMagnetics (EM), Boundary Value

REDUCED-ORDER MODELING FOR THERMAL SIMULATION

Quantum Algorithm Survey: Cont d

Multiple-Point Geostatistics: from Theory to Practice Sebastien Strebelle 1

Hyperbolic Polynomial Chaos Expansion (HPCE) and its Application to Statistical Analysis of Nonlinear Circuits

Reducing Uncertainty in Modelling Fluvial Reservoirs by using Intelligent Geological Priors

A Trajectory Piecewise-Linear Approach to Model Order Reduction and Fast Simulation of Nonlinear Circuits and Micromachined Devices

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

3.4 Fuzzy Logic Fuzzy Set Theory Approximate Reasoning Fuzzy Inference Evolutionary Optimization...

TOUGH2 Flow Simulator Used to Simulate an Electric Field of a Reservoir With a Conductive Tracer for Fracture Characterization

Machine Learning Basics

Introduction to Machine Learning

Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Electricity Demand Probabilistic Forecasting With Quantile Regression Averaging

CS 6375 Machine Learning

ECE521 Lectures 9 Fully Connected Neural Networks

Statistical Learning Theory. Part I 5. Deep Learning

Linear and Non-Linear Dimensionality Reduction

Reliability of Seismic Data for Hydrocarbon Reservoir Characterization

Probabilistic Graphical Models for Image Analysis - Lecture 1

Fully Bayesian Deep Gaussian Processes for Uncertainty Quantification

Reservoir connectivity uncertainty from stochastic seismic inversion Rémi Moyen* and Philippe M. Doyen (CGGVeritas)

A Modular NMF Matching Algorithm for Radiation Spectra

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2016

Modeling and Predicting Chaotic Time Series

Reward-modulated inference

CS534 Machine Learning - Spring Final Exam

Learning Gaussian Process Models from Uncertain Data

ractical Geomechanics for Oil & Gas Industry

NEURO-FUZZY SYSTEM BASED ON GENETIC ALGORITHM FOR ISOTHERMAL CVI PROCESS FOR CARBON/CARBON COMPOSITES

Lecture 2: Linear Algebra Review

Deep Learning. Convolutional Neural Network (CNNs) Ali Ghodsi. October 30, Slides are partially based on Book in preparation, Deep Learning

Machine Learning for Signal Processing Bayes Classification and Regression

Earth models for early exploration stages

Machine Learning Basics: Maximum Likelihood Estimation

Frequency Selective Surface Design Based on Iterative Inversion of Neural Networks

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1

The SPE Foundation through member donations and a contribution from Offshore Europe

Neural-Network Methods for Boundary Value Problems with Irregular Boundaries

PATTERN RECOGNITION AND MACHINE LEARNING

Neural networks and optimization

Notes on Latent Semantic Analysis

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Issues and Techniques in Pattern Classification

Integration of seismic and fluid-flow data: a two-way road linked by rock physics

CSCI-567: Machine Learning (Spring 2019)

CSC321 Lecture 2: Linear Regression

DESIGNING CNN GENES. Received January 23, 2003; Revised April 2, 2003

CSCI567 Machine Learning (Fall 2014)

Collaborative Filtering. Radek Pelánek

Tutorial on Machine Learning for Advanced Electronics

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)

CS242: Probabilistic Graphical Models Lecture 4B: Learning Tree-Structured and Directed Graphs

Regularization in Neural Networks

Learning From Data Lecture 15 Reflecting on Our Path - Epilogue to Part I

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)

Sample questions for Fundamentals of Machine Learning 2018

Experiments on the Consciousness Prior

Neural networks and optimization

Petrophysical Rock Typing: Enhanced Permeability Prediction and Reservoir Descriptions*

3D geologic modelling of channellized reservoirs: applications in seismic attribute facies classification

Human Mobility Pattern Prediction Algorithm using Mobile Device Location and Time Data

Mathematics Research Report No. MRR 003{96, HIGH RESOLUTION POTENTIAL FLOW METHODS IN OIL EXPLORATION Stephen Roberts 1 and Stephan Matthai 2 3rd Febr

Transcription:

Oil Field Production using Machine Learning CS 229 Project Report Sumeet Trehan, Energy Resources Engineering, Stanford University 1 Introduction Effective management of reservoirs motivates oil and gas companies to do uncertainty analysis, optimize production and/or field development etc. Any such analysis requires a large number of flow simulations, of the order of hundreds in case of gradient-based methods or even thousands in case of direct search or stochastic procedures such as genetic algorithms. Since flow simulations are computationally very expensive, performing a large number of simulations (say, hundreds) is difficult. While recent advances in parallel computing have reduced the computing time [3], the simulation of realistic models with O(10 4 10 6 ) grid cells and multiple components still remains challenging. As a result, reservoir simulation has found limited application as a tool in the development of smart fields. In this study, combination of supervised and unsupervised machine learning (ML) techniques have been explored to enable large speedups in reservoir simulation, thereby lowering its computational cost and enabling wider-scale applications. These techniques enable fast computation of approximate solutions to various scenarios that arise in optimization or uncertainty quantification. 2 Model setup and training data The reservoir model, shown in figure 2.1, is a two-dimensional horizontal synthetic permeability field and corresponds to a fluvial depositional system described by [1]. The field has three producers designated as P 1, P 2 and P 3, and three injectors designated I 1, I 2 and I 3. Training data is obtained by solving the governing equation (conservation of mass and momentum) for reservoir flow. As described in the introduction, solving governing equations is very expensive. Thus, we perform only one simulation. Since reservoir flow is a dynamic process, we get time series data for amount of oil and water produced. The input variable which we would like to vary between training and test is referred to as bottom-hole pressure (BHP). This time varying variable (represented by x) plays a critical role in optimizing production or field development. 1

Figure 2.1: Permeability log 10 K for reservoir model 3 Model based on supervised and unsupervised learning techniques In the current study, we develop a model which uses three different techniques, viz. 1) modified version of linear regression, 2) finding closest neighbor, and 3) principal component analysis (PCA). 3.1 Modified linear regression As described above, training data is obtained by solving the governing equations which obey the laws of physics. The use of simple linear or weighted linear regression with multiple features to build a model was observed to lead to large errors. Therefore, a separate mathematical procedure was adopted using Taylor series [2, 4]. Since the Taylor series was truncated at the linear term, it is equivalent to performing a linear regression where we directly obtain the product (θ T x) without explicitly constructing all the features or coefficients. If at any instant of time n + 1, we want to estimate the output function y n+1 for some new query point x n+1 (i.e., a different BHP value), then the procedure is demonstrated below. An underlying assumption in Taylor series is the availability of a point in the neighborhood where the function value is known and where linearization of the function will hold true. In this problem, we pick the point (around which we expand) by exploiting the concept of pairwise distance 2

and determining the closest point in the neighborhood. This is done by comparing the function value y n, a known quantity, against the time series data available from training. Note that y n+1 represents the unknown, while y n is known, since we assume that at T = 0, the function value is known, and this is always true. y n+1 = y i+1 ( J i+1) 1 ( B i+1 (y n y i ) + C i+1 (x n+1 x i+1 ) ) (3.1) The above equation is in the form of y = θ T x + θ o where θ T x = ( J i+1) 1 C i+1 x n+1 and θ o = y i+1 ( J i+1) 1 B i+1 (y n y i ) + ( J i+1) 1 C i+1 x i+1. Notation: 1. Superscript indicates time level, e.g., i means time level i 2. Coefficients with superscript i, n and i + 1 are known. Thus, J, B, C are known coefficient matrices. 3. BHP is the variable that is varied between the training and test and is represented by x n+1. As described earlier, it is a time varying variable whose variation for training and test case is shown in Figure 3.1. The relationship between x and y is very non-linear. 4. y i is a vector which represents pressure and saturation at different spatial locations in the reservoir (Figure 2.1) at time level i. 5. Amount of oil production at time level i is computed using pressure and saturation information contained in vector y i. Since at any instant of time, the output y i is a high-dimensional vector we project y in lower dimensional space by construction PCA basis (Φ) using a different value of y available from training. We represent the lower dimensional vector with subscript r. The above equation in lower dimensional space is shown below: 4 Results y n+1 r = y i+1 r ( Φ T J i+1 Φ ) 1 ( Φ T BΦ i+1 (y n r y i r) + Φ T C i+1 (x n+1 x i+1 ) ) (3.2) Time series data for training and test cases is shown in Figure 4.1. As observed from Figure 3.1, the input variable, BHP, is significantly different between training and test. Since we have only one training, the training error is zero. 3

(a) Training BHP represented by x i for producer and injector wells (b) Test BHP represented by x n for producer and injector wells Figure 3.1: BHP profile for training and one sample test case We observe from Figure 4.1 that we have an excellent match between the solution obtained from ML technique described in 3.1 (shown in blue color) and reference solution to the test problem (shown in black). We observe a consistently good match for both oil and water rates. In order to evaluate the test error, we adopted a three step procedure, viz., 1. Create multiple test cases where each test case has a unique BHP profile. One such profile is shown in Figure 3.1b. 2. For each test case, solve the governing equations and obtain a reference solution. 4

(a) Oil rates: Training, Reference solution for test ML solution (b) Water rates: Training, Reference solution for test ML solution Figure 4.1: Time series data for oil and water production rates for training and test 3. Find the error in oil and water production rates by comparing solution obtained using ML techniques versus the reference (obtained as part of post-processing step and described above). The error for this time-varying process is defined in equation 4.1 and error behavior is shown in Figure 4.2. 5

(a) Error in oil rates (b) Error in water rates Figure 4.2: Test error E j o,w = T 0 Q j o,w ML Q j o,w ref dt T 0 Qj o ref dt Error o,w = 1 n pw n pw Eo,w j j=1 (4.1) The error plots indicate that for majority of the test cases, the test error is around 5-8%, which is an acceptable value, given other uncertainties present in the reservoir (outside the scope of this study). However, the large speedups obtained using ML techniques indicate that this is promising approach. 5 Conclusion and Future Work We have implemented linear regression models without explicitly constructing parameters. Further, we found that traditional linear or weighted linear regression does not work well in this problem. In order to make the algorithm work, we successfully applied the concepts of closest neighbors and dimensionality reduction using PCA. Future work includes exploration of neural networks and the EM algorithm for such problems. 6 References 1. O. J. Isebor. Derivative-free optimization for generalized oil field development. PhD thesis, Department of Petroleum Engineering, Stanford University, 2013. 2. J. He. Reduced-order modeling for oil-water and compositional systems, with application 6

to data assimilation and production optimization. Engineering, Stanford University, 2013. PhD Thesis, Department of Petroleum 3. S. Maliassov, B. Beckner, and V. Dyadechko. Parallel reservoir simulation using a specific software framework (SPE 163653). In Reservoir Simulation Symposium, Woodlands, TX, February 2013. 4. M. Rewienski and J. White. A trajectory piecewise-linear approach to model order reduction and fast simulation of nonlinear circuits and micromachined devices. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 22(2):155170, February 2003. 7