Application of the Ensemble Kalman Filter to History Matching Presented at Texas A&M, November 16,2010
Outline Philosophy EnKF for Data Assimilation Field History Match Using EnKF with Covariance Localization
History Matching-Philosophical Generating multiple history-matched models is desirable but without a formal definition of probabilities, it is difficult to know how to interpret the uncertainty in reservoir description and future performance predictions. It is impossible to correctly estimate all the parameters of a model from inaccurate, inconsistent, and insufficient, data and most practitioners select a reduced parameterization of the model. Reducing parameters reduces ill-conditioning of the inverse problem, but reducing the number of parameters results in an under estimation of uncertainty, which may be dangerous for performance predictions especially when mechanisms or operating conditions change. Bayesian framework, apply randomized maximum likelihood (RML) for sampling posterior pdf for m. Requires minimization of a sequence of obective functions which is done with a quasi-newton method with the gradient computed by an adoint method.
Gradient Based History Matching Bayesian framework, apply randomized maximum likelihood (RML) for sampling posterior pdf for reservoir model parameters. Requires minimization of a sequence of obective functions: quasi-newton with adoint gradient. f (m d obs ) = a exp O(m)). O(m) = 1 2 (m m prior) T C 1 M (m m prior) + 1 2 (d f (m) d obs ) T C 1 D (d f (m) d obs ) m uc, N(m prior, C M ) d uc, N(d obs, C D ) then, minimize O (m) = 1 2 (m m uc, ) T C 1 M (m m uc,) + 1 2 (d f (m) d uc, ) T C 1 D (d f (m) d uc, )
Commercial simulators have little or no capability of computing gradients with the adoint method. Difficult to attach adoint code to simulator without detailed knowledge of simulator numerics. Thus, our more recent focus is has been on gradient free optimization algorithms or at least algorithms that use the simulator as a black box, e.g., SPSA, the ensemble Kalman filter, quadratic interpolation, ensemble optimization.
Ensemble Kalman Filter We start at time zero with a ensemble of models (N e realizations) and then assimilate data sequentially in time. For linear problems with measurement errors uncorrelated in time, sequential data assimilation is equivalent to matching all data simultaneously. Every time we have new production data, we integrate the data into each model in the ensemble to obtain a new history-matched model. Moreover, we also update the primary variables of the simulator to attempt to make them consistent with the values we would have obtained if we reran the simulator from time zero using the updated vectors of model parameters. In gradient-based history matching algorithms, we match all historical data simultaneously. Every time we update the model we run the simulator from time zero to update the simulation primary (state) variables.
Ensemble Kalman Filter After updating both parameters and states, we continue to run the simulator forward until we reach a time where we want to match new measured data. (Thus, EnKF fits perfectly into the real-time reservoir management concept.) 0.6 0.5 KF Data Mean 0.4 Data 0.3 0.2 0.1 0 0 1000 2000 3000 4000 Time
Ensemble Kalman Filter Equations Comment: Similar to doing one iteration of the Gauss-Newton method with an average sensitivity matrix, Reynolds et al. (ECMOR, 2006). y n,a = y n, f 1 + C Y n,f D n, f CD n + C D n, f D n,f d n uc, d n,f, x 1 C D n + C D n,f D n, f d n uc, d n, f. C Y n,f D n,f x = 1 N e 1 N e =1 y n, f y n, f d n, f 1 N e 1 d n,f T x N e =1 y n, f y n, f b Important Comment: Each updated y n,a is a linear combination of the ensemble of forecasts. Each m n,a is a linear combination of the initial ensemble of reservoir models.
Ensemble Kalman Filter Equations For notational simplicity, assume reservoir simulation steps coincide with data assimilation steps. m n,a = m n,f = m n 1,a + 1 N e 1 m y n, f n, f m n 1,a = = g(p n 1,a, m n 1,a ) p n, f 1 + C M n,f,d n, f CD n + C D n, f,d n, f d n uc, d n,f = m n 1,a + C M n 1,a,D n,f x n N e =1 m n,f m n, f d n, f d n,f T x n N e = =1 a n mn 1,a Important Comment: Applying the preceding equation recursively shows that for every n, m n,a is a linear combination of the initial ensemble of vectors of model parameters. Thus, if we cannot match data with a linear combination of this ensemble, EnKF cannot give a good data match.
Problems with EnKF Because each updated vector of reservoir model parameters is a linear combination of the initial set, there are a limited number of degrees of freedom to assimilate data. Sampling error in approximating covariances with a small ensembles of realizations. Leads (a) spurious correlations between data and states which causes changes in components of model and state vectors far distances far from observations even though data are insensitive to these components; (b) underestimation of variances (uncertainty). Can lead to ensemble collapse where all ensemble members become essentially the same. Occurs because of loss of rank. Common solution is covariance localization. Effectively kills off long distance spurious correlations and updates individual entries of the combined state vector using different linear combinations of corresponding entries of initial ensemble. Similar to using only data near a component of y to update that component.
Problems with EnKF Can only show that EnKF samples the correct pdf when the dynamical system (reservoir simulator) is linear, predicted data are linearly related to the state vector, the random vector of predicted states is Gaussian and the size of the ensemble goes to infinity. In this situation, EnKF is identical to RML. For highly nonlinear problems, the consistency between updated state vector and updated vector of model parameters can occasionally break down. Crude Fix: Periodic reruns from time zero, or some other iterative method, reduces computational efficiency, but computational experiments indicate that often we can get by with only a few reruns. Samples generated with EnKF are only from a local region of the pdf and are relatively low probability as they correspond to relatively high values of O(m).
Ensemble Size=100 text=ensemble size =100
Ensemble Size=10 text=ensemble size=10
Covariance Localization C f Y n ρ = ρ n C f Y n where denotes the Schur product. Hamill (2001) showed that a covariance matrix becomes full rank when this Schur product is employed as long as rho n is real symmetric positive definite. In history matching applications of the EnKF, it is not feasible to explicitly form the covariance C f Y n or the Schur product involving this matrix. Instead, we partition C f Y n and ρ n and after extensive algebra (Emerick and Reynolds, Comp. Geosciences 2010) the EnKF analysis equation with covariance localization becomes y n,a = y n, f + ρ Y n D n C Y n,f D n, f (C Dn + ρ D n D n C D n, f D n,f ) 1 (d n uc, d n,f ). (1)
Covariance Localization As in standard implementations of EnKF, we can avoid direct formation of the matrices such as C f Y n D n and ρ Y n D n C f Y n D n. Once the vector x (C Dn + ρ D n D n C f D n D ) 1 (d n n uc, d n, f ) has been formed, calculation of ρ Y n D n C f Y n D n x requires only vector inner products and vector sums. Here, ρ M n,f,d n, f, ρ P n, f D n, f correlation function. and ρ D n Dn, are based on the same
Calculation of correlation matrix We use the fifth-order compact correlation function defined by Gaspari and Cohn (1999) to calculate the elements of correlation matrix 1 ( δ 4 L )5 + 1 ( δ 2 L )4 + 5 ( δ 8 L )3 5 ( δ 3 L )2 + 1 0 δ L 1 ρ = ( δ 12 L )5 1 ( δ 2 L )4 + 5 ( δ 8 L )3 + 5 ( δ 3 L )2 5( δ ) + 4 2 ( δ L 3 L ) 1 L δ 2L 0 δ > 2L, where L is the length scale of the correlation function, and δ is the Euclidean distance between any grid point and an observation location. Throughout, we refer to L as the critical length. Note that ρ is zero for δ 2L, and when δ = L, ρ 0.21. For an anisotropic two-dimensional model, we assume that the shape of the area correlated to the data is an elliptical area determined by two critical lengths, L x and L y, rotated by θ as illustrated in the following figure.
Correlation Function y L x L y θ x Figure: Anisotropic correlation function.
Emerick and Reynolds, Comp. Geosciences, 2010 - Region of correlation should be equal to geometric sum of region of sensitivity and prior correlation region. (a) Sensitivity (G). (b) Prior covariance (c) Product C (C f M n). f M n GT. Figure: Example illustrating that the ranges of the product C f M n GT are equal to the sum of the radius of influence of the sensitivity matrix and the correlation lengths of the prior covariance.
Drainage areas Damiani (2007) proposed a method based on calculation of a pseudo-tracer concentration to define drainage areas associated with each well. The computations are performed as a postprocessing of results of a reservoir simulation using the velocity fields stored during the simulation. Drainage areas obtained with this procedure are equivalent to the ones obtained by tracing the streamlines. Streamlines Pseudo-tracer concentration (Damiani, 2007) Figure: Streamlines versus pseudotracer.
Well P4 Time: 12000 days Drainage area versus sensitivities Permeability (True model) P4 Drainage area P4 I1 I1 P4 Oil rate P4 Water rate q o lnk I1 q w lnk I1 Figure: True sensitivities versus drainage areas from streamlines
Field Example Description Upper zone of a turbidite reservoir in Campos Basis Grid 165 x 86 x 4 and 20,258 active gridblocks Observed data: water rate in producer wells Well controls: Producers: oil rate Inectors: water rate Current model (manual history matching): match data fairly well but it is very artificial (several patches) Total history ~10 years. We used 7.6 years for history matching and 2.4 years to compare predictions
Initial ensemble: 200 models. Sequential Gaussian Simulation. Hard data permeabilities from well tests. Cases Description EnKF. EnKF with localization (drainage + prior). Horizontal permeabilities. Vertical permeability is calculated using kv/kh = 0.3.
Turbidite systems
Permeability Manual history-matched model Producer Water inector Layer 4
Localization Prior correlation Drainage region Ellipse obtained from drainage region Localization (Gaspari-Cohn)
Permeability initial ensemble mean Producer Water inector Layer 4
Permeability ensemble mean EnKF Producer Water inector Layer 4
Permeability ensemble mean EnKF with localization Producer Water inector Layer 4
Standard deviation of ln(k) Initial ensemble EnKF EnKF + localization
Obective function Data mismatch considering the history part only. Normalized: O N = O d /N d EnKF EnKF + localization 0.40 79.8 (manual history match) 100% 0.40 79.8 (manual history match) 100% Relative freq. 0.35 0.30 0.25 0.20 0.15 0.10 0.05 Mean: 95.4 80% 60% 40% 20% Cumulative freq. Relative freq. 0.35 0.30 0.25 0.20 0.15 0.10 0.05 Mean: 66.9 80% 60% 40% 20% Cumulative freq. 0.00 10 40 80 120 160 200 240 280 O N 0% 0.00 10 40 80 120 160 200 240 280 O N 0%
Obective function Data mismatch considering the prediction part only. Normalized: O N = O d /N d EnKF EnKF + localization 0.40 30.8 (manual history match) 100% 0.40 30.8 (manual history match) 100% Relative freq. 0.35 0.30 0.25 0.20 0.15 0.10 0.05 Mean: 19.4 80% 60% 40% 20% Cumulative freq. Relative freq. 0.35 0.30 0.25 0.20 0.15 0.10 0.05 Mean: 14.1 80% 60% 40% 20% Cumulative freq. 0.00 6 10 14 18 22 26 30 34 O N 0% 0.00 6 10 14 18 22 26 30 34 O N 0%