P016 Toward Gauss-Newton and Exact Newton Optimization for Full Waveform Inversion

Similar documents
The truncated Newton method for Full Waveform Inversion

Optimization schemes for Full Waveform Inversion: the preconditioned truncated Newton method

Suppress Parameter Cross-talk for Elastic Full-waveform Inversion: Parameterization and Acquisition Geometry

Comparison between least-squares reverse time migration and full-waveform inversion

A Nonlinear Sparsity Promoting Formulation and Algorithm for Full Waveform Inversion

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Data-Driven Imaging in Anisotropic Media

SUMMARY REVIEW OF THE FREQUENCY DOMAIN L2 FWI-HESSIAN

Ch 12: Variations on Backpropagation

3D acoustic wave modeling with a time-space domain dispersion-relation-based Finite-difference scheme

Variations on Backpropagation

W011 Full Waveform Inversion for Detailed Velocity Model Building

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

P032 3D Seismic Diffraction Modeling in Multilayered Media in Terms of Surface Integrals

INTRODUCTION. Residual migration has proved to be a useful tool in imaging and in velocity analysis.

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x)

Sharp Time Data Tradeoffs for Linear Inverse Problems

A projected Hessian for full waveform inversion

Generalized AOR Method for Solving System of Linear Equations. Davod Khojasteh Salkuyeh. Department of Mathematics, University of Mohaghegh Ardabili,

A note on the multiplication of sparse matrices

Kernel Methods and Support Vector Machines

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

On Constant Power Water-filling

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

ANALYSIS OF REFLECTOR AND HORN ANTENNAS USING MULTILEVEL FAST MULTIPOLE ALGORITHM

The linear sampling method and the MUSIC algorithm

Pattern Recognition and Machine Learning. Artificial Neural networks

Q ESTIMATION WITHIN A FORMATION PROGRAM q_estimation

A model reduction approach to numerical inversion for a parabolic partial differential equation

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Department of Electronic and Optical Engineering, Ordnance Engineering College, Shijiazhuang, , China

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

Simultaneous estimation of wavefields & medium parameters

CHAPTER 8 CONSTRAINED OPTIMIZATION 2: SEQUENTIAL QUADRATIC PROGRAMMING, INTERIOR POINT AND GENERALIZED REDUCED GRADIENT METHODS

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Experimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis

Analytical solution for the electric potential in arbitrary anisotropic layered media applying the set of Hankel transforms of integer order

Optical Properties of Plasmas of High-Z Elements

COS 424: Interacting with Data. Written Exercises

Comparison of Stability of Selected Numerical Methods for Solving Stiff Semi- Linear Differential Equations

Stability Analysis of the Matrix-Free Linearly Implicit 2 Euler Method 3 UNCORRECTED PROOF

UBC-GIF: Capabilities for EM Modelling and Inversion of LSBB data

Low-complexity, Low-memory EMS algorithm for non-binary LDPC codes

Iterative Linear Solvers and Jacobian-free Newton-Krylov Methods

Pattern Recognition and Machine Learning. Artificial Neural networks

NUMERICAL MODELLING OF THE TYRE/ROAD CONTACT

RESTARTED FULL ORTHOGONALIZATION METHOD FOR SHIFTED LINEAR SYSTEMS

A method to determine relative stroke detection efficiencies from multiplicity distributions

An improved self-adaptive harmony search algorithm for joint replenishment problems

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Full-Waveform Inversion with Gauss- Newton-Krylov Method

Interactive Markov Models of Evolutionary Algorithms

Uncertainty quantification for Wavefield Reconstruction Inversion

A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION

REDUCTION OF FINITE ELEMENT MODELS BY PARAMETER IDENTIFICATION

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

GLOBALLY CONVERGENT LEVENBERG-MARQUARDT METHOD FOR PHASE RETRIEVAL

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer.

Approximate- vs. full-hessian in FWI: 1D analytical and numerical experiments

MSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE

Envelope frequency Response Function Analysis of Mechanical Structures with Uncertain Modal Damping Characteristics

2 Q 10. Likewise, in case of multiple particles, the corresponding density in 2 must be averaged over all

Ph 20.3 Numerical Solution of Ordinary Differential Equations

An Inverse Interpolation Method Utilizing In-Flight Strain Measurements for Determining Loads and Structural Response of Aerospace Vehicles

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Multi-Scale/Multi-Resolution: Wavelet Transform

A Generalized Permanent Estimator and its Application in Computing Multi- Homogeneous Bézout Number

Full-waveform inversion application in different geological settings Denes Vigh*, Jerry Kapoor and Hongyan Li, WesternGeco

Stochastic Subgradient Methods

Topic 5a Introduction to Curve Fitting & Linear Regression

IAENG International Journal of Computer Science, 42:2, IJCS_42_2_06. Approximation Capabilities of Interpretable Fuzzy Inference Systems

paper prepared for the 1996 PTRC Conference, September 2-6, Brunel University, UK ON THE CALIBRATION OF THE GRAVITY MODEL

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

Estimation of ADC Nonlinearities from the Measurement in Input Voltage Intervals

Supervised Baysian SAR image Classification Using The Full Polarimetric Data

Bernoulli Wavelet Based Numerical Method for Solving Fredholm Integral Equations of the Second Kind

IN modern society that various systems have become more

2D Laplace-Domain Waveform Inversion of Field Data Using a Power Objective Function

On the Analysis of the Quantum-inspired Evolutionary Algorithm with a Single Individual

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Anisotropic reference media and the possible linearized approximations for phase velocities of qs waves in weakly anisotropic media

NBN Algorithm Introduction Computational Fundamentals. Bogdan M. Wilamoswki Auburn University. Hao Yu Auburn University

Non-Parametric Non-Line-of-Sight Identification 1

Using a De-Convolution Window for Operating Modal Analysis

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax:

CS Lecture 13. More Maximum Likelihood

Numerical Solution of the MRLW Equation Using Finite Difference Method. 1 Introduction

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

146 THE LEADING EDGE February 2016 Special Section: Imaging/inversion: Estimating the earth model

The Methods of Solution for Constrained Nonlinear Programming

Wavefield Reconstruction Inversion (WRI) a new take on wave-equation based inversion Felix J. Herrmann

GREY FORECASTING AND NEURAL NETWORK MODEL OF SPORT PERFORMANCE

On the approximation of Feynman-Kac path integrals

EE5900 Spring Lecture 4 IC interconnect modeling methods Zhuo Feng

Ensemble Based on Data Envelopment Analysis

+ -d-t-' )=1. = vpi. Aportaciones Matematicas Comunicaciones 17 (1996) 5-10.

An Approximate Model for the Theoretical Prediction of the Velocity Increase in the Intermediate Ballistics Period

Tracking using CONDENSATION: Conditional Density Propagation

Generalized Rayleigh Wave Dispersion in a Covered Half-space Made of Viscoelastic Materials

Optimal Resource Allocation in Multicast Device-to-Device Communications Underlaying LTE Networks

Transcription:

P016 Toward Gauss-Newton and Exact Newton Optiization for Full Wavefor Inversion L. Métivier* ISTerre, R. Brossier ISTerre, J. Virieux ISTerre & S. Operto Géoazur SUMMARY Full Wavefor Inversion FWI applications classically rely on efficient first-order optiization schees, as the steepest descent or the nonlinear conjugate gradient optiization. However, second-order inforation provided by the Hessian atrix is proven to give a useful help in the scaling of the FWI proble and in the speed-up of the optiization. In this study, we propose an efficient atrix-free Hessian-vector foralis, that should allow to tackle Gauss-Newton GN and Exact-Newton EN optiization for large and realistic FWI targets. Our ethod relies on general second order adjoint forulas, based on a Lagrangian foralis. These forulas yield the possibility of coputing Hessianvector products at the cost of 2 forward siulations per shot. In this context, the coputational cost per shot of one GN or one EN nonlinear iteration aounts to the resolution of 2 forward siulations for the coputation of the gradient plus 2 forward siulations per inner linear conjugate gradient iteration. A nuerical test is provided, ephasizing the possible iproveent of the resolution when accounting for the exact Hessian in the inversion algorith.

Introduction Full Wavefor Inversion FWI is becoing an efficient tool to derive high resolution quantitative odels of the subsurface paraeters. The ethod relies on the iniization, through an iterative procedure, of the residual between recorded data and synthetic data coputed by solving the two-way wave equation in a subsurface odel. The growth of available coputational resources and recent developents of the ethod akes now possible applications to 2D and 3D data in the acoustic approxiation see for exaple Prieux et al., 2011; Plessix et al., 2010 and even in the elastic approxiation Brossier et al., 2009. Most of the FWI applications rely on fast optiization schees as preconditioned steepest descent or preconditioned conjugate-gradient ethods PCG. Second-order inforation provided by the Hessian is often neglected in FWI, due to the high coputational cost to build this atrix and solve the noral equation syste. However, a significant iproveent of the results can be obtained using this inforation: Pratt et al. 1998 have shown the iproved resolution of Gauss-Newton ethod copared to the steepest descent one in a canonical application. Hu et al. 2011 have shown results iproveent provided by a non-diagonal truncated Hessian in PCG. Brossier et al. 2009 have shown the estiated Hessian s ipact of a quasi-newton l-bfgs Nocedal, 1980, on iage resolution and convergence speed copared to PCG. Epanoeritakis et al. 2008; Fichtner and Trapert 2011 have also discussed the interest of Hessian for inversion and uncertainty estiation, and also the prohibitive cost of Hessian coputation and storage. In this study, we develop the atheatical fraework to propose an efficient atrix-free Hessian-vector product algorith for FWI, and give an illustration of the interest of accounting for the exact Hessian in the inversion process. The final ai is to tackle Gauss-Newton and Full-Newton ethod for large scale applications. Our developent relies on a general second-order adjoint-state forula, valid either in the tie or in the frequency doain. Proble stateent We consider the forward proble equation Spu = ϕ, 1 where p denotes the subsurface paraeters odel space, Sp denotes the linear forward proble operator corresponding to the two-way wave equation discretization 1, ϕ is the source vector, and u is the wavefield vector. In the following, we consider that p and u have N coponents 2. These notations are general and can be applied either in the tie doain or in the frequency doain. The FWI proble is expressed as the least-square proble in f p = 1 p 2 N s R s u s p d s 2, 2 where d s and u s p are respectively the recorded dataset and the solution of the forward proble associated with the source ϕ s, N s is the total nuber of sources in the dataset, and R s is a restriction operator that aps the wavefield u s to the receivers locations. The isfit gradient is f p = R J s R s R s u s p d s 3 1 Note that the operator Sp depends non-linearly on p 2 This siplification is satisfied whenever u is discretized on the sae grid than p. However, the forulas can be straightforwardly extended to the situation where p and u do not have the sae nuber of coponents.

where J s p denotes the Jacobian atrix p u s p. The coplex conjugate transpose operator is denoted by the sybol and the real part application by R. The Hessian operator is Hp = R J s R s R s J s + [R s R s u s p d s ] j H s j, H s j = 2 ppu s j p, 4 and its Gauss-Newton approxiation is Bp = R J s R s R s J s 5 In our study, the proble 2 is solved using a Newton or a Gauss-Newton algorith, a local iterative optiization ethod that coputes a sequence p k fro an initial guess p 0 using the update forula where d k is the solution of p k+1 = p k + α k d k, 6 Hp k d k = f p k, or Bp k d k = f p k in the Gauss-Newton approxiation, 7 and α k is coputed through a globalization procedure linesearch or trust region. Explicit coputation and storage of atrices Jp, Hp and Bp is prohibitive for large scale probles and realistic FWI applications. Hence, the equation 7 should be solved using atrix-free linear iterative solvers, as proposed in Epanoeritakis et al. 2008; Fichtner and Trapert 2011, such as the conjugate gradient CG ethod. This requires to copute efficiently both the gradient f p and Hessian-vector products Hpv or Bpv for arbitrary vectors v. While the coputation of f p can be perfored efficiently through the classical adjoint-state forula Plessix, 2006 where λp the adjoint state is the solution of f p i = R pi Spup,λp, i = 1,..., 8 Sp λ = R d Ru, 9 we propose general second-order adjoint-state forula for the coputation of Hpv and Bpv related to the Lagrangian foralis and the nonlinear constrained optiization theory. Coputing Hessian vector products For the sake of clarity, we consider in the following that N s = 1 and we drop index s. Forulas for N s > 1 are directly obtained by suation. First, we define the function g v p such that g v p = f p,v = R J R Rup d,v, 10 where up is the solution of 1. We have g v p = Hpv. We define the Lagrangian function associated with the functional g v p L v p,u,α,λ,µ = R R Ru d,α + R Spu ϕ, µ + R Spα + v j p j Spu,λ 11 The Lagrangian L v is coposed of three ters: the first one accounts for the function g v, the second one accounts for the constraints on the wavefield u, solution of the forward proble, the third one accounts for the constraints on the first-order derivatives of the wavefield u with respect to p. For ũ and α such that Spũ = ϕ, Sp α = v j p j Spũ, 12

we have g v p = p L v p,ũ, α,λ,µ + u L v p,ũ, α,λ,µ p ũp + α L v p,ũ, α,λ,µ p αp. 13 We define λ and µ such that We have and Hpv i = R u L v p,ũ, α, λ, µ Sp µ = R R α pi Spũ, µ + = 0, α L v p,ũ, α, λ, µ = 0. 14 v j p j Sp λ, Sp λ = R Rũ d 15 p pi Sp α, λ + v j j pi Sp ũ, λ,i = 1,...,. 16 Note that λ corresponds to the adjoint state defined for the coputation of f. In addition, it can be proved that the coputation of Bpv aounts to setting λ to 0 in equations 15 and 16. Bpv i = R pi Spũ, µ,i = 1,..., with Sp µ = R R α, 17 The coputation of one atrix vector product Hpv or Bpv thus requires to solve one additional forward proble for α and one additional adjoint proble for µ, as reported in Epanoeritakis et al. 2008; Fichtner and Trapert 2011. For N s > 1, the overall coputation cost is ultiplied by N s. Nuerical results We consider the estiation of the pressure wave velocity v p using a 2D acoustic FWI algorith in the frequency doain. The exact odel v p is coposed of a hoogeneous background v 0 p = 1500.s 1 defined on a 2000 length square, and two inclusions where v p = 4500.s 1. PML are all around. The two inclusions are distant only fro 40 see fig.1. The high velocity contrast generates high aplitude ulti-scattered waves. We use a full acquisition with four lines of 29 sources placed each 50 on each side of the doain. Each source is associated with four lines of 29 receivers placed each 50 all around the doain. The odel is discretized over a 101 101 grid with a spatial step of 20. The initial guess is the background v 0 p. We use one dataset corresponding to the frequency of 5 Hz. The associated average wavelength is λ 300. At this frequency, the distance between the two inclusions is saller than the expected resolution of classical FWI algoriths based on l-bfgs or Gauss-Newton approxiation. We copare the results obtained perforing 50 iterations of the l-bfgs algorith, 20 iterations of Gauss-Newton inversion and 20 iterations of Exact-Newton inversion 3. The corresponding estiated odels are plot in Figure 1. Figure 1 Acoustic FWI for pressure wave velocity. Fro left to right, exact odel, l-bfgs result, Gauss-Newton result, Exact-Newton result. 3 This corresponds approxiately to the sae coputation cost for 2.5 conjugate gradient inner iterations per Newton nonlinear iteration

These results ephasize the role of the second-order part of the Hessian, which is neglected in the Gauss-Newton approxiation, and hardly estiated in the l-bfgs approxiation. As entioned by different authors, this part of the Hessian allows to account for double scattering during the inversion Pratt et al., 1998; Virieux and Operto, 2009. An enhanceent of the resolution is obtained: only when using the exact Hessian, the two inclusions are identified. In this case, the noralized isfit is 7 10 4. In the case of the Gauss-Newton inversion, the final noralized isfit is around 0.25: neglecting the double scattered waves prevent the optiizer to converge. In the case of the l-bfgs inversion, the final noralized isfit is around 10 3 : the l-bfgs Hessian approxiation allows to account for a part of the scattered wavefield. Conclusions and perspectives The preliinary results we have obtained illustrate how accounting for the Hessian effect can iprove the FWI results. The second-order adjoint forula is an efficient tool to ipleent Gaus-Newton or Exact-Newton algoriths in a atrix-free fashion. The next step consists now in developing globalization ethods either based on linesearch or trust region that render the overall ethod copetitive with the l-bfgs ethod. Two ain probles have to be tackled: the first one is the definition of a proper criterion to stop the inner conjugate gradient iterations, in order to iniize the coputation cost. The second one is the developent of efficient preconditioners to speed-up the convergence of the inner conjugate gradient loop. Aong several possibilities, we are interested in the Newton Steihaug algorith Steihaug, 1983, based on the trust-region ethod, that provides naturally a stopping criterion for the inner iterations. In addition, for each resolution of the inner linear systes, a l-bfgs preconditioner can be coputed siultaneously, which is applied to the inner linear syste raised by the next outer nonlinear iteration. Acknowledgeents This research was funded by the SEISCOPE consortiu sponsored by BP, CGG-VERITAS, ENI, EXXON MOBIL, PETROBRAS, SAUDI ARAMCO, SHELL, STATOIL and TOTAL. The linear systes were solved with the MUMPS package. This work was perfored by accessing to the high-perforance coputing facilities of CIMENT Université de Grenoble, France and to the HPC resources of GENCI- CINES under Grant 2011-046091. References Brossier, R., Operto, S. and Virieux, J. [2009] Seisic iaging of coplex onshore structures by 2D elastic frequency-doain full-wavefor inversion. Geophysics, 746, WCC63 WCC76, doi:10.1190/1.3215771. Epanoeritakis, I., Akçelik, V., Ghattas, O. and Bielak, J. [2008] A Newton-CG ethod for large-scale threediensional elastic full wavefor seisic inversion. Inverse Probles, 24, 1 26. Fichtner, A. and Trapert, J. [2011] Hessian kernels of seisic data functionals based upon adjoint techniques. Geophysical Journal International, 1852, 775 798, ISSN 1365-246X, doi:10.1111/j.1365-246x.2011.04966.x. Hu, W., Abubakar, A., Habashy, T.M. and Liu, J. [2011] Preconditioned non-linear conjugate gradient ethod for frequency doain full-wavefor seisic inversion. Geophysical Prospecting, 593, 477 491, ISSN 1365-2478, doi:10.1111/j.1365-2478.2010.00938.x. Nocedal, J. [1980] Updating Quasi-Newton Matrices With Liited Storage. Matheatics of Coputation, 35151, 773 782. Plessix, R.E. [2006] A review of the adjoint-state ethod for coputing the gradient of a functional with geophysical applications. Geophysical Journal International, 1672, 495 503. Plessix, R.E., Baeten, G., de Maag, J.W., Klaassen, M., Rujie, Z. and Zhifei, T. [2010] Application of acoustic full wavefor inversion to a low-frequency large-offset land data set. SEG Technical Progra Expanded Abstracts, 291, 930 934, doi:10.1190/1.3513930. Pratt, R.G., Shin, C. and Hicks, G.J. [1998] Gauss-Newton and full Newton ethods in frequency-space seisic wavefor inversion. Geophysical Journal International, 133, 341 362. Prieux, V. et al. [2011] On the footprint of anisotropy on isotropic full wavefor inversion: the Valhall case study. Geophysical Journal International, 187, 1495 1515, doi:doi: 10.1111/j.1365-246X.2011.05209.x, 2011. Steihaug, T. [1983] The conjugate gradient ethod and trust regions in large scale optiization. SIAM Journal on Nuerical Analysis, 20, 626 637. Virieux, J. and Operto, S. [2009] An overview of full wavefor inversion in exploration geophysics. Geophysics, 746, WCC127 WCC152.