Uncertainty quantification for inverse problems with a weak wave-equation constraint

Uncertainty quantification for inverse problems with a weak wave-equation constraint Zhilong Fang*, Curt Da Silva*, Rachel Kuske** and Felix J. Herrmann* *Seismic Laboratory for Imaging and Modeling (SLIM), University of British Columbia **Department of Mathematics, Georgia Institute of Technology SLIM University of British Columbia

Motivation Forward problem m d x x x x x * x x x x x x F (m) 1 3 4

Motivation Forward map : F (m) A(m)u = q, d = Pu wavefields +! m source projection operator frequency squared-slowness 3

Motivation Inverse problem F 1 (d) 1 3 4 d m 4

Motivation Statistical inverse problem F 1 (d) 1 3 4 d noisy m noisy 5

Motivation Statistical inverse problem x = 1m 4 6 8 Depth [m] 1 1 14 16 18 1 3 4 5 Velocity [km/s] Confidence Interval 6

Bayesian inference [A. Tarantola and B. Valette, 198] [J. Kaipio and E. Somersalo, 4] Prior probability density function (PDF): m! prior (m) Likelihood PDF: given data d m! like (d m) Posterior PDF (Bayes rule): post (m d) / like (d m) prior (m) 7

Bayes w/ strong PDE constraints.5.5.5 1 1 1.5.5.5 3 3 3 3.5 3.5 3.5 1 3 Source [m] d 8 + 1.5 Time [s] Time [s] 1.5 Time [s] = 1.5 4 1 3 Source [m] F (m) 4 1 3 4 Source [m] N (, noise )

[A. Tarantola and B. Valette, 198] Bayes w/ strong PDE constraints Posterior PDF w/ strong PDE constraints: post (m d) / exp 1 kpa(m) 1 q dk 1 noise 1 km m priork 1 prior Challenges: high- dimensional model space and data space non- linear and expensive model- to- data map no closed form solution 9

Bayes w/ strong PDE constraints [J. Kaipio and E. Somersalo, 4] [A. M. Stuart et al., 4] [J. Martin et al., 1] McMC type methods: Metropolis- Hasting method - draw samples with a proposal PDF Langevin method - construct the proposal PDF with a preconditioning matrix Newton type McMC method - construct the proposal PDF with local Hessian matrix m post (m) prp (m) m 1 1

Bayes w/ strong PDE constraints Accurate proposal PDF Low computational cost 11

Bayes w/ strong PDE constraints McMC type methods: [J. Kaipio and E. Somersalo, 4] [A. M. Stuart et al., 4] [J. Martin et al., 1] [R. G. Pratt, 1999] Metropolis- Hasting method - draw samples with a proposal PDF Langevin method - construct the proposal PDF with a preconditioning matrix Newton type McMC method - construct the proposal PDF with local Hessian matrix - Hessian matrix and GN Hessian are dense and require additional PDE solves log like (d m) e.g. Gauss- Newton Hessian of : H GN (m) =! 4 diag conj(u) A(m) > P > 1 noise PA(m) 1 diag(u) additional PDE solves 1

PDE constrained optimization problems [M. Fisher et al, 4] [T. van Leeuwen and F. J. Herrmann, 13] [T. van Leeuwen and F. J. Herrmann, 15] Original problem: min u,m 1 kpu dk s.t. A(m)u = q Adjoint-state method strong constraint: min m 1 kpa 1 (m)q dk Penalty method weak constraint: min u,m 1 kpu dk + ka(m)u qk 13

Extend the search space [T. van Leeuwen and F. J. Herrmann, 13] [T. van Leeuwen and F. J. Herrmann, 15] Larger # of degrees of freedom more convex less local minima 14

Extend the search space Introduce auxiliary variable: post (u, m d) / (d u, m) (u, m) (u, m) = (u m) prior (m) A(m)u q Strong PDE constraints: 1 (d u, m) / exp kpu dk 1 noise A(m)u q (u m) = A (m) u q 15

Extend the search space Weaken PDE constraints: (u m) = A (m) u q A(m)u q exp ka(m)u qk (u m) / exp ka(m)u qk A(m)u q 16

Bayes w/ weak PDE constraints Joint posterior PDF: post (u, m d) / exp 1 kpu dk 1 noise ka(m)u qk 1 km m priork 1 prior This is a bi- Gaussian PDF!! 17

Bayes w/ weak PDE constraints Joint posterior PDF: post (u, m d) / exp 1 kpu dk 1 noise ka(m)u qk 1 km m priork 1 prior linear operators For fixed m 18

[T. van Leeuwen and F. J. Herrmann, 13] Conditional PDF For fixed m: u N (u, H 1 u ) with u = H 1 u P > 1 noise d + A > (m)q H u = P > 1 noise P + A > (m)a(m) 19

Bayes w/ weak PDE constraints Joint posterior PDF: post (u, m d) / exp 1 kpu dk 1 noise ka(m)u qk 1 km m priork 1 prior! diag(m)u + u For fixed m For fixed u! diag(u)m + u linear in m

[T. van Leeuwen and F. J. Herrmann, 13] Conditional PDF For fixed u: m N (m, H 1 m ) with m = H 1 m! diag conj(u) (q u)+ 1 prior m prior H m =! 4 diag conj(u) diag(u)+ 1 prior Diagonal matrix 1

Gibbs sampling u A(m)u q = m

Gibbs sampling u Sampling path * A(m)u q = O( A(m)u q = A(m)u q = O( 1 ) 1 ) m 3

Bayes w/ weak PDE constraints Gibbs sampling: 1. start with. for k = 1 : m n smp 3. compute u(m k ) and H u (m k ) 4. draw u k from N u(m k ), H 1 5. compute m(u k ) and 6. draw from m k+1 u (m k ) H m (u k ) N m(u k ), H 1 m (u k ) Main computational cost 7. end 4

Computational cost Drawing : u k N u(m k ), Hu 1 (m k ) Step 1 H u (m k )=R > R O(n 3 grid) Step u(m k )=R 1 R > P > 1/ noise d + A > (m k )q O(n grid) Step 3 u k = u(m k )+R 1, N (, I) O(n grid) Cost per each source = solving one PDE 5

Weak constraint v.s. Strong constraint Weak constraint w/ Gibbs sampling method joint bi- Gaussian distribution sparse Hessian matrix constructing Hessian matrix does not require additional PDE solves draw samples in a straight- forward manner from the Gaussian conditional distributions Strong constraint w/ M-H type method no special structure dense Hessian matrix constructing Hessian matrix requires additional PDE solves draw samples from certain proposal distribution with accept- reject criteria constructing the proposal distribution may require more PDE solves 6

Numerical example - transmission case Z [m] 1 3 4 5 6 7 8 9 o o o o o o o o x x x x x x x x 1-7.6.4. 1.8 s /m Model size: 1m x 1m Grid size: m x m Frequency: [, 6, 1, 14] Hz Number of sources: 51 Number of receivers: 51 Number of samplers: 1e6 1 4 6 8 1 X [m] 1.6 True model 7

Numerical example - transmission case 1-7 Z [m] 1 3 4 5 6 7 8 9 o o o o o o o o x x x x x x x x.6.4. 1.8 s /m prior = priori, prior =1e-8s /m noise = noisei, noise =1e =1e4 1 4 6 8 1 X [m] 1.6 Prior mean model 8

Posterior mean and STD 1-7 1-9 8.8 1.6 1 8.75 8.7 3.4 3 8.65 4 4 8.6 Z [m] 5. s /m Z [m] 5 8.55 s /m 6 6 8.5 7 7 8.45 8 1.8 8 8.4 9 9 8.35 1 4 6 8 1 X [m] 1.6 1 4 6 8 1 X [m] 8.3 Posterior mean model Posterior STD 9

Distribution 1-7 1.6 3.4.5.4 Posterior Prior distribution Z [m] 4 5 6 x x x. s /m ρ.3. 7.1 8 9 1.8 1.6 1.8..4.6.8 3 m [s /m ] 1-7.5.4 1 4 6 8 1 X [m] Posterior Prior 1.6.5.4 Posterior Prior x = 5m z = 5m.3.3 ρ ρ...1.1 3 1.8..4.6.8 3 m [s /m ] 1-7 x = 18m z = 5m 1.8..4.6.8 3 m [s /m ] 1-7 x = 78m z = 5m

95% Confidence interval 1-7 1.6 3.4 4 Z [m] 5. s /m 6 7 8 1.8 9 1 4 6 8 1 X [m] 1.6 31

95% Confidence interval 3 1-7 m [s /m ].5 Posterior Prior True z = 18m 1.5 4 6 8 1 Depth [m] 3 1-7 m [s /m ].5 Posterior Prior True z = 5m 1.5 4 6 8 1 Depth [m] 3 1-7 m [s /m ].5 Posterior Prior True z = 78m 3 1.5 4 6 8 1 Depth [m]

Samples 1-7 3 m [s /m ].5 Samples x = 18m z = 5m 1.5. 1.8 1.6 m [s /m ].4 4 6 8 1 Number of sample 1 5 1-7 4 6 8 1 Number of sample 1 5 x = 5m z = 5m 33

Numerical example - reflection case Model size: 15m x 3m o x o x o x o x o x o x o x o x o x o x o xo 1-7 4 Grid size: 3m x 3m 5 3.5 Frequency: [, 4, 6, 8] Hz Z [m] 1 3.5 s /m Number of sources: 1 15 5 1 15 5 3 X [m] 1.5 Number of receivers: 11 Number of samplers: 5e5 True model 34

Numerical example - reflection case o x o x o x o x o x o x o x o x o x o x o xo 1-7 4 5 3.5 prior = priori, prior =3e-8s /m Z [m] 1 3.5 s /m noise = noisei, noise =1e1 15 5 1 15 5 3 X [m] 1.5 =1e3 Prior mean model 35

Posterior mean and STD 1-7 4 1-8.7 5 3.5 5.65 Z [m] 1 3.5 s /m Z [m] 1.6.55 s /m.5 15 5 1 15 5 3 X [m] 1.5 15 5 1 15 5 3 X [m].45 Posterior mean model Posterior STD 36

Distribution Z [m] 5 1 x x x 1-7 4 3.5 3.5 s /m ρ.5.4.3..1 Posterior Prior 15 5 1 15 5 3 X [m] 1.5 1 1.5.5 3 3.5 4 m [s /m ] 1-7 x = 15m z = 75m.5.4 Posterior Prior.5.4 Posterior Prior.3.3 ρ ρ...1.1.5 3 3.5 4 4.5 5 5.5 m [s /m ] 1-7.5 1 1.5.5 3 3.5 m [s /m ] 1-7 37 x = 15m z = 7m x = 15m z = 117m

95% Confidence interval 1-7 4 5 3.5 Z [m] 1 3.5 s /m 15 5 1 15 5 3 X [m] 1.5 38

95% Confidence interval 5 5 5 Depth [m] Depth [m] Depth [m] 1 1 1 Posterior Prior True Posterior Prior True Posterior Prior True 15 4 6 m [s /m ] 1-7 15 4 6 m [s /m ] 1-7 15 4 6 m [s /m ] 1-7 x = 7m x = 15m x = 67m 39

Samples 4 m [s /m ]6 1-7 Samples x = 15m z = 7m 1 3 4 Number of samples 1 5 3 m [s /m ]4 1-7 x = 15m z = 75m 1 1 3 4 Number of samples 1 5 4

Conclusion Posterior distribution for weak PDE constraints inverse problems: joint distribution with respect to the model parameters and wavefields conditional distributions Gaussian distributions with sparse covariance matrices Gibbs sampling method sample model parameters and wavefields from the corresponding conditional distributions alternatively Computational cost one PDE solve per each source per each iteration 41

Reference Albert Tarantola, and Bernard Valette. "Inverse problems= quest for information." J. geophys 5.3 (198): 15-17. Andrew M. Stuart, Jochen Voss, and Petter Wilberg. "Conditional path sampling of SDEs and the Langevin MCMC method." Communications in Mathematical Sciences.4 (4): 685-697. James Martin, Lucas C. Wilcox, Carsten Burstedde, and OMAR Ghattas. A Stochastic Newton MCMC Method for Large- scale Statistical Inverse Problems with Application to Seismic Inversion. SIAM Journal on Scientific Computing, 34(3):A146 A1487, 1. Jean Virieux and Stéphane Operto. An overview of full- waveform inversion in exploration geophysics. GEOPHYSICS, 74(6):WCC1 WCC6, 9. Tristan van Leeuwen and Felix J. Herrmann. Mitigating local minima in full- waveform inversion by expanding the search space. Geophysical Journal International, 195:661 667, 1 13. Tristan van Leeuwen and Felix J. Herrmann. A penalty method for PDE- constrained optimization in inverse problems. Inverse Problems, 3(1):157, 1 15. Michael Fisher, Mand Leutbecher, and G. A. Kelly. "On the equivalence between Kalman smoothing and weak- constraint four- dimensional variational data assimilation." Quarterly Journal of the Royal Meteorological Society 131.613 (5): 335-346. 4

Acknowledgements This research was carried out as part of the SINBAD project with the support of the member organizaxons of the SINBAD ConsorXum. 43