Simulation based optimization

Similar documents
Parallelizing large scale time domain electromagnetic inverse problem

A multigrid method for large scale inverse problems

Nonlinear Optimization: What s important?

A multilevel, level-set method for optimizing eigenvalues in shape design problems

PDE Solvers for Fluid Flow

Achieving depth resolution with gradient array survey data through transient electromagnetic inversion

Simultaneous estimation of wavefields & medium parameters

NonlinearOptimization

A parallel method for large scale time domain electromagnetic inverse problems

2.29 Numerical Fluid Mechanics Spring 2015 Lecture 9

Computational methods for large distributed parameter estimation problems with possible discontinuities

Numerical Optimization Algorithms

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

Partial Differential Equations

Direct Current Resistivity Inversion using Various Objective Functions

Numerical optimization

Review for Exam 2 Ben Wang and Mark Styczynski

A Sobolev trust-region method for numerical solution of the Ginz

Basic Aspects of Discretization

Optimization Methods

Trajectory-based optimization

Lectures Notes Algorithms and Preconditioning in PDE-Constrained Optimization. Prof. Dr. R. Herzog

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Optimal control problems with PDE constraints

Computational methods for large distributed parameter estimation problems in 3D

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

High Performance Nonlinear Solvers

Parameter Identification in Partial Differential Equations

Comparison between least-squares reverse time migration and full-waveform inversion

1. Introduction. In this work we consider the solution of finite-dimensional constrained optimization problems of the form

Kasetsart University Workshop. Multigrid methods: An introduction

Quasi-Newton Methods. Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization

AMS526: Numerical Analysis I (Numerical Linear Algebra)

1. Introduction. In this work, we consider the solution of finite-dimensional constrained optimization problems of the form

Chapter 3 Numerical Methods

Two-Scale Wave Equation Modeling for Seismic Inversion

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Higher-Order Methods

Inexact Newton Methods and Nonlinear Constrained Optimization

An introduction to PDE-constrained optimization

Complexity analysis of second-order algorithms based on line search for smooth nonconvex optimization

LINEAR AND NONLINEAR PROGRAMMING

Finite Elements for Magnetohydrodynamics and its Optimal Control

Lecture 7 Unconstrained nonlinear programming

Anna Avdeeva Dmitry Avdeev and Marion Jegen. 31 March Introduction to 3D MT inversion code x3di

Line Search Methods for Unconstrained Optimisation

A Line search Multigrid Method for Large-Scale Nonlinear Optimization

Matrix Derivatives and Descent Optimization Methods

Nonlinear Multigrid and Domain Decomposition Methods

Index. higher order methods, 52 nonlinear, 36 with variable coefficients, 34 Burgers equation, 234 BVP, see boundary value problems

A Multiphysics Framework Using hp-finite Elements for Electromagnetics Applications

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

Numerical Optimization of Partial Differential Equations

Bindel, Fall 2011 Intro to Scientific Computing (CS 3220) Week 6: Monday, Mar 7. e k+1 = 1 f (ξ k ) 2 f (x k ) e2 k.

Numerical solutions of nonlinear systems of equations

Contents. Preface. 1 Introduction Optimization view on mathematical models NLP models, black-box versus explicit expression 3

An Inexact Newton Method for Optimization

Optimization II: Unconstrained Multivariable

REPORTS IN INFORMATICS

Chapter 5. Methods for Solving Elliptic Equations

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

Stochastic Analogues to Deterministic Optimizers

A projected Hessian for full waveform inversion

Seismic imaging and optimal transport

Spring 2014: Computational and Variational Methods for Inverse Problems CSE 397/GEO 391/ME 397/ORI 397 Assignment 4 (due 14 April 2014)

Reduced-Hessian Methods for Constrained Optimization

Fast Iterative Solution of Saddle Point Problems

Affine covariant Semi-smooth Newton in function space

Newton-Multigrid Least-Squares FEM for S-V-P Formulation of the Navier-Stokes Equations

Linear and Non-Linear Preconditioning

Algorithms for Constrained Optimization

A Multilevel Method for Image Registration

Nonlinear Optimization for Optimal Control

ETNA Kent State University

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

In the derivation of Optimal Interpolation, we found the optimal weight matrix W that minimizes the total analysis error variance.

Numerical Optimization Techniques

Optimization and Root Finding. Kurt Hornik

An Inexact Newton Method for Nonlinear Constrained Optimization

Mathematical optimization

MATHEMATICS FOR COMPUTER VISION WEEK 8 OPTIMISATION PART 2. Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year

Key words. preconditioned conjugate gradient method, saddle point problems, optimal control of PDEs, control and state constraints, multigrid method

A Robust Preconditioner for the Hessian System in Elliptic Optimal Control Problems

Inversion of 3D electromagnetic data in frequency and time domain using an inexact all-at-once approach

The Inversion Problem: solving parameters inversion and assimilation problems

Constrained Optimization

Unconstrained optimization

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks

Inversion of 3D Electromagnetic Data in frequency and time domain using an inexact all-at-once approach

Numerical Optimization

Chapter 7 Iterative Techniques in Matrix Algebra

Bang bang control of elliptic and parabolic PDEs

Lecture # 20 The Preconditioned Conjugate Gradient Method

Pose Tracking II! Gordon Wetzstein! Stanford University! EE 267 Virtual Reality! Lecture 12! stanford.edu/class/ee267/!

Programming, numerics and optimization

Lecture Notes: Geometric Considerations in Unconstrained Optimization

INTRODUCTION TO FINITE ELEMENT METHODS ON ELLIPTIC EQUATIONS LONG CHEN

Transcription:

SimBOpt p.1/52 Simulation based optimization Feb 2005 Eldad Haber haber@mathcs.emory.edu Emory University

SimBOpt p.2/52 Outline Introduction A few words about discretization The unconstrained framework Calculation of the gradient Getting a decent descent direction Globalization Summary

SimBOpt p.3/52 Simulation and Optimization The problem: min J (y, u) subject to c(y, u) = 0 Work within the discretize-optimize framework

SimBOpt p.4/52 Discretize-Optimize Optimize-Discretize: Can yield inconsistent gradients of the objective functionals. The approximate gradient obtained in this way is not a true gradient of anything not of the continuous functional nor of the discrete functional. Discretize-Optimize Requires to differentiate computational facilitators such as turbulence models, shock capturing devices or outflow boundary treatment. M. Gunzburger Want to use the wealth of optimization algorithms

SimBOpt p.5/52 Simulation and Optimization Need to discretize the PDE (constraint) Parameters change - modeling need to be flexible Need to optimize - derivatives

SimBOpt p.6/52 Discretizing c(y, u) = 0 - difficulties Stability with respect to parameters Explicit vs Implicit c(y, u) = y t uy xx Explicit: c h (y h, u h ) = y n+1 h y n h u h δt δx 2Lyn h = 0

SimBOpt p.7/52 Discretizing c(y, u) = 0 - difficulties Stability requires u h δt δx 2 But if we do not know u it may be hard to guarantee stability. Code has to make sure discretization is compatible Implicit methods are unconditionally stable

SimBOpt p.8/52 Discretizing c(y, u) = 0 - difficulties Differentiability of the discretization c(y, u) = ǫy xx + uy x = 0 Common discretization, upwind ǫ h 2(y j+1 2y j + y j 1 ) + u j h (max(u, 0)(y j y j 1 ) + min(u, 0)(y j+1 y j )) = 0

SimBOpt p.9/52 Discretizing c(y, u) = 0 - difficulties The continuous problem is continuously differentiable w.r.t u The discrete problem is not differentiable w.r.t u h Depends on the application - can be hard to deal with

SimBOpt p.10/52 The optimization problem - example Example - the mother of all elliptic problems (u y) = q Finite volume discretization A(u h )y h = D diag(n(u h ))D y h = q h Comment - N(u) harmonic averaging

SimBOpt p.11/52 The optimization problem Constrained approach, solve min J (y,u) subject to c(y,u) = 0 (e.g. A(u)y q = 0) Unconstrained approach, eliminate y to obtain min J (y(u),u) J (A(u) 1 q,u)

SimBOpt p.12/52 Constrained vs Unconstrained Constrained approach, Saddle point problem Algorithmically hard No need to solve the constraints Unconstrained approach Simple from an optimization standpoint Need to solve the constraint equation PDE Becomes even messier for nonlinear PDE s

Constrained vs Unconstrained SimBOpt p.13/52

SimBOpt p.14/52 Derivatives - Unconstrained approach In the inverse minimize min J (y(u), u) Linearize y(u + s) y(u) + δy {}}{ y u s Need to compute sensitivities C = y u

SimBOpt p.15/52 Computing Derivatives Rewrite the constraint 0 = c(y + δy, u + s) = c y δy + c u s = ( C {}}{ y u + c u )s c y Therefore C = y u = c 1 y c u

SimBOpt p.16/52 Computing Derivatives - example Then c(y,u) = A(u)y q = D diag(u)d y q c y c u = A(u) = (D diag(u)d y) u = (D diag(dy)u) u = D diag(dy) Then C = A(u) 1 D diag(dy)

SimBOpt p.17/52 The sensitivities C = y u = c 1 y c u A(u) 1 D diag(dy) c y is a discretized (linearized) PDE c 1 y is (usually) dense Do not compute C directly Whenever needed Cv use: w = Cv = c 1 y c u v Solve c y w = c u v

SimBOpt p.18/52 Computing the gradient The optimization problem min u J (y(u),u) Gradients: use chain rule J u (y(u),u) = y u J y + J u = C J y + J u

SimBOpt p.19/52 Computing the gradient Gradients g(u) = J u (y(u),u) = C J y + J u To compute the gradients need to calculate w = C J y = c u c y To compute J y Solve the adjoint problem c y z = J y Set w = c u z

Optimization algorithms The optimization problem min u J (y(u),u) Optimization algorithms - Framework Guess u 0 while not converge Evaluate J (u k ), g(u k ) and an approximation to the HessianB(u k ) Compute δu = B(u k ) 1 g(u k ) Take a step u k+1 = u k + αδu 0 < α 1 SimBOpt p.20/52

SimBOpt p.21/52 Getting a decent descent direction In a nut-shell, difference between optimization algorithms, how to choose B steepest descent B = I Newton B = J uu Quasi-Newton use [g k j,g k j+1,...,g k 1 ], [s k j,s k j+1,...,s k 1 ] to construct an approximation to the Hessian

SimBOpt p.22/52 More about the Newton direction Need to compute the Hessian. H = g(u) u = (C J (y,u) y ) u + J (y,u) uu Evaluating the second term is usually easy.

SimBOpt p.23/52 More about the Newton direction Need to compute the Hessian. H = g(u) u = (C J (y,u) y ) u + J (y,u) uu To evaluate (C J (y,u) y ) u use chain rule (J (y,u) y ) u = J (y,u) yy C

SimBOpt p.24/52 More about the Newton direction Need to compute the Hessian. H = g(u) u = (C J (y,u) y ) u + J (y,u) uu Gauss-Newton family - Ignore the dependency of C(u). H C J (y,u) yy C + J uu If J yy, J uu are SPD, then H is SPD

SimBOpt p.25/52 Computing the GN direction Need to solve (C J (y,u) yy C + J uu )δu = g(u) Problem is large, natural choice - Conjugate Gradient For each CG iteration Multiply C v and Cw Require one forward and one adjoint solve

SimBOpt p.26/52 Computing the GN direction Need to solve (C J (y,u) yy C + J uu )δu = g(u) Cost per iteration (#ITER CG + 1) (COST FORWARD + COST ADJOINT ) Typically do not solve the system to high tolerance (inexact Gauss-Newton)

SimBOpt p.27/52 Computing the GN direction Need to solve (C J (y,u) yy C + J uu )δu = g(u) Open question Preconditioning? Use Quasi-Newton approximate Hessians as preconditioners [Nocedal, Haber, Bradsly & Vogel Newman & Boggs...] Some problem dependent preconditioners [Mackie, Vogel, Farquharson...] Waiting for the big break

SimBOpt p.28/52 More about Quasi-Newton Use previous gradients and descent directions [g k j,g k j+1,...,g k 1 ], [s k j,s k j+1,...,s k 1 ] to construct an approximation to the Hessian Basic idea, Taylor s expansion B k (s k s k 1 ) = g k g k 1 Given s k s k 1 and g k g k 1 update B k

SimBOpt p.29/52 More about Quasi-Newton Cheap - Need not solve extra PDE s Very effective for some problems. Most popular - LBFGS, DFP Recent active research on application and improvement [Bradsly & Vogel, Navon, Haber...]

SimBOpt p.30/52 Globalization Make sure that J (u k+1 ) < J (u k ) line search: approximately min α J (u k + αs) trust region: approximately min J (u k + w) s.t w [s,g(u)], w homotopy: Solve a sequence of problems g(u,α k ) = 0

SimBOpt p.31/52 Globalization Every back tracking iteration requires the solution of a PDE. Important to get the most we can from a step

SimBOpt p.32/52 Grid Sequencing The problems we solve have an underline continuous structure. Use this structure for continuation Main idea: Solution of the problem on a coarse grid can approximate the problem on a fine grid. Use coarse grids to evaluate parameters within the optimization. Burger, Ascher & Haber, Haber & Modersitzki, Haber, Moŕe (see talk)

SimBOpt p.33/52 Algorithm Solve the optimization problem on a coarse grid H Refine the grid to a fine grid h Interpolate the solution from H to h and use it as initial guess H h In many cases grid continuation is sufficient to have global convergence. No proof that this is always the case

SimBOpt p.34/52 Application: Impedance Tomography Joint project with R. Knight and A. Pidlovski, Stanford Environmental Geophysics Group

Application: Impedance Tomography SimBOpt p.35/52

SimBOpt p.36/52 Application: Impedance Tomography Reference potential electrode V cone-mounted potential electrode permanent current electrodes

Application: Impedance Tomography SimBOpt p.37/52

SimBOpt p.38/52 The mathematical problem The constraint (PDE) (with some BC) c(y,u) = exp(u) y q j = 0 j = 1...k The Objective function min 1 2 Q(y(u) yobs ) 2 }{{} misfit + α }{{} regpar regularization {}}{ R(u)

Discretization SimBOpt p.39/52

SimBOpt p.40/52 Discretization use 128 128 64 cells # of states = k # of controls In practical experiments k 10 1000

SimBOpt p.41/52 The discrete mathematical problem The constraint (PDE) c h (y h,u h ) = A(u h )y h q h = D T S(u h )Dy h q h = 0 The Objective function min 1 2 Q(A(u h) 1 q h y obs ) 2 }{{} misfit + α }{{} regpar regularization {}}{ R(u h )

The Data - 63 sources SimBOpt p.42/52

The Inversion SimBOpt p.43/52

SimBOpt p.44/52 Computational Cost α misfit Total iterations forward sol s IGN QN PIGN IGN QN PIGN 10 5 6 10 2 11 11 11 89 28 46 10 6 4 10 2 16 25 16 112 52 68 10 7 2 10 2 17 36 17 131 78 79 10 8 8 10 3 21 59 21 158 131 108 IGN - inexact Gauss-Newton QN - Quasi Newton PIGN - QN preconditioner to IGN

SimBOpt p.45/52 Application - Image Registration Joint work with J. Modesitzki, Lubic, Germany Given Template image Reference image T(x) = T(x 1,x 2,x 3 ) R(x) = R(x 1,x 2,x 3 ) find a transformation u(x) = [u(x),v(x),w(x)] such that T(x + u(x)) R(x)

SimBOpt p.46/52 Example I R T T 0 R =0.24784=100.00% Start animation - HeadSpin

SimBOpt p.47/52 Example - ML R T T 0 R =0.24784=100.00%

SimBOpt p.48/52 Example - ML R T T 6 T 0 R =1.666=100.00% T 1 R =0.29452= 17.68%

SimBOpt p.49/52 Example - ML R T T 3 T 0 R =1.1637=100.00% T 1 R =0.17017= 14.62%

SimBOpt p.50/52 Example - ML R T T 3 T 0 R =0.75664=100.00% T 1 R =0.12648= 16.72%

SimBOpt p.51/52 Example - ML R T T 4 T 0 R =0.45381=100.00% T 1 R =0.10713= 23.61%

SimBOpt p.52/52 Summary Introduction discretization of PDE s The unconstrained framework Calculation of the gradient Getting a decent descent direction Globalization Summary