SimBOpt p.1/52 Simulation based optimization Feb 2005 Eldad Haber haber@mathcs.emory.edu Emory University
SimBOpt p.2/52 Outline Introduction A few words about discretization The unconstrained framework Calculation of the gradient Getting a decent descent direction Globalization Summary
SimBOpt p.3/52 Simulation and Optimization The problem: min J (y, u) subject to c(y, u) = 0 Work within the discretize-optimize framework
SimBOpt p.4/52 Discretize-Optimize Optimize-Discretize: Can yield inconsistent gradients of the objective functionals. The approximate gradient obtained in this way is not a true gradient of anything not of the continuous functional nor of the discrete functional. Discretize-Optimize Requires to differentiate computational facilitators such as turbulence models, shock capturing devices or outflow boundary treatment. M. Gunzburger Want to use the wealth of optimization algorithms
SimBOpt p.5/52 Simulation and Optimization Need to discretize the PDE (constraint) Parameters change - modeling need to be flexible Need to optimize - derivatives
SimBOpt p.6/52 Discretizing c(y, u) = 0 - difficulties Stability with respect to parameters Explicit vs Implicit c(y, u) = y t uy xx Explicit: c h (y h, u h ) = y n+1 h y n h u h δt δx 2Lyn h = 0
SimBOpt p.7/52 Discretizing c(y, u) = 0 - difficulties Stability requires u h δt δx 2 But if we do not know u it may be hard to guarantee stability. Code has to make sure discretization is compatible Implicit methods are unconditionally stable
SimBOpt p.8/52 Discretizing c(y, u) = 0 - difficulties Differentiability of the discretization c(y, u) = ǫy xx + uy x = 0 Common discretization, upwind ǫ h 2(y j+1 2y j + y j 1 ) + u j h (max(u, 0)(y j y j 1 ) + min(u, 0)(y j+1 y j )) = 0
SimBOpt p.9/52 Discretizing c(y, u) = 0 - difficulties The continuous problem is continuously differentiable w.r.t u The discrete problem is not differentiable w.r.t u h Depends on the application - can be hard to deal with
SimBOpt p.10/52 The optimization problem - example Example - the mother of all elliptic problems (u y) = q Finite volume discretization A(u h )y h = D diag(n(u h ))D y h = q h Comment - N(u) harmonic averaging
SimBOpt p.11/52 The optimization problem Constrained approach, solve min J (y,u) subject to c(y,u) = 0 (e.g. A(u)y q = 0) Unconstrained approach, eliminate y to obtain min J (y(u),u) J (A(u) 1 q,u)
SimBOpt p.12/52 Constrained vs Unconstrained Constrained approach, Saddle point problem Algorithmically hard No need to solve the constraints Unconstrained approach Simple from an optimization standpoint Need to solve the constraint equation PDE Becomes even messier for nonlinear PDE s
Constrained vs Unconstrained SimBOpt p.13/52
SimBOpt p.14/52 Derivatives - Unconstrained approach In the inverse minimize min J (y(u), u) Linearize y(u + s) y(u) + δy {}}{ y u s Need to compute sensitivities C = y u
SimBOpt p.15/52 Computing Derivatives Rewrite the constraint 0 = c(y + δy, u + s) = c y δy + c u s = ( C {}}{ y u + c u )s c y Therefore C = y u = c 1 y c u
SimBOpt p.16/52 Computing Derivatives - example Then c(y,u) = A(u)y q = D diag(u)d y q c y c u = A(u) = (D diag(u)d y) u = (D diag(dy)u) u = D diag(dy) Then C = A(u) 1 D diag(dy)
SimBOpt p.17/52 The sensitivities C = y u = c 1 y c u A(u) 1 D diag(dy) c y is a discretized (linearized) PDE c 1 y is (usually) dense Do not compute C directly Whenever needed Cv use: w = Cv = c 1 y c u v Solve c y w = c u v
SimBOpt p.18/52 Computing the gradient The optimization problem min u J (y(u),u) Gradients: use chain rule J u (y(u),u) = y u J y + J u = C J y + J u
SimBOpt p.19/52 Computing the gradient Gradients g(u) = J u (y(u),u) = C J y + J u To compute the gradients need to calculate w = C J y = c u c y To compute J y Solve the adjoint problem c y z = J y Set w = c u z
Optimization algorithms The optimization problem min u J (y(u),u) Optimization algorithms - Framework Guess u 0 while not converge Evaluate J (u k ), g(u k ) and an approximation to the HessianB(u k ) Compute δu = B(u k ) 1 g(u k ) Take a step u k+1 = u k + αδu 0 < α 1 SimBOpt p.20/52
SimBOpt p.21/52 Getting a decent descent direction In a nut-shell, difference between optimization algorithms, how to choose B steepest descent B = I Newton B = J uu Quasi-Newton use [g k j,g k j+1,...,g k 1 ], [s k j,s k j+1,...,s k 1 ] to construct an approximation to the Hessian
SimBOpt p.22/52 More about the Newton direction Need to compute the Hessian. H = g(u) u = (C J (y,u) y ) u + J (y,u) uu Evaluating the second term is usually easy.
SimBOpt p.23/52 More about the Newton direction Need to compute the Hessian. H = g(u) u = (C J (y,u) y ) u + J (y,u) uu To evaluate (C J (y,u) y ) u use chain rule (J (y,u) y ) u = J (y,u) yy C
SimBOpt p.24/52 More about the Newton direction Need to compute the Hessian. H = g(u) u = (C J (y,u) y ) u + J (y,u) uu Gauss-Newton family - Ignore the dependency of C(u). H C J (y,u) yy C + J uu If J yy, J uu are SPD, then H is SPD
SimBOpt p.25/52 Computing the GN direction Need to solve (C J (y,u) yy C + J uu )δu = g(u) Problem is large, natural choice - Conjugate Gradient For each CG iteration Multiply C v and Cw Require one forward and one adjoint solve
SimBOpt p.26/52 Computing the GN direction Need to solve (C J (y,u) yy C + J uu )δu = g(u) Cost per iteration (#ITER CG + 1) (COST FORWARD + COST ADJOINT ) Typically do not solve the system to high tolerance (inexact Gauss-Newton)
SimBOpt p.27/52 Computing the GN direction Need to solve (C J (y,u) yy C + J uu )δu = g(u) Open question Preconditioning? Use Quasi-Newton approximate Hessians as preconditioners [Nocedal, Haber, Bradsly & Vogel Newman & Boggs...] Some problem dependent preconditioners [Mackie, Vogel, Farquharson...] Waiting for the big break
SimBOpt p.28/52 More about Quasi-Newton Use previous gradients and descent directions [g k j,g k j+1,...,g k 1 ], [s k j,s k j+1,...,s k 1 ] to construct an approximation to the Hessian Basic idea, Taylor s expansion B k (s k s k 1 ) = g k g k 1 Given s k s k 1 and g k g k 1 update B k
SimBOpt p.29/52 More about Quasi-Newton Cheap - Need not solve extra PDE s Very effective for some problems. Most popular - LBFGS, DFP Recent active research on application and improvement [Bradsly & Vogel, Navon, Haber...]
SimBOpt p.30/52 Globalization Make sure that J (u k+1 ) < J (u k ) line search: approximately min α J (u k + αs) trust region: approximately min J (u k + w) s.t w [s,g(u)], w homotopy: Solve a sequence of problems g(u,α k ) = 0
SimBOpt p.31/52 Globalization Every back tracking iteration requires the solution of a PDE. Important to get the most we can from a step
SimBOpt p.32/52 Grid Sequencing The problems we solve have an underline continuous structure. Use this structure for continuation Main idea: Solution of the problem on a coarse grid can approximate the problem on a fine grid. Use coarse grids to evaluate parameters within the optimization. Burger, Ascher & Haber, Haber & Modersitzki, Haber, Moŕe (see talk)
SimBOpt p.33/52 Algorithm Solve the optimization problem on a coarse grid H Refine the grid to a fine grid h Interpolate the solution from H to h and use it as initial guess H h In many cases grid continuation is sufficient to have global convergence. No proof that this is always the case
SimBOpt p.34/52 Application: Impedance Tomography Joint project with R. Knight and A. Pidlovski, Stanford Environmental Geophysics Group
Application: Impedance Tomography SimBOpt p.35/52
SimBOpt p.36/52 Application: Impedance Tomography Reference potential electrode V cone-mounted potential electrode permanent current electrodes
Application: Impedance Tomography SimBOpt p.37/52
SimBOpt p.38/52 The mathematical problem The constraint (PDE) (with some BC) c(y,u) = exp(u) y q j = 0 j = 1...k The Objective function min 1 2 Q(y(u) yobs ) 2 }{{} misfit + α }{{} regpar regularization {}}{ R(u)
Discretization SimBOpt p.39/52
SimBOpt p.40/52 Discretization use 128 128 64 cells # of states = k # of controls In practical experiments k 10 1000
SimBOpt p.41/52 The discrete mathematical problem The constraint (PDE) c h (y h,u h ) = A(u h )y h q h = D T S(u h )Dy h q h = 0 The Objective function min 1 2 Q(A(u h) 1 q h y obs ) 2 }{{} misfit + α }{{} regpar regularization {}}{ R(u h )
The Data - 63 sources SimBOpt p.42/52
The Inversion SimBOpt p.43/52
SimBOpt p.44/52 Computational Cost α misfit Total iterations forward sol s IGN QN PIGN IGN QN PIGN 10 5 6 10 2 11 11 11 89 28 46 10 6 4 10 2 16 25 16 112 52 68 10 7 2 10 2 17 36 17 131 78 79 10 8 8 10 3 21 59 21 158 131 108 IGN - inexact Gauss-Newton QN - Quasi Newton PIGN - QN preconditioner to IGN
SimBOpt p.45/52 Application - Image Registration Joint work with J. Modesitzki, Lubic, Germany Given Template image Reference image T(x) = T(x 1,x 2,x 3 ) R(x) = R(x 1,x 2,x 3 ) find a transformation u(x) = [u(x),v(x),w(x)] such that T(x + u(x)) R(x)
SimBOpt p.46/52 Example I R T T 0 R =0.24784=100.00% Start animation - HeadSpin
SimBOpt p.47/52 Example - ML R T T 0 R =0.24784=100.00%
SimBOpt p.48/52 Example - ML R T T 6 T 0 R =1.666=100.00% T 1 R =0.29452= 17.68%
SimBOpt p.49/52 Example - ML R T T 3 T 0 R =1.1637=100.00% T 1 R =0.17017= 14.62%
SimBOpt p.50/52 Example - ML R T T 3 T 0 R =0.75664=100.00% T 1 R =0.12648= 16.72%
SimBOpt p.51/52 Example - ML R T T 4 T 0 R =0.45381=100.00% T 1 R =0.10713= 23.61%
SimBOpt p.52/52 Summary Introduction discretization of PDE s The unconstrained framework Calculation of the gradient Getting a decent descent direction Globalization Summary