Report due date. Please note: report has to be handed in by Monday, May 16 noon.

Lecture 23 18.86

Report due date Please note: report has to be handed in by Monday, May 16 noon. Course evaluation: Please submit your feedback (you should have gotten an email)

From the weak form to finite elements Electrostatics: The electrostatic potential p and the charge distribution r fulfill the Poisson equation (1D): (x) = 4 (x) Alternative system: r the mass density, p the gravitational potential We have seen that this can be expressed in its weak form as Z b a ( on interval Ω=[a,b] Dirichlet BCs to simplify the analysis (but not necessary): or (a) = a, (b) = b Z b a ( v +4 v) dx = v +4 v) dx + v b a = for all test functions v (BCs) for all test functions v

FE approximation (x) = 1X Plugging into the weak form we get: i=1 iv i (x) Find coe cients i such that Z!! b NX ivi(x) v (x)+4 (x)v(x) dx =, a i=1 for all v 2 H 1 D ( ) What about those v s? Since we have a basis and the problem is linear in v and v, we can just write the equivalent problem: Find coe cients i such that Z!! b NX ivi(x) vj(x)+4 (x)v j (x) dx =, a i=1 for j =1,..,N

FE approximation Find coe cients i such that Z!! b NX ivi(x) vj(x)+4 (x)v j (x) a i=1 dx =, for j =1,..,N Rearranging the terms a bit, the equation can be written as: Solve A ~ = ~ b with Z b ~ =( 1, 2,..., N ), A ij = vi(x)v j(x)dx, b i =4 a Z b a (x)v i (x)dx Since we need to invert a matrix, it makes sense to choose the basis such that most entries in A are zero! This is the case if different vi have only very small support (hence the name finite elements )

Nonlinear finite elements Solve (x) d2 dx 2 (x) = Z b a ( 4 (x) Equivalent weak statement: Find Φ such that for all v i : (x)+4 (x)) v i (x)dx = The same derivation as before now results in nonlinear coupled equations X i,j A ijk i j = b k with bk as before but A ijk = (1) Z b a v i (x)v j (x)v k (x)dx We now need a nonlinear root-finding algorithm to solve (1) for ɸ i.

asonable shot at this, especially if f is twice Nonlinear FE X How to solve? problems i,j A ijk i j = b k Write as r k ( ~ )= X i,j A ijk i j b k t f : R n Minimize (, ],! find ~r( ~ ) 2 2 We thus need methods to solve nonlinear problems. Quite general: given x s.t. a function f (x ) = f, we minneed to find find min {f (x)} n x R min {f (x)} n x R x R n {f (x)} l, but some cases, like f convex, are fairly solvable. find x s.t. f x lem: How about f : R n R, differentiable? Let s assume f is differentiable Then the problem becomes a root finding problem: find x s.t. f (x ) =

est we can get Nonlinear root-finding optimization: f (x) = 1 2 x t Ax x t b + c. ommon (actually universal) r expansion Let s first consider a particular form of f: f (x) = 1 2 x t Ax x t b + c. x) = f (x) + ( x) t f (x) + 1 2 ( x)t f (x) x + ly universal) Interpretation: Energy, quadratic in x. We have seen these kinds of f (x) = energies in the context of CG. Here: f (x) = Ax b = x = A 1 b mean A In has other to be words, invertible? quadratic Is this optimization all we need? amounts to a single matrix inversion (using direct methods, iterative methods (CG etc.)). But A has to be more than invertible! R. A. Lippert Non-linear optimization

Max, min, saddle, or what? Requires A be positive definite, why? 3 2.5 Positive definitenes reexamined.5 2 1 A 1.5 not only needs to be invertible, 1.5 but also positive definite 1 2.5 1 f (x) = 2.5 3 1.5 1.5 1 Otherwise, f.5 could also be a maximum or saddle point.5.5.5.5.5 1 1 1 1 or a line! 1 1.5.8.5 1 1.5.6.4.2 2 1 1.5.5 1 1.5.5 1.5.5 1 1.5.5 1 Positive definiteness is crucial to guarantee that we found a minimum by matrix inversion for quadratic energies! R. A. Lippert Non-linear optimization For general energies, positive definiteness requires convex f(x) image source: Ross A. Lippert D. E. Shaw Research

Nonlinear root-finding What do we do if f(x) is more complicated? If you have no clue, Taylor you can do! To start, lets consider 1D case: Minimize f(x): If x * is solution, we can write f (x )=f(x)+f (x)(x x)+ 1 2 f 2 (x)(x x)+... Need to find x * such that f (x * ) =, thus x = x x f (x) f (x) x n+1 = x n f (x n ) f (x n )

Nonlinear root-finding x = x x f (x) f (x) f (x n ) x n+1 = x n f (x n ) Instead of minimizing f, we could also search for a root of g(x)=f (x): x n+1 = x n g(x n ) g (x n ) Lecture

Nonlinear root finding

Nonlinear root-finding Now in higher dimensions: Taylor is now f (x ) f (x)+(x x) T rf (x)+ 1 2 (x x) T rrf (x)(x x)+... f(x) is a matrix of second derivatives (also called Hessian or Jacobian) Again, min f(x) means f(x * )= for solution x * x = x x (rrf (x)) 1 rf (x)

ton s method Naive nonlin. root solver (Newton-Raphson method) ewton s method finding x s.t. f (x) = ) Algorithm: find x s.t. f (x) = ) ; Start with initial guess x. Then iterate x i = ( f (x i )) 1 f (x i ) x i+1 = x i + x i until Δxi < tolerance or max. number if steps f (x i ) posdef, ( f (x i )) t (x i+1 x i ) < so x i is a ion of decrease (could overshoot) This means we need to invert the Hessian, i.e. solve again a linear problem! f (x i ) not posdef, x i might be in an increasing ion. However, (as in 1D), we have to do this many times (at each iteration!) convex, f (x i+1 ) f (x i ), so problems go away.

Why and where it fails x i = ( f (x i )) 1 f (x i ) x i+1 = x i + x i if f (x i ) posdef, ( f (x i )) t (x i+1 x i ) < so x i is a direction of decrease (could overshoot) if f (x i ) not posdef, x i might be in an increasing direction. if f is convex, f (x i+1 ) f (x i ), so problems go away. T

Example 1D example of trouble in 1D 1D example of trouble: f (x) = x 4 2x 2 + 12x 2 15 1 5 5 1 15 1D example of trouble: f (x) = x 4 2x 2 + 12x 2 15 1 5 5 1 15 2 2 1.5 1.5.5 1 Has one local minimum Is not convex (note the concavity near x=) 2 2 1.5 1.5.5 1 1.5 source: Ross A. Lippert D. E. Shaw Research R. A. Lippert Non-linear optimization

Example in 1D derivative of trouble: f (x) = 4x 3 4x + 12 2 15 1 5 5 1 negative f around x= is a barrier to reach solution! 15 2 1.5 1.5.5 1 1.5 the negative f region around x repells the iterates: source: Ross A. Lippert D. E. Shaw Research

n ) f (x i ) Line search methods Algorithm: Try to enforce f(x i+1 )<f(x i ) by going in gradient direction far enough : i = ( f (x i )) 1 f (x i ) x i = ( f (x i )) 1 f (x i ) = x i + α i x i x i+1 = x i + α i x i t f (x i + α i x i ) f (x i ). If x i is a direction ch that f x x f x. If x Since Δx i is a direction in which f decreases, there will be some α i > such that f decreases. i exists. n do 1D optimization problem, But since f is nonlinear, we would need to solve the nonlinear optimization problem min f (x i + α i x i ) α i (,β] It s getting complicated use this rule: α i = ρµ n some n Convergence very much depends on quality of initial guess s x ) f (x ) νs ( x ) t f (x )

Alternatives Instead of solving f=, it is often easier to just minimize f, by using, e.g. gradient descent and a good guess for the step length: 1. dient Start descent: with initial guess xi 2. gradient descent: 1 Search direction: r i = f (x i ) 2 Search step: x i+1 = x i + α i r i Search direction: r i = f (x i ) 3 Pick alpha: (depends on what s cheap) 3. Search step: x i+1 = x i + α i r i 4. Pick Find alpha: optimal (depends!i on what s c 1 linearized α i = r t E.g. by approx 1D minimization of etc. i ( f )r i r t i r i 2 1D minimization f (x i + αr i ) (danger: low 3 zero-finding ri t f (x i + αr i ) = Can be extended to nonlinear CG as well Do not need second derivative evaluations!

What if f has: many local minima, is not differentiable or x is discrete? Traveling salesman problem: Find shortest route to visit all cities once! Here, f=length of route, x: ordered indices of cities to visit Many different methods available: Monte Carlo, simulated annealing, genetic algorithms etc.

What if f is has many local minima? Rough energy landscape: Gradient descent fails. General idea: Make sure we don t get stuck in local minimum!

Genetic algorithm Sketch: Start with a population of genes (for instance, random values of x in f(x)) A C B M Calculate cost for each. Then: XXX (no drugs) and rock & roll: Generate new x by crossing parents of good fitness Incorporate random mutations into children (=>overcomes minima) Run long enough until most populations converge to an optimal set of genes (=> best values of vector x)

Monte Carlo E Idea: Find a way to statistically sample all states according to well-defined probability (lower energy states = higher probability) Need: A way to walk through phase space such that there is a finite probability to visit each state in finite steps Ex.: Particle in energy landscape, coupled to heat bath (temperature T): Statistical physics: PDF for states with energy E is given by p(e) =e E = 1 k B T In order to visit all possible states with correct probability, the Metropolis algorithm can be used

Metropolis algorithm Make a trial move from current state x to a new randomly selected state x Calculate energy change de = E - E If de<, accept the new state x and repeat. If de>, accept state with probability e de : Draw random number r in [,1) Accept state if e de >r Application: Packing problems (see class)