MATH 3795 Lecture 13. Numerical Solution of Nonlinear Equations in R N.

MATH 3795 Lecture 13. Numerical Solution of Nonlinear Equations in R N. Dmitriy Leykekhman Fall 2008 Goals Learn about different methods for the solution of F (x) = 0, their advantages and disadvantages. Convergence rates. MATLAB s fsolve D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 1

Nonlinear Equations. Goal: Given a function f : R N R N we want to find x R N such that F (x ) = 0. Definition A point x with F (x ) = 0 is called a root of F or zero of F. We want to extend the methods from the last lecture, like Newton s method and Secant Method to find roots to vector valued problems. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 2

Convergence of Sequences. Let {x k } be a sequence of vectors in R N and let denote a vector norm. 1. The sequence is called q-linearly convergent if there exists c (0, 1) and ˆk N such that x k+1 x c x k x for all k ˆk. 2. The sequence is called q-superlinearly convergent if there exists a sequence {c k } with c k > 0 and lim k c k = 0 such that or, equivalently, if x k+1 x c k x k x x k+1 x lim k x k x = 0. 3. The sequence is called q-quadratically convergent to x if lim k x k = x and if there exists c > 0 and ˆk N such that x k+1 x c x k x 2 for all k ˆk. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 3

Taylor Expansion. Given a differentiable function F, we define its Jacobian F 1 F x 1 (x)... 1 x n (x) F (x) = JF (x) =. F n x 1 (x)... If F is continuously differentiable around x 0, then F (x) F (x 0 ) + F (x 0 )(x x 0 ). F n x n (x) D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 4

Taylor Expansion. Example Let F : R 2 R 2 be given by Then its Jacobian is If x 0 = ( 0 0 ( F (x) F (x) = ( x 2 1 + x 2 2 2 e x1 1 + x 3 2 2 ( ) F 2x1 2x (x) = 2 e x1 1 3x 2 2 ) close to ) ( x1, then for x = x 2 ) ( 2 0 0 e 1 + 2 e 1 0 ) ( 0 0 ) ( ) ( x1 = x 2 ) 2 e 1 (1 + x 1 ) 2 ) D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 5

Newton s Method. Suppose we are given an approximation x 0 of a root x of F. Taylor approximation of F around x 0 gives F (x ) = F (x 0 + (x x 0 )) F (x 0 ) + F (x 0 )(x x 0 ). We use M(x) = F (x 0 ) + F (x 0 )(x x 0 ) as a model for F. The root of M(x) i.e. the solution of linear system F (x 0 )(x x 0 ) = F (x 0 ) is used as an approximation of the root of F. Write the previous identity as F (x 0 )s 0 = F (x 0 ) x 1 = x 0 + s 0. step (correction) D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 6

Newton s Method. Input: Initial values x(0), tolerance tol, maximum number of iterations maxit Output: approximation of the root 1 For k = 0:maxit do 2 Compute F (x(k))s(k) = -F(x(k)). (LU-decomposition) 3 Compute x(k+1) = x(k) + s(k). 4 Check for truncation 5 End D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 7

Newton s Method. Example Consider F (x) = ( x 2 1 + x 2 2 2 e x1 1 + x 3 2 2 with starting point x 0 = (1.5, 2) T and stopping criteria F (x k ) 2 1e 10, the Newton s method gives the computed solution x = (1.0000, 1.0000) T and the history k x k 2 F (x k ) 2 s k 2 0 2.500000e + 000 8.750168e + 000 8.805454e 001 1 1.665941e + 000 2.073196e + 000 3.234875e 001 2 1.450739e + 000 4.127937e 001 1.606253e 001 3 1.423306e + 000 6.177196e 002 2.206725e 002 4 1.414386e + 000 1.401191e 003 6.087256e 004 5 1.414214e + 000 9.730653e 007 3.964708e 007 6 1.414214e + 000 4.415589e 013 0.000000e + 000 ) D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 8

Convergence of Newton s Method. Theorem Let D R N be an open set and let F : D R N be differentiable on D with Lipschitz continuous derivative, i.e. let L > 0 be such thtat F (y) F (x) L y x x, y D. If x D is a root and if F (x ) is nonsingular, then there exists an ɛ > 0 such that Newton s method with starting point x 0 with x 0 x < 0 generates iterates x k which converge to x, and which obey lim x k = x, k x k+1 x c x k x 2 for all k and some positive constant c. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 9

Convergence of Newton s Method. Theorem Let D R N be an open set and let F : D R N be differentiable on D with Lipschitz continuous derivative, i.e. let L > 0 be such thtat F (y) F (x) L y x x, y D. If x D is a root and if F (x ) is nonsingular, then there exists an ɛ > 0 such that Newton s method with starting point x 0 with x 0 x < 0 generates iterates x k which converge to x, and which obey lim x k = x, k x k+1 x c x k x 2 for all k and some positive constant c. Newton s method is locally q-quadratically convergent (under the assumptions stated in the theorem). D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 9

Stopping Criteria. We have discussed iterative methods which generate sequences x k with (under suitable conditions) lim k x k = x. When do we stop the iteration? For a given tolerance tol a > 0 we want to find x k such that x k x < tol a, stop if absolute error is small; or x k x < tol r x, stop if relative error is small. For some t k [0, 1] F (x k ) = F (x k ) F (x ) 1 2 (F (x )) 1 1 x k x, if x k is suff. close to x Hence, if F (x k ) < tol f, and if x k is sufficiently close to x, then x k x < 2tol f (F (x )) 1. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 10

Stopping Criteria. We have discussed iterative methods which generate sequences x k with (under suitable conditions) lim k x k = x. When do we stop the iteration? For a given tolerance tol a > 0 we want to find x k such that x k x < tol a, stop if absolute error is small; or x k x < tol r x, stop if relative error is small. For some t k [0, 1] F (x k ) = F (x k ) F (x ) 1 2 (F (x )) 1 1 x k x, if x k is suff. close to x Hence, if F (x k ) < tol f, and if x k is sufficiently close to x, then x k x < 2tol f (F (x )) 1. Limit maximum number of iterations. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 10

Variations of Newton s Method. Recall Newton s Method: Input: Initial values x 0, tolerance tol, maximum number of iterations maxit Output: approximation of the root 1. For k = 0,..., maxit do 2. Compute s k = f(x k )/f (x k ). 3. x k+1 = x k + s k. 4. Check for truncation 5. End Requires the evaluation of derivatives and solution of a linear system. Linear system solves are done using LU decomposition (cost 2/3n 3 flops for each matrix factorization). If Jacobian/derivative evaluations are expensive or difficult to compute, or if the LU factorization of the derivative is expensive, the following variations are useful Finite difference Newton method and Secant method. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 11

Finite Difference Newton Method. Recall F i (x) x j F i (x 1,..., x j 1, x j + h, x j+1,..., x n ) F i (x) = lim. h 0 h D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 12

Finite Difference Newton Method. Recall F i (x) x j F i (x 1,..., x j 1, x j + h, x j+1,..., x n ) F i (x) = lim. h 0 h Thus, for all i = 1,..., n F 1(x) x j. F n(x) x j = lim h 0 F (x 1,..., x j 1, x j + h, x j+1,..., x n ) F (x). h D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 12

Finite Difference Newton Method. Recall F i (x) x j F i (x 1,..., x j 1, x j + h, x j+1,..., x n ) F i (x) = lim. h 0 h Thus, for all i = 1,..., n F 1(x) x j. F n(x) x j = lim h 0 The idea is to replace F 1(x) x j. F n(x) x j F (x 1,..., x j 1, x j + h, x j+1,..., x n ) F (x). h 1 h j ( F (x1,..., x j 1, x j + h, x j+1,..., x n ) F (x) ). Good choice for the step size h j = ɛ x. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 12

Finite Difference Newton Method. Input: Initial values x 0, tolerance tol, maximum number of iterations maxit Output: approximation of the root 1. For k = 0,..., maxit do 2. Compute finite difference approximation of F (x k ) i.e., compute matrix B with columns B of F (x k ) Be j = 1 h j ( F (x1,..., x j 1, x j + h, x j+1,..., x n ) F (x) ) 3. Factor B 4. Solve Bs k = F (x k ) 5. Compute x k+1 = x k + s k 6. Check for truncation 7. End D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 13

Secant Method. Recall that in the secant method we replace the derivative f (x k+1 ) by f(x k) f(x k+1 ) x k x k+1. If we set b k+1 = f(x k) f(x k+1 ) x k x k+1 then b k+1 satisfies b k+1 (x k+1 x k ) = f(x k+1 ) f(x k ) which is called the secant equation The next iterate is given by x k+2 = x k+1 + s k+1, where s k+1 is the solution of b k+1 s k+1 = f(x k+1 ). D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 14

Secant Method. Try to extend this to the problem of finding a root of F : R n R n. Given two iterates x k, x k+1 R n we try to find a nonsingular matrix B k+1 R n n which satisfies the so-called secant equation B k+1 (x k+1 x k ) = F (x k+1 ) F (x k ). Then we compute the new iterate as follows Solve B k+1 s k+1 = F (x k+1 ), x k+2 = x k+1 + s k+1. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 15

Secant Method. There is a problem with this approach. If n = 1 the the secant equation has a unique solution. b k+1 (x k+1 x k ) = f(x k+1 ) f(x k ) If n > 1, the we need to determine n 2 entries of B k+1 from n equations B k+1 (x k+1 x k ) = F (x k+1 ) F (x k ). There is no unique solution. For example, for n = 2, x k+1 x k = (1, 1) T and F (x k+1 ) F (x k ) = (1, 2), then the matrices ( ) ( ) 1 0 0 1, 0 2 2 0 satisfy secant equation. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 16

Secant Method. Therefore we chose B k+1 R n n as the solution of min B B k F. B(x k+1 x k )=F (x k+1 ) F (x k ) Motivation: B k+1 should satisfy the secant equation B(x k+1 x k ) = F (x k+1 ) F (x k ) and B k+1 should be as close to the old matrix B k as possible to preserve as much information contained in B k as possible. Common notation s k = x k+1 x k and y k = F (x k+1 ) F (x k ). With these definition the previous minimization problem becomes Unique solution min B B k F. s:t: Bs k =yk B k+1 = B k + (yk B ks k )s T k s T k s k Broyden update. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 17

Broyden s Method. Input: Initial values x 0, tolerance tol, maximum number of iterations maxit Output: approximation of the root 1 For k = 0,..., maxit do 2 Solve B k s k = F (x k ) for s k 3 Set x k+1 = x k + s k. 4 Evaluate F (x k+1 ) 5 Check for truncation 6 Set y k = F (x k+1 ) F (x k ) 7 Set B k+1 = B k + (yk B ks k )s T k s T k s k 8 End. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 18

Broyden s Method. Note that due to the definition of y k and s k we have that B k+1 = B k + (yk B ks k )s T k s T k s k = B k + F (x k+1)s T k s T k s k To start Broyden s method we need an initial guess x 0 for the root x and an initial matrix B 0 R n n. In practice one often chooses B 0 = F (x 0 ) or B 0 = γi; where γ is a suitable scalar. Other choices, for example finite difference approximations to F (x 0 ) or choices based on the specific structure of F are also used. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 19

Broyden s Method. Example Consider with starting point x 0 = F (x) = ( 1.5 2 ( x 2 1 + x 2 2 2 e x1 1 + x 3 2 2 and stopping criteria F (x k ) 2 1e 10. ) ) and B 0 = F (x 0 ), D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 20

Broyden s Method. the Broyden s method produces k x k 2 F (x k ) 2 s k 2 0 2.500000e + 000 8.750168e + 000 8.805454e 001 1 1.665941e + 000 2.073196e + 000 1.922038e 001 2 1.476513e + 000 8.734179e 001 1.321894e 001 3 1.410326e + 000 3.812507e 001 1.555213e 001 4 1.417633e + 000 1.586346e 001 9.620188e 002 5 1.423860e + 000 4.298504e 002 1.043037e 002 6 1.415846e + 000 4.681398e 003 2.583147e 003 7 1.414375e + 000 6.074087e 004 6.288185e 004 8 1.414212e + 000 4.051447e 006 1.805771e 006 9 1.414214e + 000 2.724111e 008 1.154246e 008 10 1.414214e + 000 1.182169e 011 0.000000e + 000 D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 21

MATLAB s fsolve. Similar to a build-in function fzero, Optimization Toolbox has a function fsolve that tries to solve a system of nonlinear equations F (x) = 0. Warning: You need this Toolbox in order to use this function. Syntax: x = fsolve(fun,x0) x = fsolve(fun,x0,options) x = fsolve(problem) [x,fval] = fsolve(fun,x0) [x,fval,exitflag] = fsolve(...) [x,fval,exitflag,output] = fsolve(...) [x,fval,exitflag,output,jacobian] = fsolve(...) D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 22

MATLAB s fsolve. x = fsolve(fun,x0) starts at x0 and tries to solve the equations described in fun. [x,fval] = fsolve(fun,x0) returns the value of the objective function fun at the solution x. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 23

MATLAB s fsolve. Thus in our example with ( ) x 2 F (x) = 1 + x 2 2 2 e x1 1 + x 3 2 2 [x,fval] = fsolve( ex1,[1.5 2]) produces and x 0 = ( 1.5 2 ) x = 0.999999999999919 1.000000000000164 fval = 1.0e-012 * 0.165201186064223 0.410338429901458 D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 24

MATLAB s fsolve. Example Let s say you want to solve 2x 1 x 2 = e x1 x 1 + 2x 2 = e x2. In other words we want to solve nonlinear system 2x 1 x 2 e x1 = 0 x 1 + 2x 2 e x2 = 0. D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 25

MATLAB s fsolve. First we create a function in a m-file. function varargout = ex3(x) % % % [F] = ex3(x) returns the function value in F % [F,Jac] = ex3(x) returns the function value in F and % the Jacobian in Jac % % return the function value varargout{1} = [ 2*x(1)-x(2)-exp(-x(1)); -x(1) + 2*x(2) - exp(-x(2))]; if( nargout > 1 ) % return the Jacobian as the second argument varargout{2} = [ 2+exp(-x(1)) -1; -1 2+exp(-x(2))]; end D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 26

MATLAB s fsolve. Now calling [x,fval] = fsolve( ex3,[-5-5]) produces x = 0.567143031397357 0.567143031397357 fval = 1.0e-006 * -0.405909605705190-0.405909605705190 D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 27

MATLAB s fsolve. We got 6 digits of accuracy. If we need more we can use options = optimset( TolFun,1e-10) then calling [x,fval] = fsolve( ex3,[-5-5],options) produces x = 0.567143290409772 0.567143290409772 fval = 1.0e-013 * -0.179856129989275-0.179856129989275 D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 28

MATLAB s fsolve. You could learn more about fsolve by typing help fsolve or looking at the function on mathworks website fsolve Similarly you learn there more about the options by typing help optimset or just optimset D. Leykekhman - MATH 3795 Introduction to Computational Mathematics Linear Least Squares 29