Handout on Newton s Method for Systems

Similar documents
Numerical Methods for Large-Scale Nonlinear Equations

Inexact Newton Methods Applied to Under Determined Systems. Joseph P. Simonis. A Dissertation. Submitted to the Faculty

Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations

THE INEXACT, INEXACT PERTURBED AND QUASI-NEWTON METHODS ARE EQUIVALENT MODELS

Research Article A Two-Step Matrix-Free Secant Method for Solving Large-Scale Systems of Nonlinear Equations

Spectral gradient projection method for solving nonlinear monotone equations

Statistics 580 Optimization Methods

Maria Cameron. f(x) = 1 n

A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION

CS 450 Numerical Analysis. Chapter 5: Nonlinear Equations

Numerical Methods for Differential Equations Mathematical and Computational Tools

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 5. Nonlinear Equations

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

Keywords: Nonlinear least-squares problems, regularized models, error bound condition, local convergence.

On the use of quadratic models in unconstrained minimization without derivatives 1. M.J.D. Powell

A NOTE ON Q-ORDER OF CONVERGENCE

STOP, a i+ 1 is the desired root. )f(a i) > 0. Else If f(a i+ 1. Set a i+1 = a i+ 1 and b i+1 = b Else Set a i+1 = a i and b i+1 = a i+ 1

Interior-Point Methods as Inexact Newton Methods. Silvia Bonettini Università di Modena e Reggio Emilia Italy

The Newton-Raphson Algorithm

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

MAJORIZATION OF DIFFERENT ORDER

Nonlinear Stationary Subdivision

(, ) : R n R n R. 1. It is bilinear, meaning it s linear in each argument: that is

c 2007 Society for Industrial and Applied Mathematics

Worst Case Complexity of Direct Search

DELFT UNIVERSITY OF TECHNOLOGY

Outline. Scientific Computing: An Introductory Survey. Nonlinear Equations. Nonlinear Equations. Examples: Nonlinear Equations

How do we recognize a solution?

system of equations. In particular, we give a complete characterization of the Q-superlinear

An Efficient Solver for Systems of Nonlinear. Equations with Singular Jacobian. via Diagonal Updating

Affine covariant Semi-smooth Newton in function space

Bindel, Spring 2016 Numerical Analysis (CS 4220) Notes for

Improved Damped Quasi-Newton Methods for Unconstrained Optimization

Scientific Computing: An Introductory Survey

CS 323: Numerical Analysis and Computing

Trust Regions. Charles J. Geyer. March 27, 2013

GLOBALIZATION TECHNIQUES FOR NEWTON KRYLOV METHODS AND APPLICATIONS TO THE FULLY-COUPLED SOLUTION OF THE NAVIER STOKES EQUATIONS

An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations

Definition 1. A set V is a vector space over the scalar field F {R, C} iff. there are two operations defined on V, called vector addition

Interior-Point Methods for Linear Optimization

1. Introduction. The problem of interest is a system of nonlinear equations

8 Numerical methods for unconstrained problems

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

R-Linear Convergence of Limited Memory Steepest Descent

An Implicit Multi-Step Diagonal Secant-Type Method for Solving Large-Scale Systems of Nonlinear Equations

Nodal bases for the serendipity family of finite elements

Unconstrained optimization

Preliminary Examination in Numerical Analysis

Zangwill s Global Convergence Theorem

arxiv: v1 [math.na] 25 Sep 2012

5 Quasi-Newton Methods

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

A New Approach for Solving Dual Fuzzy Nonlinear Equations Using Broyden's and Newton's Methods

Newton s Method and Efficient, Robust Variants

Newton s Method. Javier Peña Convex Optimization /36-725

Termination criteria for inexact fixed point methods

A NOTE ON PAN S SECOND-ORDER QUASI-NEWTON UPDATES

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

Single Variable Minimization

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

1. Introduction. In this paper we derive an algorithm for solving the nonlinear system

Conditional Gradient (Frank-Wolfe) Method

DELFT UNIVERSITY OF TECHNOLOGY

A derivative-free nonmonotone line search and its application to the spectral residual method

x 2 x n r n J(x + t(x x ))(x x )dt. For warming-up we start with methods for solving a single equation of one variable.

The Steepest Descent Algorithm for Unconstrained Optimization

A Numerical Study of Globalizations of Newton-GMRES Methods. Joseph P. Simonis. AThesis. Submitted to the Faculty WORCESTER POLYTECHNIC INSTITUTE

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

CONVERGENCE BEHAVIOUR OF INEXACT NEWTON METHODS

Search Directions for Unconstrained Optimization

Math 411 Preliminaries

Linear Algebra Massoud Malek

Generalized eigenspaces

Introduction to Unconstrained Optimization: Part 2

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities

Global convergence of a regularized factorized quasi-newton method for nonlinear least squares problems

M-Arctan estimator based on the trust-region method

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

A new Newton like method for solving nonlinear equations

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. 29 May :45 11:45

NUMERICAL COMPARISON OF LINE SEARCH CRITERIA IN NONLINEAR CONJUGATE GRADIENT ALGORITHMS

OPER 627: Nonlinear Optimization Lecture 9: Trust-region methods

Examination paper for TMA4180 Optimization I

Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization 1

2. Quasi-Newton methods

Implementation of an Interior Point Multidimensional Filter Line Search Method for Constrained Optimization

A PENALIZED FISCHER-BURMEISTER NCP-FUNCTION. September 1997 (revised May 1998 and March 1999)

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

CS 323: Numerical Analysis and Computing

On Solving Large Algebraic. Riccati Matrix Equations

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY

2 CAI, KEYES AND MARCINKOWSKI proportional to the relative nonlinearity of the function; i.e., as the relative nonlinearity increases the domain of co

DELFT UNIVERSITY OF TECHNOLOGY

New Inexact Line Search Method for Unconstrained Optimization 1,2

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725

Unconstrained minimization of smooth functions

Quasi-Newton Methods

1. Let m 1 and n 1 be two natural numbers such that m > n. Which of the following is/are true?

A QP-FREE CONSTRAINED NEWTON-TYPE METHOD FOR VARIATIONAL INEQUALITY PROBLEMS. Christian Kanzow 1 and Hou-Duo Qi 2

Transcription:

Handout on Newton s Method for Systems The following summarizes the main points of our class discussion of Newton s method for approximately solving a system of nonlinear equations F (x) = 0, F : IR n IR n. Conventions: Notation for the nonlinear system is x 1 f 1 (x) x =.., F (x) =. x n f n (x), F (x) = f 1 (x)/ x 1... f 1 (x)/ x n... f n (x)/ x 1... f n (x)/ x n Subscripts are used to denote vector components and matrix entries. Superscripts are used to denote members of sequences, e.g., x (0) is the initial member of {x (k) }. The norm is an arbitrary norm of interest. The phrase x is sufficiently near x means that x x is sufficiently small. Similarly, x is near x means that x x is appropriately small. Newton s method. The basic method is Newton s Method: Given an initial x, x x F (x) 1 F (x). A more appropriate framework for practical implementation is Newton s Method: Given an initial x, evaluate F (x). Update x x + s. The following is our basic local convergence theorem for Newton s method. Theorem 1: Suppose that F is continuously differentiable near x IR n such that F (x ) = 0 and F (x ) is non-singular. Then whenever x (0) is sufficiently near x, the Newton iterates {x (k) } converge to x superlinearly, i.e., with x (k+1) x β k x (k) x, k = 0, 1,... where β k 0. If F also satisfies an inequality F (x) F (x ) L x x (1) 1

for x near x, then the convergence is quadratic, i.e., for a constant β independent of k. x (k+1) x β x (k) x 2, k = 0, 1,... Remark: The property (1) is called Lipschitz continuity of F at x. A proof of local quadratic convergence assuming (1) is given in [1, Th. 5.2.1]. With only a little effort, this can be extended to a proof of local superlinear convergence assuming only continuity of F near x. Newton s method with backtracking. In this, we augment Newton s method with a globalization procedure that tests each step for adequate progress toward a solution and, if necessary, modifies it to obtain a step that gives adequate progress. The globalization considered here is backtracking: at the current approximate solution x, the procedure begins with the Newton step s N = F (x) 1 F (x) and shortens it, if necessary, to obtain an acceptable step s = λs N for some λ (0, 1]. Our test for adequate progress is based on the actual reduction in F and the predicted reduction in F, given, respectively, by ared = F (x) F (x + s), pred = F (x) F (x) + F (x)s. We accept a step s from the current approximate solution x if ared t pred > 0 (2) for a prescribed t (0, 1). The following proposition confirms that a sufficiently short step obtained by backtracking will be acceptable. Proposition 2: If F is differentiable at x and F (x) 0, then a step s = λs N satisfies (2) for all sufficiently small λ > 0. Proof. Note that if s = λs N and 0 < λ 1, then pred = F (x) F (x) + F (x)s = F (x) F (x) + F (x)(λs N ) = F (x) (1 λ)f (x) + λ[f (x) + F (x)s N ] = F (x) (1 λ) F (x) = λ F (x). (3) To justify the third line in (3), we note that F (x) + F (x)s N = 0 since s N = F (x) 1 F (x) and that (1 λ)f (x) = (1 λ) F (x) since 1 λ 0. Then ared = F (x) F (x + s) = F (x) F (x) + F (x)s + o( s ) F (x) F (x) + F (x)s + o( s ) = pred + o( λs N ) = λ F (x) + o(λ). It follows that if F (x) 0 and t (0, 1), then ared t (λ F (x) ) = t pred for all sufficiently small λ > 0. 2

Our first method is the following somewhat general formulation. Newton s Method with Backtracking: Given t (0, 1), 0 < θ min < θ max < 1, and an initial x, evaluate F (x). Evaluate F (x + s). While ared < t pred do: Choose θ [θ min, θ max ]. Update s θs and re-evaluate F (x + s). Update x x + s and F (x) F (x + s). The backtracking globalization is implemented in the while loop. At each pass through the loop, the step s is shortened by a factor θ [θ min, θ max ], where 0 < θ min < θ max < 1. This is known as safeguarded backtracking. The requirement θ θ max < 1 ensures that the step length will be reduced by at least the fraction θ max, and it follows from Proposition 2 that an acceptable step will be determined after at most a finite number of passes through the loop. The requirement 0 < θ min θ ensures that step lengths will not be reduced so much that the iterates cannot converge to a solution. The following is the global convergence result for the method. Theorem 3 [2, Cor. 6.2]: Suppose that F is continuously differentiable and that {x (k) } is a sequence of iterates produced by the method. If x is a limit point 1 of {x (k) } such that F (x ) is non-singular, then F (x ) = 0, x (k) x, and s (k) x (k+1) x (k) = F (x k ) 1 F (x k ) for all sufficiently large k. Note that the theorem does not guarantee that the iterates will always converge to a solution. (Indeed, there can be no such guarantee some problems have no solutions!) Rather, it only asserts that the iterates will behave about as desirably as the function F will allow. Another way of stating the result, which may offer additional insight, is that exactly one of the following must hold: (i) x (k) ; (ii) {x (k) } has one or more limit points, and F is singular at each of them; (iii) {x (k) } converges to a solution x such that F (x ) is nonsingular, and the iterates are ultimately those of Newton s method. In the case of alternative (i), the iterates diverge. In the case of (ii), the iterates may or may not converge, depending on additional properties of F. Alternative (iii) is the 1 We say x is a limit point of {x (k) } if, for every δ > 0, there are infinitely many x (k) such that x (k) x < δ. Note that if {x (k) } is bounded, i.e., there exists an M such that x (k) M for all k, then {x (k) } converges to x if and only if x is the only limit point of {x (k) }. 3

desirable outcome; in this case the iterates converge to a solution, ultimately with the speed of Newton iterates (at least superlinearly and typically quadratically). We now work toward a more refined version of the method. With s = λs N for λ (0, 1] and with pred = λ F (x) by (3), the condition ared < t pred can be simplified to F (x + s) / F (x) > 1 t λ. Also, we can make a sophisticated choice of each θ [θ min, θ max ] in an important (and common) special case: that in which the norm is an inner-product norm, i.e., v = v, v 1/2 for all v IR n, where, is an inner product on IR n. 2 Then, in the while loop, we can choose each θ to minimize over [θ min, θ max ] a quadratic p(θ) = a + bθ + cθ 2 that satisfies p(0) = F (x) 2, p(1) = F (x + s) 2, p (0) = d dθ ( F (x + θs) 2 ) θ=0 = 2 F (x), F (x)s. The quadratic satisfying these conditions is p(θ) = F (x) 2 + 2 F (x), F (x)s θ + { F (x + s) 2 F (x) 2 2 F (x), F (x)s } θ 2. Writing s = λs N and noting F (x)s = λf (x)s N = λf (x), we have F (x), F (x)s = λ F (x) 2 and p(θ) = F (x) 2[ 1 2λθ + { F (x + s) 2 / F (x) 2 1 + 2λ } θ 2]. We have that p (θ) = 0 if and only if θ = λ /{ F (x + s) 2 / F (x) 2 1 + 2λ }, and this θ minimizes p if p (θ) = 2 { F (x + s) 2 / F (x) 2 1 + 2λ } > 0. These observations lead to the following more refined method. 2 An inner product on IR n is a function, from pairs of vectors (u, v) to scalars in IR 1 that satisfies (a) v, v 0 for all v IR n, with v, v = 0 if and only if v = 0; and for all u IR n and v IR n, (b) u, v = v, u, (c) αu, v = α u, v for all α IR 1, and (d) u + v, w = u, w + v, w for all w IR n. The most familiar example is the Euclidean inner-product (the usual dot product), given by u, v = n i=1 u iv i. 4

Newton s Method with Backtracking: Given t (0, 1), 0 < θ min < θ max < 1, and an initial x, evaluate F (x). Evaluate F (x + s) and set λ = 1. While ρ F (x + s) / F (x) > 1 t λ do: If δ ρ 2 1 + 2λ 0, set θ = θ max. Else do: Set θ = λ/δ. If θ > θ max, θ θ max. If θ < θ min, θ θ min. Update s θs, λ θλ, and re-evaluate F (x + s). Update x x + s and F (x) F (x + s). Remarks: Common practical recommendations are to take t = 10 4, θ min = 1/10, and θ max = 1/2. An additional refinement can be added to the backtracking, as follows: After the first step-length reduction in the while loop, there is enough information about F (x + θs) to construct a cubic interpolating polynomial, and one can choose θ to minimize this cubic over [θ min, θ max ]. See [1, Ch. 6] for details. References. 1. J. E. Dennis, Jr., and R. B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Classics in Applied Mathematics, SIAM, Philadelphia, 1996; originally published in Series in Automatic Computation, Prentice Hall, Englewood Cliffs, NJ, 1983. 2. S. C. Eisenstat and H. F. Walker, Globally convergent inexact Newton methods, SIAM J. Optimization, 4 (1994), pp. 393 422. 5