Nonlinear equations. Norms for R n. Convergence orders for iterative methods

Similar documents
1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0

THE INVERSE FUNCTION THEOREM

1. Bounded linear maps. A linear map T : E F of real Banach

X. Linearization and Newton s Method

Mathematical Analysis Outline. William G. Faris

x 2 x n r n J(x + t(x x ))(x x )dt. For warming-up we start with methods for solving a single equation of one variable.

f(x) f(z) c x z > 0 1

FIXED POINT ITERATIONS

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005

A nonlinear equation is any equation of the form. f(x) = 0. A nonlinear equation can have any number of solutions (finite, countable, uncountable)

Maria Cameron. f(x) = 1 n

Math 421, Homework #9 Solutions

The Inverse Function Theorem via Newton s Method. Michael Taylor

Most Continuous Functions are Nowhere Differentiable

Course 224: Geometry - Continuity and Differentiability

Nonlinear Programming Algorithms Handout

Metric Spaces and Topology

d(x n, x) d(x n, x nk ) + d(x nk, x) where we chose any fixed k > N

STOP, a i+ 1 is the desired root. )f(a i) > 0. Else If f(a i+ 1. Set a i+1 = a i+ 1 and b i+1 = b Else Set a i+1 = a i and b i+1 = a i+ 1

Numerical Methods for Differential Equations Mathematical and Computational Tools

Unconstrained minimization of smooth functions

INVERSE FUNCTION THEOREM and SURFACES IN R n

Selçuk Demir WS 2017 Functional Analysis Homework Sheet

Review of Multi-Calculus (Study Guide for Spivak s CHAPTER ONE TO THREE)

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 5. Nonlinear Equations

Lecture 5. Ch. 5, Norms for vectors and matrices. Norms for vectors and matrices Why?

Implicit Functions, Curves and Surfaces

8 Numerical methods for unconstrained problems

Chapter 1. Optimality Conditions: Unconstrained Optimization. 1.1 Differentiable Problems

NATIONAL BOARD FOR HIGHER MATHEMATICS. Research Scholarships Screening Test. Saturday, February 2, Time Allowed: Two Hours Maximum Marks: 40

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Math 127C, Spring 2006 Final Exam Solutions. x 2 ), g(y 1, y 2 ) = ( y 1 y 2, y1 2 + y2) 2. (g f) (0) = g (f(0))f (0).

1.1. The category of Banach spaces. A Banach space E over F is a complete normed vector space where F = R or C. This means that

Chapter 0. Mathematical Preliminaries. 0.1 Norms

2tdt 1 y = t2 + C y = which implies C = 1 and the solution is y = 1

Economics 204 Fall 2013 Problem Set 5 Suggested Solutions

We denote the derivative at x by DF (x) = L. With respect to the standard bases of R n and R m, DF (x) is simply the matrix of partial derivatives,

THE INVERSE FUNCTION THEOREM FOR LIPSCHITZ MAPS

Economics 204 Fall 2011 Problem Set 2 Suggested Solutions

Solutions Final Exam May. 14, 2014

NORMS ON SPACE OF MATRICES

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Problem set 5, Real Analysis I, Spring, otherwise. (a) Verify that f is integrable. Solution: Compute since f is even, 1 x (log 1/ x ) 2 dx 1

Chapter 3. Differentiable Mappings. 1. Differentiable Mappings

Functions of Several Variables (Rudin)

APPLICATIONS OF DIFFERENTIABILITY IN R n.

LMI Methods in Optimal and Robust Control

Elements of Linear Algebra, Topology, and Calculus

Multivariable Calculus

CALCULUS JIA-MING (FRANK) LIOU

Numerical Method for Solving Fuzzy Nonlinear Equations

17 Solution of Nonlinear Systems

Econ Lecture 14. Outline

Several variables. x 1 x 2. x n

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

3.1 Introduction. Solve non-linear real equation f(x) = 0 for real root or zero x. E.g. x x 1.5 =0, tan x x =0.

Inexact Newton Methods Applied to Under Determined Systems. Joseph P. Simonis. A Dissertation. Submitted to the Faculty

COMPLETE METRIC SPACES AND THE CONTRACTION MAPPING THEOREM

Iterative Solution of a Matrix Riccati Equation Arising in Stochastic Control

Economics 204 Summer/Fall 2011 Lecture 11 Monday August 8, f(x + h) (f(x) + ah) = a = 0 = 0. f(x + h) (f(x) + T x (h))

Outline. Scientific Computing: An Introductory Survey. Nonlinear Equations. Nonlinear Equations. Examples: Nonlinear Equations

Chapter 8: Taylor s theorem and L Hospital s rule

Problem Set 6: Solutions Math 201A: Fall a n x n,

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7

Math 140A - Fall Final Exam

Bootcamp. Christoph Thiele. Summer As in the case of separability we have the following two observations: Lemma 1 Finite sets are compact.

converges as well if x < 1. 1 x n x n 1 1 = 2 a nx n

Higher-Order Methods

Unconstrained optimization

Notes on Numerical Optimization

Nonlinear Systems Theory

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

1. Method 1: bisection. The bisection methods starts from two points a 0 and b 0 such that

Differentiable Functions

NOTES ON MULTIVARIABLE CALCULUS: DIFFERENTIAL CALCULUS

Self-Concordant Barrier Functions for Convex Optimization

Journal of Computational and Applied Mathematics

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

You should be able to...

HOMEWORK FOR , FALL 2007 ASSIGNMENT 1 SOLUTIONS. T = sup T(x) Ax A := sup

Math 361: Homework 1 Solutions

MATH 205C: STATIONARY PHASE LEMMA

Numerical Methods for Large-Scale Nonlinear Systems

Real Analysis Problems

Chapter 4. Inverse Function Theorem. 4.1 The Inverse Function Theorem

Determine for which real numbers s the series n>1 (log n)s /n converges, giving reasons for your answer.

Analysis II: The Implicit and Inverse Function Theorems

PRACTICE PROBLEMS FOR MIDTERM I

CS 450 Numerical Analysis. Chapter 5: Nonlinear Equations

Spectral Mapping Theorem

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

FIXED POINT METHODS IN NONLINEAR ANALYSIS

Lecture 13 Newton-type Methods A Newton Method for VIs. October 20, 2008

Nonlinearity Root-finding Bisection Fixed Point Iteration Newton s Method Secant Method Conclusion. Nonlinear Systems

MATH 3795 Lecture 13. Numerical Solution of Nonlinear Equations in R N.

Numerical Methods in Informatics

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

Kantorovich s Majorants Principle for Newton s Method

Chapter 3: Root Finding. September 26, 2005

Chapter 4. Unconstrained optimization

Transcription:

Nonlinear equations Norms for R n Assume that X is a vector space. A norm is a mapping X R with x such that for all x, y X, α R x = = x = αx = α x x + y x + y We define the following norms on the vector space R n : x 1 = x 1 + + x n x 2 = ( x 1 2 + + x n 2) 1/2 x = max { x 1,..., x n } A matrix A R n n corresponds to a linear mapping from R n to R n. If x denotes a vector norm for x R n we can define the matrix norm A as follows: Ax A = sup x R n x = sup Ax x R n x =1 Note that S = {x R n x = 1} is a compact set, hence there exists a point x S with Ax = sup x R n Ax, x =1 and we can write max instead of sup. Lemma 1. For E R n n with E < 1 the matrix I + E is invertible and (I + E) 1 1 1 E. Proof: x = (1 + E)x Ex (1 + E)x + E x, hence (I + E)x (1 E ) x. Convergence orders for iterative methods We want to approximate x R n. An iterative method gives a sequence x, x 1, x 2, x 3,.... We say the method converges if looing at quotients lim x = x. We can specify more precisely how quicly it converges by x +1 x x x α for α 1. This leads to so-called q-orders of convergence (where q stands for quotient): the method converges with q-order 1 or q-linearly if there exists C < 1 such that for all x +1 x C x x the method converges with q-order 2 or q-quadratically if lim x = x and there exists C such that for all x +1 x C x x 2 the method converges q-superlinearly if x +1 x lim x x = Example: We consider a nonlinear equation f(x) = where x D R. Assume 1

f(x ) = the derivatives f and f exist around x and are continuous f (x ) Then there exists ɛ > such that for x x < ɛ the Newton method converges q-quadratically for x x < ɛ and x 1 x < ɛ the secant method converges with q-order α = 5+1 2 1.618. Hence it converges q-superlinearly. In many cases we cannot prove that the error improves for every single step from to + 1, but we still have upper bounds x x E where E converges with a certain q-order to zero. This leads to so-called r-orders of convergence: a method converges with r-order 1 or r-linearly if x x E where E converges with q-order 1 to a method converges with r-order 2 or r-quadratically if x x E where E converges with q-order 2 to a method converges r-superlinearly if x x E where E converges with superlinearly to Example: We consider a nonlinear equation f(x) = where f is continuous on the interval [a, b ] f(a ) f(b ) < Then the bisection method gives a sequence of intervals [a, b ]. Let c = (a + b )/2. Then the sequence c converges r-linearly to a point x with f(x ) = : c x E = b a 2 +1 Note that in general we do not have q-linear convergence of c for the bisection method. Derivatives Let X and Y denote normed vector spaces. Consider a mapping f from D X to Y. We say that f is Frechet differentiable at a point x D if there exists a linear mapping A: X Y such that f(x) f(x ) A(x x ) x x x x Lipschitz conditions We say a function f satisfies a Lipschitz condition with constant L on D if f(x) f(y) y x for all x, y D Lemma 2. Assume D is convex f(x) exists and is continuous on D f(x) L for x D Then f satisfies a Lipschitz condition on D with constant L on D. 2

Taylor remainder terms For a function f : D R with D R we have estimates ( ) f(x + h) f(x) max f (u) h u conv(x,y) if f C 1 (D) ( ) 1 f(x + h) f(x) f (x)h max f (u) u conv(x,y) 2 h 2 if f C 2 (D). We would lie to have analogous estimates for a function f : D R n with D R n. Assume thatf C 1 (D). Then the derivative exists in the Frechet sense: f(x + h) = f(x) + Df(x) h + R, R = o( h ) for h where Df(x): R n R n is a linear mapping h Df(x) h. Since linear mappings correspond to matrices in R n n we use the notation f 1 (x) 1 f 1 (x) n f 1 (x) f(x) =., Df(x) = f (x) =.. f n (x) 1 f n (x) n f n (x) y = Df(x) h y i = j f i (x)h j j=1 y i j f i (x) h, y max j=1 Df(x) = max i=1,...,n i=1,...,n j f i (x) h j=1 j f i (x) with i = x i. Assume that D is convex and x, y D. Let h := y x and conv(x, y) D denote the line segment betwen x and y. We have by the fundamental theorem of calculus and the chain rule for j=1 F (t) := f(x + th), F (t) = Df(x + th) h f(y) f(x) = F (1) F () = f(y) f(x) F (t)dt = f (x + th)h dt ( ) f (x + th h dt max f (u) y x. u conv(x,y) This shows that the Lipschitz constant for f in D is bounded by max u D f (u). Note that we first choose a vector norm, e.g., v = max j=1,...,n v j. For A R n n this induces a matrix norm: A = max Av = max A ij. v =1 i f (x) = max f (x)v = max j f i. v =1 i We can read this as an estimate for the Taylor remainder term of order 1: f(y) = f(x) + R 1, R 1. Now we want to consider the next Taylor term and remainder: f(y) = f(x) + f (x)(y x) + R 2 : R 2 = f(y) f(x) f (x)(y x) = Assume that f satisfies a Lipschitz condition for x, y D: j j [f (x + t(y x)) f (x)] (y x) dt. f (y) f (x) γ y x for x, y D, 3

then we obtain R 2 γt y x 2 dt = γ 2 y x 2. Now we assume that f C 2 (D): For a fixed u R n let G(x) := Df(x) u. Then DG(x) v = D 2 f(x) u, v where y = D 2 f(x) u, v y i = ( j f i ) u j v y i j,=1 Now we have for any u R n and h = y x Df(y) Df(x) u j,=1 j f i u v, y max D 2 f(x) = max i=1,...,n i=1,...,n j,=1 j,=1 j f i. G(t) := Df(x + th) u, G (t) = D 2 f(x + th) u, h [Df(y) Df(x)] u = G(1) G() = D 2 f(x + th) u h dt, G (t)dt = Local convergence results for fixed point iteration Assume that D is open so that x D implies that a ball B δ D. j f i u v D 2 f(x + th) u, h dt ( Df(y) Df(x) max z D D 2 f(z) ) y x. Theorem 3. Assume g : D R n with D R n, and g(x ) = x for x D. If g C 1 (D) and g (x ) < 1 then the fixed point iteration x +1 := g(x ) is locally convergent: There exists δ > so that for x x < δ there holds lim x = x. Proof: Since g is continuous there exists δ >, q < 1 so that g (x) q for x B δ := {x x x < δ}. Assume that x B δ. Then x +1 x = g(x ) g(x ) ( ) max g (u) x x q x x. u B δ Hence x +1 B δ. Then x B δ implies x x q x x and therefore lim x = x. Remar: For a chosen vector norm (e.g., ) and the induced matrix norm one may have g (x ) > 1, but for a different vector norm and the induced matrix norm one could still have g (x ) > 1. The spectral radius ρ(a) := max{ λ j (A) } denotes the maximum absolute values of eigenvalues of A and satisfies [see e.g. Stoer-Bulirsch Theorem (6.9.2)] ρ(a) A for any choice of vector norm and induced matrix norm for any ɛ > there exists a vector norm so that its induced matrix norm satisfies A ρ(a) + ɛ. Therefore we can replace the condition g (x ) < 1 by ρ(g (x )) < 1 and the conclusion of the theorem still holds. Theorem: Assume g : D R n with D R n, and g(x ) = x for x D. If g (x ) = and g is Lipschitz in D then the fixed point iteration x +1 := g(x ) is locally convergent of order 2: There exists δ > so that for x x < δ there holds lim x = x and x +1 x C x x 2. Proof: Assume that we have f (y) f (x) γ y x for x, y D. x +1 x = g(x ) g(x ) g (x )(x x ) γ x x 2. 2 4

Newton method We have local convergence of order 2 as long as f(x ) is a nonsingular matrix: Theorem 4. Assume that f(x ) = and f exists and satisfies a Lipschitz condition in a neighborhood of x f(x ) is nonsingular Then there exists δ > such that for an initial guess x with x x δ the Newton iterates x exist for = 1, 2, 3,... since f(x 1 ) is nonsingular lim x = x the convergence is of q-order 2 Note that we have with d = x +1 x = f(x ) 1 f(x ) and using f(x )d = f(x ) that implying f(x +1 ) = f(x +1 ) f(x +1 ) f(x ) = f(x +1 ) = f(x + t d )d dt [ f(x + t d ) f(x )] d dt f(x + t d ) f(x ) d dt L }{{} 2 d 2 Lt d [ f(x + t d ) f(x )] f(x ) 1 f(x )dt = f(x +1 ) = Newton-Kantorovich theorem The Newton-Kantorovich theorem does not assume that a solution x with f(x ) exists. Theorem 5. Assume [ f(x + t d ) f(x ) 1 I ] f(x )dt f(x ) C, f(x ) 1 C1 f(x) is Lipschitz with constant L for x x ρ where Then h := C C 2 1L 1 2, ρ := 1 1 2h C 1 L there exists a unique solution x B ρ (x ) with f(x ) = the Newton iterates x are well-defined and converge to x the convergence is r-quadratic for h < 1 2, and r-linear for h = 1 2. 5

In preparation for the proof we consider the Newton method for the quadratic equation f(t) = t 2 2t + 2h = : t t 1 t 2 t 3 t t + Lemma 6. Assume ω and h 1 2 and define ω, h for = 1, 2,... by Then ω +1 := ω h 2, h +1 := 1 h 2(1 h ) 2 (1) j= h = 1 1 2h ω ω and the sum converges q-quadratically for h < 1 2, and q-linearly for h = 1 2. Proof: It is sufficient to consider the case ω = 1. Consider the quadratic equation f(t) = t 2 2t + 2h = which has the two roots t ± = 1 ± 1 2h with < t < 1 < t + for h < 1 2 and < t = t + = 1 for h = 1 2. We define the sequence t using the Newton method with t =. We have t < t +1 < t for all, this implies that lim = t. In the case of h < 1 2 we have quadratic convergence, in the case of h = 1 2 we have linear convergence. Note that for a quadratic function f(t) = t 2 + a 1 t + a and its tangent line p(t) at t = c we have f(t) p(t) = (t c) 2 (since it must be a quadratic function with leading coefficient 1 and minimum at t = c). Hence for c = t we have by the definition of the Newton method p(t +1 ) = and (t +1 t ) 2 = f(t +1 ) p(t +1 ) = f(t +1 ) for all so that We now want to show that for =, 1, 2,... we have We use induction: For = we have t =, and 1 t = 1 = ω 1 as ω = 1. t +1 t = f(t ) f (t ) = (t t 1 ) 2 = 1 (t t 1 ) 2. (2) 2t 2 2 1 t t +1 t = ω 1 h, 1 t = ω 1 (3) t 1 t = f()/f () = 2h /( 2) = h = ω h 6

Induction step: Assume that (3) holds. We have ω 1 (D) +1 = ω 1 (1 h ) (I) = ω 1 which is the second equation in (3) with + 1 instead of. We get from (2) that (t +1 t ) (I) = (1 t ) (t +1 t ) = 1 t +1 t +2 t +1 = 1 (t +1 t ) 2 = 1 (h /ω ) 2 2 1 t +1 2 ω 1 = ω 1 1 h 2 +1 2 (ω +1 /ω +1 ) 2 = 1 h 2 ω 1 +1 2 (1 h ) 2 = ω 1 +1 h +1 Note that N = h ω = N (t +1 t ) = t N+1 t = 1 1 2h as N = We use the following notation for open and closed balls: B r (x ) := {x x x < r}, B r (x ) := {x x x r} The Newton method uses the iteration x +1 = x + d with d := f (x ) 1 f(x ) Assumption: There are constants α, ω > such that h := αω 1 2 and f (x ) 1 f(x ) α (4) f (x ) 1 (f (y) f (x)) ω y x for x, y D (5) Then: B ρ (x ) D with ρ := 1 1 2h ω there exists a unique solution x B ρ (x ) with f(x ) = the Newton iterates x are well-defined and converge to x the convergence is r-quadratic for h < 1 2, and r-linear for h = 1 2. Induction: We claim that for =, 1, 2,...: (B ) : (A ) : ω d h (6) f (x ) 1 (f (y) f (x))) ω y x for x, y D (7) where ω +1 := x B ρ (x ) (8) ω h 2, h +1 := 1 h 2(1 h ) 2 (9) Proof: For = the statements (6), (7) are just the assumptions (4), (5). Induction step: Assume that (6), (7), (8) and we have to prove the corresponding statements for + 1. We first prove (8): From (A ) and the Lemma we obtain that x +1 x h j d j = ρ. (1) ω j j= j= 7

We then prove (6) for + 1: Let M := f (x ) 1 f (x +1 ), then f (x +1 ) 1 = M 1 f (x ) 1 and since f (x +1 ) 1 (f (y) f (x)) M 1 f (x ) 1 (f (y) f (x)) (B ) M 1 ω y x 1 ω y x = ω +1 y x 1 h M = I + E, E := f (x ) 1 (f (x +1 ) f (x )), E (B ) ω d (A ) h M 1 = (I + E) 1 1 1 E 1 1 h We now prove (6) for + 1: With x +1 = x + d and f(x ) = f (x )d we have d +1 = f (x +1 ) 1 f(x +1 ) = f (x +1 ) 1 [f(x +1 ) f(x ) + f(x )] = d +1 (B +1) ω +1 t d d dt = 1 2 ω +1 h 2 ω 2 f (x +1 ) 1 [f (x + td ) f (x )] d dt h 2 ω +1 d +1 1 2 (ω /ω +1 ) 2 = 1 2 (1 h ) 2 = h +1. This concludes the induction proof. We now obtain from (1) that x is a Cauchy sequence and therefore has a limit x B ρ (x ) D. By taing the limit in f (x )(x +1 x ) = f(x ) we obtain by the continuity of f and f that f(x ) =. We sip the proof of the uniqueness. h 2 8