FIXED POINT ITERATIONS

Similar documents
Nonlinear equations. Norms for R n. Convergence orders for iterative methods

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

REVIEW OF DIFFERENTIAL CALCULUS

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

THE INVERSE FUNCTION THEOREM

Economics 204 Summer/Fall 2011 Lecture 5 Friday July 29, 2011

Linear Algebra Massoud Malek

COMPLETE METRIC SPACES AND THE CONTRACTION MAPPING THEOREM

Inner products and Norms. Inner product of 2 vectors. Inner product of 2 vectors x and y in R n : x 1 y 1 + x 2 y x n y n in R n

Inverse Function Theorem

BASICS OF CONVEX ANALYSIS

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005

1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0

The Inverse Function Theorem 1

Analysis II: The Implicit and Inverse Function Theorems

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

The Inverse Function Theorem via Newton s Method. Michael Taylor

5 Compact linear operators

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Numerical Methods for Differential Equations Mathematical and Computational Tools

Chapter 4. Inverse Function Theorem. 4.1 The Inverse Function Theorem

Numerical Analysis: Solving Nonlinear Equations

NORMS ON SPACE OF MATRICES

Simple Iteration, cont d

Second Order Optimization Algorithms I

Econ Lecture 14. Outline

Problem Set 6: Solutions Math 201A: Fall a n x n,

Convex Functions and Optimization

MATH 205C: STATIONARY PHASE LEMMA

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Unconstrained optimization

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

COMP 558 lecture 18 Nov. 15, 2010

Section 3.9. Matrix Norm

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

STOP, a i+ 1 is the desired root. )f(a i) > 0. Else If f(a i+ 1. Set a i+1 = a i+ 1 and b i+1 = b Else Set a i+1 = a i and b i+1 = a i+ 1

Gradient Descent. Dr. Xiaowei Huang

Iterative Solution of a Matrix Riccati Equation Arising in Stochastic Control

Implicit Functions, Curves and Surfaces

The Steepest Descent Algorithm for Unconstrained Optimization

A Distributed Newton Method for Network Utility Maximization, II: Convergence

Unconstrained minimization of smooth functions

Numerical Analysis: Solving Systems of Linear Equations

Lecture Notes 6: Dynamic Equations Part C: Linear Difference Equation Systems

Metric Spaces and Topology

Nonlinear Programming Algorithms Handout

INVERSE FUNCTION THEOREM and SURFACES IN R n

1 Math 241A-B Homework Problem List for F2015 and W2016

An introduction to Birkhoff normal form

Math 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March 22, 2018

The goal of this chapter is to study linear systems of ordinary differential equations: dt,..., dx ) T

Unconstrained Optimization

We denote the derivative at x by DF (x) = L. With respect to the standard bases of R n and R m, DF (x) is simply the matrix of partial derivatives,

Chapter 3: Root Finding. September 26, 2005

MATH 581D FINAL EXAM Autumn December 12, 2016

Chapter 1. Optimality Conditions: Unconstrained Optimization. 1.1 Differentiable Problems

at time t, in dimension d. The index i varies in a countable set I. We call configuration the family, denoted generically by Φ: U (x i (t) x j (t))

are continuous). Then we also know that the above function g is continuously differentiable on [a, b]. For that purpose, we will first show that

are Banach algebras. f(x)g(x) max Example 7.4. Similarly, A = L and A = l with the pointwise multiplication

An introduction to Mathematical Theory of Control

Notes on Linear Algebra and Matrix Theory

Chapter 0. Mathematical Preliminaries. 0.1 Norms

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

APPLICATIONS OF DIFFERENTIABILITY IN R n.

Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call. r := b A x.

Selçuk Demir WS 2017 Functional Analysis Homework Sheet

THE INVERSE FUNCTION THEOREM FOR LIPSCHITZ MAPS

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

1 Solutions to selected problems

Linear Algebra Review (Course Notes for Math 308H - Spring 2016)

Homework If the inverse T 1 of a closed linear operator exists, show that T 1 is a closed linear operator.

Richard S. Palais Department of Mathematics Brandeis University Waltham, MA The Magic of Iteration

Review of Multi-Calculus (Study Guide for Spivak s CHAPTER ONE TO THREE)

Mathematical Analysis Outline. William G. Faris

Examination paper for TMA4180 Optimization I

Review of Some Concepts from Linear Algebra: Part 2

Nonsmooth optimization: conditioning, convergence, and semi-algebraic models

THEODORE VORONOV DIFFERENTIABLE MANIFOLDS. Fall Last updated: November 26, (Under construction.)

The Arzelà-Ascoli Theorem

MAT 610: Numerical Linear Algebra. James V. Lambers

Mathematics 530. Practice Problems. n + 1 }

you expect to encounter difficulties when trying to solve A x = b? 4. A composite quadrature rule has error associated with it in the following form

The Contraction Mapping Theorem and the Implicit and Inverse Function Theorems

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 5. Nonlinear Equations

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS

CS 246 Review of Linear Algebra 01/17/19

SMSTC (2017/18) Geometry and Topology 2.

Lecture 11 Hyperbolicity.

Lecture 5. Ch. 5, Norms for vectors and matrices. Norms for vectors and matrices Why?

5. Subgradient method

Numerical Methods in Informatics

w T 1 w T 2. w T n 0 if i j 1 if i = j

Econ 204 Differential Equations. 1 Existence and Uniqueness of Solutions

This manuscript is for review purposes only.

Chapter 7 Iterative Techniques in Matrix Algebra

MATH2071: LAB #5: Norms, Errors and Condition Numbers

Lyapunov Stability Theory

Sobolev Spaces. Chapter 10

Transcription:

FIXED POINT ITERATIONS MARKUS GRASMAIR 1. Fixed Point Iteration for Non-linear Equations Our goal is the solution of an equation (1) F (x) = 0, where F : R n R n is a continuous vector valued mapping in n variables. Since the target space is the same as the domain of the mapping F, one can equivalently rewrite this as x = x + F (x). More general, if G: R n R n n is any matrix valued function such that G(x) is invertible for every x R n, then (1) is equivalent to the equation which in turn is equivalent to G(x)F (x) = 0, x = x + G(x)F (x). Setting Φ(x) := x + G(x)F (x), it follows that x solves (1), if and only if x solves Φ(x) = x; that is, x is a fixed point of Φ. A simple idea for the solution of fixed point equations is that of fixed point iteration. Starting with any point x (0) R n (which, preferably, should be an approximation of a solution of (1)), one defines the sequence (x (k) ) k N R n by x (k+1) := Φ(x (k) ). The rationale behind that definition is the fact that this sequence will become stationary after some index k, if (and only if) x (k) is a fixed point of Φ. Moreover, one can assume that x (k) is close to a fixed point, if x (k) x (k+1). In the following we will investigate conditions that guarantee that this iteration actually works. To that end, we will have to study (shortly) the concept of matrix norms. 2. Matrix Norms Recall that a vector norm on R n is a mapping : R n R satisfying the following conditions: x > 0 for x 0. λx = λ x for x R n and λ R. x + y x + y for all x, y R n. Since the space R n n of all matrices is also a vector space, it is also possible to consider norms there. In contrast to usual vectors, it is, however, also possible to multiply matrices (that is, the matrices form not only a vector space but also an algebra). One now calls a norm on R n n a matrix norm, if the norm behaves well under multiplication of matrices. More precisely, we require the norm to satisfy (in addition) the condition: Date: February 2014. 1

2 MARKUS GRASMAIR AB A B for all A, B R n n. This condition is called the sub-multiplicativity of the norm. Assume now that we are given a norm on R n. Then this norm induces a norm on R n n by means of the formula A := sup { Ax : x 1 }. That is, the norm measures the maximal elongation of a vector x when it is multiplied by the matrix A. Such a norm is interchangeably referred to as either the matrix norm induced by or the matrix norm subordinate to. One can show that a subordinate matrix norm always has the following properties: (1) We always have Id = 1. (2) A subordinate matrix norm is always sub-multiplicative. (3) If A is invertible, then A A 1 1. (4) For every A R n n and x R n we have the inequality Even more, one can write A as Ax A x. A = inf { C > 0 : Ax C x for all x R n}. Put differently, A is the minimal Lipschitz constant of the linear mapping A: R n R n. Finally, we will shortly discuss the most important subordinate matrix norms: The 1-norm is induced by the 1-norm on R n, defined by n x 1 := x i. i=1 The induced matrix norm has the form n A 1 = max a ij. 1jn i=1 The -norm is induced by the -norm on R n, defined by x := max 1in x i. The induced matrix norm has the form n A = max a ij. 1in The 2-norm or spectral norm is induced by the Euclidean norm on R n, defined by ( n x 2 := a ij 2) 1/2. i=1 For the definition of the induced matrix norm, we have to recall that the singular values of a matrix A R n n are precisely the square roots of the eigenvalues of the (symmetric and positive semi-definite) matrix A T A. That is, σ 0 is a singular value of A, if and only if σ 2 is an eigenvalue of A T A. One then has A 2 = max { σ 0 : σ is a singular value of A }.

FIXED POINT ITERATIONS 3 Remark 1. Another quite common matrix norm is the Frobenius norm ( n A F := a ij 2) 1/2. i, One can show that this is indeed a matrix norm (that is, it is sub-multiplicative), but it is not induced by any norm on R n unless n = 1. This can be easily seen by the fact that Id F = n, which is different from 1 for n > 1. 3. Main Theorems Recall that a fixed point of a mapping Φ: R n R n is a solution of the equation Φ(x) = x. In other words, if we apply the mapping Φ to a fixed point, then we will obtain as a result the same point. Theorem 2 (Banach s fixed-point theorem). Assume that K R n is closed and that Φ: K K is a contraction. That is, there exists 0 C < 1 such that Φ(x) Φ(y) C x y for all x, y K, where is any norm on R n. Then the function Φ has a unique fixed point ˆx K. Now let x (0) K be arbitrary and define the sequence (x (k) ) k N K by x (k+1) := Φ(x (k) ). Then we have the estimates x (k+1) ˆx C x (k) ˆx, Ck x (k) ˆx 1 C x(1) x (0), x (k) ˆx C 1 C x(k) x (k 1). In particular, the seqeuence (x (k) ) k N converges to ˆx. Proof. Assume that ˆx, ŷ are two fixed points of Φ in K. Then ˆx ŷ = Φ(ˆx) Φ(ŷ) C ˆx ŷ. Wegen 0 C < 1, it follows that ˆx = ŷ. Thus the fixed point (if it exists) is unique. Let now x (0) K be arbitrary and define inductively x (k+1) = Φ(x (k) ). Then we have for all k N and m N the estimate m x (k+m) x (k) x (k+j) x (k+j 1) (2) m C j 1 x (k+1) x (k) C j 1 x (k+1) x (k) = 1 1 C x(k+1) x (k) C 1 C x(k) x (k 1) Ck 1 C x(1) x (0).

4 MARKUS GRASMAIR This shows that the sequence {x (k) } k N is a Cauchy sequence and thus has a limit ˆx R n. Since the set K is closed, we further obtain that ˆx K. Now note that the mapping Φ is by definition Lipschitz continuous with Lipschitz constant C. In particular, this shows that Φ is continuous. Thus ˆx = lim k x(k) = lim k x(k+1) = lim k Φ(x(k) ) = Φ ( lim x(k)) = Φ(ˆx), k which shows that ˆx is a fixed point of Φ. The different estimates claimed in the theorem now follow easily from the fact that Φ is a contraction and from (2) by considering the limit m (note that the norm is a continuous mapping, and thus the left hand side in said equation tends to ˆx x (k) ). Theorem 3. Assume that Φ: R n R n is continuously differentiable in a neighborhood of a fixed point ˆx of Φ and that there exists a norm on R n with subordinate matrix norm on R n n such that DΦ(ˆx) < 1. Then there exists a closed neighborhood K of ˆx such that Φ is a contraction on K. In particular, the fixed point iteration x (k+1) = Φ(x (k) ) converges for every x (0) K to ˆx. Proof. Because of the continuous differentiability of Φ there exist ρ > 0 and a constant 0 < C < 1 such that DΦ(x) C for all x B ρ (ˆx) := { y R n : y ˆx ρ }. Let now x, y B ρ (ˆx). Then Thus Φ(y) Φ(x) = Φ(y) Φ(x) 1 0 1 0 1 C y x. In particular, we obtain with y = ˆx the inequality 0 DΦ ( x + t(y x) ) (y x) dt. DΦ ( x + t(y x) ) (y x) dt DΦ ( x + t(y x) ) y x dt ˆx Φ(x) = Φ(ˆx) Φ(x) C ˆx x Cρ < ρ. Thus whenever x B ρ (ˆx) we also have Φ(x) B ρ (ˆx). This shows that, indeed, Φ is a contraction on B ρ (ˆx). Application of Banach s fixed point theorem yields the final claim of the theorem. Remark 4. Assume that A R n n and define the spectral radius 1 of A by ρ(a) := max { λ : λ C is an eigenvalue of A }. That is, ρ(a) is (the absolute value of) the largest eigenvalue of A. Then every subordinate matrix norm on R n n satisfies the inequality ρ(a) A. In case a largest eigenvalue is real, this easily follows from the fact that, whenever λ is an eigenvalue with eigenvector x, we have λ x = λx = Ax A x. Division by x yields the desired estimate. 1 This is usually different from the spectral norm of A.

FIXED POINT ITERATIONS 5 Conversely, it is also possible to show that for every ε > 0 there exists a norm ε on R n such that the induced matrix norm ε on R n n satisfies A ε ρ(a) + ε (this is a rather tedious construction involving real versions of the Jordan normal form of A). Together, these results imply that there exists a subordinate matrix norm on R n n such that A < 1, if and only if ρ(a) < 1. Thus, the main condition of the previous theorem could also be equivalently reformulated as ρ ( DΦ(ˆx) ) < 1. 4. Consequences for the Solution of Equations Assume for simplicity that n = 1. That is, we are only trying to solve a single equation in one variable. The fixed point function will then have the form Φ(x) = x + g(x)f(x), and g should be different from zero everywhere or at least near the sought for solution. Assume now that f(ˆx) = 0. Then Theorem 3 states that fixed point iteration will converge to ˆx for all starting values x (0) sufficiently close to ˆx, if Φ (ˆx) < 1 (note that on R there is only a single norm satisfying 1 = 1). Now, if f and g are sufficiently smooth, then the equation f(ˆx) = 0 implies that Φ (ˆx) = 1 + g (ˆx)f(ˆx) + g(ˆx)f (ˆx) = 1 + g(ˆx)f (ˆx). In particular, Φ (ˆx) < 1, if and only if the following conditions hold: (1) f (ˆx) 0. (2) sgn(g(ˆx)) = sgn(f (ˆx)). (3) g(ˆx) < 2/ f (ˆx). This provides sufficient conditions for the convergence of the above fixed point iteration in one dimension. In case of higher dimensions with fixed point function Φ(x) = x + G(x)F (x) with G: R n R n n, we similarly obtain (though with more complicated calculations) DΦ(ˆx) = Id +G(ˆx) DF (ˆx). Consequently, said fixed point iteration will converge locally, if the matrix DF (ˆx) is invertible and the matrix G(ˆx) is sufficiently close to DF (ˆx) 1. 5. The Theorey Behind Newton s Method Newton s method (in one dimension) has the form of the fixed point iteration x (k+1) := Φ(x (k) ) := x (k) f(x(k) ) f (x (k) ). It is therefore of the form discussed above with g(x) := 1 f (x). In particular, this shows that it converges locally to a solution ˆx of the equation f(x) = 0 provided that f (ˆx) 0. When actually performing the iterations, however, it turns out that the method works too well: While Banach s fixed point theorem predicts a linear convergence of the iterates to the limit ˆx, the actual convergence of the iterates is much faster.

6 MARKUS GRASMAIR Definition 5. We say that a sequence (x (k) ) k N R n converges to ˆx with convergence order (or: convergence speed) p 1, if x (k) ˆx and there exists 0 < C < + (0 < C < 1 in case p = 1) such that x (k+1) ˆx C x (k) ˆx p. In the particular case of p = 1 we speak of linear convergence. The rough interpretation is that in each step the number of correct digits increases by a constant number. For p = 2 we speak of quadratic convergence. Here the number of correct digits roughly doubles in each step of the iteration. In general, numerical methods with higher order are (all else being equal) preferable to methods with lower order, if one aims for (reasonably) high accuracy. Theorem 6. Assume that Φ: R R is p-times continuously differentiable with p 2 in a neighborhood of a fixed point ˆx of Φ. Assume moreover that 0 = Φ (ˆx) = Φ (ˆx) =... = Φ (p 1) (ˆx). Then the fixed point sequence x (k+1) = Φ(x (k) ) converges to ˆx with order (at least) p, provided that the starting point x (0) is sufficiently close to ˆx. Proof. First note that Theorem 3 shows that the fixed point sequence indeed converges to ˆx for suitable starting points x (0). A Taylor expansion of Φ at the fixed point ˆx now shows that x (k+1) ˆx = Φ(x (k) ) Φ(ˆx) = p 1 s=1 Φ (s) (ˆx) s! (x (k) ˆx) s + Φ(p) (ξ) (x (k) ˆx) p p! for some ξ between ˆx and x (k). Since Φ (s) (ˆx) = 0 for 0 s p 1, this further implies that x (k+1) x (k) = Φ(p) (ξ) x (k) ˆx p. p! Because Φ (p) is continuous, there exists C > 0 such that for ξ sufficiently close to ˆx. Thus Φ (p) (ξ) p! C x (k+1) x (k) C x (k) ˆx p, and therefore the convergence order of the sequence (x (k) ) k N is (at least) p. Remark 7. It is possible to show conversely that the convergence order is precisely p, if in addition to the conditions of Theorem 6 the function Φ is (p + 1)-times differentiable near ˆx and Φ (p) (ˆx) 0. Corollary 8. Newton s method converges at least quadratically provided that f (ˆx) 0 and f is twice differentiable. Remark 9. In higher dimensions, Newton s method is defined by the iteration where h (k) solves the linear system Equivalently, x (k+1) = x (k) + h (k), DF (x (k) )h = F (x (k) ). x (k+1) = x (k) DF (x (k) ) 1 F (x (k) ).

FIXED POINT ITERATIONS 7 Similarly as in the one-dimensional case, one can show that this method converges (at least) quadratically, provided that F is twice differentiable and DF (ˆx) is an invertible matrix. Department of Mathematics, Norwegian University of Science and Technology, 7491 Trondheim, Norway E-mail address: markus.grasmair@math.ntnu.no