Iterative Methods for Linear Systems

Similar documents
Boundary Value Problems and Iterative Methods for Linear Systems

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

Finite Elements. Colin Cotter. February 22, Colin Cotter FEM

Finite difference method for elliptic problems: I

Simple Examples on Rectangular Domains

CLASSICAL ITERATIVE METHODS

Chapter Two: Numerical Methods for Elliptic PDEs. 1 Finite Difference Methods for Elliptic PDEs

Chapter 1 Foundations of Elliptic Boundary Value Problems 1.1 Euler equations of variational problems

Theory of PDE Homework 2

FEniCS Course. Lecture 0: Introduction to FEM. Contributors Anders Logg, Kent-Andre Mardal

1.Chapter Objectives

Algebra C Numerical Linear Algebra Sample Exam Problems

AMSC/CMSC 466 Problem set 3

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 1: The Finite Element Method

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

2 Two-Point Boundary Value Problems

Class notes: Approximation

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

Scientific Computing

Qualifying Examination

Solving linear systems (6 lectures)

Numerical Methods I Non-Square and Sparse Linear Systems

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Numerical Linear Algebra

Iterative methods for positive definite linear systems with a complex shift

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University

Scientific Computing WS 2018/2019. Lecture 15. Jürgen Fuhrmann Lecture 15 Slide 1

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018

Numerical Methods - Numerical Linear Algebra

Linear Systems of n equations for n unknowns

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Introduction to PDEs and Numerical Methods Tutorial 5. Finite difference methods equilibrium equation and iterative solvers

1 Positive definiteness and semidefiniteness

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts

1. Let a(x) > 0, and assume that u and u h are the solutions of the Dirichlet problem:

Gaussian Elimination without/with Pivoting and Cholesky Decomposition

Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call. r := b A x.

Finite Element Method for Ordinary Differential Equations

A Finite Element Method for an Ill-Posed Problem. Martin-Luther-Universitat, Fachbereich Mathematik/Informatik,Postfach 8, D Halle, Abstract

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)

Weak Formulation of Elliptic BVP s

9.1 Preconditioned Krylov Subspace Methods

A very short introduction to the Finite Element Method

Matrix decompositions

AMS 147 Computational Methods and Applications Lecture 17 Copyright by Hongyun Wang, UCSC

Scientific Computing WS 2017/2018. Lecture 18. Jürgen Fuhrmann Lecture 18 Slide 1

LECTURE 1: SOURCES OF ERRORS MATHEMATICAL TOOLS A PRIORI ERROR ESTIMATES. Sergey Korotov,

Finite Difference Methods for Boundary Value Problems

Lecture 18 Classical Iterative Methods

Scientific Computing: An Introductory Survey

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

EECS 275 Matrix Computation

6.4 Krylov Subspaces and Conjugate Gradients

ACM/CMS 107 Linear Analysis & Applications Fall 2017 Assignment 2: PDEs and Finite Element Methods Due: 7th November 2017

Scientific Computing: Solving Linear Systems

[2] (a) Develop and describe the piecewise linear Galerkin finite element approximation of,

Numerical Solution Techniques in Mechanical and Aerospace Engineering

Preface to the Second Edition. Preface to the First Edition

Methods for sparse analysis of high-dimensional data, II

Numerical Methods for Differential Equations Mathematical and Computational Tools

A Fast Fourier transform based direct solver for the Helmholtz problem. AANMPDE 2017 Palaiochora, Crete, Oct 2, 2017

Linear Algebraic Equations

Cheat Sheet for MATH461

Finite Difference Methods (FDMs) 1

Algebra II. Paulius Drungilas and Jonas Jankauskas

Lecture 2: Linear Algebra Review

Multi-Factor Finite Differences

LU Factorization. LU Decomposition. LU Decomposition. LU Decomposition: Motivation A = LU

1 Discretizing BVP with Finite Element Methods.

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

The Solution of Linear Systems AX = B

Classical iterative methods for linear systems

Methods for sparse analysis of high-dimensional data, II

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

CONVERGENCE BOUNDS FOR PRECONDITIONED GMRES USING ELEMENT-BY-ELEMENT ESTIMATES OF THE FIELD OF VALUES

The Plane Stress Problem

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

Linear Systems of Equations. ChEn 2450

Notes on PCG for Sparse Linear Systems

Lecture Notes on PDEs

6 Linear Systems of Equations

Variational Formulations

Variational Principles for Equilibrium Physical Systems

From Completing the Squares and Orthogonal Projection to Finite Element Methods

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Numerical Solution I

DELFT UNIVERSITY OF TECHNOLOGY

Numerical Analysis Comprehensive Exam Questions

Notes for CS542G (Iterative Solvers for Linear Systems)

Background. Background. C. T. Kelley NC State University tim C. T. Kelley Background NCSU, Spring / 58

Optimal Left and Right Additive Schwarz Preconditioning for Minimal Residual Methods with Euclidean and Energy Norms

Linear Solvers. Andrew Hazel

Transcription:

Iterative Methods for Linear Systems 1. Introduction: Direct solvers versus iterative solvers In many applications we have to solve a linear system Ax = b with A R n n and b R n given. If n is large the solution of the linear system takes a lot of operations, and standard Gaussian elimination may take too long. But in many cases most entries of the matrix A are zero and A is a so-called sparse matrix. This means each equation only couples very few of the n unknowns x 1,..., x n. A typical example are discretizations of partial differential equations, see next section for an example. Direct solvers will give the exact solution after finitely many operations (if we ignore roundoff errors). Gaussian elimination with partial pivoting: This gives a decomposition LU = lower triangular, U is upper triangular. row p 1 of A. row p n of A where L is Cholesky decomposition: We need that A is symmetric positive definite. This gives a decomposition A = LL where L is lower triangular. Cholesky decomposition takes about half the number of operations of Gaussian elimination. Cost for Gaussian elimination and Cholesky algorithm: for full matrices finding the decomposition takes Cn 3 operations. Once have the decomposition solving the linear system for a given vector b takes n operations. for band matrices with bandwidth m, i.e., A ij = 0 for i j > m: Finding the decomposition takes Cm n operations, solving a linear system then takes Cmn operations. In Matlab we should initialize the matrix A as a sparse matrix structure. Then Matlab will only use storage and operations to compute the nonzero elements of L, U. For a matrix A with bandwidth m the factors L, U will also have bandwidth m. For a general sparse matrix A, the factors L, U will usually have additional nonzero elements at locations where A had zero elements. This is called fill-in, and this increases the number of operations. Reordering: If you use the Matlab command x=a\b (where the matrix has sparse array type) then Matlab will try to renumber the unknowns in such a way that the amount of fill-in will minimized. This can substantially reduce the number of operations. The command spparms( spumoni,) makes Matlab print out details about the algorithms used for each following \ command. There are also versions of the lu and chol commands that use reordering to minimize fill-in. 1

. Convection diffusion problem in R d We consider a typical application problem which leads to a large sparse linear system. Equilibrium problems for elastic deformations or heat transfer lead to elliptic differential equations. Boundary value problem In the convection diffusion problem we have a domain R d. For d = 1 we consider an interval, for d = we consider a polygon, for d = 3 we consider a polyhedron. We want to find a function u(x) for x such that u + b u = f in, u = 0 on the boundary where b R n is a constant vector, and f is a given function on. This is called a boundary value problem. Variational formulation We first want to find the variational formulation : If we multiply the PDE by a test function v which is zero on the boundary and integrate over we obtain after using the first Green formula ( u)v dx + u v dx = ( nu)v ds [ u v + (b u)v] dx = fv dx } {{ } }{{ } a(u, v) l(v) We use the Hilbert space V = H0 {u 1() = ( } u + u ) dx <, u = 0 with the norm u V = ( u + u ) dx = u L () + u L (). Note that (b u)u dx = (b 1u x1 u + b u x u) dx = 0: For β b(x ) x =α x 1 =a(x ) b 1u x1 u dx 1 dx integration by parts gives for the inner integral b(x ) x 1 =a(x ) u x 1 u dx 1 = b(x ) x 1 =a(x ) u x 1 u dx 1 since u is zero on the boundary. We then obtain that a(, ): V V R is a bilinear form satisfying for all u, v V a(u, v) L a u V v V (1) a(u, u) γ a u V () The first inequality follows from the Cauchy-Schwartz inequality. For the second inequality we use a(u, v) = u u dx and the Poincare inequality u L () C u L () ). We obtain that l: V R is a linear functional such that l(v) C l v V. The variational formulation is: Find u V such that v V : a(u, v) = l(v) By the Lax-Milgram theorem (see Appendix A below) this variational problem has a unique solution u V.

Finite element discretization We choose a finite dimensional subspace V h V. For d = the domain is a polygon, and we divide it into a mesh of triangles. (For d = 1 we divide the interval into subintervals, for d = 3 we divide the polyhedron into a mesh of tetrahedra). Then we define V h as the space of piecewise linear functions on the mesh which are continuous in and are zero on the boundary. The discrete problem is: Find u h V h such that v h V h : a(u h, v h ) = l(v h ) (3) Since V h V the inequalities (1), () are satisfied for u, v V h. Hence by the Lax-Milgram theorem the discrete problem has a unique solution u h. We can specify a function v h V h by specifying the values v 1,..., v n at the interior nodes x 1,..., x n of the mesh. The basis function φ j is the function in V h with φ j (x j ) = 1 and φ(x k ) = 0 for k j. We can then write u h as u h = u 1 φ 1 + + u n φ n where u = u 1. u n R n is the coefficient vector. Now (3) for v h = φ 1,..., φ n gives the linear system Au = b, A jk = a(φ k, φ j ), b j = l(φ j ). Therefore the finite element method involves the following steps: pick a mesh on assemble the stiffness matrix A and the right hand side vector b solve the linear system Au = b Work for direct solvers For d = 1 we obtain a tridiagonal matrix A. Hence the work is proportional to N = 1/h. As a simple example for d = 3 consider the cube = (0, 1) 3. Let h = 1/N with positive integer N. By using uniform grids x 1 = j 1 /N, x = j /N, x 3 = j 3 /N with j 1, j, j 3 {0,..., N} for each coordinate we can subdivide into N 3 smaller cubes. We can then subdivide each of the smaller cubes into tetrahedra. We have n = (N 1) 3 interior nodes with j 1, j, j 3 {1,..., N 1}, and we can order them lexicographically by (j 1, j, j 3 ): (1, 1, 1),..., (1, 1, N 1), (1,, 1),..., (N, N, N). Then the resulting stiffness matrix A has size n n with n = (N 1) 3, and bandwidth (N 1). This will also hold for a more general domain R d, assuming that all triangles/tetrahedra are of size h, up to a constant : We will have n = dim V h ch d and a bandwidthm c h 1 d. Therefore the work for Gaussian elimination is with h cn 1 m n C ( N d 1) N d = CN 3d For d = we have therefore O(N 4 ) operations. For d = 3 we have O(N 7 ) operations. Using Matlab s reordering algorithms reduces the work to N 3 for d =, but for d = 3 it does not improve the rate O(N 7 ). Work of direct solvers for the convection-diffusion problem: d = 1 d = d = 3 Gaussian elimination (using band structure) N 1 N 4 N 7 Gaussian elimination with reordering N 1 N 3 N 7 3

Estimates (A u, v) L u v and (A u, u) γ u for the stiffness matrix A A function v h V h is given by a coefficient vector v = and v h L. How are these norms related to the norm v the coefficient vector v? v 1. v n. For the function v h we have the norms v h L We assume that all triangles are of size h, up to a constant. More precisely: We assume that a circle of radius c 0 h fits inside each triangle, and each triangle fits inside a circle of radius C 0 h. Then one can show that there exist constants c 1, c, c 3 depending on c 0 and C 0 such that c 1 h d/ v v h L c h d/ v v h L c 3 h 1 h d/ v This implies v h H 1 = v h L + v h L c h d v Hence we obtain for functions u h, v h V h with coefficient vectors u, v (A u, v) = a(u h, v h ) L a u h H 1 v h H 1 L a c h d u v (4) (A u, u) = a(u h, u h ) γ a u h L γ ac 1h d u (5) Hence we obtain ( ) L = A Ch d, γ = λ 1 min (A + A ) C h d We can split the bilinear form a(u, v) into a diffusion part and a convection part a(u, v) = u v dx + (b u)v dx, }{{}}{{} a diff (u, v) a conv (u, v) hence we have A = A diff + A conv with (A diff u, v) = u h v h dx u h L v h L c 3h d u v (A conv u, v) = u h (bv h ) dx u h L b v h L c 3 c b h d 1 u v 3. 1-step minimum residual method aka GMRES(1) We want to solve the linear system Au = b where A R n n and b R n. We assume that A is positive definite: (Au, u) γ u for all u R n The current guess is u (k). We compute the residual r (k) := b Au (k) and define the new guess as u (k+1) := u (k) + α k r (k) where we choose α k such that the new residual has r (k+1) := b Au (k+1) has minimum norm r (k+1), yielding α k := ( Ar (k), r (k)) Ar (k) 4

Note that each step requires one matrix-vector product Ar (k) (and a few inner products of vectors). If the matrix A satisfies (4), (5) we obtained earlier (see Appendix A below) r (k+1) ( 1 K 1) ( ) 1/ r (k) L with K = (6) γ implying r (k) ( 1 K 1) k/ r (0), Let κ := L/γ. Since A 1 γ 1 we have cond (A) κ: If the matrix A is symmetric we have cond (A) = κ: u (k) u γ 1 ( 1 K 1) k/ r (0) (7) cond (A) = A A 1 Lγ 1 = κ. L = A = λ max (A), γ = λ min (A), cond (A) = A A 1 = Lγ 1 = κ. Assume that we have r (k+1) q r (k) with q = 1 ε. In order to achieve r (k) δ we need to pick k such that r (k) q k r (0) δ, hence k log(δ/ r (0) ) log q For q = 1 ε the first order Taylor approximation gives log(1 ε) ε, hence we need approximatively ) steps for the iterative method. k ε 1 log ( r (0) δ = ε 1 C δ Here we have q = ( 1 K 1) 1/ 1 1 K 1 using Taylor. Hence we need steps for the iterative method. k C δ K = C δ κ Note that for the convection diffusion problem we have κ = Ch and q 1 ch 4. Therefore it would seem that we need Ch 4 steps of our iterative method. But it turns out that this estimate is too pessimistic. Actually we have q 1 ch and we need only Ch steps of our iterative method as we will see in the next section. 4. Sharper estimates for the convergence factor Symmetric case Recall that r (k+1) = (I αa)r (k) r (k) I αa. If A is symmetric, then also I αa is symmetric and we have with the eigenvalues λ 1,..., λ n of A I αa = max 1 αλ j (8) j=1,...,n If A is positive definite, the eigenvalues are positive, and we can minimize (8) by choosing α such that yielding α = λ max + λ min, 1 + αλ max = (1 αλ min ) I αa = λ max λ min λ max + λ min = 1 κ + 1 with κ = λ max λ min So the convergence factor is q = 1 κ + 1 and the number of iterations is proportional to κ = cond (A) (and not κ as the earlier estimate (6) would suggest). 5

Nonsymmetric case We can write A as a sum of a symmetric part H and antisymmetric part S: A = H + S, H := 1 (A + A ), S := 1 (A A ) We assume that A is positive definite, i.e., (Av, v) = (Hv, v) > 0 for v 0. Let u denote the current guess, and r := b Au the residual. The next approximation is u new = u + αr, with the residual r new = b Au new = (I αa)r. Hence r new = ((I αa)r, (I αa)r) = r α (Ar, r) + α Ar Note that both Ar and (Ar, r) 1/ = (Hr, r) 1/ define norms on R n. Therefore there exists C > 0 such that Then Ar C (Ar, r) for all r R d (9) r new r + [ α + Cα ] (Ar, r) The bound is minimal for α = C 1, and with this we get r new r C 1 (Ar, r) It remains to find C such that (9) holds: with v = Ar we get [ 1 γ ] r C (Ar, r) = (Hr, r) = ( HA 1 v, A 1 v ) Then we obtain (v, v) C(A} {{ HA 1 } v, v) with C = λ min (B) 1 = λ max (B 1 ) since B is symmetric. Hence we B need an estimate ( w, B 1 w ) C (w, w) with B 1 = AH 1 A : Using A = H S we get ( w, B 1 w ) = ( (H S)w, H 1 (H S)w ) = (Hw, w) (Sw, w) ( Hw, H 1 Sw ) + ( Sw, H 1 Sw ) }{{}}{{} 0 0 λ max (H) w + λ min (H) Sw Since (Sw, Sw) = ( S w, w) ρ(s) w we obtain Note: the eigenvalues of H are real and positive. C = λ max (H) + ρ(s) λ min (H) the eigenvalues of S are of the form ±α j i with α j 0. Proof: Let µ j denote the eigenvalues of S. The matrix S is symmetric and has real eigenvalues µ j 0 because ( S w, w ) = (Sw, Sw) 0. Since the matrix S is real, taking the complex conjugate of Sv = µ j v gives Sv = µ j v. Hence µ j is also an eigenvalue of S. Theorem 4.1. Let A R n n, let H := 1 (A + A ), S := 1 (A A ). If A is positive definite, i.e., λ min (H) > 0 the 1-step minimum residual iteration satisfies r (k+1) ( 1 K 1) 1/ r (k) K := cond (H) + ( ) ρ(s) (10) λ min (H) 6

( ) A Note: The number of iterations is proportional to K. In our earlier estimate (7) we had K = whereas γ we now obtain K = H ( ) S + γ γ This shows that for a symmetric matrix A = H the number of steps is proportional to the condition number. If we have nonsymmetric A = H +S then K increases by ( S /γ). So we see that the quadratic term ( A /γ) in our earlier estimate is actually only caused by the antisymmetric part S. Application to convection diffusion problem Recall that A = A diff + A conv with the symmetric matrix H = A diff and the antisymmetric matrix S = A conv and Therefore we have yielding with (10) K := cond (H) + C 1 h d u (A diffu, u) C h d u (A conv u, v) C 3 h d 1 u v λ min (H) C 1 h d, λ max (H) C h d, ρ(s) = S C 3 h d 1 ( ) ρ(s) C ( ) h C3 + h 1 = Ch, q = ( 1 C 1 h ) 1/ λ min (H) C 1 C 1 This means that we need C h steps of the iterative method to reduce the norm of the residual by a fixed factor. ( ) Note: C 3 is proportional to b, so we obtain K = C + C b h. So for a problem with strong convection the number of iterations can be very large. Recall that the stiffness matrix A is of size n n with cn nonzero elements where n ch d cn d (for the meshsize h 1/N). Therefore the work of a matrix-vector product is given by the number of nonzero matrix elements cn d. The work of one step of the 1-step min. res. method is one matrix-vector product, and some inner products, so the work per step is c N d. The number of steps is proportional to h N if we want to achieve a residual with r δ. Hence the total work for our iterative method with q = 1 ch is CN N d If we had an iterative method with q = 1 ch we would obtain a toal work of CNN d instead. Summary: work for solving the convection-diffusion problem d = 1 d = d = 3 Gaussian elimination (using band structure) N 1 N 4 N 7 Gaussian elimination with reordering N 1 N 3 N 7 1-step min. res. method, q = 1 ch N 3 N 4 N 5 iterative method with q = 1 ch N N 3 N 4 Note that for d = 1 using an iterative method is pointless. For d = the direct solver with reordering is better than the 1-step min. res. method. For d = 3 the iterative method is clearly better than the direct method. In the case of a symmetric matrix we can construct an iterative method with q = 1 ch. This is the conjugate gradient method which we will discuss next. 7

A. Solving F (u) = b using Richardson iteration or minimal residual method Lemma A.1. Let V be a Hilbert space. Assume that the function F : V V satisfies with constants L, γ > 0 for all u, v V F (v) F (u) L v u (11) F (v) F (u), v u γ v u (1) Then the equation F (u) = b with b V has a unique solution u V. The inverse mapping satisfies for b, c V F 1 (c) F 1 (b) γ 1 c b. (13) Proof. Consider the Richardson iteration u k+1 = G(u k ) with G(u) := u + α(b F (u)) with α > 0. We claim that G is a contraction if α is small: With e := v u we have G(v) G(u) = e α(f (v) F (u)) = e α F (v) F (u), e + α F (v) F (u) (1 αγ + α L ) e. For g(α) := 1 αγ+α L we have g(0) = 1 and g (0) < 0, so G is a contraction for sufficiently small α. It is easy to see that g(α) < 1 for α (0, γ/l ); we can minimize g(α) by choosing α = γ/l and obtain G(v) G(u) (1 γ /L ) 1/ v u. By the contraction mapping theorem the equation G(u) = u F (u) = b has a unique solution. We obtain (13) from γ v u F (v) F (u), v u F (v) F (u) v u. If we know some (possibly nonoptimal) constants L, γ satisfying (11), (1) the Richardson iteration u k+1 = u k + α (b F (u k )) can be used with α = γ/l to find an approximate solution of the nonlinear equation, and we have ) 1/ u k+1 u (1 γ L u k u. If we do not know the constants γ, L we can use a line search for α > 0 such that for u k+1 = u k + αr k the new residual r k+1 = b F (u k+1 ) has minimal norm, i.e., f k (α) := b F (u k + αr k ) becomes minimal: r k+1 = f k (α) = r k + F (u k ) F (u k + αr k ) = r k α 1 F (u k + αr k ) F (u k ), αr k + F (u k + αr k ) F (u k ) (1 αγ + αl ) r k This means that f k (α) < f k (0) for α (0, γ/l ), and f k (α) (1 γ /L )f k (0) for α = γ/l. Assume that our approximate line search yields a value α so that f k (α) < q f k (0) with some q < 1 independent of k, then r k = b F (u k ) q k/ r 0 0 as k Note that (13) implies and u k u γ 1 q k/ r 0 0 as k. u k u γ 1 F (u k ) b Corollary A.. Assume that f : [t 0, T ] R n R n satisfies with constants L > 0, L R for all t [t 0, T ], y, ỹ R n f(t, ỹ) f(t, y) L ỹ y f(t, ỹ) f(t, y), ỹ y L ỹ y Then for t [t 0, T ], y 0 R n the backward Euler equation y = y 0 + hf(t, y) has a unique solution y R n if hl < 1. In particular for L 0 (dissipative problem) there is a unique solution for any h > 0. 8

Proof. Define F : R n R n by F (y) = y hf(t, y). Then F (v) F (u), v u = (v u) h [f(t, v) f(t, u)], v u (1 hl) v u Note that the assertion is independent of L. Let y j, ỹ j be values at time t j. Then the values at time t j+1 = t j +h are given by y j+1 = F 1 (y j ), ỹ j+1 = F 1 (ỹ j ), and we obtain from (13) with γ = 1 hl ỹ j y j 1 1 hl ỹ j y j. Corollary A.3. Let V be a Hilbert space. Assume that the function F : V V satisfies with constants L, γ > 0 for all u, v V F (v) F (u) V L v u [F (v) F (u)] v u γ v u Then the equation F (u) = f with f V has a unique solution u V. Proof. By the Riesz representation theorem there is a linear isometry φ: V V such that f v = φf, v for all v V. Therefore we define F := φ F : V V and can apply the previous Lemma. Corollary A.4. (Lax-Milgram) Let V be a Hilbert space. Assume that the bilinear form a: V V R satisfies with constants L, γ > 0 for all u, v V a(u, v) L u v a(u, u) γ u Then the equation there is a unique u V which satisfies a(u, v) = f v for all v V and we have u γ 1 f V. Proof. Define F : V V by F (u) v := a(u, v). By the definition of V the function F satisfies the assumptions of Cor. 1. The estimate for u follows from γ u a(u, u) = f u f V u. Corollary A.5. Assume A R n n satisfies for all u R n Au, u γ u (14) Then the equation Au = b has for b R n a unique solution and we have A 1 γ 1. It can be found with the following iterative methods: 1. Richardson iteration: Let L := A, then for α (0, γ/l ) the iteration u k+1 = u k + α(b Au k ) (15) converges. In particular, for α = γ/l we have u k+1 u (1 γ /L ) 1/ u k u. A drawback of the method is that we need to know some (possibly nonoptimal) constants γ, L satisfying (5), A L in order to choose α (or we have to experiment with different values of α). 9

. 1-step minimum residual: We use (15) and choose α so that the norm of the new residual r k+1 = b Au k+1 becomes minimal: r k+1 r = k αar k r = k α Ar k, r k + α Ar k i.e., then α k := Ar k, r k Ar k r k+1 r = k Ar k, r k Ar k Therefore the residuals r k = A(u u k ) converge and r k (1 γ /L ) k/ r 0, ) r (1 k γ L ( ) k/ u k u γ 1 1 γ r 0 L This method corresponds to the first step of the GMRES method, or the GMRES(1) method which is restarted after every step. The full GMRES method minimizes the residuals over multiple directions, so the norm of the residual can only be lower. Hence the above estimates for r k and u k u also hold for the GMRES method. 10