Class notes: Approximation

Similar documents
Dot product and linear least squares problems

Linear least squares problem: Example

Cheat Sheet for MATH461

Linear Algebra Massoud Malek

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Applied Linear Algebra in Geoscience Using MATLAB

MATH 235: Inner Product Spaces, Assignment 7

Linear Least Squares Problems

Gaussian Elimination without/with Pivoting and Cholesky Decomposition

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Applied Linear Algebra in Geoscience Using MATLAB

1. General Vector Spaces

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

Lecture 3: QR-Factorization

6. Orthogonality and Least-Squares

MAT Linear Algebra Collection of sample exams

orthogonal relations between vectors and subspaces Then we study some applications in vector spaces and linear systems, including Orthonormal Basis,

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning:

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Designing Information Devices and Systems II

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Inner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:

Numerical Linear Algebra

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space.

The Solution of Linear Systems AX = B

Inner product spaces. Layers of structure:

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

Chapter 3 Transformations

Numerical Methods - Numerical Linear Algebra

Solutions: Problem Set 3 Math 201B, Winter 2007

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

Conceptual Questions for Review

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

Exercise Sheet 1.

The Gram-Schmidt Process 1

Lecture Notes 1: Vector spaces

Applied Linear Algebra

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)

Orthonormal Transformations and Least Squares

Orthogonal Transformations

Math Linear Algebra II. 1. Inner Products and Norms

SUMMARY OF MATH 1600

j=1 [We will show that the triangle inequality holds for each p-norm in Chapter 3 Section 6.] The 1-norm is A F = tr(a H A).

5 Compact linear operators

7. Dimension and Structure.

1 Vectors. Notes for Bindel, Spring 2017 Numerical Analysis (CS 4220)

Approximation Theory

Stability of the Gram-Schmidt process

Numerical Linear Algebra Chap. 2: Least Squares Problems

Mathematics Department Stanford University Math 61CM/DM Inner products

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms.

Math 307 Learning Goals. March 23, 2010

Lecture 1: Review of linear algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Linear Algebra- Final Exam Review

LINEAR ALGEBRA SUMMARY SHEET.

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB

Orthonormal Transformations

Notes on Eigenvalues, Singular Values and QR

MTH 2032 SemesterII

6 Inner Product Spaces

LINEAR ALGEBRA W W L CHEN

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Boundary Value Problems and Iterative Methods for Linear Systems

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

Numerical Analysis Lecture Notes

SECTION v 2 x + v 2 y, (5.1)

18.06 Problem Set 10 - Solutions Due Thursday, 29 November 2007 at 4 pm in

Math 307 Learning Goals

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

What is it we are looking for in these algorithms? We want algorithms that are

Linear Algebra. Session 12

Linear Systems of n equations for n unknowns

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

BASIC ALGORITHMS IN LINEAR ALGEBRA. Matrices and Applications of Gaussian Elimination. A 2 x. A T m x. A 1 x A T 1. A m x

Algebra C Numerical Linear Algebra Sample Exam Problems

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

MATH 532: Linear Algebra

Elementary linear algebra

Your first day at work MATH 806 (Fall 2015)

Linear Algebra. Min Yan

Linear Analysis Lecture 16

We will discuss matrix diagonalization algorithms in Numerical Recipes in the context of the eigenvalue problem in quantum mechanics, m A n = λ m

Since the determinant of a diagonal matrix is the product of its diagonal elements it is trivial to see that det(a) = α 2. = max. A 1 x.

Linear Algebra: Matrix Eigenvalue Problems

Solutions to Review Problems for Chapter 6 ( ), 7.1

Linear Algebra, part 3 QR and SVD

MAT 610: Numerical Linear Algebra. James V. Lambers

6.4 Krylov Subspaces and Conjugate Gradients

Vectors in Function Spaces

Review of Some Concepts from Linear Algebra: Part 2

Fundamentals of Engineering Analysis (650163)

1 Number Systems and Errors 1

Functional Analysis Exercise Class

Schur s Triangularization Theorem. Math 422

The QR Factorization

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Transcription:

Class notes: Approximation Introduction Vector spaces, linear independence, subspace The goal of Numerical Analysis is to compute approximations We want to approximate eg numbers in R or C vectors in R m or C n functions on an interval: [a,b] R or [a,b] C Therefore we are trying to approximate an element v of a vector space V In a vector space the addition v + w of vectors is defined, and the product scalar times vector αv Here α is an element of the field of scalars, in our case either in R real vector space) or C complex vector space) For vectors u ),,u n) and scalars c,,c Nn the linear combination v = c u ) + + c n u n) is again a vector If c u ) + + c n u n) can only be the zero vector for c = = c n = we say that the vectors u ),,u n) are linearly independent The span of vectors u ),,u n) is the set of all linear combinations by { span u ),,u n)} { } = c u ) + + c n u n) c,,c n C A subset W V of a vector space V which is also a vector space is called a subspace Example: For V = R 3 we have the following subspaces: the whole space R 3 planes through the origin lines through the origin { } If a subspace W is the span of linearly independent vectors u ),,u n) we say that u ),,u n) is a basis of the subspace W the dimension of W is n Note that V = R 3 has dimension 3, since,, forms a basis Note that V = C[,]) is not finite dimensional: There do not exist finitely many vectors u ),,u N) which span the whole space V

Norms Assume we want to find v V, and we are able to find the approximation w W We then want to measure the size of the error For this we use a norm, and the size of the error is w v The norm is a mapping V [, ) with the following properties: v = implies that v = null vector) I will usually not use arrows to denote vectors, but I will use to emphasize that this is in V and not the scalar ) αv = α v for all scalars α v + w v + w triangle inequality) Examples for norms: For a vector v = v v n v = max j=,,m v j in R m or C m : v = v + + v m v = v + + v m ) / For a continuous function f defined on an interval [a,b]: f = max x [a,b] f x) f = b a f x) dx f = b a f x) dx ) / A sequence v ),v ),v 3), of elements of V forms a Cauchy sequence if ε > N N such that m,n N implies v m) v n) < ε Note that the vector spaces R n and C n equipped with any of these norms are complete: Any Cauchy sequence v ),v ),v 3), has a limit element v C such that v k) converges to v as k, ie, v k) v as k The vector space C[a,b]) of continuous functions on [a,b] equipped with is complete: For a Cauchy sequence of continuous functions there exists a limit v which is also a continuous function We have v k) v, this is called uniform convergence The vector space C[a,b]) of continuous functions on [a,b] equipped with or is NOT complete: Consider eg the functions v k) on [,] defined by for x [, v k) k ) x) = kx for x [ k, k ] for x k,] { for x < which have the limit function v = for x > where we can define any value for x = : We then have v k) v and v k) v as k If we have v k) v as k, ie, b a v k) x) vx) dx as k we say that v k) converges to v in the mean square sense Note that v k) converges to v in the mean square sense, but we do not have uniform convergence Note that two functions v,ṽ which only differ in the function value at x = satisfy v ṽ = and v ṽ =

The set of possible limits with respect to is called L [a,b]) These are functions v such that v is Lebesgue integrable on [a,b] The set of possible limits with respect to is called L [a,b]) These are functions v such that v is Lebesgue integrable on [a,b] Note that functions v and ṽ which differ only on a set of measure are considered identical in L and L This means that L [a,b]) and L [a,b]) are quotient spaces: its elements are not functions, but rather sets of functions which differ only on a set of measure zero Example: The function vx) = x / is in L [,]), but not in L [,]) The space L [a,b]) with is complete The space L [a,b]) with is complete 3 Inner product spaces For the vector space C n we introduce the inner product u,v) := Here v i denotes the complex conjugate: For a complex number z = x + iy we have z = x iy We can the define the norm of a vector as u := u,u) / ) For a vector u C n this means that u = u + + u n = v We say that two vectors u,v are orthogonal if u,v) = An important property is the Cauchy-Schwarz inequality: u,v) u v For the vector spaces C[a,b]) and L [a,b]) we can also define an inner product: For functions u and v we define n i= u i v i and have b u,v) := ux)vx) dx a b / u = u,u) / = ux) dx) = u a Least squares approximation We want to approximate an element v of a vector space V We want to find an approximation w in the finite dimensional subspace W = span { a ),,a n)} We measure the error w v with a norm, and we want to find the best approximation Example problem: We want to understand how a calculator or computer can evaluate sinx for a given value x The processor can essentially only perform addition, multiplication, division Therefore we can try to approximate the function with a polynomial We use the notation P n for polynomials of degree k: { } P k = c + c x + + c k x k c,,c k R It is sufficient to consider x [,π/] Instead of approximating sinx for x [,π/] we can consider the function ux) = sin π x) for x [,] We want to find a polynomial wx) P n such that the error is minimal 3

Problem : Find the best approximation with respect to u w : Find a polynomial w P n such that the maximal error u w = max x [,π/] ux) wx) is minimal Problem : Find the best approximation in the least squares sense: Find a polynomial w P n such that u w = ux) wx) dx is minimal It turns out that Problem is much easier to solve than Problem Abstract version of the least squares problem Let V denote a vector space which is equipped with an inner product, ) This defines a norm v := v,v) / We are given: a vector u V and an n-dimensional subspace W := span {a ),,a n)} where the vectors a ),,a n) are assumed to be linearly independent otherwise some of them are redundant and can be dropped) Least Squares Problem: Find a vector w = c a ) + + c n a n) such that u w is minimal 3 Algorithm : Normal Equations We are given a point u outside of the subspace W We want to find the closest point w in this subspace Geometric intuition tells us that w u should be orthogonal on the subspace W This orthogonality condition is called Normal Equations: Find w = c a ) + + c n a n) such that w u,a j)) = for j =,,n This gives the linear system of the following n equations for n unknowns c,,c n : for j =,,n : We can write this in matrix-vector notation with c = c c n n a k),a j)) c k = u,a j)) k= C n as Mc = b Gram matrix M C n n has entries M jk = a k),a j)) right-hand side vector b C n has entries b j = u,a j)) After we have computed the entries of M and b we have to solve the n n linear system Mc = b We can use Gaussian elimination with pivoting for this In Matlab we can use c=m\b The linear system has a unique solution if and only if the matrix M is nonsingular Proposition M is nonsingular 4

Proof We have to show that Mv = implies v = We claim that the unique solution c of the normal equations is the solution of the least squares problem Proposition The solution vector c of the normal equations yields the unique solution of the least squares problem Proof Let w = c a ) + + c n a n) We want to show that any w W different from w will have w u > w u We can write w = w + v with some v W, then multiplying out and using the normal equations we obtain u w = u w v,u w v) = u w,u w) v,u w) u w,v) +v,v) }{{}}{{} u w = u w + v which is actually Pythagoras theorem for the triangle with vertices u,w, w Hence u w u w and we have equality only if v = d a ) + +d n a n) = Since a ),,a n) are by assumption linearly independent we must have d = = d n = Hence the coefficients c,,c n of the least squares solution are unique 3 Case of V = C m We now consider the special case V = C m with the inner product u,v) = u v + + u n v n = [v,,v n ] where we use the notation v H := v Note that in Matlab v H is obtained by v We need to have m n, otherwise the vectors a ),,a n) C m are linearly dependent We define the matrix A C m n using the columns a ),,a n) A = [a ),,a n)] u u n Then we can write w = c a ) + + c n a n) = Ac Least Squares Problem: Given A C m n and u C n find a vector c C n such that Ac u is minimal Note that we can write the Gram matrix M and the right-hand side vector b as M = A H A, b = A H u Hence we can write the normal equations as A H Ac = A H u 3 Notation A = [a ),,a n)] We can use the matrix-vector notation also in the general case of a vector space V, eg for functions in V = L [,]) For n vectors a ),,a n) V we denote this n-tupel by A = [ a ),,a n)] We can then write a linear combination with coefficient vector c C n with the matrix-vector notation c a ) + + c n a n) = [a ),,a n)] c c n = Ac 5

Note that we have for vectors v = Ac and ṽ = A c with c, c C n that with the Gram matrix M v,ṽ) = Ac,A c) = c H Mc = Mc, c) Example: Consider the vector space V = C[,]) For v = sin π x) find the best least squares approximation using a linear combination of a ) =, a ) = x Using the above notation we find the solution as follows: A := [,x] M := b := A Adx = [ x x x A sin π x)dx = Solving the linear system Mc = b gives c = π [ 8π 4 48 π ] [ dx = [ sin π x) x sin π x) 3 ] ] dx = [ π 4 π ] and obtain the solution ] w = Ac = c a ) + c a ) = π [8π 4) + 48 π)x] We can do this with symbolic operations in Matlab as follows: >> syms x; Pi = sym pi ); >> v = sinpi/*x); A = x^:) A = [, x] >> M = inta *A,x,,) M = [, /] [ /, /3] >> b = inta *v,x,,) b = /pi 4/pi^ >> c = M\b c = 8*pi - 3))/pi^ -*pi - 4))/pi^ >> w = A*c; >> A = x^:); M = inta *A,x,,); b = inta *v,x,,); c = M\b; w = A*c; >> fplot[v,w,w],[,]); legend sin\pi x/), approx with,x, approx with,x,x^ ) 6

sin x/) approx with,x approx with,x, x 8 6 4 4 6 8 4 Algorithm : Orthogonalization We can use a new basis p ),, p n) for W = span { a ),,a n)} Then we can write w W as w = d p ) + + d n p n) and we obtain the normal equations p ), p )) p n), p )) d u, p ) ) p ), p n)) p n), p n)) = d n u, p n) ) If the vectors p ),, p n) are orthogonal on each other, the matrix is diagonal and the linear system is very easy to solve 4 Gram-Schmidt orthogonalization and decomposition A = PS Assume the vectors a ),,a n) V are linearly independent We want to find vectors p ),, p n) which are orthogonal on each other and satisfy { span p ),, p k)} { = span a ),,a k)} for k =,,n We define p ) := a ) p ) := a ) s p ) p 3) := a 3) s 3 p ) s 3 p ) 7

where we choose the coefficients k j such that we have p j), p k)) = for j =,,n, k =,, j Eg, the condition p ), p )) gives a ), p )) s = p ), p )) The conditions p 3), p )) = and p 3), p )) = give a 3), p )) a 3) s 3 = p ), p )), s, p )) 3 = p ), p )) Therefore we obtain Gram-Schmidt orthogonalization algorithm: For j =,,n: p j) := a j) j a s k j p k) j), p k)) where s k j := p k), p k)) k= We see that p j) span { a ),,a j)} for j =,,n All the vectors p j) are nonzero: Eg, if obtained p 3) = this would mean that a 3) span { p ), p )} span { a ),a )}, hence the vectors a ),a ),a 3) are linearly independent, in contradiction to our assumption that a ),,a n) are linearly independent Note that we have ie, we have the decomposition [ a ),,a n)] = a ) := p ) a ) := p ) + s p ) a 3) := p 3) + s 3 p ) + s 3 p ) A = PS [p ),, p n)] s s n sn,n with the upper triangular matrix S C n n This is the QR decomposition without normalization We can write w W as Because of A = PS the vectors c,d C n are related by Sc = d w = Ac = Pd This gives the following algorithm for solving the least squares problem: Given a ),,a n) use the Gram-Schmidt algorithm to find p ),, p n) and the matrix S Given u V compute the coefficients d,,d n from Section 4: u, p j) ) For j =,,n : d j := p j), p j)) 3 We obtain w W with w u = min w = d p ) + + d n p n) We obtain the coefficient vector c with Ac u = min by solving the linear system Sc = d using back substitution 8

4 Orthonormalization and decomposition A = QR Sometimes it is useful to have an orthonormal basis ON basis) q ),,q n) for span { a ),,a n)} We can obtain this by normalizing the vectors p ),, p n) If we multiply row j of the matrix S by p j) q j) = p j) p j), [r j j,,r jn ] := p j) [,sj, j+,,s jn ] we get from A = PS the so-called QR decomposition [ a ),,a n)] = A = QR [q ),,q n)] where the matrix R is upper triangular with positive entries on the diagonal r r n r nn 43 Case of V = C m We are given n linearly independent vectors a ),,a n) C m Therefore we must have m n This corresponds to the matrix A = [ a ),,a n)] C m n with linearly independent columns We then obtain the QR decomposition [ a ),,a n)] = [q ),,q n)] }{{} orthonormal = Q R A m n m n n n r r n r nn We can compute this in Matlab by [Q,R]=qrA,) This is known as the compact version of the QR decomposition Note that Matlab uses a slightly different algorithm and therefore some of the columns of Q are multiplied by, and the corresponding r j j are negative The vectors q ),,q n) form an orthonormal basis for rangea This is a subspace of dimension n of the vector space V which has dimension m n We can extend the orthonormal basis q ),,q n) to a basis q ),,q m) of the whole space C m Here the additional vectors q n+),,q m) form an orthonormal basis for the orthogonal complement of rangea This yields the full version of the QR decomposition: [ a ),,a n)] = [q ),,q n),q n+),,q m)] }{{} orthonormal A = Q m n m m m n We can compute this in Matlab by [Qt,Rt]=qrA) R r r n r nn Note that the additional part [ q n+),,q m)] does not contribute anything to the decomposition since it is multiplied by zero In many applications eg solving a least squares problem) the economy size version [Q,R]=qrA,) is sufficient Only use the full version [Qt,Rt]=qrA) if you need a basis for the orthogonal complement of rangea The most accurate way to compute the QR decomposition in machine arithmetic is to transform the columns of A step by step to an upper triangular matrix R using so-called Householder reflections This is the algorithm used by Matlab for the qr command 9

5 Polynomials Let P k denote polynomials of degree k: P k = { } c + c x + + c k x k c,,c k C The space P n has dimension n A basis is given by the monomials,x,,x n However, this basis may cause problems in computations with machine numbers for higher degree polynomials Eg, the Gram matrix for [, ] is the infamous Hilbert matrix H with entries H i j = i+ j where the condition number grows exponentially fast with n: >> for n=5:; disp[n,condhilbn))]); end 5 4766e+5 6 495e+7 7 47537e+8 8 558e+ 9 4935e+ 65e+3 5e+4 677e+6 5 Space L w with weight function wx) on a,b) We consider functions vx) defined on the interval a,b) We define the weighted L norm b v = vx) wx)dx a with a weight function wx) We want wx) > except for finitely many points in a,b), and p < for any polynomial p We have the inner product b u,v) := ux)vx)wx)dx a and v = v,v) / We denote the space of all functions v with v < by L w where integrals are understood in the Lebesgue sense, and functions which only differ on sets of measure zero are understood to be the same vector) Examples are,) with wx) =,) with wx) = x ) /, ) with wx) = e x We consider the subspace P n = span{a ),,a n) } where a ) =,,a n) = x n We can use the Gram-Schmidt algorithm to construct an orthogonal basis p ),, p n) Note that these polynomials have leading coefficient, ie, p j) = x j + lower order terms Let us denote these orthogonal polynomials by p x), p x), p x), They are defined by p x) = and x j ), p k and satisfy for j,k =,,, p j x) := x j j k= p k, p k ) p k for j =,,3, p j x) = x j + lower order terms, p j, p k ) = for j k Here we start with the function x j and subtract a linear combination of p j, p j,, p to achieve p j,x k) = for k = j, j,,

Instead we can also start with the function xp j x) = x j + lower order terms: p j x) := xp j j k= xp j, p k ) p k for j =,,3, p k, p k ) This also yields a polynomial p j x) = x j + lower order terms which is orthogonal on P j Hence it must be the same polynomial if we have two polynomials p j and p j we have p j p j P j and p j p j is orthogonal on P j ) Then we we have xp j, p k ) = p j,xp k ) = for k < j, hence we obtain the three term recursion p j x) := xp j xp j, p j ) p j, p j ) p j xp j, p j ) p j, p j ) p j We have xp j, p j ) = p j,xp j ) = p j, p j ) since xp j p j P j, hence p = p = x δ p j = x δ j ) p j γ j p j with δ j := xp j, p j ) p j, p j ), γ j := p j, p j ) p j, p j ) Here are the most popular orthogonal polynomials: Name a,b) wx) δ j+ γ j+ LEGENDRE, ) { 4 j CHEBYSHEV,) x ) / for j = 4 for j JACOBI,) + x) α x) β β α 4 j j + α) j + β) j + α + β) j + α + β) + j + α + β) j + α + β) 4 j + α + β) JACOBI, α = β,) x ) α j j + α) 4 j + α) LAGUERRE, ) e x j + j HERMITE, ) e x j/ For the Jacobi polynomials we need α,β > The orthogonal polynomial p n P n has oscillating behavior, see, eg, the graph of the Legendre polynomial p x) on the next page We have Theorem The orthogonal polynomial p n x) changes sign at n distinct points t,,t n a,b) Proof Let t,,t m be the distinct points in a,b) where p n x) changes sign Assume m < n Then the function qx) := p n x) x t ) x t m ) does not change its sign We have, eg, qx)wx) > for all x a,b) with finitely many exceptions and hence b < p n x t ) x t m ) wx)dx = p n,x t ) x t m )) L a w But this inner product must be zero since p n is orthogonal on P m This is a contradiction We could first use the recursion formula to find p k x) in the form c + c x + + c n x n Then we can use this monomial form to eg evaluate p k x) for given values x, or to find the zeros of p k This is usually a bad idea since we are using the monomial basis which can cause problems with numerical stability We can often avoid using the monomial form of p n, and use the recursion coefficients δ,δ,δ 3, and γ,γ 3, directly

Theorem The zeros of the orthogonal polynomial p n are the eigenvalues of the symmetric tridiagonal matrix δ γ γ δ γn Proof We normalize the orthogonal polynomials such that p j = : p := p, p := p γ, p := p γ γ 3,, p n := Then the recursion formula becomes with p :=, p := ) γ n δ n p n γ γ 3 γ n+ γ j+ p j = x δ j ) p j γ j p j for j =,,n or or in matrix-vector notation δ γ γ j p j + δ j p j + γ j+ p j = x p j γ δ γn γ n δ n p x) p n x) + γ n+ p n x) for j =,,n = x p x) p n x) Example: Plot the Legendre polynomial p and find its zeros: n = ; A = sparsen,n); x = -::; pold = x^; p = x; for j=:n g = /4-j-)^-); g = g^5; Aj-,j) = g; Aj,j-) = g; pnew = x*p - g*pold; pold = p; p = pnew; end plotx,p); grid on; hold on z = eiga); plotz,zerosn,), ro ); hold off % initialize matrix A as sparse array % evaluate p for these x-values % initialize p_, p_ % gamma_j % build tridiagonal matrix A % find p_j from p_{j-}, p_{j-} % zeros of p are eigenvalues of A 6-3 4 - -4 - -8-6 -4-4 6 8 Is it a good idea to solve an eigenvalue problem in order to compute the zeros of a polynomial? Yes!

Finding the roots of a polynomial px) = c + c x + + c n x n from given coefficients c,,c n can be a very ill-conditioned problem see the famous Wilkinson polynomial px) = x ) x ) ) In contrast, finding the eigenvalues of a symmetric tridiagonal n n matrix is a very well-behaved numerical problem There are several efficient algorithms available QR algorithm with shift used by Matlab, divide-and-conquer ) which need typically On ) operations to compute the eigenvalues λ,,λ n For faster On logn) algorithms see eg [Coakley, Rokhlin 3]) 6 Review of Hermitian matrices and matrix norms 6 Review: Hermitian matrices, positive definite matrices We call a matrix A C n n Hermitian if A H = A For a real matrix this is the same as symmetric Theorem 3 Assume A C n n is Hermitian Then the eigenvalues λ,,λ n are real, and there exists an orthonormal basis of eigenvectors v ),,v n) : With V = [ v ),,v n)] we have A = V λ λ n V H, V H V = I Let A C n n be Hermitian and u C n Then q := u H Au R since q H = u H A H u H ) H = u H Au We have with v := V H u that v = u and u H Au = u H V λ λ V H u = v H v = λ v + + λ n v n λ n λ n The eigenvalues λ,,λ n are real Let λ min := min j=,n λ j and λ max := max j=,n λ j Hence we obtain lower and upper bounds for u H Au: λ min v λ v + + λ n v n λ max v λ min u uh Au λ max u We call a Hermitian matrix A positive definite if for all nonzero vectors u u H Au > Note that a Hermitian matrix is positive definite if and only if all eigenvalues are positive Consider linearly independent vectors a ),,a n) V Then the Gram matrix M with M jk = a k),a j)) is Hermitian since M k j = a j),a k)) = c a k),a j)) = M jk It is positive definite since for c n 6 Matrix norms: -norm, Frobenius norm c H Mc = c a ) + + c n a n) > A matrix A C m n corresponds to a linear mapping C n C m, c Ac 3

We would like to have a bound such that v C n : with the smallest possible constant C We can define a norm as the maximal magnification factor : Av C v A = sup Av = sup Av v C n v C n v = v Since the unit ball in C n is compact there exists v with v = such that Av is maximal, and we can write max instead of sup For the vector norms v, v, v we obtain in this way matrix norms A, A, A One can show that The last line follows from n j=,,n k= A = max n k=,,n j= A = max a jk a jk maximal row sum of absolute values maximal column sum of absolute values A = λ max A H A ) maximal eigenvalue of the Gram matrix M = A H A Ac = c a ) + + c n a n) = n j= n k= We can also define the Frobenius norm aka Hilbert-Schmidt norm) c j c k a j),a k)) = c H Mc λ max M) c a A F := ) ) / a + + n) = M + M + + M nn) / Note that Ac = n j= = n k= n j= c j c k a j),a k)) n n j= k= c j c k a j) a k) ) c j a j) c + + c ) a ) n ) a + + n) }{{} A F Hence we have A A F 63 Case A = Ac A F c [ a ),,a n)] with a j) V For a general inner product space V we can consider A = [ a ),,a n)] with a j) V For c C n we can consider the mapping c Ac = c a ) + + c n a n), C n V and define the norm A = sup Ac = sup Ac c C n c = c C n c Since the unit ball in C n is compact there exists v with v = such that Av is maximal, and we can write max instead of sup By the same argument as above we have A = λ max M) 4

with the Gram matrix M C n n With the Frobenius norm a A F := ) ) + + a n) / we have with the same arguments as above and A A F Ac A F c 5