that determines x up to a complex scalar of modulus 1, in the real case ±1. Another condition to normalize x is by requesting that

Similar documents
Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Econ Slides from Lecture 7

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying

Scientific Computing: An Introductory Survey

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Numerical Linear Algebra Homework Assignment - Week 2

4. Determinants.

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Solving large scale eigenvalue problems

Matrices, Moments and Quadrature, cont d

Eigenvalues and Eigenvectors

Solution of Linear Equations

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

JUST THE MATHS UNIT NUMBER 9.8. MATRICES 8 (Characteristic properties) & (Similarity transformations) A.J.Hobson

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

Linear Algebra: Matrix Eigenvalue Problems

Nonlinear palindromic eigenvalue problems and their numerical solution

Algorithms for Solving the Polynomial Eigenvalue Problem

Quadratic Matrix Polynomials

Calculating determinants for larger matrices

Math 577 Assignment 7

Eigenvalues and Eigenvectors

a 11 a 12 a 11 a 12 a 13 a 21 a 22 a 23 . a 31 a 32 a 33 a 12 a 21 a 23 a 31 a = = = = 12

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Math 489AB Exercises for Chapter 2 Fall Section 2.3

Lecture 4 Eigenvalue problems

Eigenvalue and Eigenvector Problems

Matrix decompositions

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

MTH 464: Computational Linear Algebra


Linear Algebra Primer

1. Matrix multiplication and Pauli Matrices: Pauli matrices are the 2 2 matrices. 1 0 i 0. 0 i

G1110 & 852G1 Numerical Linear Algebra

11 a 12 a 21 a 11 a 22 a 12 a 21. (C.11) A = The determinant of a product of two matrices is given by AB = A B 1 1 = (C.13) and similarly.

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

CAAM 335 Matrix Analysis

ICS 6N Computational Linear Algebra Eigenvalues and Eigenvectors

S.F. Xu (Department of Mathematics, Peking University, Beijing)

33AH, WINTER 2018: STUDY GUIDE FOR FINAL EXAM

A = 3 1. We conclude that the algebraic multiplicity of the eigenvalues are both one, that is,

Foundations of Matrix Analysis

Math 504 (Fall 2011) 1. (*) Consider the matrices

COMP 558 lecture 18 Nov. 15, 2010

CHAPTER 3. Matrix Eigenvalue Problems

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors

ACM106a - Homework 2 Solutions

The Eigenvalue Problem: Perturbation Theory

Conceptual Questions for Review

Chapters 5 & 6: Theory Review: Solutions Math 308 F Spring 2015

Math 215 HW #9 Solutions

Applied Linear Algebra in Geoscience Using MATLAB

QR Decomposition. When solving an overdetermined system by projection (or a least squares solution) often the following method is used:

Computational Methods. Eigenvalues and Singular Values

Chapter 3. Linear and Nonlinear Systems

Improved Newton s method with exact line searches to solve quadratic matrix equation

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

Mathematical foundations - linear algebra

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

Math 2331 Linear Algebra

Linear Algebra Primer

Iterative Methods. Splitting Methods

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Next topics: Solving systems of linear equations

CS 246 Review of Linear Algebra 01/17/19

Lecture Notes 6: Dynamic Equations Part C: Linear Difference Equation Systems

JUST THE MATHS SLIDES NUMBER 9.6. MATRICES 6 (Eigenvalues and eigenvectors) A.J.Hobson

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

6.4 Krylov Subspaces and Conjugate Gradients

MATH 423 Linear Algebra II Lecture 20: Geometry of linear transformations. Eigenvalues and eigenvectors. Characteristic polynomial.

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Orthogonal Transformations

Math Linear Algebra Final Exam Review Sheet

Cheat Sheet for MATH461

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

Linear algebra I Homework #1 due Thursday, Oct Show that the diagonals of a square are orthogonal to one another.

Numerical Methods - Numerical Linear Algebra

1 Number Systems and Errors 1

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

Lagrange multipliers. Portfolio optimization. The Lagrange multipliers method for finding constrained extrema of multivariable functions.

Direct methods for symmetric eigenvalue problems

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Stat 159/259: Linear Algebra Notes

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

MAT Linear Algebra Collection of sample exams

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

2 Eigenvectors and Eigenvalues in abstract spaces.

Math Ordinary Differential Equations

Linear Algebra Primer

LU Factorization. A m x n matrix A admits an LU factorization if it can be written in the form of A = LU

Eigenvalues and Eigenvectors

Symmetric Matrices and Eigendecomposition

MATH 225 Summer 2005 Linear Algebra II Solutions to Assignment 1 Due: Wednesday July 13, 2005

6 EIGENVALUES AND EIGENVECTORS

Conjugate Gradient (CG) Method

Some Notes on Linear Algebra

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Chapter 5 Eigenvalues and Eigenvectors

Transcription:

Chapter 3 Newton methods 3. Linear and nonlinear eigenvalue problems When solving linear eigenvalue problems we want to find values λ C such that λi A is singular. Here A F n n is a given real or complex matrix. Equivalently, we want to find values λ C such that there is a nontrivial (nonzero) x that satisfies (3.) (A λi)x = Ax = λx. In the linear eigenvalue problem (3.) the eigenvalue λ appears linearly. However, as the unknown λ is multiplied with the unknown vector x, the problem is in fact nonlinear. We have n+ unknownsλ,x,...,x n that arenot uniquelydefinedbythe nequations in(3.). We have noticed earlier, that the length of the eigenvector x is not determined. Therefore, we add to (3.) a further equation that fixes this. The straightforward condition is (3.2) x 2 = x x =, that determines x up to a complex scalar of modulus, in the real case ±. Another condition to normalize x is by requesting that (3.3) c T x =, for some c. Eq. (3.3) is linear in x and thus simpler. However, the combined equations (3.) (3.3) are nonlinear anyway. Furthermore, c must be chosen such that it has a strong component in the (unknown) direction of the searched eigenvector. This requires some knowledge about the solution. In nonlinear eigenvalue problems we have to find values λ C such that (3.4) A(λ)x = where A(λ) is a matrix the elements of which depend on λ in a nonlinear way. An example is a matrix polynomial, (3.5) A(λ) = d λ k A k, A k F n n. k= The linear eigenvalue problem (3.) is a special case with d =, A(λ) = A λa, A = A, A = I. 53

54 CHAPTER 3. NEWTON METHODS Quadratic eigenvalue problems are matrix polynomials where λ appears squared [6], (3.6) Ax+λKx+λ 2 Mx =. Matrix polynomials can be linearized, i.e., they can be transformed in a linear eigenvalue of size dn where d is the degree of the polynomial and n is the size of the matrix. Thus, the quadratic eigenvalue problem (3.6) can be transformed in a linear eigenvalue problem of size 2n. Setting y = λx we get or ( A O O I ( A K O I )( x y ) ( K M = λ I O )( ) x = λ y ( O M I O )( ) x y )( ) x. y Notice that other linearizations are possible [2, 6]. Notice also the relation with the transformation of high order to first order ODE s [5, p. 478]. Instead of looking at the nonlinear system (3.) (complemented with (3.3) or (3.2)) we may look at the nonlinear scalar equation (3.7) f(λ) := det A(λ) = andapplysomezerofinder. Herethequestionariseshowtocomputef(λ)andinparticular f (λ) = d dλ det A(λ). This is waht we discuss next. 3.2 Zeros of the determinant We first consider the computation of eigenvalues and subsequently eigenvectors by means of computing zeros of the determinant det(a(λ)). Gaussian elimination with partial pivoting (GEPP) applied to A(λ) provides the decomposition (3.8) P(λ)A(λ) = L(λ)U(λ), where P(λ) is the permutation matrix due to partial pivoting, L(λ) is a lower unit triangular matrix, and U(λ) is an upper triangular matrix. From well-known properties of the determinant function, equation (3.8) gives detp(λ) deta(λ) = detl(λ) detu(λ). Taking the particular structures of the factors in (3.8) into account, we get (3.9) f(λ) = deta(λ) = ± The derivative of deta(λ) is (3.) f (λ) = ± = ± n n u ii (λ) u jj (λ) i= n i= u ii (λ) u ii (λ) j i n u ii (λ). i= n u jj (λ) = j= n i= u ii (λ) u ii (λ) f(λ). How can we compute the derivatives u ii of the diagonal elements of U(λ)?

3.2. ZEROS OF THE DETERMINANT 55 3.2. Algorithmic differentiation A clever way to compute derivatives of a function is by algorithmic differentiation, see e.g., []. Here we assume that we have an algorithm available that computes the value f(λ) of a function f, given the input argument λ. By algorithmic differentiation a new algorithm is obtained that computes besides f(λ) the derivative f (λ). The idea is easily explained by means of the Horner scheme to evaluate polynomials. Let n f(z) = c i z i. be a polynomial of degree n. f(z) can be written in the form which gives rise to the recurrence i= f(z) = c +z(c +z(c 2 + +z(c n ) )) p n := c n, p i := zp i+ +c i, f(z) := p. i = n,n 2,...,, Note that each of the p i can be considered as a function (polynomial) in z. We use the above recurrence to determine the derivatives dp i, dp n :=, p n := c n, dp i := p i+ +zdp i+, p i := zp i+ +c i, i = n,n 2,...,, f (z) := dp, f(z) := p. Here, to each equation in the original algorithm its derivation is prepended. We can proceed in a similar fashion for computing deta(λ). We need to be able to compute the derivatives a ij, though. Then, we can derive each single assignment in the GEPP algorithm. IfwerestrictourselvestothestandardeigenvalueproblemAx = λxthena(λ) = A λi and a ij = δ ij, the Kronecker δ. 3.2.2 Hyman s algorithm In a Newton iteration we have to compute the determinant for possibly many values λ. Using the factorization (3.8) leads to computational costs of 2 3 n3 flops (floating point operations) for each factorization, i.e., per iteration step. If this algorithm was used to compute all eigenvalues then an excessive amount of flops would be required. Can we do better? The strategy is to transform A by a similarity transformation to a Hessenberg matrix, i.e., a matrix H whose entries below the lower off-diagonal are zero, h ij =, i > j +. Any matrix A can be transformed into a similar Hessenberg matrix H by means of a sequence of elementary unitary matrices called Householder transformations. The details are given in Section 4.3.

56 CHAPTER 3. NEWTON METHODS LetS AS = H, wheres is unitary. S is theproductofthejustmentioned Householder transformations. Then Ax = λx Hy = λy, x = Sy. So, A and H have equal eigenvalues (A and H are similar) and the eigenvectors are transformed by S. We now assume that H is unreduced, i.e., h i+,i for all i. Otherwise we can split Hx = λx in smaller problems. Let λ be an eigenvalue of H and (3.) (H λi)x =, i.e., x is an eigenvector of H associated with the eigenvalue λ. Then the last component of x cannot be zero, x n. The proof is by contradiction. Let x n =. Then (for n = 4) h λ h 2 h 3 h 4 h 2 h 22 λ h 23 h 24 h 32 h 33 λ h 34 h 43 h 44 λ x x 2 x 3 =. The last equation reads h n,n x n +(h nn λ) = from which x n = follows since we assumed the H is unreduced, i.e., that h n,n. In the exact same procedure we obtain x n 2 =,...,x =. But the zero vector cannot be an eigenvector. Therefore, x n must not be zero. Without loss of generality we can set x n =. We continue to expose the procedure with the problem size n = 4. If λ is an eigenvalue then there are x i, i < n, such that (3.2) h λ h 2 h 3 h 4 h 2 h 22 λ h 23 h 24 h 32 h 33 λ h 34 h 43 h 44 λ If λ is not an eigenvalue then we determine the x i such that (3.3) h λ h 2 h 3 h 4 h 2 h 22 λ h 23 h 24 h 32 h 33 λ h 34 h 43 h 44 λ We determine the n numbers x n,x n 2,...,x by x x 2 x 3 x x 2 x 3 =. = p(λ). x i = ( ) (hi+,i+ λ)x i+ +h i+,i+2 x i+2 + +h i+,n x n, i = n,...,. h i+,i }{{} The x i are functions of λ, in fact, x i P n i. The first equation in (3.3) gives (3.4) (h, λ)x +h,2 x 2 + +h,n x n p(λ).

3.2. ZEROS OF THE DETERMINANT 57 Eq. (3.3) is obtained from looking at the last column of h λ h 2 h 3 h 4 x h 2 h 22 λ h 23 h 24 x 2 h 32 h 33 λ h 34 x 3 h 43 h 44 λ Taking determinants yields = h λ h 2 h 3 p(λ) h 2 h 22 λ h 23 h 32 h 33 λ. h 43 ( n det(h λi) = ( ) n i= h i+,i )p(λ) = c p(λ). So, p(λ) is a constant multiple of the determinant of H λi. Therefore, we can solve p(λ) = instead of det(h λi) =. Since the quantities x,x 2,...,x n and thus p(λ) are differentiable functions of λ, we can get p (λ) by algorithmic differentiation. For i = n,..., we have Finally, x i = ( xi+ +(h i+,i+ λ)x i+ +h i+,i+2 x i+2 + +h i+,n x ) n. h i+,i c f (λ) = x +(h,n λ)x +h,2x 2 + +h,n x n. Algorithm 3.2.2 implements Hyman s algorithm that returns p(λ) and p (λ) given an input parameter λ [7]. Algorithm 3. Hyman s algorithm : Choose a value λ. 2: x n := ; dx n := ; 3: for i = n downto do 4: s = (λ h i+,i+ )x i+ ; ds = x i+ +(λ h i+,i+ )dx i+ ; 5: for j = i+2 to n do 6: s = s h i+,j x j ; ds = ds h i+,j dx j ; 7: end for 8: x i = s/h i+,i ; dx i = ds/h i+,i ; 9: end for : s = (λ h, )x ; ds = x (λ h, )dx ; : for i = 2 to n do 2: s = s+h,i x i ; ds = ds+h,i dx i ; 3: end for 4: p(λ) := s; p (λ) := ds; This algorithm computes p(λ) = c det(h(λ)) and its derivative p (λ) of a Hessenberg matrix H in O(n 2 ) operations. Inside a Newton iteration the new iterate is obtained by λ k+ = λ k p(λ k) p (λ k ), k =,,...

58 CHAPTER 3. NEWTON METHODS The factor c cancels. An initial guess λ has to be chosen to start the iteration. It is clear that a good guess reduces the iteration count of the Newton method. The iteration is considered converged if f(λ k ). The vector x = (x,x 2,...,x n,) T is a good approximation of the corresponding eigenvector. Remark 3.. Higher order deriatives of f can be computed in an analogous fashion. Higher order zero finders (e.g. Laguerre s zero finder) are then applicable [3]. 3.2.3 Computing multiple zeros If we have found a zero z of f(x) = and want to compute another one, we want to avoid recomputing the already found z. We can explicitly deflate the zero by defining a new function (3.5) f (x) := f(x) x z, and apply our method of choice to f. This procedure can in particular be done with polynomials. The coefficients of f are however very sensitive to inaccuracies in z. We can proceed similarly for multiple zeros z,...,z m. Explicit deflation is not recommended and often not feasible since f is not given explicitely. For the reciprocal Newton correction for f in (3.5) we get f (x) f (x) = f (x) x z f(x) (x z) 2 f(x) x z = f (x) f(x) x z. Then a Newton correction becomes (3.6) x (k+) = x k f (x k ) f(x k ) x k z and similarly for multiple zeros z,...,z m. Working with (3.6) is called implicit deflation. Here, f is not modified. In this way errors in z are not propagated to f 3.3 Newton methods for the constrained matrix problem We consider the nonlinear eigenvalue problem (3.4) equipped with the normalization condition (3.3), (3.7) T(λ)x =, c T x =, where c is some given vector. At a solution (x,λ), x, T(λ) is singular. Note that x is defined only up to a (nonzero) multiplicative factor. c T x = is just one way to normalize x. Another one would be x 2 =, cf. the next section. Solving (3.7) is equivalent with finding a zero of the nonlinear function f(x,λ), (3.8) f(x,λ) = ( ) T(λ)x c T = x ( ).

3.3. NEWTON METHODS FOR THE CONSTRAINED MATRIX PROBLEM 59 To apply Newton s zero finding method to (3.8) we need the Jacobian J of f, (3.9) J(x,λ) f(x,λ) (x,λ) = ( T(λ) T ) (λ)x c T. Here, T (λ) denotes the (elementwise) derivative of T with respect to λ. Then, a step of Newton s iteration is given by ( ) ( ) xk+ xk (3.2) = J(x k,λ k ) f(x k,λ k ), λ k+ λ k or, with the abbreviations T k := T(λ k ) and T k := T (λ k ), an equation for the Newton corrections is obtained, ( Tk T (3.2) k x )( ) ( ) k xk+ x k Tk x c T = k λ k+ λ k c T. x k If x k is normalized, c T x k =, then the second equation in (3.2) yields (3.22) c T (x k+ x k ) = c T x k+ =. The first equation in (3.2) gives T k (x k+ x k )+(λ k+ λ k )T k x k = T k x k T k x k+ = (λ k+ λ k )T k x k. We introduce the auxiliary vector u k+ by (3.23) T k u k+ = T k x k. Note that (3.24) x k+ = (λ k+ λ k )u k+. So, u k+ pointsin theright direction; it justneedstobenormalized. Premultiplying (3.24) by c T and using (3.22) gives or = c T x k+ = (λ k+ λ k )c T u k+, (3.25) λ k+ = λ k c T u k+. In summary, we get the following procedure. If the linear eigenvalue problem is solved by Algorithm 3.3 then T (λ)x = x. In each iteration step a linear system has to be solved which requires the factorization of a matrix. We now change the way we normalize x. Problem (3.7) becomes (3.26) T(λ)x =, x 2 =, with the corresponding nonlinear system of equations ( ) T(λ)x (3.27) f(x,λ) = = 2 (xt x ) ( ).

6 CHAPTER 3. NEWTON METHODS Algorithm 3.2 Newton iteration for solving (3.8) : Choose a starting vector x R n with c T x =. Set k :=. 2: repeat 3: Solve T(λ k )u k+ := T (λ k )x k for u k+ ; (3.23) 4: µ k := c T u k+ ; 5: x k+ := u k+ /µ k ; (Normalize u k+ ) 6: λ k+ := λ k /µ k ; (3.25) 7: k := k +; 8: until some convergence criterion is satisfied The Jacobian now is (3.28) J(x,λ) f(x,λ) (x,λ) = The Newton step (3.2) is changed into (3.29) ( Tk T k x k x T k ( T(λ) T ) (λ)x x T. )( ) ( xk+ x k Tk x = k λ k+ λ k 2 ( xt k x k) ). If x k is normalized, x k =, then the second equation in (3.29) gives (3.3) x T k (x k+ x k ) = x T k x k+ =. The correction x k := x k+ x k is orthogonal to the actual approximation. The first equation in (3.29) is identical with the first equation in (3.2). Again, we employ the auxiliary vector u k+ defined in (3.23). Premultiplying (3.24) by x T k and using (3.3) gives = x T k x k+ = (λ k+ λ k )x T k u k+, or (3.3) λ k+ = λ k x T k u. k+ The next iterate x k+ is obtained by normalizing u k+, (3.32) x k+ = u k+ / u k+. Algorithm 3.3 Newton iteration for solving (3.27) : Choose a starting vector x R n with x () =. Set k :=. 2: repeat 3: Solve T(λ k )u k+ := T (λ k )x k for u k+ ; (3.23) 4: µ k := x T k u k+; 5: λ k+ := λ k /µ k ; (3.3) 6: x k+ := u k+ / u k+ ; (Normalize u k+ ) 7: k := k +; 8: until some convergence criterion is satisfied

3.4. SUCCESSIVE LINEAR APPROXIMATIONS 6 3.4 Successive linear approximations Ruhe [4] suggested the following method which is not derived as a Newton method. It is based on an expansion of T(λ) at some approximate eigenvalue λ k. (3.33) T(λ)x (T(λ k ) ϑt (λ k ))x =, λ = λ k ϑ. Equation (3.33) is a generalized eigenvalue problem with eigenvalue ϑ. If λ k is a good approximation of an eigenvalue, then it is straightforward to compute the smallest eigenvalue ϑ of (3.34) T(λ k )x = ϑt (λ k )x and update λ k by λ k+ = λ k ϑ. Algorithm 3.4 Algorithm of successive linear problems : Start with approximation λ of an eigenvalue of T(λ). 2: for k =,2,... do 3: Solve the linear eigenvalue problem T(λ k )u = ϑt (λ k )u. 4: Choose an eigenvalue ϑ smallest in modulus. 5: λ k+ := λ k ϑ; 6: if ϑ is small enough then 7: Return with eigenvalue λ k+ and associated eigenvector u/ u. 8: end if 9: end for Remark 3.2. If T is twice continuously differentiable, and λ is an eigenvalue of problem () such that T (λ) is singular and is an algebraically simple eigenvalue of T (λ) T(λ), then the method in Algorithm 3.4 converges quadratically towards λ. Bibliography [] P. Arbenz and W. Gander, Solving nonlinear eigenvalue problems by algorithmic differentiation, Computing, 36 (986), pp. 25 25. [2] D. S. Mackey, N. Mackey, C. Mehl, and V. Mehrmann, Structured polynomial eigenvalue problems: Good vibrations from good linearizations, SIAM J. Matrix Anal. Appl., 28 (26), pp. 29 5. [3] B. Parlett, Laguerre s method applied to the matrix eigenvalue problem, Math. Comput., 8 (964), pp. 464 485. [4] A. Ruhe, Algorithms for the nonlinear eigenvalue problem, SIAM J. Numer. Anal., (973), pp. 674 689. [5] G. Strang, Introduction to Applied Mathematics, Wellesley-Cambridge Press, Wellesley, 986. [6] F. Tisseur and K. Meerbergen, The quadratic eigenvalue problem, SIAM Rev., 43 (2), pp. 235 286. [7] R. C. Ward, The QR algorithm and Hyman s method on vector computers, Math. Comput., 3 (976), pp. 32 42.

62 CHAPTER 3. NEWTON METHODS