Krylov Subspaces. Lab 1. The Arnoldi Iteration

Similar documents
Krylov Subspaces. The order-n Krylov subspace of A generated by x is

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Lab 1: Iterative Methods for Solving Linear Systems

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Eigenvalue problems III: Advanced Numerical Methods

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Course Notes: Week 1

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

6.4 Krylov Subspaces and Conjugate Gradients

The Lanczos and conjugate gradient algorithms

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

Iterative methods for symmetric eigenvalue problems

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

LARGE SPARSE EIGENVALUE PROBLEMS

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Arnoldi Methods in SLEPc

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Scientific Computing: An Introductory Survey

Index. for generalized eigenvalue problem, butterfly form, 211

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

1 Extrapolation: A Hint of Things to Come

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

Algorithms that use the Arnoldi Basis

Review problems for MA 54, Fall 2004.

APPLIED NUMERICAL LINEAR ALGEBRA

Bindel, Spring 2016 Numerical Analysis (CS 4220) Notes for

Iterative Methods for Solving A x = b

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

FEM and sparse linear system solving

Numerical Methods I Non-Square and Sparse Linear Systems

I. Multiple Choice Questions (Answer any eight)

Final Exam Practice Problems Answers Math 24 Winter 2012

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Introduction to Arnoldi method

q n. Q T Q = I. Projections Least Squares best fit solution to Ax = b. Gram-Schmidt process for getting an orthonormal basis from any basis.

Conceptual Questions for Review

Linear Algebra Massoud Malek

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

STA141C: Big Data & High Performance Statistical Computing

Linear Algebra Review. Vectors

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying

Eigenvalue Problems CHAPTER 1 : PRELIMINARIES

STA141C: Big Data & High Performance Statistical Computing

7 Principal Component Analysis

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

MATH 304 Linear Algebra Lecture 34: Review for Test 2.

Iterative methods for Linear System

Least squares and Eigenvalues

Krylov subspace projection methods

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Elementary linear algebra

MAT Linear Algebra Collection of sample exams

Decomposition. bq (m n) R b (n n) r 11 r 1n

Introduction. Chapter One

A refined Lanczos method for computing eigenvalues and eigenvectors of unsymmetric matrices

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

LINEAR ALGEBRA KNOWLEDGE SURVEY

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Chapter 7. Tridiagonal linear systems. Solving tridiagonal systems of equations. and subdiagonal. E.g. a 21 a 22 a A =

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Review of Linear Algebra

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

UNIT 6: The singular value decomposition.

On Solving Large Algebraic. Riccati Matrix Equations

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Numerical Analysis Lecture Notes

Econ Slides from Lecture 7

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

Math 224, Fall 2007 Exam 3 Thursday, December 6, 2007

Krylov Subspace Methods to Calculate PageRank

Orthogonal iteration to QR

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

MAA507, Power method, QR-method and sparse matrix representation.

18.06 Quiz 2 April 7, 2010 Professor Strang

Section 4.5 Eigenvalues of Symmetric Tridiagonal Matrices

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems

Lecture: Face Recognition and Feature Reduction

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Singular Value Decomposition and Digital Image Compression

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

CS 246 Review of Linear Algebra 01/17/19

Linear Algebra. Session 12

Math 504 (Fall 2011) 1. (*) Consider the matrices

Numerical Methods in Matrix Computations

Eigenvalue and Eigenvector Problems

Preconditioned inverse iteration and shift-invert Arnoldi method

Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm

Review and problem list for Applied Math I

Transcription:

Lab 1 Krylov Subspaces Lab Objective: Discuss simple Krylov Subspace Methods for finding eigenvalues and show some interesting applications. One of the biggest difficulties in computational linear algebra is the amount of memory needed to store a large matrix and the amount of time needed to read its entries. Methods using Krylov subspaces avoid this difficulty by studying how a matrix acts on vectors, making it unnecessary in many cases to create the matrix itself. More specifically, we can construct a Krylov subspace just by knowing how a linear transformation acts on vectors, and with these subspaces we can closely approximate eigenvalues of the transformation and solutions to associated linear systems. The Arnoldi iteration is an algorithm for finding an orthonormal basis of a Krylov subspace. Its outputs can also be used to approximate the eigenvalues of the original matrix. The Arnoldi Iteration The order-n Krylov subspace of A generated by x is K n (A, x) = span{x, Ax, A 2 x,..., A n 1 x}. If the vectors {x, Ax, A 2 x,..., A n 1 x} are linearly independent, then they form a basis for K n (A, x). However, this basis is usually far from orthogonal, and hence computations using this basis will likely be ill-conditioned. One way to find an orthonormal basis for K n (A, x) would be to use the modified Gram-Schmidt algorithm from Lab TODO on the set {x, Ax, A 2 x,..., A n 1 x}. More efficiently, the Arnold iteration integrates the creation of {x, Ax, A 2 x,..., A n 1 x} with the modified Gram-Schmidt algorithm, returning an orthonormal basis for K n (A, x). This algorithm is described in Algorithm 1.1. In Algorithm 1.1, k is the number of times we multiply by A. This will result in an order-k + 1 Krylov subspace. 1

2 Lab 1. Finding Eigenvalues Using Iterative Methods Algorithm 1.1 The Arnoldi Iteration. This algorithm operates on a vector b of length n and an n n matrix A. It iterates k times or until the norm of the next vector in the iteration is less than tol. 1: procedure arnoldi(b, A, k, tol = 1E 8) 2: Q empty (b.size, k + 1) Some initialization steps 3: H zeros (k + 1, k) 4: Q [:,0] = b 5: Q [:,0] / = Q [:,0] 2 6: for j = 0, j < k do Perform the actual iteration. 7: Q [:,j+1] = AQ [:,j] 8: for i = 0, i < j + 1 do Modified Gram-Schmidt. 9: H [i,j] = Q [:,i], Q [:,j+1] 10: Q [:,j+1] = H [i,j] Q [:,i] 11: H [j+1,j] = Q [:,j+1] 2 Set subdiagonal element of H. 12: if H [j+1,j] < tol then Stop if Q [:,j+1] 2 is too small. 13: return H [:j+1,:j+1], Q [:,:j+1] 14: Q [:,j+1] / = H [j+1,j] Normalize q j+1. 15: return H [: 1,:], Q Return H k. Something perhaps unexpected happens in the Arnoldi iteration if the starting vector x is an eigenvector of A. If the corresponding eigenvalue is λ, then by definition K k (A, x) = span{x, λx, λ 2 x,..., λ k x}, which is equal to the span of x. Let us trace through Algorithm?? in this case. We will use q i to denote the i th column of Q. In line 5 we normalize x, setting q 0 = x/ x. In line 7 we set q 1 = Aq 0 = λq 0. Then in line 9 H 0,0 = q 0, q 1 = q 0, λq 0 = λ q 0, q 0 = λ, so in line 10 we subtract λq 0 from q 1, ending with q 1 = 0. The vector q 1 is supposed to be the next vector in the orthonormal basis for K k (A, x), but since it is 0, it is not linearly independent of q 0. In fact, q 0 already spans K k (A, x). Hence, when in line 12 we find that the norm of q 1 is zero (or close to it, allowing for numerical error), we terminate the algorithm early, returning the 1 1 matrix H = H 0,0 = λ and the n 1 matrix Q = q 0. A similar phenomenon may occur if the starting vector x is contained in a proper invariant subspace of A. Problem 1. Using Algorithm 1.1, complete the following Python function that performs the Arnoldi iteration. Write this function so that it can run on complex arrays. def arnoldi(b, Amul, k, tol=1e-8): '''Perform `k' steps of the Arnoldi iteration on the linear operator defined by `Amul', staring with the vector 'b'. INPUTS:

3 b - A NumPy array. The starting vector for the Arnoldi iteration. Amul - A function handle. Should describe a linear operator. k - Number of times to perform the Arnoldi iteration. tol - Stop iterating if the next vector in the Arnoldi iteration has norm less than `tol'. Defaults to 1E-8. RETURN: Return the matrices H_n and Q_n defined by the Arnoldi iteration. The number n will equal k, unless the algorithm terminated early, in which case n will be less than k. Examples: >>> A = np.array([[1,0,0],[0,2,0],[0,0,3]]) >>> Amul = lambda x: A.dot(x) >>> H, Q = arnoldi(np.array([1,1,1]), Amul, 3) >>> np.allclose(h, np.conjugate(q.t).dot(a).dot(q) ) True >>> H, Q = arnoldi(np.array([1,0,0]), Amul, 3) >>> H array([[ 1.+0.j]]) >>> np.conjugate(q.t).dot(a).dot(q) array([[ 1.+0.j]]) ''' Hints: 1. Since H and Q will eventually hold complex numbers, initialize them as complex arrays (e.g., A = np.empty((3,3), dtype=complex128)). 2. Remember to use complex inner products. 3. TODO: does it matter if you return H or a copy of H? Finding Eigenvalues Using Arnoldi iteration Let A be an n n matrix. The matrices H k and Q k created by k iterations of the Arnoldi algorithm satisfy H k = Q kaq k. Since H k is a k k matrix, if k < n then H k is a low-rank approximation to A. We may use its eigenvalues as approximations to the eigenvalues of A. The eigenvalues of H k are called Ritz values, and in fact they converge quickly to the largest eigenvalues of A. Problem 2. Finish the following function that computes the Ritz values of a matrix. def ritz(amul, dim, k, iters): ''' Find the `k' Ritz values with largest real part of the linear operator defined by `Amul'.

4 Lab 1. Finding Eigenvalues Using Iterative Methods INPUTS: Amul - A function handle. Should describe a linear operator on R^(dim). dim - The dimension of the space on which `Amul' acts. numvals - The number of Ritz values to return. iters - The number of times to perform the Arnoldi iteration. Must be between `numvals' and `dim'. RETURN: Return the `k' Ritz values with largest real part of the operator defined by `Amul.' ''' One application of the Arnoldi iteration is to find the eigenvalues of linear operators that are too large to store in memory. For example, if an operator acts on C 220, then its matrix representation contains 2 40 complex values. Storing such a matrix would require 64 terabytes of memory! An example of such an operator is the Fast Fourier Transform, claimed by SIAM to be one of the top algorithms of the century [TODO: cite!]. The Fast Fourier Transform is used in many applications, including oil hunting and mp3 compression. Problem 3. The eigenvalues of the Fast Fourier Transform are known to be { 1, 1, i, i}. Use your function ritz() from Problem 2 to approximate the eigenvalues of the Fast Fourier Transform. Set k to be 10 and set dim to be 2 20. For the argument Amul, use the fft function from scipy.fftpack. The Arnoldi iteration for finding eigenvalues is implemented in a Fortran library called ARPACK. SciPy interfaces with the Arnoldi iteration in this library via the function scipy.sparse.linalg.eigs(). This function has many more options than the implementation we wrote in Problem 2. In this example, the keyword argument k=5 specifies that we want five Ritz values. Note that even though this function comes from the sparse library in SciPy, we can still call it on regular NumPy arrays. >>> B = np.random.rand(10000).reshape(100, 100) >>> sp.sparse.linalg.eigs(b, k=5, return_eigenvectors=false) array([ -1.15577072-2.59438308j, -2.63675878-1.09571889j, -2.63675878+1.09571889j, -3.00915592+0.j, 50.14472893+0.j ]) Convergence The Arnoldi method for finding eigenvalues quickly converges to eigenvalues whose magnitude is distinctly larger than the rest.

5 Problem 4. Finish the following function to visualize the convergence of the Ritz values. def plot_ritz(amul, dim, numvals, iters): ''' Plot the Ritz values of the linear operator defined by `Amul'. INPUTS: Amul - A function handle. Should describe a linear operator on R^(dim). dim - The dimension of the space on which `Amul' acts. numvals - The number of Ritz values to plot. iters - The number of times to perform the Arnoldi iteration. Must be between `numvals' and `dim'. Creates a plot with the number of Arnoldi iterations k on the x-axis, and the Ritz values of the Hessenberg approximation H_k on the y- axis. ''' Hints: 1. Before saving the Ritz values of H k, use np.sort() to ensure that they are in a canonical order. 2. It is not efficient to call your function from Problem 2 for increasing values of iters. If your code takes too long to run, consider integrating your solutions to Problems 1 and 2 with the body of this function. Run your function on these examples. See the plots below. TODO: output real and complex plots separately Lanczos Iteration(Optional) Depending on the symmetry of the problem we may be able to make the Arnoldi iteration more efficient. Consider the case that A is symmetric. This means that the matrx H is both Upper Hessenberg and symmetric, so it is tridiagonal. Since H is tridiagonal, we should have that each Aq n is orthogonal to q 0,..., q n 2. This means that storage of all the columns of Q k is no longer necessary. We can run the entire algorithm while storing only the previous two columns of Q that we have computed. We can also represent H as two vectors: a vector α storing the values along the main diagonal of H, and a vector β storing the values in the first subdiagonal (the values in the first superdiagonal are the same). This change in the way things are stored allows Algorithm 1.1 to be simplified to Algorithm 1.2. Algorithm 1.2 is known as the Lanczos iteration. Problem 5. Write a Python function that performs the Lanczos iteration. Have it accept a starting vector b, a function Amul that computes Ax for

6 Lab 1. Finding Eigenvalues Using Iterative Methods Figure 1.1: The convergence of the Ritz values to the largest eigenvalues of a matrix with random eigenvalues between 0 and 1. any vector x, a number k of iterations to perform, and an optional argument tol that defaults to 1E-8. In its most basic form the Lanczos iteration is not stable. In exact arithmetic the vectors q i are exactly orthogonal, but in the presence of roundoff error this may be absolutely false. In imprecise arithmetic, it is possible for the q i to suffer from so much roundoff error that they may no longer even be linearly independent. There are a variety of modifications to the Lanczos iteration that address this instability. The library used for Lanczos iteration in Scipy uses an algorithm called the Implicitly Restarted Lanczos Method. We will not discuss these algorithms in detail here. Problem 6. The following code performs matrix multiplication by a tridiagonal symmetric matrix. It accepts vectors a and b and u. a stores the entries in the main diagonal of the matrix. b stores the entries in the first sub/superdiagonal. The function returns the image of u under the matrix represented by a and b. def tri_mul(a, b, u): v = a * u v[:-1] += b * u[1:]

7 Figure 1.2: The convergence of the Ritz values to the largest eigenvalues of a matrix with random entries between 0 and 1. Matrices of this form generally have a single isolated eigenvalue that is much larger than the rest. It can be seen here that the Ritz values converge to this eigenvalue much more quickly. Algorithm 1.2 The Lanczos Iteration 1: procedure lanczos(b, Amul, k, tol = 1E 8) 2: q 0 0 Some initialization 3: q 1 b b 2 4: α empty (k) 5: β empty (k) 6: β 1 = 0 7: for i = 0, i < k do Perform the iteration. 8: z Amul (q 1 ) z is a temporary vector to store q i+1. 9: α i = q 1, z q 1 is used to store the previous q i. 10: z = α i q 1 + β i 1 q 0 q 0 is used to store q i 1. 11: β i = z 2 Initialize β i. 12: if β i < tol then Stop if q i+1 2 is too small. 13: return α[: i + 1], β[: i] 14: z/ = β i 15: q 0, q 1 = q 1, z Store new q i+1 and q i on top of q 1 and q 0. 16: return α, β[: 1]

8 Lab 1. Finding Eigenvalues Using Iterative Methods v[1:] += b * u[:-1] return v Use the Lanczos iteration function you wrote for Problem 5 to estimate the 5 largest eigenvalues of a symmetric tridiagonal matrix A with random values in its nonzero diagonals (i.e. make a and b random). For demonstration purposes, let A be 1000 1000. Perform 100 iterations. Compare the 5 eigenvalues of largest absolute value with the 5 Ritz values of largest norm. How do they compare? Try running your simulation a few times for different vectors a and b. You may notice that, occasionally, the largest eigenvalue is repeated in the Ritz values. This happens because of the lack of orthogonality between the vectors used in the Lanczos iteration. These erroneous eigenvaleus are called ghost eigenvalues. They generally converge to actual eigenvalues of the matrix and can make the multiplicity of an eigenvalue look higher than it really is. Arnoldi Iteration in SciPy SciPy interfaces with a Fortran library called ARPACK that has good implementations of the Arnoldi and Lanczos algorithms. The Arnoldi iteration is found in scipy.sparse.linalg.eigs. The Implicitly Restarted Lanczos iteration is found in scipy.sparse.linalg.eigsh These functions allow you to find either the largest or smallest eigenvalues of a sparse or dense matrix. The function scipy.sparse.linalg. svds uses the Implicitly Restarted Lanczos iteration on A A to find singular values.