ARPACK. Dick Kachuma & Alex Prideaux. November 3, Oxford University Computing Laboratory

Similar documents
ARPACK. A c++ implementation of ARPACK eigenvalue package.

Arnoldi Methods in SLEPc

APPLIED NUMERICAL LINEAR ALGEBRA

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

problem Au = u by constructing an orthonormal basis V k = [v 1 ; : : : ; v k ], at each k th iteration step, and then nding an approximation for the e

eciency. In many instances, the cost of computing a few matrix-vector products

Krylov Subspaces. Lab 1. The Arnoldi Iteration

MULTIGRID ARNOLDI FOR EIGENVALUES

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Matrix Eigensystem Tutorial For Parallel Computation

Numerical Methods in Matrix Computations

Section 4.5 Eigenvalues of Symmetric Tridiagonal Matrices

Index. Copyright (c)2007 The Society for Industrial and Applied Mathematics From: Matrix Methods in Data Mining and Pattern Recgonition By: Lars Elden

ECS130 Scientific Computing Handout E February 13, 2017

On the Ritz values of normal matrices

DELFT UNIVERSITY OF TECHNOLOGY

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

PERTURBED ARNOLDI FOR COMPUTING MULTIPLE EIGENVALUES

Numerical Methods I Eigenvalue Problems

Lecture 2: Numerical linear algebra

Institute for Advanced Computer Studies. Department of Computer Science. Iterative methods for solving Ax = b. GMRES/FOM versus QMR/BiCG

DELFT UNIVERSITY OF TECHNOLOGY

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

Index. for generalized eigenvalue problem, butterfly form, 211

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

STA141C: Big Data & High Performance Statistical Computing

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St

LSTRS 1.2: MATLAB Software for Large-Scale Trust-Regions Subproblems and Regularization

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Preconditioned inverse iteration and shift-invert Arnoldi method

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Presentation of XLIFE++

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Krylov Subspaces. The order-n Krylov subspace of A generated by x is

BLZPACK: Description and User's Guide. Osni A. Marques. CERFACS Report TR/PA/95/30

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

STA141C: Big Data & High Performance Statistical Computing

Parallel Singular Value Decomposition. Jiaxing Tan

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Sparse BLAS-3 Reduction

Eigenvalue problems. Eigenvalue problems

Centro de Processamento de Dados, Universidade Federal do Rio Grande do Sul,

6.4 Krylov Subspaces and Conjugate Gradients

Block Krylov-Schur Method for Large Symmetric Eigenvalue Problems

Lecture: Numerical Linear Algebra Background

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

Eigenvalue problems and optimization

A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY

Inexact and Nonlinear Extensions of the FEAST Eigenvalue Algorithm

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Data Analysis and Manifold Learning Lecture 2: Properties of Symmetric Matrices and Examples

Algorithms that use the Arnoldi Basis

QR-decomposition. The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which A = QR

A New Block Algorithm for Full-Rank Solution of the Sylvester-observer Equation.

Scientific Computing: An Introductory Survey

Lecture 8: Fast Linear Solvers (Part 7)

Matrix decompositions

The Lanczos and conjugate gradient algorithms

NORTHERN ILLINOIS UNIVERSITY

Restarting Arnoldi and Lanczos algorithms

Math 504 (Fall 2011) 1. (*) Consider the matrices

NAG Toolbox for MATLAB Chapter Introduction. F02 Eigenvalues and Eigenvectors

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Total least squares. Gérard MEURANT. October, 2008

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Universiteit-Utrecht. Department. of Mathematics. Jacobi-Davidson algorithms for various. eigenproblems. - A working document -

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Solving Ax = b, an overview. Program

Henk van der Vorst. Abstract. We discuss a novel approach for the computation of a number of eigenvalues and eigenvectors

Application of Lanczos and Schur vectors in structural dynamics

IMPLICIT APPLICATION OF POLYNOMIAL FILTERS D. C. SORENSEN

Positive Denite Matrix. Ya Yan Lu 1. Department of Mathematics. City University of Hong Kong. Kowloon, Hong Kong. Abstract

Notes on Eigenvalues, Singular Values and QR

Symmetric rank-2k update on GPUs and/or multi-cores

ABSTRACT. Professor G.W. Stewart

A Parallel Algorithm for Computing the Extremal Eigenvalues of Very Large Sparse Matrices*

In this section again we shall assume that the matrix A is m m, real and symmetric.

CS 542G: Conditioning, BLAS, LU Factorization

Iterative methods for symmetric eigenvalue problems

Lecture 13 Stability of LU Factorization; Cholesky Factorization. Songting Luo. Department of Mathematics Iowa State University

Package rarpack. August 29, 2016

Finding Rightmost Eigenvalues of Large, Sparse, Nonsymmetric Parameterized Eigenvalue Problems

An Asynchronous Algorithm on NetSolve Global Computing System

Reduction of two-loop Feynman integrals. Rob Verheyen

Eigenvalue Problems CHAPTER 1 : PRELIMINARIES

The restarted QR-algorithm for eigenvalue computation of structured matrices

Eigenvalue Problems and Singular Value Decomposition

COMP 558 lecture 18 Nov. 15, 2010

Eigenvalue and Eigenvector Problems

1. The Polar Decomposition

Fast Hessenberg QR Iteration for Companion Matrices

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

ON MATRIX BALANCING AND EIGENVECTOR COMPUTATION

Math 411 Preliminaries

FEM and sparse linear system solving

Transcription:

ARPACK Dick Kachuma & Alex Prideaux Oxford University Computing Laboratory November 3, 2006

What is ARPACK? ARnoldi PACKage Collection of routines to solve large scale eigenvalue problems Developed at Rice University, Texas www.caam.rice.edu/software/arpack/ 2/30 Dick Kachuma & Alex Prideaux ARPACK

The History of ARPACK Main authors: Danny Sorensen (Rice University, Texas) Kristi Maschho (Rice University, Texas) Rich Lehoucq (Sandia National Laboratories) Chao Yang (Berkeley National Laboratory, California) Funded by: National Science Oce ARPA (administered by US Army Research Oce) 3/30 Dick Kachuma & Alex Prideaux ARPACK

What can it do? Designed to compute a few eigenvalues and corresponding eigenvectors of a general n by n matrix A. Most appropriate for large sparse or structured matrices A where structured means that a matrix-vector product w Av requires order n rather than the usual order n 2 oating point operations. Suitable for solving large scale symmetric, nonsymmetric, and generalized eigenproblems from signicant application areas. Eigenvalues and vectors found can be chosen to have specied features such as those of largest real part or largest magnitude. Storage requirements are on the order of n k locations. Where k is the number of eigenvalues requested. 4/30 Dick Kachuma & Alex Prideaux ARPACK

Algorithms used In the general case, the Implicitly Restarted Arnoldi Method (IRAM) is used If A is symmetric it reduces to the Implicitly Restarted Lanczos Method (IRLM). For most problems, a matrix factorization is not required. Only the action of the matrix on a vector is needed, or even more generally, an operation dened. 5/30 Dick Kachuma & Alex Prideaux ARPACK

Libraries used by ARPACK ARPACK makes use of LAPACK BLAS SuperLU (depending on problem being solved) UMFPack (depending on problem being solved) The required LAPACK and BLAS routines are shipped with ARPACK (although dierent versions can be used), SuperLU and UMFPack must be installed separately. 6/30 Dick Kachuma & Alex Prideaux ARPACK

LAPACK routines used Main LAPACK routines used are: Xlahqr (computes Schur decomposition of Hessenberg matrix) Xgeqr (computes QR factorization of a matrix) Xsteqr (diagonalizes a symmetric tri-diagonal matrix) Ytrevc (computes the eigenvectors of a matrix in upper (quasi-)triangular form) 7/30 Dick Kachuma & Alex Prideaux ARPACK

BLAS routines used Main BLAS routines used are: Xtrmm (matrix times upper triangular matrix - Level 3) Xgemv (matrix vector product - Level 2) - IMPORTANT Xger (rank one matrix update - Level 2) Cgeru (rank one complex matrix update - Level 2) Many other Level 1 routines. 8/30 Dick Kachuma & Alex Prideaux ARPACK

Code structure - Fortran implementation The original code is written in Fortran77. Functions are complicated - using a 'Reverse Communication Interface' - a function that is called repeatedly with a number of parameters, and after each call operations are carried out according to its return value and the updated parameters used. Functions are then called for 'Post-Processing' to return the values required. 9/30 Dick Kachuma & Alex Prideaux ARPACK

Code structure - Fortran function notation Reverse Communication Functions: XYaupd RCF using IRAM or IRLM. Post-processing functions: XYeupd Return computed eigenvalues/eigenvectors Xneigh Compute Ritz values and error bounds for the non-symmetric case Xseigt Compute Ritz values and error bounds for the symmetric case Other (auxiliary) functions: XYaup2 Xgetv0 XYconv XYapps 10/30 Dick Kachuma & Alex Prideaux ARPACK

Code structure - C++ Implementation Fortunately for us, a C++ version is now available, the latest version being released in 2000 (although still in beta test). Library is actually only an interface to the Fortran, written by D. Sorensen and F. Gomes, but it has a number of improvements. Library has many possible Class Templates to dene your operator in order to avoid the complications of the Reverse Communication Interface (although this is still available if required). 11/30 Dick Kachuma & Alex Prideaux ARPACK

Code structure - C++ function notation Either use a simple 'Matlab-like' function - AREig Or use various functions/templates for denition of matrix or operation (uses SuperLU (CSC format), UMFPack (CSC format), LAPACK (banded or dense)) Use your choice of function dependent on problem (eigenvalue, SVD etc.). 12/30 Dick Kachuma & Alex Prideaux ARPACK

The AREig function The simple way to use ARPACK++! Many dierent overloaded functions designed to take dierent parameters eg for eigenvalues & eigenvectors required from a matrix in CSC form: int AREig(FLOAT EigValR[ ], FLOAT EigValI[ ], FLOAT EigVec[ ], int n, int nnz, FLOAT A[ ], int irow[ ], int pcol[ ], int nev,...) EigValR (real part of eigenvalues) EigValI (imaginary part of eigenvalues) EigVec (eigenvectors) n (dimension) nnz, A, irow, pcol (CSC form for matrix) nev (number eigenvalues required) 13/30 Dick Kachuma & Alex Prideaux ARPACK

Expected Performance - according to the documentation! Asymptotic performance: For a xed number k of requested eigenvalues and a xed length ncv Arnoldi basis, the computational cost scales linearly with n. The rate of execution (in FLOPS) for the IRA iteration is asymptotic to the rate of execution of Xgemv, ie time = O(nk 2 ) Fortran vs C++ performance Performance is very much comparable in the real variable case. Performance is much worse in C++ for the complex case. Matrix/vector multiplications are of the order of 750% slower (how?!) In their complex test case, which was VERY sparse, performance was down 31% on the Fortran implementation. 14/30 Dick Kachuma & Alex Prideaux ARPACK

Implementation - Matlab vs ARPACK vs ARPACK++ We implemented six test problems on Henrici: Problem 1-1D Laplacian (Real Symmetric) Problem 2-2D Harmonic Oscillator (Real Symmetric) Problem 3 - Convection-Diusion Operator (Real Non-symmetric) Problem 4 - Squire Equation (Complex) Problem 5 - Toy Problem (Real Symmetric) Problem 6 - Grcar Matrix (Real Non-symmetric, highly non-normal) 15/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 1-1D Laplacian Eigenvalues of the 1D Laplacian operator: d 2 u Discretized using central nite dierences. = u; x 2 [0; 1] (1) dx 2 Benchmarking was carried out by measuring time taken to calculate the smallest 4 eigenvalues using implementations in ARPACK and ARPACK++, in comparison with the Matlab 'eigs'. 16/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 1 - Results 10 4 10 2 10 0 ARPACK ARPACK++ MATLAB double single time 10 2 10 4 10 6 10 0 10 1 10 2 10 3 10 4 N 17/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 2-2D Harmonic Oscillator Eigenvalues of the harmonic oscillator: d 2 u dx 2 2 d u dy 2 + (x 2 + y 2 )u = u; (x ; y) 2 [ 5; 5] [ 5; 5] (2) Discretized using central nite dierences. Benchmarking was carried out by measuring time taken to calculate the smallest 4 eigenvalues using implementations in ARPACK and ARPACK++, in comparison with the Matlab 'eigs'. 18/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 2 - Results 10 1 10 0 ARPACK ARPACK++ MATLAB double single time 10 1 10 2 10 3 10 2 10 3 10 4 N 19/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 3 - Convection-Diusion Operator Eigenvalues of the convection-diusion operator: u + 2a:ru = u; (x ; y) 2 [0; 1] [0; 1]; a = ( 1; 2) T (3) Discretized using central nite dierences. Benchmarking was carried out by measuring time taken to calculate the smallest 4 eigenvalues using an implementation in ARPACK++ in comparison with the Matlab 'eigs'. 20/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 3 - Results 10 1 10 0 ARPACK++ MATLAB double single time 10 1 10 2 10 3 10 2 10 3 10 4 N 21/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 4 - Squire Equation Eigenvalues of the Squire equation: 1 U(y) i k Re d 2 dy 2 k 2! = c! (4) where U(y) = 1 y 2, k = 1:0, Re = 2000. The equation was again discretized using central nite dierences. Benchmarking was carried out by measuring time taken to calculate the 4 eigenvalues of largest imaginary part using an implementation in ARPACK++ in comparison with the Matlab 'eigs'. 22/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 4 - Results 10 4 ARPACK++ MATLAB 10 2 time 10 0 10 2 10 4 10 0 10 1 10 2 10 3 10 4 N 23/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 5 - Toy Problem We now look at the eigenvalues of a 'toy' problem, to try and obtain O(N) time dependence for both Matlab and ARPACK. Ax = x (5) where A is tri-diagonal with 1:002 on the diagonal and 0:001 on the o diagonals. Benchmarking was carried out by measuring time taken to calculate the smallest 4 eigenvalues using an implementation in both ARPACK and ARPACK++ in comparison with the Matlab 'eigs'. 24/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 5 - Results 10 1 10 0 10 1 ARPACK ARPACK++ MATLAB double single factor~10 10 2 time 10 3 10 4 10 5 10 6 10 0 10 1 10 2 10 3 10 4 10 5 N 25/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 6 - Grcar Matrix We nally look at the eigenvalues of the Grcar matrix. The use of this matrix and its non-normality in particular was designed to force Matlab to use stricter error bounds and therefore make it comparable to ARPACK. A = 0 B @ 1 1 1 1 1 1 1 1 1 1 1 1 1 1............... 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Benchmarking was carried out by measuring time taken to calculate the 4 eigenvalues of smallest magnitude using an implementation in both ARPACK and ARPACK++ in comparison with the Matlab 'eigs'. 1 C A (6) 26/30 Dick Kachuma & Alex Prideaux ARPACK

Problem 6 - Results 10 2 ARPACK++ MATLAB 10 0 time 10 2 10 4 10 6 10 0 10 1 10 2 10 3 10 4 10 5 N 27/30 Dick Kachuma & Alex Prideaux ARPACK

Problem Solving Squad Find the 2-norm condition number of the matrix Toeplitz matrix: 0 B @ 3 4 1 0 0 3 4 1 1 0 3 4............... 1 0 3 4 1 0 3 4 4 1 0 3 1 C A (7) with dimension n = 2 20. 28/30 Dick Kachuma & Alex Prideaux ARPACK

Problem Solving Squad - BRUTE FORCE As a demonstration of Matlab's 'svds' and the functions in ARPACK, we show the time taken to calculate the norm simply by generating the sparse matrix and nding the required singular values: 10 2 ARPACK++ MATLAB 10 0 time 10 2 10 4 10 6 10 0 10 1 10 2 10 3 10 4 10 5 N (note the sensible method is to extrapolate the results for small n to n = 2 20 ). 29/30 Dick Kachuma & Alex Prideaux ARPACK

Conclusions Our results show that Matlab is clearly the package of choice, oering ease of use and good speed. We would expect ARPACK to be comparable as this underlies Matlab's 'eigs', but we can only demonstrate this in the highly non-normal case. From an implementation point of view, ARPACK++ is vastly preferable to ARPACK, but it should be used only cautiously in the complex case! 30/30 Dick Kachuma & Alex Prideaux ARPACK