BlockMatrixComputations and the Singular Value Decomposition. ATaleofTwoIdeas

Similar documents
Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

Lecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan

Matrix decompositions

From Matrix to Tensor. Charles F. Van Loan

Algebra C Numerical Linear Algebra Sample Exam Problems

Tensor Network Computations in Quantum Chemistry. Charles F. Van Loan Department of Computer Science Cornell University

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

SVD, PCA & Preprocessing

Lecture 9: Numerical Linear Algebra Primer (February 11st)

CS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra

Numerical Methods in Matrix Computations

Linear Algebra, part 3 QR and SVD

CS 323: Numerical Analysis and Computing

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

Main matrix factorizations

Iterative Methods. Splitting Methods

STA141C: Big Data & High Performance Statistical Computing

APPLIED NUMERICAL LINEAR ALGEBRA

Scientific Computing

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

AMS526: Numerical Analysis I (Numerical Linear Algebra)

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

9. Numerical linear algebra background

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

STA141C: Big Data & High Performance Statistical Computing

Linear Equations and Matrix

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

Applied Linear Algebra in Geoscience Using MATLAB

Solving linear equations with Gaussian Elimination (I)

Practical Linear Algebra: A Geometry Toolbox

Applied Numerical Linear Algebra. Lecture 8

Singular Value Decomposition

MS&E 318 (CME 338) Large-Scale Numerical Optimization

Lecture 4. CP and KSVD Representations. Charles F. Van Loan

Lecture # 11 The Power Method for Eigenvalues Part II. The power method find the largest (in magnitude) eigenvalue of. A R n n.

Matrix decompositions

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Notes on Eigenvalues, Singular Values and QR

Numerical Linear Algebra

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Computational Methods. Eigenvalues and Singular Values

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

5.6. PSEUDOINVERSES 101. A H w.

There are six more problems on the next two pages

Linear Systems of n equations for n unknowns

Block Bidiagonal Decomposition and Least Squares Problems

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Linear Algebra Methods for Data Mining

Next topics: Solving systems of linear equations

Dense LU factorization and its error analysis

AMS526: Numerical Analysis I (Numerical Linear Algebra)

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

MATHEMATICS FOR COMPUTER VISION WEEK 2 LINEAR SYSTEMS. Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

3 QR factorization revisited

Matrices and Matrix Algebra.

EE731 Lecture Notes: Matrix Computations for Signal Processing

Linear Algebra Review

lecture 2 and 3: algorithms for linear algebra

1 Eigenvalues and eigenvectors

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

This can be accomplished by left matrix multiplication as follows: I

Computational Multilinear Algebra

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

Matrix decompositions

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in

Orthogonalization and least squares methods

Linear Algebra and Matrices

Solution of Linear Equations

AM205: Assignment 2. i=1

Matrix & Linear Algebra

EECS 275 Matrix Computation

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

LU Factorization. LU Decomposition. LU Decomposition. LU Decomposition: Motivation A = LU

POLI270 - Linear Algebra

Numerical Methods - Numerical Linear Algebra

Numerical Methods I Non-Square and Sparse Linear Systems

Matrix Factorization and Analysis

18.06SC Final Exam Solutions

lecture 3 and 4: algorithms for linear algebra

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)

Linear Algebra Primer

13-2 Text: 28-30; AB: 1.3.3, 3.2.3, 3.4.2, 3.5, 3.6.2; GvL Eigen2

ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition

Lecture 5 Singular value decomposition

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Lecture 6. Numerical methods. Approximation of functions

Linear Analysis Lecture 16

The Singular Value Decomposition

Maths for Signals and Systems Linear Algebra for Engineering Applications

9. Numerical linear algebra background

MAT 610: Numerical Linear Algebra. James V. Lambers

Computational Linear Algebra

Transcription:

BlockMatrixComputations and the Singular Value Decomposition ATaleofTwoIdeas Charles F. Van Loan Department of Computer Science Cornell University Supported in part by the NSF contract CCR-9901988.

Block Matrices A block matrix is a matrix with matrix entries, e.g., A = = A 11 A 12 A 21 A 22 A ij IR 3 2 Operations are pretty much business as usual, e.g. A 11 A 12 A 21 A 22 T A 11 A 12 A 21 A 22 = A T 11 A T 21 A T 12 AT 22 A 11 A 12 A 21 A 22 = A T 11A 11 + A T 21A 21 etc etc etc

E.g., Strassen Multiplication C 11 C 12 C 21 C 22 = A 11 A 12 A 21 A 22 B 11 B 12 B 21 B 22 P 1 =(A 11 + A 22 )(B 11 + B 22 ) P 2 =(A 21 + A 22 )B 11 P 3 = A 11 (B 12 B 22 ) P 4 = A 22 (B 21 B 11 ) P 5 =(A 11 + A 12 )B 22 P 6 =(A 21 A 11 )(B 11 + B 12 ) P 7 =(A 12 A 22 )(B 21 + B 22 ) C 11 = P 1 + P 4 P 5 + P 7 C 12 = P 3 + P 5 C 21 = P 2 + P 4 C 22 = P 1 + P 3 P 2 + P 6

Singular Value Decomposition (SVD) If A IR m n then there exist orthogonal U IR m m and V IR n n so U T AV = Σ = diag(σ 1,...,σ n ) = where σ 1 σ 2 σ n are the singular values. σ 1 0 0 σ 2 0 0

Singular Value Decomposition (SVD) If A IR m n then there exist orthogonal U IR m m and V IR n n so U T AV = Σ = diag(σ 1,...,σ n ) = where σ 1 σ 2 σ n are the singular values. σ 1 0 0 σ 2 0 0 Fact 1. The columns of V = v 1 v n and U = u 1 u n are the right and left singular vectors and they are related: Av j = σ j u j j =1:n A T u j = σ j v j

Singular Value Decomposition (SVD) If A IR m n then there exist orthogonal U IR m m and V IR n n so U T AV = Σ = diag(σ 1,...,σ n ) = where σ 1 σ 2 σ n are the singular values. σ 1 0 0 σ 2 0 0 Fact 2. The SVD of A is related to the eigen-decompositions of A T A and AA T : V T (A T A)V =diag(σ 2 1,...,σ 2 n) U T (AA T )U =diag(σ1,...,σ 2 n, 2 0,...,0) m n

Singular Value Decomposition (SVD) If A IR m n then there exist orthogonal U IR m m and V IR n n so U T AV = Σ = diag(σ 1,...,σ n ) = where σ 1 σ 2 σ n are the singular values. σ 1 0 0 σ 2 0 0 Fact 3. The smallest singular value is the distance from A to the set of rank deficient matrices: σ min = min rank(b) <n A B F

Singular Value Decomposition (SVD) If A IR m n then there exist orthogonal U IR m m and V IR n n so U T AV = Σ = diag(σ 1,...,σ n ) = where σ 1 σ 2 σ n are the singular values. σ 1 0 0 σ 2 0 0 Fact 4. The matrix σ 1 u 1 v T 1 is the closest rank-1 matrix to A, i.e., it solves the problem: σ min = min rank(b) =1 A B F

The High-Level Message.. It is important to be able to think at the block level because of problem structure. It is important to be able to develop block matrix algorithms There is a progression... Simple Linear Algebra Block Linear Algebra Multilinear Algebra

Reasoning at the Block Level

Uncontrollability The system ẋ = Ax + Bu A IR n n,b IR n p,n>p is uncontrollable if G = t 0 ea(t τ) BB T e Aτ dτ is singular.

Uncontrollability The system ẋ = Ax + Bu A IR n n,b IR n p,n>p is uncontrollable if G = t 0 ea(t τ) BB T e Aτ dτ is singular. Ã = A BB T 0 A T eãt = F 11 F 12 G = F 11F T 12 0 F 22

Nearness to Uncontrollability The system ẋ = Ax + Bu A IR n n,b IR n p,n>p is nearly uncontrollable if the minimum singular value of is small. G = t 0 ea(t τ) BB T e Aτ dτ Ã = A BB T 0 A eãt = F 11 F 12 G = F 11F T 12 0 F 22

Developing Algorithms at the Block Level

Block Matrix Factorizations: A Challenge By a block algorithm we mean an algorithm that is rich in matrix-matrix multiplication. Is there a block algorithm for Gaussian elimination? I.e., is there a way to rearrange the O(n 3 ) operations so that the implementation spends at most O(n 2 )timenot doing matrix-matrix multiplication?

Why? Re-use ideology: when you touch data, you want to use it a lot. Not all linear algebra operations are equal in this regard. Level Example Data Work 1 α = y T z O(n) O(n) y = y + αz O(n) O(n) 2 y = y + As O(n 2 ) O(n 2 ) A = A + yz T O(n 2 ) O(n 2 ) 3 A = A + BC O(n 2 ) O(n 3 ) Here, α is a scalar, y, z vectors, and A, B, C matrices

Scalar LU a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 = 1 0 0 21 1 0 31 32 1 u 11 u 12 u 13 0 u 22 u 23 0 0 u 33 a 11 = u 11 a 12 = u 12 a 13 = u 13 a 21 = 21 u 11 u 11 = a 11 u 12 = a 12 u 13 = a 13 21 = a 21 /u 11 a 31 = 31 u 11 31 = a 31 /u 11 a 22 = 21 u 12 + u 22 u 22 = a 22 21 u 12 a 23 = 21 u 13 + u 23 u 23 = a 23 21 u 13 a 32 = 31 u 12 + 32 u 22 32 =(a 32 31 u 12 )/u 22 a 33 = 31 u 13 + 32 u 23 + u 33 u 33 = a 33 31 u 13 32 u 23

Recursive Description If α IR, v, w IR n 1,andB IR (n 1) (n 1) then A = α v w T B = 1 0 v/α L α w T 0 Ũ is the LU factorization of A if LŨ = à is the LU factorization of à = B vw T /α. Rich in level-2 operations.

Block LU: Recursive Description L 11 0 L 21 L 22 U 11 U 12 0 U 22 = A 11 A 12 A 21 A 22 p n p p n p A 11 = L 11 U 11 Get L 11, U 11 via LU of A 11 A 21 = L 21 U 11 Solve triangular systems for L 21 A 12 = L 11 U 12 Solve triangular systems for U 12 A 22 = L 21 U 12 + L 22 U 22 Form à = A 22 L 21 U 12 Get L 22, U 22 via LU of Ã

Block LU: Recursive Description L 11 0 L 21 L 22 U 11 U 12 0 U 22 = A 11 A 12 A 21 A 22 p n p p n p A 11 = L 11 U 11 Get L 11, U 11 via LU of A 11 O(p 3 ) A 21 = L 21 U 11 Solve triangular systems for L 21 O(np 2 ) A 12 = L 11 U 12 Solve triangular systems for U 12 O(np 2 ) A 22 = L 21 U 12 + L 22 U 22 Form à = A 22 L 21 U 12 O(n 2 p) Recur Get L 22, U 22 via LU of à Rich in level-3 operations!!!

Consider: Ã = A 22 L 21 U 12 p =3: p must be big enough so that the advantage of data re-use is realized.

Block Algorithms for (Some) Matrix Factorizations LU with pivoting: PA = LU P permutation, L lower triangular, U upper triangular Cholesky factorization for symmetric positive definite A: A = GG T G lower triangular The QR factorization for rectangular matrices: A = QR Q IR m m orthogonal R IR m n upper triangular

Block Matrix Factorization Algorithms via Aggregation

Developing a Block QR Factorization The standard algorithm computes Q as a product of Householder reflections, After 2 steps... Q T A = H n H 1 A = R H 2 H 1 A = 0 0 0 0 0 0 0 0 0. The H matrices look like this: H = I 2vv T v a unit 2-norm column vector

Developing a Block QR Factorization The standard algorithm computes Q as a product of Householder reflections, After 2 steps... Q T A = H n H 1 A = R H 2 H 1 A = 0 0 0 2 0 0 2 0 0 2 0 0 2. The H matrices look like this: H = I 2vv T v a unit 2-norm column vector

Developing a Block QR Factorization The standard algorithm computes Q as a product of Householder reflections, After 3 steps... Q T A = H n H 1 A = R H 3 H 2 H 1 A = 0 0 0 0 0 0 0 0 0 0 0 0. The H matrices look like this: H = I 2vv T v a unit 2-norm column vector

Aggregation A = A 1 A 2 p n p p<<n Generate the first p Householders based on A 1 : H p H 1 A 1 = R 11 (upper triangular) Aggregate H 1,...,H p : H p H 1 = I 2WY T W, Y IR m p Apply to rest of matrix: (H p H 1 )A = (H p H 1 )A 1 (I 2WY T )A 2

The WY Representation Aggregation: (I 2WY T )(I 2vv T ) = I 2W + Y T + where W + = W Y + = Y (I 2WY T )v v Application A (I 2WY T )A = A (2W )(Y T A)

The Curse of Similarity Transforms

A Block Householder Tridiagonalization? A symmetric. Compute orthogonal Q such that Q T AQ = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0. The standard algorithm computes Q as a product of Householder reflections, Q = H 1 H n 2

A Block Householder Tridiagonalization? H 1 introduces zeros in first column H T 1 A = 0 0 0 0. Scrambles rows 2 through 6.

A Block Householder Tridiagonalization? Must also post-multiply... H T 1 AH 1 = 0 0 0 0 0 0 0 0. H 1 scrambles columns 2 through 6.

A Block Householder Tridiagonalization? H 2 introduces zeros into second column H T 1 AH 1 = 0 0 0 0 0 2 0 2 0 2 0 2. Note that because of H 1 s impact, H 2 depends on all of A s entries. The H i can be aggregated, but A must be completely updated along the way destroys the advantage.

Jacobi Methods for Symmetric Eigenproblem These methods do not initially reduce A to condensed form. They repeatedly reduce the sum-of-squares of the off-diagonal elements. A = 0 0 1 0 0 0 0 c 0 s 0 0 1 0 0 s 0 c T A 1 0 0 0 0 c 0 s 0 0 1 0 0 s 0 c The (2,4) subproblem: d 1 0 0 d 2 = c s s c T a 22 a 24 a 42 a 44 c s s c c 2 + s 2 = 1 and the off-diagonal sum of squares is reduced by 2a 2 24.

Jacobi Methods for Symmetric Eigenproblem Cycle through subproblems: (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4) (1,2): A = 0 0 c s 0 0 s c 0 0 0 0 1 0 0 0 0 1 T A c s 0 0 s c 0 0 0 0 1 0 0 0 0 1 (1,3): A = 0 0 c 0 s 0 0 1 0 0 s 0 c 0 0 0 0 1 T A c 0 s 0 0 1 0 0 s 0 c 0 0 0 0 1 etc.

Parallel Jacobi Parallel Ordering: {(1, 2), (3, 4)}, {(1, 4), (2, 3)}, { (1, 3), (2, 4)} (1,2): A = 0 0 c s 0 0 s c 0 0 0 0 1 0 0 0 0 1 T A c s 0 0 s c 0 0 0 0 1 0 0 0 0 1 (3,4): A = 0 0 1 0 0 0 0 1 0 0 0 0 c s 0 0 s c T A 1 0 0 0 0 1 0 0 0 0 c s 0 0 s c

Block Jacobi A = 0 0 I 0 0 0 0 Q 11 0 Q 12 0 0 I 0 0 Q 21 0 Q 22 T A I 0 0 0 0 Q 11 0 Q 12 0 0 I 0 0 Q 21 0 Q 22 The (2,4) subproblem: D 1 0 0 D 2 = Q 11 Q 12 Q 21 Q 22 T A 22 A 24 A 42 A 44 Q 11 Q 12 Q 21 Q 22 Convergence analysis requires an understanding of 2-by-2 block matrices that are orthogonal...

The CS Decomposition The blocks of an orthogonal matrix have related SVDs: Q 11 Q 12 Q 21 Q 22 = U 1 0 0 U 2 C S S C V 1 0 0 V 2 T C =diag(cos(θ 1 ),...,cos(θ m )) S =diag(sin(θ 1 ),...,sin(θ m )) U1 T Q 11V 1 = C U1 T Q 12 V 2 = S U2 T Q 21 V 1 = S U2 T Q 22 V 2 = C

The Curse of Sparsity

HowtoExtractLevel-3Performance? Suppose we want to solve a large sparse Ax = b problem. Iterative methods for this problem can often be dramatically accelerated if you can find a matrix M with two properties: It is easy/efficient to solve linear systems of the form Mz = r. M approximates the essence of A. M is called a preconditioner and the original iteration is modified to effectively solve the equivalent linear system (M 1 A)x =(M 1 b).

Idea: Kronecker Product Preconditioners b 11 b 12 b 21 b 22 c 11 c 12 c 13 c 21 c 22 c 23 c 31 c 32 c 33 = b 11 c 11 b 11 c 12 b 11 c 13 b 12 c 11 b 12 c 12 b 12 c 13 b 11 c 21 b 11 c 22 b 11 c 23 b 12 c 21 b 12 c 22 b 12 c 23 b 11 c 31 b 11 c 32 b 11 c 33 b 12 c 31 b 12 c 32 b 12 c 33 b 21 c 11 b 21 c 12 b 21 c 13 b 22 c 11 b 22 c 12 b 22 c 13 b 21 c 21 b 21 c 22 b 21 c 23 b 22 c 21 b 22 c 22 b 22 c 23 b 21 c 31 b 21 c 32 b 21 c 33 b 22 c 31 b 22 c 32 b 22 c 33 Replicated Block Structure

Some Properties and a Big Fact Properties: (B C) T = B T C T (B C) 1 = B 1 C 1 (B C)(D F ) = BD CF B (C D) = (B C) D Big Fact: If B and C are m-by-m, then the m 2 -by-m 2 linear system (B C)z = r can be solved in O(m 3 ) time rather than O(m 6 )time.

Capturing Essence with a Kronecker Product Given A = A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33 A ij IR m m choose B and C to minimize A B C F = A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33 b 11 C b 12 C b 13 C b 21 C b 22 C b 23 C b 31 C b 32 C b 33 C F TheexactsolutioncanbeobtainedviatheSVD..

Solution of the min A B C Problem Makes the blocks of A into vectors and arrange block-column major order: Ã = col(a 11 ) col(a 21 ) col(a 31 ) col(a 12 ) col(a 33 ) Compute the largest singular value σ max and the corresponding singular vectors u max and v max. B opt = σ max reshape(v max, 3, 3)). C opt = σ max reshape(u max,m,m)).

Conclusions It is important to be able to think at the block level because of problem structure. It is important to be able to develop block matrix algorithms There is a progression... Simple Linear Algebra Block Linear Algebra Multilinear Algebra (Thursday)