Parallel Numerical Algorithms
|
|
- Jasmin Fields
- 5 years ago
- Views:
Transcription
1 Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Edgar Solomonik Parallel Numerical Algorithms 1 / 35
2 Outline 1 Truncated SVD Fast Algorithms with Truncated SVD 2 Direct Computation Indirect Computation 3 Randomized Approximation Basics Structured Randomized Factorization 4 HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication Edgar Solomonik Parallel Numerical Algorithms 2 / 35
3 Truncated SVD Fast Algorithms with Truncated SVD Rank-k Singular Value Decomposition (SVD) For any matrix A R m n of rank k there exists a factorization A = UDV T U R m k is a matrix of orthonormal left singular vectors D R k k is a nonnegative diagonal matrix of singular values in decreasing order σ 1 σ k V R n k is a matrix of orthonormal right singular vectors Edgar Solomonik Parallel Numerical Algorithms 3 / 35
4 Truncated SVD Truncated SVD Fast Algorithms with Truncated SVD Given A R m n seek its best k < rank(a) approximation B = argmin ( A B 2 ) B R m n,rank(b) k Eckart-Young theorem: given SVD A = [ U 1 U 2 ] [ D 1 D 2 ] [V1 V 2 ] T B = U1 D 1 V T 1 where D 1 is k k. U 1 D 1 V1 T is the rank-k truncated SVD of A and A U 1 D 1 V T 1 2 = min B R m n,rank(b) k ( A B 2) = σ k+1 Edgar Solomonik Parallel Numerical Algorithms 4 / 35
5 Computational Cost Truncated SVD Fast Algorithms with Truncated SVD Given a rank k truncated SVD A UDV T of A R m n with m n Performing approximately y = Ax requires O(mk) work y U(D(V T x)) Solving Ax = b requires O(mk) work via approximation x V D 1 U T b Edgar Solomonik Parallel Numerical Algorithms 5 / 35
6 Computing the Truncated SVD Direct Computation Indirect Computation Reduction to upper-hessenberg form via two-sided orthogonal updates can compute full SVD Given full SVD can obtain truncated SVD by keeping only largest singular value/vector pairs Given set of transformations Q 1,..., Q s so that U = Q 1 Q s, can obtain leading k columns of U by computing ( ( [ ] ) ) I U 1 = Q 1 Q s 0 This method requires O(mn 2 ) work for the computation of singular values and O(mnk) for k singular vectors Edgar Solomonik Parallel Numerical Algorithms 6 / 35
7 Direct Computation Indirect Computation Computing the Truncated SVD by Krylov Subspace Methods Seek k m, n leading right singular vectors of A Find a basis for Krylov subspace of B = A T A Rather than computing B, compute products Bx = A T (Ax) For instance, do k k + O(1) iterations of Lanczos, to yield BQ = QH Left singular vectors can be obtains by working with C = AA T This method requires O(mnk) work for k singular vectors However, Θ(k) sparse-matrix-vector multiplications are needed (high latency and low flop/byte ratio) Edgar Solomonik Parallel Numerical Algorithms 7 / 35
8 Direct Computation Indirect Computation Generic Low-Rank Factorizations A matrix A R m n is rank k, if for somex R m k, Y R n k with k min(m, n), A = XY T If A = XY T (exact low rank factorization), we can obtain reduced SVD A = UDV T via 1 [U 1, R] = QR(X) 2 [U 2, D, V ] = SVD(RY T ) 3 U = U 1 U 2 with cost O(mk 2 ) using an SVD of a k k rather than m n matrix If instead A XY T 2 ε then A UDV T 2 ε So we can obtain a truncated SVD given an optimal generic low-rank approximation Edgar Solomonik Parallel Numerical Algorithms 8 / 35
9 Rank-Revealing QR Direct Computation Indirect Computation If A is of rank k and its first k columns are linearly independent R 11 R 12 A = Q where R 11 is upper-triangular and k k and Q = Y T Y T with n k matrix Y For arbitrary A we need column ordering permutation P A = QRP QR with column pivoting (due to Gene Golub) is an effective method for this pivot so that the leading column has largest 2-norm method can break in the presence of roundoff error (see Kahan matrix), but is very robust in practice Edgar Solomonik Parallel Numerical Algorithms 9 / 35
10 Direct Computation Indirect Computation Low Rank Factorization by QR with Column Pivoting QR with column pivoting can be used to either determine the (numerical) rank of A compute a low-rank approximation with a bounded error performs only O(mnk) rather than O(mn 2 ) work for a full QR or SVD Edgar Solomonik Parallel Numerical Algorithms 10 / 35
11 Direct Computation Indirect Computation Parallel QR with Column Pivoting In distributed-memory, column pivoting poses further challenges Need at least one message to decide on each pivot column, which leads to Ω(k) synchronizations Existing work tries to pivot many columns at a time by finding subsets of them that are sufficiently linearly independent Randomized approaches provide alternatives and flexibility Edgar Solomonik Parallel Numerical Algorithms 11 / 35
12 Randomization Basics Randomized Approximation Basics Structured Randomized Factorization Intuition: consider a random vector w of dimension n, all of the following holds with high probability in exact arithmetic Given any basis Q for the n dimensional space, random w is not orthogonal to any row of Q T Let A = UDV T where V T R n k Vector w is not orthogonal to any row of V T, consequently z = V T w is a random vector Aw = UDz is random linear combination of cols of UD Given k random vectors, i.e., random matrix W R n k Columns of B = AW gives k random linear combinations of columns of in UD B has the same span as U! Edgar Solomonik Parallel Numerical Algorithms 12 / 35
13 Randomized Approximation Basics Structured Randomized Factorization Using the Basis to Compute a Factorization If B has the same span as the range of A [Q, R] = QR(B) gives orthogonal basis Q for B = AW QQ T A = QQ T UDV T = (QQ T U)DV T, now Q T U is orthogonal and so QQ T U is a basis for the range of A so compute H = Q T A, H R k n and compute [U 1, D, V ] = SVD(H) then compute U = QU 1 and we have a rank k truncated SVD of A A = UDV T Edgar Solomonik Parallel Numerical Algorithms 13 / 35
14 Cost of the Randomized Method Randomized Approximation Basics Structured Randomized Factorization Matrix multiplications e.g. AW, all require O(mnk) operations QR and SVD require O((m + n)k 2 ) operations If k min(m, n) the bulk of the computation here is within matrix multiplication, which can be done with fewer synchronizations and higher efficiency than QR with column pivoting or Arnoldi Edgar Solomonik Parallel Numerical Algorithms 14 / 35
15 Randomized Approximation Basics Structured Randomized Factorization Randomized Approximate Factorization Now lets consider the case when A = UDV T + E where D R k k and E is a small perturbation E may be noise in data or numerical error To obtain a basis for U it is insufficient to multiply by random B R n k, due to influence of E However, oversampling, for instance l = k + 10, and random B R n l gives good results A Gaussian random distribution provides particularly good accuracy So far the dimension of B has assumed knowledge of the target approximate rank k, to determine it dynamically generate vectors (columns of B) one at a time or a block at a time, which results in a provably accurate basis Edgar Solomonik Parallel Numerical Algorithms 15 / 35
16 Randomized Approximation Basics Structured Randomized Factorization Cost Analysis of Randomized Low-rank Factorization The cost of the randomized algorithm for is Tp MM (m, n, k) + Tp QR (m, k, k) which means that the work is O(mnk) and the algorithm is well-parallelizable This assumes we factorize the basis by QR and SVD of R Edgar Solomonik Parallel Numerical Algorithms 16 / 35
17 Randomized Approximation Basics Structured Randomized Factorization Fast Algorithms via Pseudo-Randomness We can lower the number of operations needed by the randomized algorithm by generating B so that AB can be computed more rapidly Generate W as a pseudo-random matrix B = DF R D is diagonal with random elements F can be applied to a vector in O(n log(n)) operations [ ] Hn H e.g. DFT or Hadamard matrix H 2n = n H n H n R is p k columns of the n n identity matrix Computes AB with O(mn log(n)) operations (if m > n) Edgar Solomonik Parallel Numerical Algorithms 17 / 35
18 Randomized Approximation Basics Structured Randomized Factorization Cost of Pseudo-Randomized Factorization Instead of matrix multiplication, apply m FFTs of dimension n Each FFT is independent, so it suffices to perform a single transpose So we have the following overall cost ( ) mn log(n) O γ + Tp all to all (mn/p) + Tp QR (m, k, k) p assuming m > n This is lower with respect to the unstructured/randomized version, however, this idea does not extend well to the case when A is sparse Edgar Solomonik Parallel Numerical Algorithms 18 / 35
19 Hierarchical Low Rank Structure HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication Consider two-way partitioning of vertices of a graph The connectivity within each partition is given by a block diagonal matrix ] [ A1 If the graph is nicely separable there is little connectivity between vertices in the two partitions Consequently, it is often possible to approximate the off-diagonal blocks by low-rank factorization [ A 2 A 1 U 1 D 1 V1 T U 2 D 2 V2 T A 2 Doing this recursively to A 1 and A 2 yields a matrix with hierarchical low-rank structure Edgar Solomonik Parallel Numerical Algorithms 19 / 35 ]
20 Hierarchically semi-separable Edgar Solomonik(HSS) Parallel matrix, Numerical Algorithms space padded 20 / 35 HSS Matrix, Two Levels HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication
21 HSS Matrix, Three Levels HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication Edgar Solomonik Parallel Numerical Algorithms 21 / 35
22 HSS Matrix Formal Definition HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication The l-level HSS factorization is described by { {U, V, T 12, T 21, A 11, A 22 } : l = 1 H l (A) = {U, V, T 12, T 21, H l 1 (A 11 ), H l 1 (A 22 )} : l > 1 The low-rank representation of the diagonal blocks is given by A 21 = Ū2T T 21 V 1, A 12 = Ū1T T 12 V 2 where for a {1, 2}, [ U a ] : l = 1 Ū a = U a (H l (A)) = U 1 (H l 1 (A aa )) 0 U a : l > 1 0 U 2 (H l 1 (A aa )) V [ a ] : l = 1 V a = V a (H l (A)) = V 1 (H l 1 (A aa )) 0 V a : l > 1 0 V 2 (H l 1 (A aa )) Edgar Solomonik Parallel Numerical Algorithms 22 / 35
23 HSS Matrix Vector Multiplication HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication We now consider computing y = Ax With H 1 (A) we would just compute y 1 = A 11 x 1 + U 1 (T 12 (V2 T x 2 )) and y 2 = A 22 x 2 + U 2 (T 21 (V1 T x 1 )) For general H l (A) perform up-sweep and down-sweep [ ] V T up-sweep computes w = 1 x 1 V 2 T at every tree node x 2 ] [Ū1 T down-sweep computes a tree sum of 12 w 2 Ū 2 T 21 w 1 Edgar Solomonik Parallel Numerical Algorithms 23 / 35
24 HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication HSS Matrix Vector Multiplication, Up-Sweep The up-sweep is performed by using the nested structure of V [ ] w = W(H l (A), x) = [ V1 T 0 0 V T 2 V T V T 2 x : l = 1 ] [ ] W(H l 1 (A 11 ), x 1 ) : l > 1 W(H l 1 (A 22 ), x 2 ) Edgar Solomonik Parallel Numerical Algorithms 24 / 35
25 HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication HSS Matrix Vector Multiplication, Down-Sweep Use w = W(H l (A), x) from the root to the leaves to get [ ] [ ] ] [ ] [ ] U1T 12w 2 A11x 1 [Ū1 0 0 T12 A11 0 y = Ax = + = w+ x U 2T 21w 1 A 22x 2 0 Ū 2 T A 22 [ ] [ ] using the nested structure of Ūa and v = U1 0 0 T12 w, 0 U 2 T 21 0 [ ] U1(H y a = l 1 (A aa)) 0 v 0 U 2(H l 1 (A a + A aax a for a {1, 2} aa)) which gives the down-sweep recurrence [ ] [ ] U 1q 1 A 11x 1 + : l = 1 U 2q 2 A 22x 2 y = Ax + z = Y(H l (A), x, z) = [ ] Y(H l 1 (A 11), x 1, U 1q 1) Y(H l 1 (A 22), x 2, U 2q 2) [ ] 0 T12 where q = w + z T 21 0 : l > 1 Edgar Solomonik Parallel Numerical Algorithms 25 / 35
26 HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication Prefix Sum as HSS Matrix Vector Multiplication We can express the n-element prefix sum y(i) = i 1 j=1 x(j) as [ ] L11 0 y = Lx where L = = 1 0. L 21 L L is an H-matrix since L 21 = 1 n 1 T n = [ 1 1 ] T [ 1 1 ] L also has rank-1 HSS structure, in particular { [ ] [ ] [ ] [ ] } 1 2, 1 2, 0, 1, 0, 0 H l (L) = { [ ] [ ] } 1 4, 1 4, 0, 1, H l 1 (L 11 ), H l 1 (L 22 ) : l = 1 : l > 1 so each U, V, Ū, V is a vector of 1s, T 12 = [ 0 ] and T 21 = [ 1 ] Edgar Solomonik Parallel Numerical Algorithms 26 / 35
27 Prefix Sum HSS Up-Sweep HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication We can use the HSS structure of L to compute the prefix sum of x recall that the up-sweep recurrence has the general form [ ] w = W(H l (A), x) = [ V1 T 0 0 V T 2 V T V T 2 x : l = 1 ] [ ] W(H l 1 (A 11 ), x 1 ) : l > 1 W(H l 1 (A 22 ), x 2 ) for the prefix sum this becomes x [ ] [ ] : l = 1 w = W(H l (L), x) = W(H l 1 (L 11 ), x 1 ) : l > W(H l 1 (L 22 ), x 2 ) [ ] S(x1 ) so the up-sweep computes w = where S(y) = S(x 2 ) i y i Edgar Solomonik Parallel Numerical Algorithms 27 / 35
28 Prefix Sum HSS Down-Sweep HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication The down-sweep has the general structure [ ] [ ] U 1 0 A 11 0 q + x : l = 1 0 U 2 0 A 22 y = Y(H l (A), x, z) = [ ] Y(H l 1 (A 11), x 1, U 1q 1) : l > 1 Y(H l 1 (A 22), x 2, U 2q 2) [ ] 0 T12 where q = W(H T 21 0 l (A), x) + z, for the prefix sum [ ] [ ] [ ] [ ] 0 T S(x1) 0 W(H T 21 0 l (L), x) = = = q z 1 0 S(x 2) S(x 1) [ ] z 1 x 1 + z y = Y(H l (L), x, z) = [ 2 ] Y(H l 1 (L 11), x 1, 1 2z 1 Y(H l 1 (L 22), x 2, 1 2(S(x 1) + z 2) Initially the prefix z = 0 and it will always be the case that z 1 = z 2 : l = 1 : l > 1 Edgar Solomonik Parallel Numerical Algorithms 28 / 35
29 HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication Cost of HSS Matrix Vector Multiplication The down-sweep and the up-sweep perform small dense matrix vector multiplications at each recursive step Lets assume k is the dimension of the leaf blocks and the rank at each level (number of columns in each U a, V a ) The work for both the down-sweep and up-sweep is Q(n, k) = 2Q(n/2, k) + O(k 2 γ), Q(k, k) = O(k 2 γ) Q(n, k) = O(nk γ) The depth of the algorithm scales as D = Θ(log(n)) for fixed k Edgar Solomonik Parallel Numerical Algorithms 29 / 35
30 HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication Parallel of HSS Matrix Vector Multiplication If we assign each tree node to a single processor for the first log 2 (p) levels, and execute a different leaf subtree with a different processor T p (n, k) = 2T p/2 (n, k) + O(k 2 γ + k β + α) = O((nk/p + k 2 log(p)) γ + k log(p) β + log(p) α) Edgar Solomonik Parallel Numerical Algorithms 30 / 35
31 HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication Synchronization-Efficient HSS Multiplication The leaf subtrees can be computed independently T leaf subtrees p (n, k) = O(nk/p γ + k β + α) Focus on up-sweep and down-sweep with log 2 (p) levels Executing the root subtree sequentially takes time T root subtree p (pk, k) = O(pk 2 γ + pk β + α) Instead have p r (r < 1) processors compute subtrees with p 1 r leaves, then recurse on the p r roots Tp rec tree (pk, k) = Tp rec tree r (pr k, k) + O(p 1 r k 2 γ + p 1 r k β + α) = O(p 1 r k 2 γ + p 1 r k β + log 1/r (log(p)) α) Edgar Solomonik Parallel Numerical Algorithms 31 / 35
32 HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication Synchronization-Efficient HSS Multiplication Focus on the top tree with p leaves (leaf subtrees) Assign each processor a unique path from a leaf to the root Given w = W(H l (A), x) at every node each processor can compute a down-sweep path in the subtree independently For the up-sweep, realize that the tree applies a linear transformation, so can sum the results computed in each path For each tree node, there is a contribution from every processor assigned a leaf of the subtree of the node Therefore, there are p 1 sums of a total of O(p log(p)) contributions, for a total of O(kp log(p)) elements Do these with min(p, k log 2 (p)) processors, each obtaining max(p, k log 2 (p)) contributions, so T root paths p (k) = O(k 2 log(p) γ + (k log(p) + p) β + log(p) α) Edgar Solomonik Parallel Numerical Algorithms 32 / 35
33 HSS Matrix Vector Multiplication Parallel HSS Matrix Vector Multiplication HSS Multiplication by Multiple Vectors Consider multiplication C = AB where A R n n is HSS and B R n b lets consider the case that p b n if we assign each processor all of A, each can compute a column of C simultaneously this requires a prohibitive amount of memory usage perform leaf-level multiplications, processing n/p rows of B with each processor (call intermediate C) transpose C and apply log 2 (p) root levels of HSS tree to columns of C independently this algorithm requires replication only of the root O(log(p)) levels of the HSS tree, O(pb) data for large k or larger p different algorithms may be desirable Edgar Solomonik Parallel Numerical Algorithms 33 / 35
34 References Michiel E. Hochstenbach, A Jacobi Davidson Type SVD Method, SIAM Journal on Scientific Computing :2, Chan, T. F. (1987). Rank revealing QR factorizations. Linear algebra and its applications, 88, Businger, P., and Golub, G. H. (1965). Linear least squares solutions by Householder transformations. Numerische Mathematik, 7(3), Edgar Solomonik Parallel Numerical Algorithms 34 / 35
35 References Quintana-Ortí, G., Sun, X., and Bischof, C. H. (1998). A BLAS-3 version of the QR factorization with column pivoting. SIAM Journal on Scientific Computing, 19(5), Demmel, J.W., Grigori, L., Gu, M. and Xiang, H., Communication avoiding rank revealing QR factorization with column pivoting. SIAM Journal on Matrix Analysis and Applications, 36(1), pp Bischof, C.H., A parallel QR factorization algorithm with controlled local pivoting. SIAM Journal on Scientific and Statistical Computing, 12(1), pp Halko, N., Martinsson, P.G. and Tropp, J.A., Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM review, 53(2), pp Edgar Solomonik Parallel Numerical Algorithms 35 / 35
Block Bidiagonal Decomposition and Least Squares Problems
Block Bidiagonal Decomposition and Least Squares Problems Åke Björck Department of Mathematics Linköping University Perspectives in Numerical Analysis, Helsinki, May 27 29, 2008 Outline Bidiagonal Decomposition
More informationAlgorithm 853: an Efficient Algorithm for Solving Rank-Deficient Least Squares Problems
Algorithm 853: an Efficient Algorithm for Solving Rank-Deficient Least Squares Problems LESLIE FOSTER and RAJESH KOMMU San Jose State University Existing routines, such as xgelsy or xgelsd in LAPACK, for
More informationNormalized power iterations for the computation of SVD
Normalized power iterations for the computation of SVD Per-Gunnar Martinsson Department of Applied Mathematics University of Colorado Boulder, Co. Per-gunnar.Martinsson@Colorado.edu Arthur Szlam Courant
More informationAPPLIED NUMERICAL LINEAR ALGEBRA
APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation
More informationNumerical Methods in Matrix Computations
Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices
More informationLecture 5: Randomized methods for low-rank approximation
CBMS Conference on Fast Direct Solvers Dartmouth College June 23 June 27, 2014 Lecture 5: Randomized methods for low-rank approximation Gunnar Martinsson The University of Colorado at Boulder Research
More informationMain matrix factorizations
Main matrix factorizations A P L U P permutation matrix, L lower triangular, U upper triangular Key use: Solve square linear system Ax b. A Q R Q unitary, R upper triangular Key use: Solve square or overdetrmined
More informationRank revealing factorizations, and low rank approximations
Rank revealing factorizations, and low rank approximations L. Grigori Inria Paris, UPMC January 2018 Plan Low rank matrix approximation Rank revealing QR factorization LU CRTP: Truncated LU factorization
More informationLecture 9: Numerical Linear Algebra Primer (February 11st)
10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template
More informationPerformance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs
Performance of Random Sampling for Computing Low-rank Approximations of a Dense Matrix on GPUs Théo Mary, Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Jack Dongarra presenter 1 Low-Rank
More informationCommunication avoiding parallel algorithms for dense matrix factorizations
Communication avoiding parallel dense matrix factorizations 1/ 44 Communication avoiding parallel algorithms for dense matrix factorizations Edgar Solomonik Department of EECS, UC Berkeley October 2013
More informationApplications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices
Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.
More informationParallel Singular Value Decomposition. Jiaxing Tan
Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9
STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 1. qr and complete orthogonal factorization poor man s svd can solve many problems on the svd list using either of these factorizations but they
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms Chapter 5 Eigenvalue Problems Section 5.1 Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Michael
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview & Matrix-Vector Multiplication Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 20 Outline 1 Course
More informationMAT 610: Numerical Linear Algebra. James V. Lambers
MAT 610: Numerical Linear Algebra James V Lambers January 16, 2017 2 Contents 1 Matrix Multiplication Problems 7 11 Introduction 7 111 Systems of Linear Equations 7 112 The Eigenvalue Problem 8 12 Basic
More informationLinear Algebra and Eigenproblems
Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details
More informationAM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition
AM 205: lecture 8 Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition QR Factorization A matrix A R m n, m n, can be factorized
More informationCS 598: Communication Cost Analysis of Algorithms Lecture 9: The Ideal Cache Model and the Discrete Fourier Transform
CS 598: Communication Cost Analysis of Algorithms Lecture 9: The Ideal Cache Model and the Discrete Fourier Transform Edgar Solomonik University of Illinois at Urbana-Champaign September 21, 2016 Fast
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More informationA fast randomized algorithm for the approximation of matrices preliminary report
DRAFT A fast randomized algorithm for the approximation of matrices preliminary report Yale Department of Computer Science Technical Report #1380 Franco Woolfe, Edo Liberty, Vladimir Rokhlin, and Mark
More information14.2 QR Factorization with Column Pivoting
page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution
More informationMath 407: Linear Optimization
Math 407: Linear Optimization Lecture 16: The Linear Least Squares Problem II Math Dept, University of Washington February 28, 2018 Lecture 16: The Linear Least Squares Problem II (Math Dept, University
More informationRandomized methods for accelerating matrix factorization algorithms
Randomized methods for accelerating matrix factorization algorithms Gunnar Martinsson Mathematical Institute University of Oxford Students & postdocs: Tracy Babb, Abinand Gopal, Nathan Halko, Sijia Hao,
More informationRandomized algorithms for matrix computations and analysis of high dimensional data
PCMI Summer Session, The Mathematics of Data Midway, Utah July, 2016 Randomized algorithms for matrix computations and analysis of high dimensional data Gunnar Martinsson The University of Colorado at
More informationApplied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization
More informationBlockMatrixComputations and the Singular Value Decomposition. ATaleofTwoIdeas
BlockMatrixComputations and the Singular Value Decomposition ATaleofTwoIdeas Charles F. Van Loan Department of Computer Science Cornell University Supported in part by the NSF contract CCR-9901988. Block
More informationIntroduction to communication avoiding algorithms for direct methods of factorization in Linear Algebra
Introduction to communication avoiding algorithms for direct methods of factorization in Linear Algebra Laura Grigori Abstract Modern, massively parallel computers play a fundamental role in a large and
More informationComputation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.
Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces
More informationA Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades
A Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades Qiaochu Yuan Department of Mathematics UC Berkeley Joint work with Prof. Ming Gu, Bo Li April 17, 2018 Qiaochu Yuan Tour
More informationA fast randomized algorithm for approximating an SVD of a matrix
A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,
More informationMATH 350: Introduction to Computational Mathematics
MATH 350: Introduction to Computational Mathematics Chapter V: Least Squares Problems Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Spring 2011 fasshauer@iit.edu MATH
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms Chapter 13 Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Michael T. Heath Parallel Numerical Algorithms
More informationThis can be accomplished by left matrix multiplication as follows: I
1 Numerical Linear Algebra 11 The LU Factorization Recall from linear algebra that Gaussian elimination is a method for solving linear systems of the form Ax = b, where A R m n and bran(a) In this method
More informationCHAPTER 11. A Revision. 1. The Computers and Numbers therein
CHAPTER A Revision. The Computers and Numbers therein Traditional computer science begins with a finite alphabet. By stringing elements of the alphabet one after another, one obtains strings. A set of
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal
More informationDense LU factorization and its error analysis
Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,
More informationSubset Selection. Deterministic vs. Randomized. Ilse Ipsen. North Carolina State University. Joint work with: Stan Eisenstat, Yale
Subset Selection Deterministic vs. Randomized Ilse Ipsen North Carolina State University Joint work with: Stan Eisenstat, Yale Mary Beth Broadbent, Martin Brown, Kevin Penner Subset Selection Given: real
More informationSparse BLAS-3 Reduction
Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc
More informationMatrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =
30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can
More informationA fast randomized algorithm for computing a Hierarchically Semi-Separable representation of a matrix
A fast randomized algorithm for computing a Hierarchically Semi-Separable representation of a matrix P.G. Martinsson, Department of Applied Mathematics, University of Colorado at Boulder Abstract: Randomized
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationVector and Matrix Norms. Vector and Matrix Norms
Vector and Matrix Norms Vector Space Algebra Matrix Algebra: We let x x and A A, where, if x is an element of an abstract vector space n, and A = A: n m, then x is a complex column vector of length n whose
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 22 1 / 21 Overview
More informationLow Rank Approximation of a Sparse Matrix Based on LU Factorization with Column and Row Tournament Pivoting
Low Rank Approximation of a Sparse Matrix Based on LU Factorization with Column and Row Tournament Pivoting James Demmel Laura Grigori Sebastien Cayrols Electrical Engineering and Computer Sciences University
More informationAvoiding Communication in Distributed-Memory Tridiagonalization
Avoiding Communication in Distributed-Memory Tridiagonalization SIAM CSE 15 Nicholas Knight University of California, Berkeley March 14, 2015 Joint work with: Grey Ballard (SNL) James Demmel (UCB) Laura
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms Chapter 6 Structured and Low Rank Matrices Section 6.3 Numerical Optimization Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at
More informationPreliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012
Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.
More informationRank revealing factorizations, and low rank approximations
Rank revealing factorizations, and low rank approximations L. Grigori Inria Paris, UPMC February 2017 Plan Low rank matrix approximation Rank revealing QR factorization LU CRTP: Truncated LU factorization
More information4.8 Arnoldi Iteration, Krylov Subspaces and GMRES
48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate
More informationSparse factorization using low rank submatrices. Cleve Ashcraft LSTC 2010 MUMPS User Group Meeting April 15-16, 2010 Toulouse, FRANCE
Sparse factorization using low rank submatrices Cleve Ashcraft LSTC cleve@lstc.com 21 MUMPS User Group Meeting April 15-16, 21 Toulouse, FRANCE ftp.lstc.com:outgoing/cleve/mumps1 Ashcraft.pdf 1 LSTC Livermore
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationNumerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??
Metodi Numerici M p. 1/?? Numerical Methods Elena loli Piccolomini Civil Engeneering http://www.dm.unibo.it/ piccolom elena.loli@unibo.it Metodi Numerici M p. 2/?? Least Squares Data Fitting Measurement
More informationRandomized methods for computing low-rank approximations of matrices
Randomized methods for computing low-rank approximations of matrices by Nathan P. Halko B.S., University of California, Santa Barbara, 2007 M.S., University of Colorado, Boulder, 2010 A thesis submitted
More informationSOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA
1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization
More informationSummary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method
Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Leslie Foster 11-5-2012 We will discuss the FOM (full orthogonalization method), CG,
More informationLeast Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo
Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 26, 2010 Linear system Linear system Ax = b, A C m,n, b C m, x C n. under-determined
More informationLinear Analysis Lecture 16
Linear Analysis Lecture 16 The QR Factorization Recall the Gram-Schmidt orthogonalization process. Let V be an inner product space, and suppose a 1,..., a n V are linearly independent. Define q 1,...,
More information7. Symmetric Matrices and Quadratic Forms
Linear Algebra 7. Symmetric Matrices and Quadratic Forms CSIE NCU 1 7. Symmetric Matrices and Quadratic Forms 7.1 Diagonalization of symmetric matrices 2 7.2 Quadratic forms.. 9 7.4 The singular value
More informationVector Spaces, Orthogonality, and Linear Least Squares
Week Vector Spaces, Orthogonality, and Linear Least Squares. Opening Remarks.. Visualizing Planes, Lines, and Solutions Consider the following system of linear equations from the opener for Week 9: χ χ
More informationProbabilistic upper bounds for the matrix two-norm
Noname manuscript No. (will be inserted by the editor) Probabilistic upper bounds for the matrix two-norm Michiel E. Hochstenbach Received: date / Accepted: date Abstract We develop probabilistic upper
More informationBlock Lanczos Tridiagonalization of Complex Symmetric Matrices
Block Lanczos Tridiagonalization of Complex Symmetric Matrices Sanzheng Qiao, Guohong Liu, Wei Xu Department of Computing and Software, McMaster University, Hamilton, Ontario L8S 4L7 ABSTRACT The classic
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 4 Eigenvalue Problems Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction
More informationB553 Lecture 5: Matrix Algebra Review
B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations
More informationExploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning
Exploiting Low-Rank Structure in Computing Matrix Powers with Applications to Preconditioning Erin C. Carson, Nicholas Knight, James Demmel, Ming Gu U.C. Berkeley SIAM PP 12, Savannah, Georgia, USA, February
More informationKrylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17
Krylov Space Methods Nonstationary sounds good Radu Trîmbiţaş Babeş-Bolyai University Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Introduction These methods are used both to solve
More informationNumerical Methods for Solving Large Scale Eigenvalue Problems
Peter Arbenz Computer Science Department, ETH Zürich E-mail: arbenz@inf.ethz.ch arge scale eigenvalue problems, Lecture 2, February 28, 2018 1/46 Numerical Methods for Solving Large Scale Eigenvalue Problems
More informationMinisymposia 9 and 34: Avoiding Communication in Linear Algebra. Jim Demmel UC Berkeley bebop.cs.berkeley.edu
Minisymposia 9 and 34: Avoiding Communication in Linear Algebra Jim Demmel UC Berkeley bebop.cs.berkeley.edu Motivation (1) Increasing parallelism to exploit From Top500 to multicores in your laptop Exponentially
More informationRandomized Methods for Computing Low-Rank Approximations of Matrices
University of Colorado, Boulder CU Scholar Applied Mathematics Graduate Theses & Dissertations Applied Mathematics Spring 1-1-2012 Randomized Methods for Computing Low-Rank Approximations of Matrices Nathan
More informationA communication-avoiding parallel algorithm for the symmetric eigenvalue problem
A communication-avoiding parallel algorithm for the symmetric eigenvalue problem Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign Householder Symposium XX June
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline
More informationRank Revealing QR factorization. F. Guyomarc h, D. Mezher and B. Philippe
Rank Revealing QR factorization F. Guyomarc h, D. Mezher and B. Philippe 1 Outline Introduction Classical Algorithms Full matrices Sparse matrices Rank-Revealing QR Conclusion CSDA 2005, Cyprus 2 Situation
More informationWhy the QR Factorization can be more Accurate than the SVD
Why the QR Factorization can be more Accurate than the SVD Leslie V. Foster Department of Mathematics San Jose State University San Jose, CA 95192 foster@math.sjsu.edu May 10, 2004 Problem: or Ax = b for
More informationTowards an algebraic formalism for scalable numerical algorithms
Towards an algebraic formalism for scalable numerical algorithms Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign Householder Symposium XX June 22, 2017 Householder
More informationProblem set 5: SVD, Orthogonal projections, etc.
Problem set 5: SVD, Orthogonal projections, etc. February 21, 2017 1 SVD 1. Work out again the SVD theorem done in the class: If A is a real m n matrix then here exist orthogonal matrices such that where
More informationarxiv: v1 [math.na] 23 Nov 2018
Randomized QLP algorithm and error analysis Nianci Wu a, Hua Xiang a, a School of Mathematics and Statistics, Wuhan University, Wuhan 43007, PR China. arxiv:1811.09334v1 [math.na] 3 Nov 018 Abstract In
More informationFast Hessenberg QR Iteration for Companion Matrices
Fast Hessenberg QR Iteration for Companion Matrices David Bindel Ming Gu David Garmire James Demmel Shivkumar Chandrasekaran Fast Hessenberg QR Iteration for Companion Matrices p.1/20 Motivation: Polynomial
More informationScientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix
Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 1: Direct Methods Dianne P. O Leary c 2008
More informationA fast randomized algorithm for overdetermined linear least-squares regression
A fast randomized algorithm for overdetermined linear least-squares regression Vladimir Rokhlin and Mark Tygert Technical Report YALEU/DCS/TR-1403 April 28, 2008 Abstract We introduce a randomized algorithm
More informationCheat Sheet for MATH461
Cheat Sheet for MATH46 Here is the stuff you really need to remember for the exams Linear systems Ax = b Problem: We consider a linear system of m equations for n unknowns x,,x n : For a given matrix A
More informationRandomized methods for computing the Singular Value Decomposition (SVD) of very large matrices
Randomized methods for computing the Singular Value Decomposition (SVD) of very large matrices Gunnar Martinsson The University of Colorado at Boulder Students: Adrianna Gillman Nathan Halko Sijia Hao
More informationσ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2
HE SINGULAR VALUE DECOMPOSIION he SVD existence - properties. Pseudo-inverses and the SVD Use of SVD for least-squares problems Applications of the SVD he Singular Value Decomposition (SVD) heorem For
More informationOrthogonal Transformations
Orthogonal Transformations Tom Lyche University of Oslo Norway Orthogonal Transformations p. 1/3 Applications of Qx with Q T Q = I 1. solving least squares problems (today) 2. solving linear equations
More informationA randomized algorithm for approximating the SVD of a matrix
A randomized algorithm for approximating the SVD of a matrix Joint work with Per-Gunnar Martinsson (U. of Colorado) and Vladimir Rokhlin (Yale) Mark Tygert Program in Applied Mathematics Yale University
More informationCS6964: Notes On Linear Systems
CS6964: Notes On Linear Systems 1 Linear Systems Systems of equations that are linear in the unknowns are said to be linear systems For instance ax 1 + bx 2 dx 1 + ex 2 = c = f gives 2 equations and 2
More informationA communication-avoiding parallel algorithm for the symmetric eigenvalue problem
A communication-avoiding parallel algorithm for the symmetric eigenvalue problem Edgar Solomonik 1, Grey Ballard 2, James Demmel 3, Torsten Hoefler 4 (1) Department of Computer Science, University of Illinois
More informationJim Lambers MAT 610 Summer Session Lecture 1 Notes
Jim Lambers MAT 60 Summer Session 2009-0 Lecture Notes Introduction This course is about numerical linear algebra, which is the study of the approximate solution of fundamental problems from linear algebra
More informationBindel, Fall 2016 Matrix Computations (CS 6210) Notes for
1 A cautionary tale Notes for 2016-10-05 You have been dropped on a desert island with a laptop with a magic battery of infinite life, a MATLAB license, and a complete lack of knowledge of basic geometry.
More informationKrylov Subspaces. Lab 1. The Arnoldi Iteration
Lab 1 Krylov Subspaces Lab Objective: Discuss simple Krylov Subspace Methods for finding eigenvalues and show some interesting applications. One of the biggest difficulties in computational linear algebra
More informationMatrix decompositions
Matrix decompositions Zdeněk Dvořák May 19, 2015 Lemma 1 (Schur decomposition). If A is a symmetric real matrix, then there exists an orthogonal matrix Q and a diagonal matrix D such that A = QDQ T. The
More informationIntroduction to Numerical Linear Algebra II
Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Projectors and QR Factorization Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 14 Outline 1 Projectors 2 QR Factorization
More informationrandomized block krylov methods for stronger and faster approximate svd
randomized block krylov methods for stronger and faster approximate svd Cameron Musco and Christopher Musco December 2, 25 Massachusetts Institute of Technology, EECS singular value decomposition n d left
More informationConceptual Questions for Review
Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.
More informationParallel Iterative Methods for Sparse Linear Systems. H. Martin Bücker Lehrstuhl für Hochleistungsrechnen
Parallel Iterative Methods for Sparse Linear Systems Lehrstuhl für Hochleistungsrechnen www.sc.rwth-aachen.de RWTH Aachen Large and Sparse Small and Dense Outline Problem with Direct Methods Iterative
More information