Sketching as a Tool for Numerical Linear Algebra

Size: px

Start display at page:

Download "Sketching as a Tool for Numerical Linear Algebra"

Ruth Blankenship
5 years ago
Views:

1 Sketching as a Tool for Numerical Linear Algebra (Part 2) David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 1 / 21

2 Goal New survey by David Woodruff: Sketching as a Tool for Numerical Linear Algebra Topics: Subspace Embeddings Least Squares Regression Least Absolute Deviation Regression Low Rank Approximation Graph Sparsification Sketching Lower Bounds Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 2 / 21

3 Goal New survey by David Woodruff: Sketching as a Tool for Numerical Linear Algebra Topics: Subspace Embeddings Least Squares Regression Least Absolute Deviation Regression Low Rank Approximation Graph Sparsification Sketching Lower Bounds Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 3 / 21

4 Introduction You have Big data! Computationally expensive to deal with Excessive storage requirement Hard to communicate... Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 4 / 21

5 Introduction You have Big data! Computationally expensive to deal with Excessive storage requirement Hard to communicate... Summarize your data Sampling A representative subset of the data Sketching An aggregate summary of the whole data Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 5 / 21

6 Model Input: matrix A R n d vector b R n. Output: function F(A, b,...) e.g. least square regression Different goals: Faster algorithms Streaming Distributed Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 6 / 21

7 Linear Sketching Input: matrix A R n d Let r n and S R r n be a random matrix Let S A be the sketch Compute F(S A) instead of F(A) Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 7 / 21

8 Linear Sketching (cont.) Pros: Compute on a r d matrix instead of n d Smaller representation and faster computation Linearity: S (A + B) = S A + S B We can compose linear sketches! Cons: F(S A) is an approximation of F(A) Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 8 / 21

9 Approximate l 2 -regression Input: matrix A R n d (full column rank) vector b R n parameter 0 < ε < 1 Output ˆx R d : Aˆx b 2 (1 + ε) arg min x Ax b 2 Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 9 / 21

10 Subspace Embedding Definition (l 2 -subspace embedding) A (1 ± ε) l 2 -subspace embedding for a matrix A R n d is a matrix S for which for all x R n SAx 2 2 = (1 ± ε) Ax 2 2 Actually subspace embedding for column space of A Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 10 / 21

11 Previous Session Oblivious l 2 -subspace embedding The distribution from which S is chosen is oblivious to A One very common tool: Johnson-Lindenstrauss transform (JLT) Immediately approximate l2 -regression problem Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 11 / 21

12 Today Non-oblivious l 2 -subspace embedding The distribution from which S is chosen depends on A One very common tool: Leverage Score Sampling Can still be used to approximate l2 -regression problem Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 12 / 21

13 Leverage Scores Thin Singular Value Decomposition (SVD) of A: An d = U n d Σ d d V d d U is an orthonormal basis of column space of A Leverage Score of i-th row of A: l i = 2 U(i) Properties: Independent of the basis (property of the column space) Forms a probability distribution (by simple normalization) Let H = A(A T A) 1 A T, then l 2 i = H i,i Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 13 / 21

14 Leverage Score Sampling Definition (SampleRescale(n, s, p)) We define the procedure S = SampleRescale(n, s, p), if S s n = D Ω, where each row of Ω is a random basis vector in R n chosen according to the probability distribution p, and D is a diagonal matrix where D i,i = 1/ p j s if e j is chosen for i-th row of Ω. Leverage Score Sampling (p = LS-Sampling(A, β)): p = (p1,..., p n ) is a probability distribution satisfying p i β l 2 i /d, where l i is the i-th leverage score of A n d Compute S = SampleRescale(n, s, p) Return S A Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 14 / 21

15 Subspace Embedding via LS-Sampling Theorem Let s = Θ( d log d ), S = SampleRescale(n, s, p) for βε 2 p = LS-Sampling(A, β), and U be an orthonormal matrix of the column space of A; then with probability 0.99, simultaneously for all i [d], 1 ε σ 2 (S U) 1 + ε It immediately implies subspace embedding Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 15 / 21

16 Subspace Embedding via LS-Sampling (cont.) Theorem Let s = Θ( d log d ), S = SampleRescale(n, s, p) for βε 2 p = LS-Sampling(A, β), and U be an orthonormal matrix of the column space of A; then with probability 0.99, simultaneously for all i [d], 1 ε σ 2 (S U) 1 + ε Proof. Matrix Chernoff: Suppose X 1,..., X s are independent copies of symmetric matrix X R d d with E[X] = 0, and X γ, and E[X T X] s 2 and let W = 1 si=1 X s i ; then Pr( W > ε) 2d exp ( sε 2 /(2s 2 + 2γε/3) ) Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 16 / 21

17 Linear Regression via LS-Sampling Theorem Let s = Θ( d log d ), S = SampleRescale(n, s, p) for βε 2 p = LS-Sampling(A, β), and ˆx = arg min x SAx Sb, then with probability 0.99, Aˆx b 2 (1 + ε) arg min x Ax b 2 Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 17 / 21

18 Linear Regression via LS-Sampling (cont.) Theorem (Approximate Matrix Multiplication) For an orthonormal matrix C n m, an arbitrary vector d n 1, and probabilities p = (p 1,..., p n ) such that: p k β C(k) C F let S = SampleRescale(n, s, p); then, with probability 0.99: (SC) T (Sd) C T d 1 F O( sβ ) C F d F Warning: this statement is neither general nor precise! see [DKM06] 2 Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 18 / 21

19 Linear Regression via LS-Sampling (cont.) Theorem Let s = Θ( d log d ), S = SampleRescale(n, s, p) for βε 2 p = LS-Sampling(A, β), and ˆx = arg min x SAx Sb, then with probability 0.99, Proof. On the board. Aˆx b 2 (1 + ε) arg min x Ax b 2 Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 19 / 21

20 Approximating Leverage Scores Computing leverage scores is as hard as solving the regression problem! Can we approximate them? For β = 1/2, in time O(nd log n + d 3 ) [DMIMW12] Improved to O(nnz(A) log n + d 3 ) [CW13] Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 20 / 21

21 Questions? Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 21 / 21

22 Kenneth L Clarkson and David P Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages ACM, Petros Drineas, Ravi Kannan, and Michael W Mahoney. Fast monte carlo algorithms for matrices i: Approximating matrix multiplication. SIAM Journal on Computing, 36(1): , Petros Drineas, Malik Magdon-Ismail, Michael W Mahoney, and David P Woodruff. Fast approximation of matrix coherence and statistical leverage. The Journal of Machine Learning Research, 13(1): , Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 21 / 21

Sketching as a Tool for Numerical Linear Algebra

Sketching as a Tool for Numerical Linear Algebra David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching for Numerical