Lecture 9: Matrix approximation continued

Size: px
Start display at page:

Download "Lecture 9: Matrix approximation continued"

Transcription

1 Algorithms in Data Mining Fall 013 Lecturer: Edo Liberty Lecture 9: Matrix approximation continued Warning: This note may contain typos and other inaccuracies which are usually discussed during class. Please do not cite this note as a reliable source. If you find mistakes, please inform me. Recap Last class was dedicated to approximating matrices by sampling individual entries from them. We used the following useful concentration result for sums of random matrices. Lemma 0.1 (Matrix Bernstein Inequality [1]). Let X 1,..., X s be independent m n matrix valued random variables such that k [s] E[X k ] 0 and X k R Set σ max{ k E[X kx T k ], k E[XT k X k] } then Pr[ X k > t] (m + n)e t σ +Rt/3 We obtained an approximate sampled matrix B which was sparser than A. For example, for a matrix A containing values in {1, 0, 1} it sufficed that B contains only s entries and ( ) nr log(n/δ) s O. Moreover, we got that A B ε A. We also claimed that we can compute the PCA projection of B, Pk B instead of that of A, P k and still have: σ k+1 A P k A A P B k A σ k+1 + ε A In this class we will reduce the amount of space needed to be independent of n. We will do this by approximating AA T directly. Here we follow the ideas in [] and give a much simpler proof which, unfortunately, obtains slightly worse bounds. PCA with an approximate covariance matrix Here we see that approximating BB T AA T is sufficient in order to compute an approximate PCA projection (Lemma 3.8 in []). Let Pk B denote the projection on the top k left singular values of B. ε A Pk B A sup xa xpk B A (1) x, x 1 sup x null(p B k ), x 1 xa () sup x null(p B k ), x 1 xaa T (3) sup x(aa T BB T ) + xbb T (4) x null(pk B), x 1 AA T BB T + σ k+1 (BB T ) (5) AA T BB T + σ k+1 (AA T ) (6) 1

2 The last transition is due to the fact that σ k+1 (BB ) σ k+1 (AA T ) + AA T BB T. By taking the square root and recalling that σ k+1 (AA T ) σk+1 (A) σ k+1 we get: A Pk B A σk+1 + AAT BB T σ k+1 + AA T BB T Therefore, we have that AA T BB T ε A A P B k A σ k+1 + ε A Column subset selection by sampling From this point on, our goal is to find a matrix B such that AA T BB T is small and B is as sparse or small as possible. Note that AA T A j A j where we denote by A j the j th column of the matrix A. Let Computing the expectation of BB T we have B A j / p j with probability p j E[BB T ] p j (A j / p j )(A j / p j ) T A j A T j AA T Clearly, we cannot hope to approximate the matrix A with only one column. We therefore define B to be s such sampled columns from A side by side. B 1 s [B 1... B s ] Just to clarify, B is an m s matrix containing s columns from A (rescaled) and B k A j / p j with probability p j. Computing the expectation of BB T we get that E[BB T ] E[ 1 s B kbk T ] E[B k Bk T ] AA T k1 We are now ready to use the matrix Bernstein inequality above: BB T AA T k1 1 s (B kb T k AA T ) X k To make things simpler we pick p j A j / A F. In words, the columns are picked with probability proportional to their squared norm. R max X k k1 n max 1 s A ja T j /p j + 1 s AAt 1 s A F + 1 s A σ k E[X k X T k ] 1 s E[B kb T k B k B T k AA T AA T ] (7) 1 n s p j A j A T j A j A T j /p j + 1 s A 4 1 s A F A + 1 s A 4 (8)

3 Plugging both expressions into the matrix Chernoff bound above we get that Pr[ BB T AA T > ε A /] me sε 4 A 4 /4 A F A + A 4 +ε A A F /3+ε A 4 /3 By using that A F A and that ε 1 and denoting the numeric rank of A by r A F / A we can simplify this to be Pr[ BB T AA T > ε A /] me sε4 16r To conclude, it is enough to sample s 16r log(m/δ) ε 4 columns from A (with probability proportional to their squared norm) to form a matrix B such that BB T AA T ε A / with probability at least 1 δ. According to above this also gives us that A Pk BA σ k+1 + ε A which completes our claim. Note that the number of non zeros in B is bounded by s m which is independent of n, the number of columns. This is potentially significantly better than the results obtained in last week s class. Remark 0.1. More elaborate column selection algorithms exist which provide better approximation but these will not be discussed here. See [3] for the latest result that I am aware of. Deterministic Lossy SVD I this section we will see that the above approximation can be improved using a simple trick [4]. The algorithm keeps an m s sketch matrix B which is updated every time a new column from the input matrix A is added. It maintains the invariant that the last column in the sketch is always zero. When a new input row is added it is places in the last (all zeros) column of the sketch. Then, using its SV D the sketch is rotated from the right so that its columns are orthogonal. Finally, the sketch column norms are shrunk so that the last column is again all zeros. In the algorithm we denote by [U, Σ, V ] SV D(B) the Singular Value Decomposition of B. We use the convention that UΣV T B, U T U V T V I, and Σ diag([σ 1,..., σ s ]), σ 1... σ s. The notation I stands for the s s identity matrix while B s denotes the s th column of B (similarly A j ). Input: s, A R m n B 0 all zeros matrix R m s for i [n] do C i B i 1 Cs i A i [U i, Σ i, V i ] SV D(C i ) D i U i Σ i δ i Σ i s,s W i I Σ δi B i D i W i end for Return: B B n Put A i in the last (all zeros) column of C i Lemma 0.. Let B be the output of the above algorithm for a matrix A and an integer s then: AA T BB T A F /s Proof. We start by bounding AA T BB T from above: AA T BB T max x 1 (xt AA T x x T BB T x) max x 1 ( xt A x T B ). 3

4 We open this with a simple telescopic sum x T B n xt B i x T B i 1 and by replacing x T A n (xt A j ). x T A x T B [(x T A j ) ( x T B i x T B i 1 )] (9) [((x T A j ) + x T B i 1 ) x T B i ] (10) [ x T C i x T B i ] (11) [ x T D i x T D i W i ] (1) x T [D i (D i ) T D i (W i ) (D i ) T ]x (13) x T [δi D i (Σ i ) (D i ) T ]x (14) x T [δi U i (U i ) T ]x (15) δi U i (U i ) T δi (16) For this to mean anything we have to bound the term n δ i. We do this computing the Frobenius norm of B. B n F B i F B i 1 F (17) i1 [ B i F D i F ] + [ D i F C i F ] + [ C i F B i 1 F ] (18) i1 Let us deal with each term separately: Putting these together we get B i F D i F tr(b i (B i ) T D i (D i ) T ) tr(δ i U i (U i ) ) sδ i (19) D i F C i F 0 (0) C i F B i 1 F A i (1) B F B n F A F s δ i Since B F 0 we conclude that δi A F /s Combining with the above: BB T AA T A F /s Finally, we recall that to achieve the approximation guarantee A P B k A σ k+1 + ε A it is sufficient to require BB T AA T ε A / which is obtained when This completes the claim. s r ε 4

5 Discusion Note that while the deterministic lossy SVD procedure requires less space it also requires more operations per column insertion. Namely, it computes the SVD of the sketch in every iteration. This can be somewhat improved (see [4]) but it is still far from being as efficient as column sampling. Making this algorithm faster while maintaining its approximation guaranty is an open problem. Another interesting problem is to make this algorithm take advantage of the matrix sparsity. References [1] Emmanuel Candès and Benjamin Recht. Exact matrix completion via convex optimization. Commun. ACM, 55(6): , June 01. [] Mark Rudelson and Roman Vershynin. Sampling from large matrices: An approach through geometric functional analysis. J. ACM, 54(4), July 007. [3] Christos Boutsidis, Petros Drineas, and Malik Magdon-Ismail. Near optimal column-based matrix reconstruction. In Proceedings of the 011 IEEE 5nd Annual Symposium on Foundations of Computer Science, FOCS 11, pages , Washington, DC, USA, 011. IEEE Computer Society. [4] Edo Liberty. Simple and deterministic matrix sketching. CoRR, abs/ , 01. 5

to be more efficient on enormous scale, in a stream, or in distributed settings.

to be more efficient on enormous scale, in a stream, or in distributed settings. 16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to

More information

Lecture 18 Nov 3rd, 2015

Lecture 18 Nov 3rd, 2015 CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 18 Nov 3rd, 2015 Scribe: Jefferson Lee 1 Overview Low-rank approximation, Compression Sensing 2 Last Time We looked at three different

More information

Sketching as a Tool for Numerical Linear Algebra

Sketching as a Tool for Numerical Linear Algebra Sketching as a Tool for Numerical Linear Algebra (Part 2) David P. Woodruff presented by Sepehr Assadi o(n) Big Data Reading Group University of Pennsylvania February, 2015 Sepehr Assadi (Penn) Sketching

More information

Compressed Sensing and Robust Recovery of Low Rank Matrices

Compressed Sensing and Robust Recovery of Low Rank Matrices Compressed Sensing and Robust Recovery of Low Rank Matrices M. Fazel, E. Candès, B. Recht, P. Parrilo Electrical Engineering, University of Washington Applied and Computational Mathematics Dept., Caltech

More information

CS 229r: Algorithms for Big Data Fall Lecture 17 10/28

CS 229r: Algorithms for Big Data Fall Lecture 17 10/28 CS 229r: Algorithms for Big Data Fall 2015 Prof. Jelani Nelson Lecture 17 10/28 Scribe: Morris Yau 1 Overview In the last lecture we defined subspace embeddings a subspace embedding is a linear transformation

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Using Friendly Tail Bounds for Sums of Random Matrices

Using Friendly Tail Bounds for Sums of Random Matrices Using Friendly Tail Bounds for Sums of Random Matrices Joel A. Tropp Computing + Mathematical Sciences California Institute of Technology jtropp@cms.caltech.edu Research supported in part by NSF, DARPA,

More information

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the

More information

dimensionality reduction for k-means and low rank approximation

dimensionality reduction for k-means and low rank approximation dimensionality reduction for k-means and low rank approximation Michael Cohen, Sam Elder, Cameron Musco, Christopher Musco, Mădălina Persu Massachusetts Institute of Technology 0 overview Simple techniques

More information

Column Selection via Adaptive Sampling

Column Selection via Adaptive Sampling Column Selection via Adaptive Sampling Saurabh Paul Global Risk Sciences, Paypal Inc. saupaul@paypal.com Malik Magdon-Ismail CS Dept., Rensselaer Polytechnic Institute magdon@cs.rpi.edu Petros Drineas

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =

More information

Randomized Numerical Linear Algebra: Review and Progresses

Randomized Numerical Linear Algebra: Review and Progresses ized ized SVD ized : Review and Progresses Zhihua Department of Computer Science and Engineering Shanghai Jiao Tong University The 12th China Workshop on Machine Learning and Applications Xi an, November

More information

Sparse Features for PCA-Like Linear Regression

Sparse Features for PCA-Like Linear Regression Sparse Features for PCA-Like Linear Regression Christos Boutsidis Mathematical Sciences Department IBM T J Watson Research Center Yorktown Heights, New York cboutsi@usibmcom Petros Drineas Computer Science

More information

randomized block krylov methods for stronger and faster approximate svd

randomized block krylov methods for stronger and faster approximate svd randomized block krylov methods for stronger and faster approximate svd Cameron Musco and Christopher Musco December 2, 25 Massachusetts Institute of Technology, EECS singular value decomposition n d left

More information

Subset Selection. Deterministic vs. Randomized. Ilse Ipsen. North Carolina State University. Joint work with: Stan Eisenstat, Yale

Subset Selection. Deterministic vs. Randomized. Ilse Ipsen. North Carolina State University. Joint work with: Stan Eisenstat, Yale Subset Selection Deterministic vs. Randomized Ilse Ipsen North Carolina State University Joint work with: Stan Eisenstat, Yale Mary Beth Broadbent, Martin Brown, Kevin Penner Subset Selection Given: real

More information

RandNLA: Randomized Numerical Linear Algebra

RandNLA: Randomized Numerical Linear Algebra RandNLA: Randomized Numerical Linear Algebra Petros Drineas Rensselaer Polytechnic Institute Computer Science Department To access my web page: drineas RandNLA: sketch a matrix by row/ column sampling

More information

Simple and Deterministic Matrix Sketches

Simple and Deterministic Matrix Sketches Simple and Deterministic Matrix Sketches Edo Liberty + ongoing work with: Mina Ghashami, Jeff Philips and David Woodruff. Edo Liberty: Simple and Deterministic Matrix Sketches 1 / 41 Data Matrices Often

More information

Lecture 3. Random Fourier measurements

Lecture 3. Random Fourier measurements Lecture 3. Random Fourier measurements 1 Sampling from Fourier matrices 2 Law of Large Numbers and its operator-valued versions 3 Frames. Rudelson s Selection Theorem Sampling from Fourier matrices Our

More information

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit

Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit Uniform Uncertainty Principle and signal recovery via Regularized Orthogonal Matching Pursuit arxiv:0707.4203v2 [math.na] 14 Aug 2007 Deanna Needell Department of Mathematics University of California,

More information

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU) 0 overview Our Contributions: 1 overview Our Contributions: A near optimal low-rank

More information

Approximate Spectral Clustering via Randomized Sketching

Approximate Spectral Clustering via Randomized Sketching Approximate Spectral Clustering via Randomized Sketching Christos Boutsidis Yahoo! Labs, New York Joint work with Alex Gittens (Ebay), Anju Kambadur (IBM) The big picture: sketch and solve Tradeoff: Speed

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

arxiv: v6 [cs.ds] 11 Jul 2012

arxiv: v6 [cs.ds] 11 Jul 2012 Simple and Deterministic Matrix Sketching Edo Liberty arxiv:1206.0594v6 [cs.ds] 11 Jul 2012 Abstract We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Randomized algorithms for the approximation of matrices

Randomized algorithms for the approximation of matrices Randomized algorithms for the approximation of matrices Luis Rademacher The Ohio State University Computer Science and Engineering (joint work with Amit Deshpande, Santosh Vempala, Grant Wang) Two topics

More information

arxiv: v2 [cs.ds] 17 Feb 2016

arxiv: v2 [cs.ds] 17 Feb 2016 Efficient Algorithm for Sparse Matrices Mina Ghashami University of Utah ghashami@cs.utah.edu Edo Liberty Yahoo Labs edo.liberty@yahoo.com Jeff M. Phillips University of Utah jeffp@cs.utah.edu arxiv:1602.00412v2

More information

Random Matrices: Invertibility, Structure, and Applications

Random Matrices: Invertibility, Structure, and Applications Random Matrices: Invertibility, Structure, and Applications Roman Vershynin University of Michigan Colloquium, October 11, 2011 Roman Vershynin (University of Michigan) Random Matrices Colloquium 1 / 37

More information

Efficiently Implementing Sparsity in Learning

Efficiently Implementing Sparsity in Learning Efficiently Implementing Sparsity in Learning M. Magdon-Ismail Rensselaer Polytechnic Institute (Joint Work) December 9, 2013. Out-of-Sample is What Counts NO YES A pattern exists We don t know it We have

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

A fast randomized algorithm for approximating an SVD of a matrix

A fast randomized algorithm for approximating an SVD of a matrix A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,

More information

A Simple Algorithm for Clustering Mixtures of Discrete Distributions

A Simple Algorithm for Clustering Mixtures of Discrete Distributions 1 A Simple Algorithm for Clustering Mixtures of Discrete Distributions Pradipta Mitra In this paper, we propose a simple, rotationally invariant algorithm for clustering mixture of distributions, including

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

Subset Selection. Ilse Ipsen. North Carolina State University, USA

Subset Selection. Ilse Ipsen. North Carolina State University, USA Subset Selection Ilse Ipsen North Carolina State University, USA Subset Selection Given: real or complex matrix A integer k Determine permutation matrix P so that AP = ( A 1 }{{} k A 2 ) Important columns

More information

Solution-recovery in l 1 -norm for non-square linear systems: deterministic conditions and open questions

Solution-recovery in l 1 -norm for non-square linear systems: deterministic conditions and open questions Solution-recovery in l 1 -norm for non-square linear systems: deterministic conditions and open questions Yin Zhang Technical Report TR05-06 Department of Computational and Applied Mathematics Rice University,

More information

Homework 1. Yuan Yao. September 18, 2011

Homework 1. Yuan Yao. September 18, 2011 Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to

More information

Sparsification by Effective Resistance Sampling

Sparsification by Effective Resistance Sampling Spectral raph Theory Lecture 17 Sparsification by Effective Resistance Sampling Daniel A. Spielman November 2, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened

More information

CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery

CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery Tim Roughgarden & Gregory Valiant May 3, 2017 Last lecture discussed singular value decomposition (SVD), and we

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 22 1 / 21 Overview

More information

Lecture 18 Oct. 30, 2014

Lecture 18 Oct. 30, 2014 CS 224: Advanced Algorithms Fall 214 Lecture 18 Oct. 3, 214 Prof. Jelani Nelson Scribe: Xiaoyu He 1 Overview In this lecture we will describe a path-following implementation of the Interior Point Method

More information

Fast Approximate Matrix Multiplication by Solving Linear Systems

Fast Approximate Matrix Multiplication by Solving Linear Systems Electronic Colloquium on Computational Complexity, Report No. 117 (2014) Fast Approximate Matrix Multiplication by Solving Linear Systems Shiva Manne 1 and Manjish Pal 2 1 Birla Institute of Technology,

More information

U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Last revised. Lecture 21

U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Last revised. Lecture 21 U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Scribe: Anupam Last revised Lecture 21 1 Laplacian systems in nearly linear time Building upon the ideas introduced in the

More information

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of

More information

Lecture 12: Randomized Least-squares Approximation in Practice, Cont. 12 Randomized Least-squares Approximation in Practice, Cont.

Lecture 12: Randomized Least-squares Approximation in Practice, Cont. 12 Randomized Least-squares Approximation in Practice, Cont. Stat60/CS94: Randomized Algorithms for Matrices and Data Lecture 1-10/14/013 Lecture 1: Randomized Least-squares Approximation in Practice, Cont. Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning:

More information

Lecture Notes 10: Matrix Factorization

Lecture Notes 10: Matrix Factorization Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

17 Random Projections and Orthogonal Matching Pursuit

17 Random Projections and Orthogonal Matching Pursuit 17 Random Projections and Orthogonal Matching Pursuit Again we will consider high-dimensional data P. Now we will consider the uses and effects of randomness. We will use it to simplify P (put it in a

More information

Section 3.9. Matrix Norm

Section 3.9. Matrix Norm 3.9. Matrix Norm 1 Section 3.9. Matrix Norm Note. We define several matrix norms, some similar to vector norms and some reflecting how multiplication by a matrix affects the norm of a vector. We use matrix

More information

Near-Optimal Coresets for Least-Squares Regression

Near-Optimal Coresets for Least-Squares Regression 6880 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 59, NO 10, OCTOBER 2013 Near-Optimal Coresets for Least-Squares Regression Christos Boutsidis, Petros Drineas, Malik Magdon-Ismail Abstract We study the

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2017 LECTURE 5 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 17 LECTURE 5 1 existence of svd Theorem 1 (Existence of SVD) Every matrix has a singular value decomposition (condensed version) Proof Let A C m n and for simplicity

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 08): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing

A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 5, SEPTEMBER 2001 1215 A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing Da-Zheng Feng, Zheng Bao, Xian-Da Zhang

More information

Randomized algorithms for the low-rank approximation of matrices

Randomized algorithms for the low-rank approximation of matrices Randomized algorithms for the low-rank approximation of matrices Yale Dept. of Computer Science Technical Report 1388 Edo Liberty, Franco Woolfe, Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information

Information-Theoretic Limits of Matrix Completion

Information-Theoretic Limits of Matrix Completion Information-Theoretic Limits of Matrix Completion Erwin Riegler, David Stotz, and Helmut Bölcskei Dept. IT & EE, ETH Zurich, Switzerland Email: {eriegler, dstotz, boelcskei}@nari.ee.ethz.ch Abstract We

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

Lecture 5 Singular value decomposition

Lecture 5 Singular value decomposition Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn

More information

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2 Norwegian University of Science and Technology Department of Mathematical Sciences TMA445 Linear Methods Fall 07 Exercise set Please justify your answers! The most important part is how you arrive at an

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

Recovering any low-rank matrix, provably

Recovering any low-rank matrix, provably Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

Normalized power iterations for the computation of SVD

Normalized power iterations for the computation of SVD Normalized power iterations for the computation of SVD Per-Gunnar Martinsson Department of Applied Mathematics University of Colorado Boulder, Co. Per-gunnar.Martinsson@Colorado.edu Arthur Szlam Courant

More information

Matrix Completion from a Few Entries

Matrix Completion from a Few Entries Matrix Completion from a Few Entries Raghunandan H. Keshavan and Sewoong Oh EE Department Stanford University, Stanford, CA 9434 Andrea Montanari EE and Statistics Departments Stanford University, Stanford,

More information

Using the Johnson-Lindenstrauss lemma in linear and integer programming

Using the Johnson-Lindenstrauss lemma in linear and integer programming Using the Johnson-Lindenstrauss lemma in linear and integer programming Vu Khac Ky 1, Pierre-Louis Poirion, Leo Liberti LIX, École Polytechnique, F-91128 Palaiseau, France Email:{vu,poirion,liberti}@lix.polytechnique.fr

More information

Stein s Method for Matrix Concentration

Stein s Method for Matrix Concentration Stein s Method for Matrix Concentration Lester Mackey Collaborators: Michael I. Jordan, Richard Y. Chen, Brendan Farrell, and Joel A. Tropp University of California, Berkeley California Institute of Technology

More information

Compressed Sensing and Sparse Recovery

Compressed Sensing and Sparse Recovery ELE 538B: Sparsity, Structure and Inference Compressed Sensing and Sparse Recovery Yuxin Chen Princeton University, Spring 217 Outline Restricted isometry property (RIP) A RIPless theory Compressed sensing

More information

OPERATIONS on large matrices are a cornerstone of

OPERATIONS on large matrices are a cornerstone of 1 On Sparse Representations of Linear Operators and the Approximation of Matrix Products Mohamed-Ali Belabbas and Patrick J. Wolfe arxiv:0707.4448v2 [cs.ds] 26 Jun 2009 Abstract Thus far, sparse representations

More information

EE 381V: Large Scale Learning Spring Lecture 16 March 7

EE 381V: Large Scale Learning Spring Lecture 16 March 7 EE 381V: Large Scale Learning Spring 2013 Lecture 16 March 7 Lecturer: Caramanis & Sanghavi Scribe: Tianyang Bai 16.1 Topics Covered In this lecture, we introduced one method of matrix completion via SVD-based

More information

ORIE 6334 Spectral Graph Theory November 8, Lecture 22

ORIE 6334 Spectral Graph Theory November 8, Lecture 22 ORIE 6334 Spectral Graph Theory November 8, 206 Lecture 22 Lecturer: David P. Williamson Scribe: Shijin Rajakrishnan Recap In the previous lectures, we explored the problem of finding spectral sparsifiers

More information

Tighter Low-rank Approximation via Sampling the Leveraged Element

Tighter Low-rank Approximation via Sampling the Leveraged Element Tighter Low-rank Approximation via Sampling the Leveraged Element Srinadh Bhojanapalli The University of Texas at Austin bsrinadh@utexas.edu Prateek Jain Microsoft Research, India prajain@microsoft.com

More information

Lecture Notes 9: Constrained Optimization

Lecture Notes 9: Constrained Optimization Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form

More information

arxiv: v1 [math.na] 1 Sep 2018

arxiv: v1 [math.na] 1 Sep 2018 On the perturbation of an L -orthogonal projection Xuefeng Xu arxiv:18090000v1 [mathna] 1 Sep 018 September 5 018 Abstract The L -orthogonal projection is an important mathematical tool in scientific computing

More information

Dissertation Defense

Dissertation Defense Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation

More information

Dimensionality Reduction Notes 3

Dimensionality Reduction Notes 3 Dimensionality Reduction Notes 3 Jelani Nelson minilek@seas.harvard.edu August 13, 2015 1 Gordon s theorem Let T be a finite subset of some normed vector space with norm X. We say that a sequence T 0 T

More information

A randomized algorithm for approximating the SVD of a matrix

A randomized algorithm for approximating the SVD of a matrix A randomized algorithm for approximating the SVD of a matrix Joint work with Per-Gunnar Martinsson (U. of Colorado) and Vladimir Rokhlin (Yale) Mark Tygert Program in Applied Mathematics Yale University

More information

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Lars Eldén Department of Mathematics, Linköping University 1 April 2005 ERCIM April 2005 Multi-Linear

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Linear Least Squares. Using SVD Decomposition.

Linear Least Squares. Using SVD Decomposition. Linear Least Squares. Using SVD Decomposition. Dmitriy Leykekhman Spring 2011 Goals SVD-decomposition. Solving LLS with SVD-decomposition. D. Leykekhman Linear Least Squares 1 SVD Decomposition. For any

More information

Randomized algorithms in numerical linear algebra

Randomized algorithms in numerical linear algebra Acta Numerica (2017), pp. 95 135 c Cambridge University Press, 2017 doi:10.1017/s0962492917000058 Printed in the United Kingdom Randomized algorithms in numerical linear algebra Ravindran Kannan Microsoft

More information

An algebraic perspective on integer sparse recovery

An algebraic perspective on integer sparse recovery An algebraic perspective on integer sparse recovery Lenny Fukshansky Claremont McKenna College (joint work with Deanna Needell and Benny Sudakov) Combinatorics Seminar USC October 31, 2018 From Wikipedia:

More information

A Fast Algorithm For Computing The A-optimal Sampling Distributions In A Big Data Linear Regression

A Fast Algorithm For Computing The A-optimal Sampling Distributions In A Big Data Linear Regression A Fast Algorithm For Computing The A-optimal Sampling Distributions In A Big Data Linear Regression Hanxiang Peng and Fei Tan Indiana University Purdue University Indianapolis Department of Mathematical

More information

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming CSC2411 - Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming Notes taken by Mike Jamieson March 28, 2005 Summary: In this lecture, we introduce semidefinite programming

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

Lecture 9: Low Rank Approximation

Lecture 9: Low Rank Approximation CSE 521: Design and Analysis of Algorithms I Fall 2018 Lecture 9: Low Rank Approximation Lecturer: Shayan Oveis Gharan February 8th Scribe: Jun Qi Disclaimer: These notes have not been subjected to the

More information

Strengthened Sobolev inequalities for a random subspace of functions

Strengthened Sobolev inequalities for a random subspace of functions Strengthened Sobolev inequalities for a random subspace of functions Rachel Ward University of Texas at Austin April 2013 2 Discrete Sobolev inequalities Proposition (Sobolev inequality for discrete images)

More information

MATH 612 Computational methods for equation solving and function minimization Week # 2

MATH 612 Computational methods for equation solving and function minimization Week # 2 MATH 612 Computational methods for equation solving and function minimization Week # 2 Instructor: Francisco-Javier Pancho Sayas Spring 2014 University of Delaware FJS MATH 612 1 / 38 Plan for this week

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

c i r i i=1 r 1 = [1, 2] r 2 = [0, 1] r 3 = [3, 4].

c i r i i=1 r 1 = [1, 2] r 2 = [0, 1] r 3 = [3, 4]. Lecture Notes: Rank of a Matrix Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk 1 Linear Independence Definition 1. Let r 1, r 2,..., r m

More information

SUBSET SELECTION ALGORITHMS: RANDOMIZED VS. DETERMINISTIC

SUBSET SELECTION ALGORITHMS: RANDOMIZED VS. DETERMINISTIC SUBSET SELECTION ALGORITHMS: RANDOMIZED VS. DETERMINISTIC MARY E. BROADBENT, MARTIN BROWN, AND KEVIN PENNER Advisors: Ilse Ipsen 1 and Rizwana Rehman 2 Abstract. Subset selection is a method for selecting

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition (Com S 477/577 Notes Yan-Bin Jia Sep, 7 Introduction Now comes a highlight of linear algebra. Any real m n matrix can be factored as A = UΣV T where U is an m m orthogonal

More information

Improved Practical Matrix Sketching with Guarantees

Improved Practical Matrix Sketching with Guarantees Improved Practical Matrix Sketching with Guarantees Mina Ghashami, Amey Desai, and Jeff M. Phillips University of Utah Abstract. Matrices have become essential data representations for many large-scale

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

Solutions to Exam I MATH 304, section 6

Solutions to Exam I MATH 304, section 6 Solutions to Exam I MATH 304, section 6 YOU MUST SHOW ALL WORK TO GET CREDIT. Problem 1. Let A = 1 2 5 6 1 2 5 6 3 2 0 0 1 3 1 1 2 0 1 3, B =, C =, I = I 0 0 0 1 1 3 4 = 4 4 identity matrix. 3 1 2 6 0

More information

Introduction to Numerical Linear Algebra II

Introduction to Numerical Linear Algebra II Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in

More information