Guaranteed Rank Minimization via Singular Value Projection: Supplementary Material

Size: px
Start display at page:

Download "Guaranteed Rank Minimization via Singular Value Projection: Supplementary Material"

Transcription

1 Guaranteed Rank Minimization via Singular Value Projection: Supplementary Material Prateek Jain Microsoft Research Bangalore Bangalore, India Raghu Meka UT Austin Dept. of Computer Sciences Austin, TX, USA Inderjit Dhillon UT Austin Dept. of Computer Sciences Austin, TX, USA Appendix A: Analysis for Affine Constraints Satisfying RIP We first give a proof of Lemma 2. which bounds the error of the (t + )-st iterate (ψ(x t+ )) in terms of the error incurred by the t-th iterate and the optimal solution. Proof of Lemma 2. Recall that ψ(x) = 2 A(X) b 2 2. Since ψ( ) is a quadratic function, we have ψ(x t+ ) ψ(x t ) = ψ(x t ), X t+ X t + 2 A(Xt+ X t ) 2 2 A T (A(X t ) b), X t+ X t + 2 ( + δ 2k) X t+ X t 2 F, (0.) where the inequality follows from RIP applied to the matrix X t+ X t of rank at most 2k. Let Y t+ = X t +δ 2k A T (A(X t ) b) and Now, f t (X) = A T (A(X t ) b), X X t + 2 ( + δ 2k) X X t 2 F. f t (X) = 2 ( + δ 2k) [ A X X t 2 T (A(X t ] ) b) F + 2, X X t + δ 2k = 2 ( + δ 2k) X Y t+ 2 F 2( + δ 2k ) AT (A(X t ) b) 2 F. Thus, by definition, P k (Y t+ ) = X t+ is the minimizer of f t (X) over all matrices X C(k) (of rank at most k). In particular, f t (X t+ ) f t (X ) and, ψ(x t+ ) ψ(x t ) f t (X t+ ) f t (X ) = A T (A(X t ) b), X X t + 2 ( + δ 2k) X X t 2 F A T (A(X t ) b), X X t δ 2k A(X X t ) 2 2 (0.2) δ 2k = ψ(x ) ψ(x t δ 2k ) + ( δ 2k ) A(X X t ) 2 2, where inequality (0.2) follows from RIP applied to X X t. We now prove Theorem.2.

2 Proof of Theorem.2 Let the current solution X t satisfy ψ(x t ) G e 2 /2, where G 0 is a universal constant. Using Lemma 2. and the fact that b A(X ) = e, δ 2k ψ(x t+ ) e ( δ 2k ) b A(Xt ) e 2 2 e δ 2k 2 ( δ 2k ) (ψ(xt ) e T (b A(X t )) + e 2 /2) ψ(xt ) G 2 + 2δ ( 2k ψ(x t ) + 2 ( δ 2k ) G ψ(xt ) + ) G 2 ψ(xt ) Dψ(X t ), ( where D = G + 2δ ( ) ) 2 2k ( δ 2k ) + 2 G. Recall that δ 2k < /3. Hence, selecting G > ( + δ 2k )/( 3δ 2k ), we get D <. Also, ψ(x 0 ) = ψ(0) = b 2 /2. Hence, ψ(x τ ) C e 2 + ϵ where τ = log(/d) log b 2 2(C e 2 +ϵ) and C = G 2 /2. Appendix B: Proof of Weak RIP for Matrix Completion We now give a detailed proof of the weak RIP property for the matrix completion problem, Theorem 2.2. Proof of Lemma 2.4 Let X = UΣV T be the singular value decomposition of X. Then, Since X is µ-incoherent, X ij l= X ij = e T i UΣV T e j = Σ ll U il V jl µ mn ( l= U il Σ ll V jl l= Σ ll U il V jl. l= Σ ll ) µ mn k ( We also need the following technical lemma for our proof. Lemma 0. Let a, b, c, x, y, z [, ]. Then, l= abc xyz a x + b y + c z. Σ 2 ll) /2 = µ k mn X F. Proof We have, abc xyz = (abc xbc) + (xbc xyc) + (xyc xyz) a x bc + b y xc + c z xy a x + b y + c z. We now prove Lemma 2.5. We use the following form of the classical Bernstein s large deviation inequality in our proof. Lemma 0.2 (Bernstein s Inequality, see [3]) Let X, X 2,..., X n be independent random variables with E[X i ] = 0 and X i M for all i. Then, P [ ( t 2 ) /2 X i > t] exp i i V ar(x. i) + Mt/3 Proof of Lemma 2.5 For (i, j) [m] [n], let ω ij be the indicator variables with ω ij = if (i, j) Ω and 0 otherwise. Then, ω ij are independent random variables with Pr[ω ij = ] = p. Let random variable Z ij = ω ij Xij 2. Note that, E[Z ij ] = px 2 ij, V ar(z ij ) = p( p)x 4 ij. Since X is α-regular, Z ij E[Z ij ] X ij 2 (α 2 /mn) X 2 F. Thus, M = max Z ij E[Z ij ] α2 mn X 2 F. (0.3) 2

3 Now, define random variable S = Z ij = ω ijxij 2 = P Ω(X) 2 F. Note that, E[S] = p X 2 F. Since, Z ij are independent random variables, V ar(s) = p( p) X 4 ij p (max X 2 ij) X 2 ij p α2 mn X 4 F. (0.4) Using Bernstein s inequality (Lemma 0.2) for S with t = δp X 2 F and Equations (0.3) and (0.4) we get, ( t 2 ) /2 Pr[ S E[S] > t] 2 exp V ar(z) + Mt/3 ( ) 2 exp δ2 pmn α 2 ( + δ/3) ( ) 2 exp δ2 pmn 3α 2. Proof of Lemma 2.6 We construct S(µ, ϵ) by discretizing the space of low-rank incoherent matrices. Let ρ = ϵ/ 9k 2 mn and D(ρ) = {ρ i : i Z, i < /ρ }. Let U(ρ) = {U R m k : U ij ( µ/m) D(ρ) }, V (ρ) = {V R n k : V ij ( µ/n) D(ρ) }, Σ(ρ) = {Σ R k k : Σ ij = 0, i j, Σ ii D(ρ)}, S(µ, ϵ) = { UΣV T : U U(ρ), Σ Σ(ρ), V V (ρ) }. We will show that S(µ, ϵ) satisfies the conditions of the Lemma. Observe that D(ρ) < 2/ρ. Thus, U(ρ) < (2/ρ) mk, V (ρ) < (2/ρ) nk, Σ(ρ) < (2/ρ) k. Hence, S(µ, ϵ) < (2/ρ) mk+nk+k < (mnk/ϵ) 3(m+n)k. Fix a µ-incoherent X R m n of rank at most k with X 2 =. Let X = UΣV T be the singular value decomposition of X. Let U be the matrix obtained by rounding entries of U to integer multiples of µ ρ/ m as follows: for (i, l) [m] [k], let (U ) il = µρ m m U il. µ ρ Now, since U il µ/ m, it follows that U U(ρ). Further, for all i [m], l [k], µ (U ) il U il < ρ ρ. m Similarly, define V, Σ by rounding entries of V, Σ to integer multiples of µ ρ/ n and ρ respectively. Then, V V (ρ), Σ Σ(ρ) and for (j, l) [n] [k], µρ (V ) jl V jl < ρ, (Σ ) ll Σ ll < ρ. n Let X(ρ) = U Σ V T. Then, by the above equations and Lemma 0., for i [m], l [k], j [n], Thus, for i, j [m] [n], (U ) il (Σ ) ll (V ) jl U il Σ ll V jl < 3ρ. X(ρ) ij X ij = (U ) il (Σ ) ll (V ) jl U il Σ ll V jl l= (U ) il (Σ ) ll (V ) jl U il Σ ll V jl l= < 3kρ. (0.5) 3

4 Using Lemma 2.4 and Equation (0.5) Also, using (0.5), max X(ρ) ij < max X ij + 3kρ µ k X F + mn ϵ mn. X(ρ) X 2 F = X(ρ) ij X ij 2 < 9k 2 mnρ 2 = ϵ 2. Further, by triangle inequality, X(ρ) F > X F ϵ > X F /2. Since, ϵ < and µ k X F, max X(ρ) ij < 2µ k X F < 4µ k X(ρ) F. mn mn Thus, X(ρ) is 4µ k-regular. The lemma now follows by taking Y = X(ρ). We now prove Theorem 2.2 by combining Lemmas 2.5 and 2.6. Proof of Theorem 2.2 Let m n, ϵ = δ/9mnk and S (µ, ϵ) = {Y : Y S(µ, ϵ), Y is 4µ k-regular}, where S(µ, ϵ) is as in Lemma 2.6. Then, by Lemma 2.5 and a union bound, Pr [ P Ω (Y ) 2 F p Y 2 F δp Y 2 F for some Y S (µ, ϵ) ] ( ) 3(m+n)k ( mnk δ 2 ) pmn 2 exp ϵ 6µ 2 k ( δ 2 ) pmn exp(c nk log n) exp 6µ 2, k where C 0 is a constant independent of m, n, k. Thus, if p > Cµ 2 k 2 log n/δ 2 m, where C = 6(C +), with probability at least exp( n log n), the following holds: Y S (µ, ϵ), P Ω (Y ) 2 F p Y 2 F δp Y 2 F. (0.6) As the statement of the theorem is invariant under scaling, it is enough to show the statement for all µ-incoherent matrices X of rank at most k and X 2 =. Fix such a X and suppose that (0.6) holds. Now, by Lemma 2.6 there exists Y S (µ, ϵ) such that Y X F ϵ. Moreover, Y 2 F ( X F + ϵ) 2 X 2 F + 2ϵ X F + ϵ 2 X 2 F + 3ϵk. Proceeding similarly, we can show that X 2 F Y 2 F 3ϵk. (0.7) Further, starting with P Ω (Y X) F Y X F ϵ and arguing as above we get that Combining inequalities (0.7), (0.8) above, we have P Ω (Y ) 2 F P Ω (X) 2 F 3ϵk. (0.8) P Ω (X) 2 F p X 2 F P Ω (X) 2 F P Ω (Y ) 2 F + p X 2 F Y 2 F + P Ω (Y ) 2 F p Y 2 F The theorem now follows. 6ϵk + δp Y 2 F from (0.6), (0.7), (0.8) 6ϵk + δp( X 2 F + 3ϵk) from (0.7) 9ϵk + δp X 2 F 2δp X 2 F. since X 2 F 4

5 Appendix C: SVP-Newton The affine rank minimization problem is a natural generalization to matrices of the following compressed sensing problem (CSP) for vectors: min x x 0, s.t. Ax = b, (0.9) where x 0 is the l 0 norm (size of the support) of x R n, A R m n is the sensing matrix and b R m are the measurements. Similar to ARMP, the compressed sensing problem is also NP-hard in general. However, similar to ARMP, compressed sensing can also be solved for measurement matrices that satisfies restricted isometry property. Restricted Isometry Property (RIP) for the Compressed Sensing problem is similar to the corresponding definition for matrices: ( δ k ) x 2 2 Ax 2 2 ( + δ k ) x 2 2, (0.0) where support(x) k. Note that support of a vector is same as the rank of the corresponding diagonal matrix whose diagonal is given by the same vector. Since, Compressed Sensing is a special case of Affine Rank Minimization Problem (ARMP) with diagonal matrices, i.e., with fixed basis vectors U k = I k, V k = I k (see step 4, SVP algorithm). Hence, SVP-Newton can be directly applied to Problem 0.9. The corresponding updates for SVP-Newton applied to compressed sensing are given by: y t+ x t η t A T (Ax b), (0.) S Set of top k elements ofy t+, (0.2) x t+ S argmin A S x S b 2 2, (0.3) x S We now present a Theorem that shows that the SVP-Newton method applied to the Compressed Sensing problem converges to the optimal solution in O(log k) iterations where k is the optimal support. Theorem 0.3 Suppose the isometry constant of A in CSP satisfies δ 2k < /3. Let b = Ax for a support-k vector x, i.e., x 0 = k and b i, i. Then, SVP-Newton algorithm for CSP (Updates 0.3) with step-size η t = 4/3 converges to the exact x 2 in at most log /α log α iterations, where c = min i x i and α = /3+δ 2k δ 2k. k δ2k c Proof Assume that x 0 = 0. Now, using Lemma 2. specialized to CSP (or see Theorem 2. in Garg and Khandekar []), ( δ 2k ) x t x Ax t b 2 α t b 2, where α = /3+δ 2k δ 2k. Let us assume that the support set S discovered by SVP-Newton at the t-th step differs from the optimal set S (support set for x ) on at least one element. Now, note that if the support set S discovered by SVP-Newton is the optimal set S (support set for x ) then x t = x as x t minimizes A S x S b 2 2. Since, S S ϕ, x t x 2 c 2. Hence, ( δ 2k )c 2 α t b 2, implying, t 2 log /α log k α δ2k c. Note that SVP-Newton converges to the optimal solution in logarithmic number of queries rather than ϵ-approximate solution obtained by GradeS [] which is just a specialization of SVP to the compressed sensing problem. Note that the number of iterations required is an improvement over 5

6 O(k) bound provided by [2]. [2] requires incoherence property for the measurement matrix, which is a stronger condition to RIP. In fact, one can construct measurement matrices A with bounded RIP constant but unbounded incoherence constant. References [] Rahul Garg and Rohit Khandekar. Gradient descent with sparsification: an iterative algorithm for sparse recovery with restricted isometry property. In ICML, doi: 0.45/ [2] Arian Maleki. Coherence analysis of iterative thresholding algorithms. CoRR, abs/ , [3] Wikipedia. Bernstein inequalities (probability theory) wikipedia, the free encyclopedia, URL inequalities_(probability_theory)&oldid= [Online; accessed 6- April-2009]. Note that incoherence property mentioned here is different than the one used in the context of Matrix Completion 6

Guaranteed Rank Minimization via Singular Value Projection

Guaranteed Rank Minimization via Singular Value Projection 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 Guaranteed Rank Minimization via Singular Value Projection Anonymous Author(s) Affiliation Address

More information

Provable Alternating Minimization Methods for Non-convex Optimization

Provable Alternating Minimization Methods for Non-convex Optimization Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish

More information

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Sensing systems limited by constraints: physical size, time, cost, energy

Sensing systems limited by constraints: physical size, time, cost, energy Rebecca Willett Sensing systems limited by constraints: physical size, time, cost, energy Reduce the number of measurements needed for reconstruction Higher accuracy data subject to constraints Original

More information

Compressive Sensing with Random Matrices

Compressive Sensing with Random Matrices Compressive Sensing with Random Matrices Lucas Connell University of Georgia 9 November 017 Lucas Connell (University of Georgia) Compressive Sensing with Random Matrices 9 November 017 1 / 18 Overview

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Optimization for Compressed Sensing

Optimization for Compressed Sensing Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve

More information

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth

More information

Constructing Explicit RIP Matrices and the Square-Root Bottleneck

Constructing Explicit RIP Matrices and the Square-Root Bottleneck Constructing Explicit RIP Matrices and the Square-Root Bottleneck Ryan Cinoman July 18, 2018 Ryan Cinoman Constructing Explicit RIP Matrices July 18, 2018 1 / 36 Outline 1 Introduction 2 Restricted Isometry

More information

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 6 Structured and Low Rank Matrices Section 6.3 Numerical Optimization Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 3: Sparse signal recovery: A RIPless analysis of l 1 minimization Yuejie Chi The Ohio State University Page 1 Outline

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 08): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

Prediction and Clustering in Signed Networks: A Local to Global Perspective

Prediction and Clustering in Signed Networks: A Local to Global Perspective Journal of Machine Learning Research 15 (2014) 1177-1213 Submitted 2/13; Revised 10/13; Published 3/14 Prediction and Clustering in Signed Networks: A Local to Global Perspective Kai-Yang Chiang Cho-Jui

More information

PU Learning for Matrix Completion

PU Learning for Matrix Completion Cho-Jui Hsieh Dept of Computer Science UT Austin ICML 2015 Joint work with N. Natarajan and I. S. Dhillon Matrix Completion Example: movie recommendation Given a set Ω and the values M Ω, how to predict

More information

Supplementary Materials for Riemannian Pursuit for Big Matrix Recovery

Supplementary Materials for Riemannian Pursuit for Big Matrix Recovery Supplementary Materials for Riemannian Pursuit for Big Matrix Recovery Mingkui Tan, School of omputer Science, The University of Adelaide, Australia Ivor W. Tsang, IS, University of Technology Sydney,

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

On Iterative Hard Thresholding Methods for High-dimensional M-Estimation

On Iterative Hard Thresholding Methods for High-dimensional M-Estimation On Iterative Hard Thresholding Methods for High-dimensional M-Estimation Prateek Jain Ambuj Tewari Purushottam Kar Microsoft Research, INDIA University of Michigan, Ann Arbor, USA {prajain,t-purkar}@microsoft.com,

More information

Random Methods for Linear Algebra

Random Methods for Linear Algebra Gittens gittens@acm.caltech.edu Applied and Computational Mathematics California Institue of Technology October 2, 2009 Outline The Johnson-Lindenstrauss Transform 1 The Johnson-Lindenstrauss Transform

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Convex Optimization Algorithms for Machine Learning in 10 Slides

Convex Optimization Algorithms for Machine Learning in 10 Slides Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Z Algorithmic Superpower Randomization October 15th, Lecture 12

Z Algorithmic Superpower Randomization October 15th, Lecture 12 15.859-Z Algorithmic Superpower Randomization October 15th, 014 Lecture 1 Lecturer: Bernhard Haeupler Scribe: Goran Žužić Today s lecture is about finding sparse solutions to linear systems. The problem

More information

Stability and Robustness of Weak Orthogonal Matching Pursuits

Stability and Robustness of Weak Orthogonal Matching Pursuits Stability and Robustness of Weak Orthogonal Matching Pursuits Simon Foucart, Drexel University Abstract A recent result establishing, under restricted isometry conditions, the success of sparse recovery

More information

One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017

One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017 One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017 1 One Picture and a Thousand Words Using Matrix Approximations Dianne P. O Leary

More information

Recent Developments in Compressed Sensing

Recent Developments in Compressed Sensing Recent Developments in Compressed Sensing M. Vidyasagar Distinguished Professor, IIT Hyderabad m.vidyasagar@iith.ac.in, www.iith.ac.in/ m vidyasagar/ ISL Seminar, Stanford University, 19 April 2018 Outline

More information

Strengthened Sobolev inequalities for a random subspace of functions

Strengthened Sobolev inequalities for a random subspace of functions Strengthened Sobolev inequalities for a random subspace of functions Rachel Ward University of Texas at Austin April 2013 2 Discrete Sobolev inequalities Proposition (Sobolev inequality for discrete images)

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725 Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple

More information

Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm

Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm Zaiwen Wen, Wotao Yin, Yin Zhang 2010 ONR Compressed Sensing Workshop May, 2010 Matrix Completion

More information

List of Errata for the Book A Mathematical Introduction to Compressive Sensing

List of Errata for the Book A Mathematical Introduction to Compressive Sensing List of Errata for the Book A Mathematical Introduction to Compressive Sensing Simon Foucart and Holger Rauhut This list was last updated on September 22, 2018. If you see further errors, please send us

More information

Geometry of log-concave Ensembles of random matrices

Geometry of log-concave Ensembles of random matrices Geometry of log-concave Ensembles of random matrices Nicole Tomczak-Jaegermann Joint work with Radosław Adamczak, Rafał Latała, Alexander Litvak, Alain Pajor Cortona, June 2011 Nicole Tomczak-Jaegermann

More information

Noisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get

Noisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get Supplementary Material A. Auxillary Lemmas Lemma A. Lemma. Shalev-Shwartz & Ben-David,. Any update of the form P t+ = Π C P t ηg t, 3 for an arbitrary sequence of matrices g, g,..., g, projection Π C onto

More information

Generalized Orthogonal Matching Pursuit- A Review and Some

Generalized Orthogonal Matching Pursuit- A Review and Some Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents

More information

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies July 12, 212 Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies Morteza Mardani Dept. of ECE, University of Minnesota, Minneapolis, MN 55455 Acknowledgments:

More information

Low-Rank Factorization Models for Matrix Completion and Matrix Separation

Low-Rank Factorization Models for Matrix Completion and Matrix Separation for Matrix Completion and Matrix Separation Joint work with Wotao Yin, Yin Zhang and Shen Yuan IPAM, UCLA Oct. 5, 2010 Low rank minimization problems Matrix completion: find a low-rank matrix W R m n so

More information

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems

An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems Kim-Chuan Toh Sangwoon Yun March 27, 2009; Revised, Nov 11, 2009 Abstract The affine rank minimization

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Using SVD to Recommend Movies

Using SVD to Recommend Movies Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

Introduction to Compressed Sensing

Introduction to Compressed Sensing Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral

More information

Compressive Sensing and Beyond

Compressive Sensing and Beyond Compressive Sensing and Beyond Sohail Bahmani Gerorgia Tech. Signal Processing Compressed Sensing Signal Models Classics: bandlimited The Sampling Theorem Any signal with bandwidth B can be recovered

More information

Recovering overcomplete sparse representations from structured sensing

Recovering overcomplete sparse representations from structured sensing Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix

More information

Recovering any low-rank matrix, provably

Recovering any low-rank matrix, provably Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix

More information

QALGO workshop, Riga. 1 / 26. Quantum algorithms for linear algebra.

QALGO workshop, Riga. 1 / 26. Quantum algorithms for linear algebra. QALGO workshop, Riga. 1 / 26 Quantum algorithms for linear algebra., Center for Quantum Technologies and Nanyang Technological University, Singapore. September 22, 2015 QALGO workshop, Riga. 2 / 26 Overview

More information

Minimizing the Difference of L 1 and L 2 Norms with Applications

Minimizing the Difference of L 1 and L 2 Norms with Applications 1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:

More information

Lecture 9: September 28

Lecture 9: September 28 0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Jorge F. Silva and Eduardo Pavez Department of Electrical Engineering Information and Decision Systems Group Universidad

More information

Orthonormal Transformations and Least Squares

Orthonormal Transformations and Least Squares Orthonormal Transformations and Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 30, 2009 Applications of Qx with Q T Q = I 1. solving

More information

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.

Penalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way. AMSC 607 / CMSC 878o Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 3: Penalty and Barrier Methods Dianne P. O Leary c 2008 Reference: N&S Chapter 16 Penalty and Barrier

More information

Lecture 9: Matrix approximation continued

Lecture 9: Matrix approximation continued 0368-348-01-Algorithms in Data Mining Fall 013 Lecturer: Edo Liberty Lecture 9: Matrix approximation continued Warning: This note may contain typos and other inaccuracies which are usually discussed during

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

A Randomized Algorithm for the Approximation of Matrices

A Randomized Algorithm for the Approximation of Matrices A Randomized Algorithm for the Approximation of Matrices Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert Technical Report YALEU/DCS/TR-36 June 29, 2006 Abstract Given an m n matrix A and a positive

More information

Department of Mathematics, University of California, Berkeley. GRADUATE PRELIMINARY EXAMINATION, Part A Spring Semester 2015

Department of Mathematics, University of California, Berkeley. GRADUATE PRELIMINARY EXAMINATION, Part A Spring Semester 2015 Department of Mathematics, University of California, Berkeley YOUR OR 2 DIGIT EXAM NUMBER GRADUATE PRELIMINARY EXAMINATION, Part A Spring Semester 205. Please write your - or 2-digit exam number on this

More information

A Quick Tour of Linear Algebra and Optimization for Machine Learning

A Quick Tour of Linear Algebra and Optimization for Machine Learning A Quick Tour of Linear Algebra and Optimization for Machine Learning Masoud Farivar January 8, 2015 1 / 28 Outline of Part I: Review of Basic Linear Algebra Matrices and Vectors Matrix Multiplication Operators

More information

Linear Least Squares. Using SVD Decomposition.

Linear Least Squares. Using SVD Decomposition. Linear Least Squares. Using SVD Decomposition. Dmitriy Leykekhman Spring 2011 Goals SVD-decomposition. Solving LLS with SVD-decomposition. D. Leykekhman Linear Least Squares 1 SVD Decomposition. For any

More information

Fourier Bases on the Skewed Sierpinski Gasket

Fourier Bases on the Skewed Sierpinski Gasket Fourier Bases on the Skewed Sierpinski Gasket Calvin Hotchkiss University Nebraska-Iowa Functional Analysis Seminar November 5, 2016 Part 1: A Fast Fourier Transform for Fractal Approximations The Fractal

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini April 27, 2018 1 / 80 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016

Random projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016 Lecture notes 5 February 9, 016 1 Introduction Random projections Random projections are a useful tool in the analysis and processing of high-dimensional data. We will analyze two applications that use

More information

Symmetric Factorization for Nonconvex Optimization

Symmetric Factorization for Nonconvex Optimization Symmetric Factorization for Nonconvex Optimization Qinqing Zheng February 24, 2017 1 Overview A growing body of recent research is shedding new light on the role of nonconvex optimization for tackling

More information

Key Words: low-rank matrix recovery, iteratively reweighted least squares, matrix completion.

Key Words: low-rank matrix recovery, iteratively reweighted least squares, matrix completion. LOW-RANK MATRIX RECOVERY VIA ITERATIVELY REWEIGHTED LEAST SQUARES MINIMIZATION MASSIMO FORNASIER, HOLGER RAUHUT, AND RACHEL WARD Abstract. We present and analyze an efficient implementation of an iteratively

More information

CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3

CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3 CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3 Felix Kwok February 27, 2004 Written Problems 1. (Heath E3.10) Let B be an n n matrix, and assume that B is both

More information

Nonconvex Low Rank Matrix Factorization via Inexact First Order Oracle

Nonconvex Low Rank Matrix Factorization via Inexact First Order Oracle Nonconvex Low Rank Matrix Factorization via Inexact First Order Oracle Tuo Zhao Zhaoran Wang Han Liu Abstract We study the low rank matrix factorization problem via nonconvex optimization. Compared with

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

More information

Limitations in Approximating RIP

Limitations in Approximating RIP Alok Puranik Mentor: Adrian Vladu Fifth Annual PRIMES Conference, 2015 Outline 1 Background The Problem Motivation Construction Certification 2 Planted model Planting eigenvalues Analysis Distinguishing

More information

Fast recovery from a union of subspaces

Fast recovery from a union of subspaces Fast recovery from a union of subspaces Chinmay Hegde Iowa State University Piotr Indyk MIT Ludwig Schmidt MIT Abstract We address the problem of recovering a high-dimensional but structured vector from

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5 CS 229r: Algorithms for Big Data Fall 215 Prof. Jelani Nelson Lecture 19 Nov 5 Scribe: Abdul Wasay 1 Overview In the last lecture, we started discussing the problem of compressed sensing where we are given

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Chapter 4: Asymptotic Properties of the MLE (Part 2)

Chapter 4: Asymptotic Properties of the MLE (Part 2) Chapter 4: Asymptotic Properties of the MLE (Part 2) Daniel O. Scharfstein 09/24/13 1 / 1 Example Let {(R i, X i ) : i = 1,..., n} be an i.i.d. sample of n random vectors (R, X ). Here R is a response

More information

Analysis of Greedy Algorithms

Analysis of Greedy Algorithms Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm

More information

r=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J

r=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J 7 Appendix 7. Proof of Theorem Proof. There are two main difficulties in proving the convergence of our algorithm, and none of them is addressed in previous works. First, the Hessian matrix H is a block-structured

More information

Consistent Robust Regression

Consistent Robust Regression Consistent Robust Regression Kush Bhatia University of California, Berkeley Prateek Jain Microsoft Research, India Parameswaran Kamalaruban EPFL, Switzerland Purushottam Kar Indian Institute of Technology,

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?)

1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?) Math 35 Exam Review SOLUTIONS Overview In this third of the course we focused on linear learning algorithms to model data. summarize: To. Background: The SVD and the best basis (questions selected from

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior

More information

Sketching as a Tool for Numerical Linear Algebra All Lectures. David Woodruff IBM Almaden

Sketching as a Tool for Numerical Linear Algebra All Lectures. David Woodruff IBM Almaden Sketching as a Tool for Numerical Linear Algebra All Lectures David Woodruff IBM Almaden Massive data sets Examples Internet traffic logs Financial data etc. Algorithms Want nearly linear time or less

More information

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Lecture 9: Numerical Linear Algebra Primer (February 11st) 10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template

More information

An algebraic perspective on integer sparse recovery

An algebraic perspective on integer sparse recovery An algebraic perspective on integer sparse recovery Lenny Fukshansky Claremont McKenna College (joint work with Deanna Needell and Benny Sudakov) Combinatorics Seminar USC October 31, 2018 From Wikipedia:

More information

Preliminary Examination, Numerical Analysis, August 2016

Preliminary Examination, Numerical Analysis, August 2016 Preliminary Examination, Numerical Analysis, August 2016 Instructions: This exam is closed books and notes. The time allowed is three hours and you need to work on any three out of questions 1-4 and any

More information

Block-sparse Solutions using Kernel Block RIP and its Application to Group Lasso

Block-sparse Solutions using Kernel Block RIP and its Application to Group Lasso Block-sparse Solutions using Kernel Block RIP and its Application to Group Lasso Rahul Garg IBM T.J. Watson research center grahul@us.ibm.com Rohit Khandekar IBM T.J. Watson research center rohitk@us.ibm.com

More information

This manuscript is for review purposes only.

This manuscript is for review purposes only. 1 2 3 4 5 6 7 8 9 10 11 12 THE USE OF QUADRATIC REGULARIZATION WITH A CUBIC DESCENT CONDITION FOR UNCONSTRAINED OPTIMIZATION E. G. BIRGIN AND J. M. MARTíNEZ Abstract. Cubic-regularization and trust-region

More information

On the l 1 -Norm Invariant Convex k-sparse Decomposition of Signals

On the l 1 -Norm Invariant Convex k-sparse Decomposition of Signals On the l 1 -Norm Invariant Convex -Sparse Decomposition of Signals arxiv:1305.6021v2 [cs.it] 11 Nov 2013 Guangwu Xu and Zhiqiang Xu Abstract Inspired by an interesting idea of Cai and Zhang, we formulate

More information

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.

Approximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina. Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and

More information

Laplace s Equation. Chapter Mean Value Formulas

Laplace s Equation. Chapter Mean Value Formulas Chapter 1 Laplace s Equation Let be an open set in R n. A function u C 2 () is called harmonic in if it satisfies Laplace s equation n (1.1) u := D ii u = 0 in. i=1 A function u C 2 () is called subharmonic

More information

CHAPTER 7. Regression

CHAPTER 7. Regression CHAPTER 7 Regression This chapter presents an extended example, illustrating and extending many of the concepts introduced over the past three chapters. Perhaps the best known multi-variate optimisation

More information

Performance Analysis for Sparse Support Recovery

Performance Analysis for Sparse Support Recovery Performance Analysis for Sparse Support Recovery Gongguo Tang and Arye Nehorai ESE, Washington University April 21st 2009 Gongguo Tang and Arye Nehorai (Institute) Performance Analysis for Sparse Support

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

Info-Greedy Sequential Adaptive Compressed Sensing

Info-Greedy Sequential Adaptive Compressed Sensing Info-Greedy Sequential Adaptive Compressed Sensing Yao Xie Joint work with Gabor Braun and Sebastian Pokutta Georgia Institute of Technology Presented at Allerton Conference 2014 Information sensing for

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information