Guaranteed Rank Minimization via Singular Value Projection: Supplementary Material
|
|
- Juliet Sherman
- 5 years ago
- Views:
Transcription
1 Guaranteed Rank Minimization via Singular Value Projection: Supplementary Material Prateek Jain Microsoft Research Bangalore Bangalore, India Raghu Meka UT Austin Dept. of Computer Sciences Austin, TX, USA Inderjit Dhillon UT Austin Dept. of Computer Sciences Austin, TX, USA Appendix A: Analysis for Affine Constraints Satisfying RIP We first give a proof of Lemma 2. which bounds the error of the (t + )-st iterate (ψ(x t+ )) in terms of the error incurred by the t-th iterate and the optimal solution. Proof of Lemma 2. Recall that ψ(x) = 2 A(X) b 2 2. Since ψ( ) is a quadratic function, we have ψ(x t+ ) ψ(x t ) = ψ(x t ), X t+ X t + 2 A(Xt+ X t ) 2 2 A T (A(X t ) b), X t+ X t + 2 ( + δ 2k) X t+ X t 2 F, (0.) where the inequality follows from RIP applied to the matrix X t+ X t of rank at most 2k. Let Y t+ = X t +δ 2k A T (A(X t ) b) and Now, f t (X) = A T (A(X t ) b), X X t + 2 ( + δ 2k) X X t 2 F. f t (X) = 2 ( + δ 2k) [ A X X t 2 T (A(X t ] ) b) F + 2, X X t + δ 2k = 2 ( + δ 2k) X Y t+ 2 F 2( + δ 2k ) AT (A(X t ) b) 2 F. Thus, by definition, P k (Y t+ ) = X t+ is the minimizer of f t (X) over all matrices X C(k) (of rank at most k). In particular, f t (X t+ ) f t (X ) and, ψ(x t+ ) ψ(x t ) f t (X t+ ) f t (X ) = A T (A(X t ) b), X X t + 2 ( + δ 2k) X X t 2 F A T (A(X t ) b), X X t δ 2k A(X X t ) 2 2 (0.2) δ 2k = ψ(x ) ψ(x t δ 2k ) + ( δ 2k ) A(X X t ) 2 2, where inequality (0.2) follows from RIP applied to X X t. We now prove Theorem.2.
2 Proof of Theorem.2 Let the current solution X t satisfy ψ(x t ) G e 2 /2, where G 0 is a universal constant. Using Lemma 2. and the fact that b A(X ) = e, δ 2k ψ(x t+ ) e ( δ 2k ) b A(Xt ) e 2 2 e δ 2k 2 ( δ 2k ) (ψ(xt ) e T (b A(X t )) + e 2 /2) ψ(xt ) G 2 + 2δ ( 2k ψ(x t ) + 2 ( δ 2k ) G ψ(xt ) + ) G 2 ψ(xt ) Dψ(X t ), ( where D = G + 2δ ( ) ) 2 2k ( δ 2k ) + 2 G. Recall that δ 2k < /3. Hence, selecting G > ( + δ 2k )/( 3δ 2k ), we get D <. Also, ψ(x 0 ) = ψ(0) = b 2 /2. Hence, ψ(x τ ) C e 2 + ϵ where τ = log(/d) log b 2 2(C e 2 +ϵ) and C = G 2 /2. Appendix B: Proof of Weak RIP for Matrix Completion We now give a detailed proof of the weak RIP property for the matrix completion problem, Theorem 2.2. Proof of Lemma 2.4 Let X = UΣV T be the singular value decomposition of X. Then, Since X is µ-incoherent, X ij l= X ij = e T i UΣV T e j = Σ ll U il V jl µ mn ( l= U il Σ ll V jl l= Σ ll U il V jl. l= Σ ll ) µ mn k ( We also need the following technical lemma for our proof. Lemma 0. Let a, b, c, x, y, z [, ]. Then, l= abc xyz a x + b y + c z. Σ 2 ll) /2 = µ k mn X F. Proof We have, abc xyz = (abc xbc) + (xbc xyc) + (xyc xyz) a x bc + b y xc + c z xy a x + b y + c z. We now prove Lemma 2.5. We use the following form of the classical Bernstein s large deviation inequality in our proof. Lemma 0.2 (Bernstein s Inequality, see [3]) Let X, X 2,..., X n be independent random variables with E[X i ] = 0 and X i M for all i. Then, P [ ( t 2 ) /2 X i > t] exp i i V ar(x. i) + Mt/3 Proof of Lemma 2.5 For (i, j) [m] [n], let ω ij be the indicator variables with ω ij = if (i, j) Ω and 0 otherwise. Then, ω ij are independent random variables with Pr[ω ij = ] = p. Let random variable Z ij = ω ij Xij 2. Note that, E[Z ij ] = px 2 ij, V ar(z ij ) = p( p)x 4 ij. Since X is α-regular, Z ij E[Z ij ] X ij 2 (α 2 /mn) X 2 F. Thus, M = max Z ij E[Z ij ] α2 mn X 2 F. (0.3) 2
3 Now, define random variable S = Z ij = ω ijxij 2 = P Ω(X) 2 F. Note that, E[S] = p X 2 F. Since, Z ij are independent random variables, V ar(s) = p( p) X 4 ij p (max X 2 ij) X 2 ij p α2 mn X 4 F. (0.4) Using Bernstein s inequality (Lemma 0.2) for S with t = δp X 2 F and Equations (0.3) and (0.4) we get, ( t 2 ) /2 Pr[ S E[S] > t] 2 exp V ar(z) + Mt/3 ( ) 2 exp δ2 pmn α 2 ( + δ/3) ( ) 2 exp δ2 pmn 3α 2. Proof of Lemma 2.6 We construct S(µ, ϵ) by discretizing the space of low-rank incoherent matrices. Let ρ = ϵ/ 9k 2 mn and D(ρ) = {ρ i : i Z, i < /ρ }. Let U(ρ) = {U R m k : U ij ( µ/m) D(ρ) }, V (ρ) = {V R n k : V ij ( µ/n) D(ρ) }, Σ(ρ) = {Σ R k k : Σ ij = 0, i j, Σ ii D(ρ)}, S(µ, ϵ) = { UΣV T : U U(ρ), Σ Σ(ρ), V V (ρ) }. We will show that S(µ, ϵ) satisfies the conditions of the Lemma. Observe that D(ρ) < 2/ρ. Thus, U(ρ) < (2/ρ) mk, V (ρ) < (2/ρ) nk, Σ(ρ) < (2/ρ) k. Hence, S(µ, ϵ) < (2/ρ) mk+nk+k < (mnk/ϵ) 3(m+n)k. Fix a µ-incoherent X R m n of rank at most k with X 2 =. Let X = UΣV T be the singular value decomposition of X. Let U be the matrix obtained by rounding entries of U to integer multiples of µ ρ/ m as follows: for (i, l) [m] [k], let (U ) il = µρ m m U il. µ ρ Now, since U il µ/ m, it follows that U U(ρ). Further, for all i [m], l [k], µ (U ) il U il < ρ ρ. m Similarly, define V, Σ by rounding entries of V, Σ to integer multiples of µ ρ/ n and ρ respectively. Then, V V (ρ), Σ Σ(ρ) and for (j, l) [n] [k], µρ (V ) jl V jl < ρ, (Σ ) ll Σ ll < ρ. n Let X(ρ) = U Σ V T. Then, by the above equations and Lemma 0., for i [m], l [k], j [n], Thus, for i, j [m] [n], (U ) il (Σ ) ll (V ) jl U il Σ ll V jl < 3ρ. X(ρ) ij X ij = (U ) il (Σ ) ll (V ) jl U il Σ ll V jl l= (U ) il (Σ ) ll (V ) jl U il Σ ll V jl l= < 3kρ. (0.5) 3
4 Using Lemma 2.4 and Equation (0.5) Also, using (0.5), max X(ρ) ij < max X ij + 3kρ µ k X F + mn ϵ mn. X(ρ) X 2 F = X(ρ) ij X ij 2 < 9k 2 mnρ 2 = ϵ 2. Further, by triangle inequality, X(ρ) F > X F ϵ > X F /2. Since, ϵ < and µ k X F, max X(ρ) ij < 2µ k X F < 4µ k X(ρ) F. mn mn Thus, X(ρ) is 4µ k-regular. The lemma now follows by taking Y = X(ρ). We now prove Theorem 2.2 by combining Lemmas 2.5 and 2.6. Proof of Theorem 2.2 Let m n, ϵ = δ/9mnk and S (µ, ϵ) = {Y : Y S(µ, ϵ), Y is 4µ k-regular}, where S(µ, ϵ) is as in Lemma 2.6. Then, by Lemma 2.5 and a union bound, Pr [ P Ω (Y ) 2 F p Y 2 F δp Y 2 F for some Y S (µ, ϵ) ] ( ) 3(m+n)k ( mnk δ 2 ) pmn 2 exp ϵ 6µ 2 k ( δ 2 ) pmn exp(c nk log n) exp 6µ 2, k where C 0 is a constant independent of m, n, k. Thus, if p > Cµ 2 k 2 log n/δ 2 m, where C = 6(C +), with probability at least exp( n log n), the following holds: Y S (µ, ϵ), P Ω (Y ) 2 F p Y 2 F δp Y 2 F. (0.6) As the statement of the theorem is invariant under scaling, it is enough to show the statement for all µ-incoherent matrices X of rank at most k and X 2 =. Fix such a X and suppose that (0.6) holds. Now, by Lemma 2.6 there exists Y S (µ, ϵ) such that Y X F ϵ. Moreover, Y 2 F ( X F + ϵ) 2 X 2 F + 2ϵ X F + ϵ 2 X 2 F + 3ϵk. Proceeding similarly, we can show that X 2 F Y 2 F 3ϵk. (0.7) Further, starting with P Ω (Y X) F Y X F ϵ and arguing as above we get that Combining inequalities (0.7), (0.8) above, we have P Ω (Y ) 2 F P Ω (X) 2 F 3ϵk. (0.8) P Ω (X) 2 F p X 2 F P Ω (X) 2 F P Ω (Y ) 2 F + p X 2 F Y 2 F + P Ω (Y ) 2 F p Y 2 F The theorem now follows. 6ϵk + δp Y 2 F from (0.6), (0.7), (0.8) 6ϵk + δp( X 2 F + 3ϵk) from (0.7) 9ϵk + δp X 2 F 2δp X 2 F. since X 2 F 4
5 Appendix C: SVP-Newton The affine rank minimization problem is a natural generalization to matrices of the following compressed sensing problem (CSP) for vectors: min x x 0, s.t. Ax = b, (0.9) where x 0 is the l 0 norm (size of the support) of x R n, A R m n is the sensing matrix and b R m are the measurements. Similar to ARMP, the compressed sensing problem is also NP-hard in general. However, similar to ARMP, compressed sensing can also be solved for measurement matrices that satisfies restricted isometry property. Restricted Isometry Property (RIP) for the Compressed Sensing problem is similar to the corresponding definition for matrices: ( δ k ) x 2 2 Ax 2 2 ( + δ k ) x 2 2, (0.0) where support(x) k. Note that support of a vector is same as the rank of the corresponding diagonal matrix whose diagonal is given by the same vector. Since, Compressed Sensing is a special case of Affine Rank Minimization Problem (ARMP) with diagonal matrices, i.e., with fixed basis vectors U k = I k, V k = I k (see step 4, SVP algorithm). Hence, SVP-Newton can be directly applied to Problem 0.9. The corresponding updates for SVP-Newton applied to compressed sensing are given by: y t+ x t η t A T (Ax b), (0.) S Set of top k elements ofy t+, (0.2) x t+ S argmin A S x S b 2 2, (0.3) x S We now present a Theorem that shows that the SVP-Newton method applied to the Compressed Sensing problem converges to the optimal solution in O(log k) iterations where k is the optimal support. Theorem 0.3 Suppose the isometry constant of A in CSP satisfies δ 2k < /3. Let b = Ax for a support-k vector x, i.e., x 0 = k and b i, i. Then, SVP-Newton algorithm for CSP (Updates 0.3) with step-size η t = 4/3 converges to the exact x 2 in at most log /α log α iterations, where c = min i x i and α = /3+δ 2k δ 2k. k δ2k c Proof Assume that x 0 = 0. Now, using Lemma 2. specialized to CSP (or see Theorem 2. in Garg and Khandekar []), ( δ 2k ) x t x Ax t b 2 α t b 2, where α = /3+δ 2k δ 2k. Let us assume that the support set S discovered by SVP-Newton at the t-th step differs from the optimal set S (support set for x ) on at least one element. Now, note that if the support set S discovered by SVP-Newton is the optimal set S (support set for x ) then x t = x as x t minimizes A S x S b 2 2. Since, S S ϕ, x t x 2 c 2. Hence, ( δ 2k )c 2 α t b 2, implying, t 2 log /α log k α δ2k c. Note that SVP-Newton converges to the optimal solution in logarithmic number of queries rather than ϵ-approximate solution obtained by GradeS [] which is just a specialization of SVP to the compressed sensing problem. Note that the number of iterations required is an improvement over 5
6 O(k) bound provided by [2]. [2] requires incoherence property for the measurement matrix, which is a stronger condition to RIP. In fact, one can construct measurement matrices A with bounded RIP constant but unbounded incoherence constant. References [] Rahul Garg and Rohit Khandekar. Gradient descent with sparsification: an iterative algorithm for sparse recovery with restricted isometry property. In ICML, doi: 0.45/ [2] Arian Maleki. Coherence analysis of iterative thresholding algorithms. CoRR, abs/ , [3] Wikipedia. Bernstein inequalities (probability theory) wikipedia, the free encyclopedia, URL inequalities_(probability_theory)&oldid= [Online; accessed 6- April-2009]. Note that incoherence property mentioned here is different than the one used in the context of Matrix Completion 6
Guaranteed Rank Minimization via Singular Value Projection
3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 Guaranteed Rank Minimization via Singular Value Projection Anonymous Author(s) Affiliation Address
More informationProvable Alternating Minimization Methods for Non-convex Optimization
Provable Alternating Minimization Methods for Non-convex Optimization Prateek Jain Microsoft Research, India Joint work with Praneeth Netrapalli, Sujay Sanghavi, Alekh Agarwal, Animashree Anandkumar, Rashish
More informationLecture 13 October 6, Covering Numbers and Maurey s Empirical Method
CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and
More informationECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis
ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 7: Matrix completion Yuejie Chi The Ohio State University Page 1 Reference Guaranteed Minimum-Rank Solutions of Linear
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via nonconvex optimization Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationSensing systems limited by constraints: physical size, time, cost, energy
Rebecca Willett Sensing systems limited by constraints: physical size, time, cost, energy Reduce the number of measurements needed for reconstruction Higher accuracy data subject to constraints Original
More informationCompressive Sensing with Random Matrices
Compressive Sensing with Random Matrices Lucas Connell University of Georgia 9 November 017 Lucas Connell (University of Georgia) Compressive Sensing with Random Matrices 9 November 017 1 / 18 Overview
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional
More informationOptimization for Compressed Sensing
Optimization for Compressed Sensing Robert J. Vanderbei 2014 March 21 Dept. of Industrial & Systems Engineering University of Florida http://www.princeton.edu/ rvdb Lasso Regression The problem is to solve
More informationAccelerated Block-Coordinate Relaxation for Regularized Optimization
Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth
More informationConstructing Explicit RIP Matrices and the Square-Root Bottleneck
Constructing Explicit RIP Matrices and the Square-Root Bottleneck Ryan Cinoman July 18, 2018 Ryan Cinoman Constructing Explicit RIP Matrices July 18, 2018 1 / 36 Outline 1 Introduction 2 Restricted Isometry
More informationNew Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit
New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms Chapter 6 Structured and Low Rank Matrices Section 6.3 Numerical Optimization Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at
More informationConditions for Robust Principal Component Analysis
Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and
More informationECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis
ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis Lecture 3: Sparse signal recovery: A RIPless analysis of l 1 minimization Yuejie Chi The Ohio State University Page 1 Outline
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE7C (Spring 08): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee7c@berkeley.edu October
More informationPrediction and Clustering in Signed Networks: A Local to Global Perspective
Journal of Machine Learning Research 15 (2014) 1177-1213 Submitted 2/13; Revised 10/13; Published 3/14 Prediction and Clustering in Signed Networks: A Local to Global Perspective Kai-Yang Chiang Cho-Jui
More informationPU Learning for Matrix Completion
Cho-Jui Hsieh Dept of Computer Science UT Austin ICML 2015 Joint work with N. Natarajan and I. S. Dhillon Matrix Completion Example: movie recommendation Given a set Ω and the values M Ω, how to predict
More informationSupplementary Materials for Riemannian Pursuit for Big Matrix Recovery
Supplementary Materials for Riemannian Pursuit for Big Matrix Recovery Mingkui Tan, School of omputer Science, The University of Adelaide, Australia Ivor W. Tsang, IS, University of Technology Sydney,
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More informationOn Iterative Hard Thresholding Methods for High-dimensional M-Estimation
On Iterative Hard Thresholding Methods for High-dimensional M-Estimation Prateek Jain Ambuj Tewari Purushottam Kar Microsoft Research, INDIA University of Michigan, Ann Arbor, USA {prajain,t-purkar}@microsoft.com,
More informationRandom Methods for Linear Algebra
Gittens gittens@acm.caltech.edu Applied and Computational Mathematics California Institue of Technology October 2, 2009 Outline The Johnson-Lindenstrauss Transform 1 The Johnson-Lindenstrauss Transform
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationConvex Optimization Algorithms for Machine Learning in 10 Slides
Convex Optimization Algorithms for Machine Learning in 10 Slides Presenter: Jul. 15. 2015 Outline 1 Quadratic Problem Linear System 2 Smooth Problem Newton-CG 3 Composite Problem Proximal-Newton-CD 4 Non-smooth,
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationZ Algorithmic Superpower Randomization October 15th, Lecture 12
15.859-Z Algorithmic Superpower Randomization October 15th, 014 Lecture 1 Lecturer: Bernhard Haeupler Scribe: Goran Žužić Today s lecture is about finding sparse solutions to linear systems. The problem
More informationStability and Robustness of Weak Orthogonal Matching Pursuits
Stability and Robustness of Weak Orthogonal Matching Pursuits Simon Foucart, Drexel University Abstract A recent result establishing, under restricted isometry conditions, the success of sparse recovery
More informationOne Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017
One Picture and a Thousand Words Using Matrix Approximtions October 2017 Oak Ridge National Lab Dianne P. O Leary c 2017 1 One Picture and a Thousand Words Using Matrix Approximations Dianne P. O Leary
More informationRecent Developments in Compressed Sensing
Recent Developments in Compressed Sensing M. Vidyasagar Distinguished Professor, IIT Hyderabad m.vidyasagar@iith.ac.in, www.iith.ac.in/ m vidyasagar/ ISL Seminar, Stanford University, 19 April 2018 Outline
More informationStrengthened Sobolev inequalities for a random subspace of functions
Strengthened Sobolev inequalities for a random subspace of functions Rachel Ward University of Texas at Austin April 2013 2 Discrete Sobolev inequalities Proposition (Sobolev inequality for discrete images)
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple
More informationSolving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm
Solving A Low-Rank Factorization Model for Matrix Completion by A Nonlinear Successive Over-Relaxation Algorithm Zaiwen Wen, Wotao Yin, Yin Zhang 2010 ONR Compressed Sensing Workshop May, 2010 Matrix Completion
More informationList of Errata for the Book A Mathematical Introduction to Compressive Sensing
List of Errata for the Book A Mathematical Introduction to Compressive Sensing Simon Foucart and Holger Rauhut This list was last updated on September 22, 2018. If you see further errors, please send us
More informationGeometry of log-concave Ensembles of random matrices
Geometry of log-concave Ensembles of random matrices Nicole Tomczak-Jaegermann Joint work with Radosław Adamczak, Rafał Latała, Alexander Litvak, Alain Pajor Cortona, June 2011 Nicole Tomczak-Jaegermann
More informationNoisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get
Supplementary Material A. Auxillary Lemmas Lemma A. Lemma. Shalev-Shwartz & Ben-David,. Any update of the form P t+ = Π C P t ηg t, 3 for an arbitrary sequence of matrices g, g,..., g, projection Π C onto
More informationGeneralized Orthogonal Matching Pursuit- A Review and Some
Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents
More informationRecovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies
July 12, 212 Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies Morteza Mardani Dept. of ECE, University of Minnesota, Minneapolis, MN 55455 Acknowledgments:
More informationLow-Rank Factorization Models for Matrix Completion and Matrix Separation
for Matrix Completion and Matrix Separation Joint work with Wotao Yin, Yin Zhang and Shen Yuan IPAM, UCLA Oct. 5, 2010 Low rank minimization problems Matrix completion: find a low-rank matrix W R m n so
More informationAn accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems
An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems Kim-Chuan Toh Sangwoon Yun March 27, 2009; Revised, Nov 11, 2009 Abstract The affine rank minimization
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu
More informationUsing SVD to Recommend Movies
Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last
More informationLecture 5 : Projections
Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization
More informationProximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725
Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:
More informationIntroduction to Compressed Sensing
Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral
More informationCompressive Sensing and Beyond
Compressive Sensing and Beyond Sohail Bahmani Gerorgia Tech. Signal Processing Compressed Sensing Signal Models Classics: bandlimited The Sampling Theorem Any signal with bandwidth B can be recovered
More informationRecovering overcomplete sparse representations from structured sensing
Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix
More informationRecovering any low-rank matrix, provably
Recovering any low-rank matrix, provably Rachel Ward University of Texas at Austin October, 2014 Joint work with Yudong Chen (U.C. Berkeley), Srinadh Bhojanapalli and Sujay Sanghavi (U.T. Austin) Matrix
More informationQALGO workshop, Riga. 1 / 26. Quantum algorithms for linear algebra.
QALGO workshop, Riga. 1 / 26 Quantum algorithms for linear algebra., Center for Quantum Technologies and Nanyang Technological University, Singapore. September 22, 2015 QALGO workshop, Riga. 2 / 26 Overview
More informationMinimizing the Difference of L 1 and L 2 Norms with Applications
1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:
More informationLecture 9: September 28
0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These
More informationCompressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery
Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Jorge F. Silva and Eduardo Pavez Department of Electrical Engineering Information and Decision Systems Group Universidad
More informationOrthonormal Transformations and Least Squares
Orthonormal Transformations and Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 30, 2009 Applications of Qx with Q T Q = I 1. solving
More informationPenalty and Barrier Methods. So we again build on our unconstrained algorithms, but in a different way.
AMSC 607 / CMSC 878o Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 3: Penalty and Barrier Methods Dianne P. O Leary c 2008 Reference: N&S Chapter 16 Penalty and Barrier
More informationLecture 9: Matrix approximation continued
0368-348-01-Algorithms in Data Mining Fall 013 Lecturer: Edo Liberty Lecture 9: Matrix approximation continued Warning: This note may contain typos and other inaccuracies which are usually discussed during
More informationSparsity Regularization
Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation
More informationA Randomized Algorithm for the Approximation of Matrices
A Randomized Algorithm for the Approximation of Matrices Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert Technical Report YALEU/DCS/TR-36 June 29, 2006 Abstract Given an m n matrix A and a positive
More informationDepartment of Mathematics, University of California, Berkeley. GRADUATE PRELIMINARY EXAMINATION, Part A Spring Semester 2015
Department of Mathematics, University of California, Berkeley YOUR OR 2 DIGIT EXAM NUMBER GRADUATE PRELIMINARY EXAMINATION, Part A Spring Semester 205. Please write your - or 2-digit exam number on this
More informationA Quick Tour of Linear Algebra and Optimization for Machine Learning
A Quick Tour of Linear Algebra and Optimization for Machine Learning Masoud Farivar January 8, 2015 1 / 28 Outline of Part I: Review of Basic Linear Algebra Matrices and Vectors Matrix Multiplication Operators
More informationLinear Least Squares. Using SVD Decomposition.
Linear Least Squares. Using SVD Decomposition. Dmitriy Leykekhman Spring 2011 Goals SVD-decomposition. Solving LLS with SVD-decomposition. D. Leykekhman Linear Least Squares 1 SVD Decomposition. For any
More informationFourier Bases on the Skewed Sierpinski Gasket
Fourier Bases on the Skewed Sierpinski Gasket Calvin Hotchkiss University Nebraska-Iowa Functional Analysis Seminar November 5, 2016 Part 1: A Fast Fourier Transform for Fractal Approximations The Fractal
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini April 27, 2018 1 / 80 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationRandom projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016
Lecture notes 5 February 9, 016 1 Introduction Random projections Random projections are a useful tool in the analysis and processing of high-dimensional data. We will analyze two applications that use
More informationSymmetric Factorization for Nonconvex Optimization
Symmetric Factorization for Nonconvex Optimization Qinqing Zheng February 24, 2017 1 Overview A growing body of recent research is shedding new light on the role of nonconvex optimization for tackling
More informationKey Words: low-rank matrix recovery, iteratively reweighted least squares, matrix completion.
LOW-RANK MATRIX RECOVERY VIA ITERATIVELY REWEIGHTED LEAST SQUARES MINIMIZATION MASSIMO FORNASIER, HOLGER RAUHUT, AND RACHEL WARD Abstract. We present and analyze an efficient implementation of an iteratively
More informationCS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3
CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3 Felix Kwok February 27, 2004 Written Problems 1. (Heath E3.10) Let B be an n n matrix, and assume that B is both
More informationNonconvex Low Rank Matrix Factorization via Inexact First Order Oracle
Nonconvex Low Rank Matrix Factorization via Inexact First Order Oracle Tuo Zhao Zhaoran Wang Han Liu Abstract We study the low rank matrix factorization problem via nonconvex optimization. Compared with
More informationsparse and low-rank tensor recovery Cubic-Sketching
Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru
More informationChapter 4. Unconstrained optimization
Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file
More informationLimitations in Approximating RIP
Alok Puranik Mentor: Adrian Vladu Fifth Annual PRIMES Conference, 2015 Outline 1 Background The Problem Motivation Construction Certification 2 Planted model Planting eigenvalues Analysis Distinguishing
More informationFast recovery from a union of subspaces
Fast recovery from a union of subspaces Chinmay Hegde Iowa State University Piotr Indyk MIT Ludwig Schmidt MIT Abstract We address the problem of recovering a high-dimensional but structured vector from
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationCS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5
CS 229r: Algorithms for Big Data Fall 215 Prof. Jelani Nelson Lecture 19 Nov 5 Scribe: Abdul Wasay 1 Overview In the last lecture, we started discussing the problem of compressed sensing where we are given
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationChapter 4: Asymptotic Properties of the MLE (Part 2)
Chapter 4: Asymptotic Properties of the MLE (Part 2) Daniel O. Scharfstein 09/24/13 1 / 1 Example Let {(R i, X i ) : i = 1,..., n} be an i.i.d. sample of n random vectors (R, X ). Here R is a response
More informationAnalysis of Greedy Algorithms
Analysis of Greedy Algorithms Jiahui Shen Florida State University Oct.26th Outline Introduction Regularity condition Analysis on orthogonal matching pursuit Analysis on forward-backward greedy algorithm
More informationr=1 r=1 argmin Q Jt (20) After computing the descent direction d Jt 2 dt H t d + P (x + d) d i = 0, i / J
7 Appendix 7. Proof of Theorem Proof. There are two main difficulties in proving the convergence of our algorithm, and none of them is addressed in previous works. First, the Hessian matrix H is a block-structured
More informationConsistent Robust Regression
Consistent Robust Regression Kush Bhatia University of California, Berkeley Prateek Jain Microsoft Research, India Parameswaran Kamalaruban EPFL, Switzerland Purushottam Kar Indian Institute of Technology,
More information10. Unconstrained minimization
Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation
More information1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?)
Math 35 Exam Review SOLUTIONS Overview In this third of the course we focused on linear learning algorithms to model data. summarize: To. Background: The SVD and the best basis (questions selected from
More informationIEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior
More informationSketching as a Tool for Numerical Linear Algebra All Lectures. David Woodruff IBM Almaden
Sketching as a Tool for Numerical Linear Algebra All Lectures David Woodruff IBM Almaden Massive data sets Examples Internet traffic logs Financial data etc. Algorithms Want nearly linear time or less
More informationLecture 9: Numerical Linear Algebra Primer (February 11st)
10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template
More informationAn algebraic perspective on integer sparse recovery
An algebraic perspective on integer sparse recovery Lenny Fukshansky Claremont McKenna College (joint work with Deanna Needell and Benny Sudakov) Combinatorics Seminar USC October 31, 2018 From Wikipedia:
More informationPreliminary Examination, Numerical Analysis, August 2016
Preliminary Examination, Numerical Analysis, August 2016 Instructions: This exam is closed books and notes. The time allowed is three hours and you need to work on any three out of questions 1-4 and any
More informationBlock-sparse Solutions using Kernel Block RIP and its Application to Group Lasso
Block-sparse Solutions using Kernel Block RIP and its Application to Group Lasso Rahul Garg IBM T.J. Watson research center grahul@us.ibm.com Rohit Khandekar IBM T.J. Watson research center rohitk@us.ibm.com
More informationThis manuscript is for review purposes only.
1 2 3 4 5 6 7 8 9 10 11 12 THE USE OF QUADRATIC REGULARIZATION WITH A CUBIC DESCENT CONDITION FOR UNCONSTRAINED OPTIMIZATION E. G. BIRGIN AND J. M. MARTíNEZ Abstract. Cubic-regularization and trust-region
More informationOn the l 1 -Norm Invariant Convex k-sparse Decomposition of Signals
On the l 1 -Norm Invariant Convex -Sparse Decomposition of Signals arxiv:1305.6021v2 [cs.it] 11 Nov 2013 Guangwu Xu and Zhiqiang Xu Abstract Inspired by an interesting idea of Cai and Zhang, we formulate
More informationApproximation. Inderjit S. Dhillon Dept of Computer Science UT Austin. SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina.
Using Quadratic Approximation Inderjit S. Dhillon Dept of Computer Science UT Austin SAMSI Massive Datasets Opening Workshop Raleigh, North Carolina Sept 12, 2012 Joint work with C. Hsieh, M. Sustik and
More informationLaplace s Equation. Chapter Mean Value Formulas
Chapter 1 Laplace s Equation Let be an open set in R n. A function u C 2 () is called harmonic in if it satisfies Laplace s equation n (1.1) u := D ii u = 0 in. i=1 A function u C 2 () is called subharmonic
More informationCHAPTER 7. Regression
CHAPTER 7 Regression This chapter presents an extended example, illustrating and extending many of the concepts introduced over the past three chapters. Perhaps the best known multi-variate optimisation
More informationPerformance Analysis for Sparse Support Recovery
Performance Analysis for Sparse Support Recovery Gongguo Tang and Arye Nehorai ESE, Washington University April 21st 2009 Gongguo Tang and Arye Nehorai (Institute) Performance Analysis for Sparse Support
More informationExact Low-rank Matrix Recovery via Nonconvex M p -Minimization
Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:
More informationInfo-Greedy Sequential Adaptive Compressed Sensing
Info-Greedy Sequential Adaptive Compressed Sensing Yao Xie Joint work with Gabor Braun and Sebastian Pokutta Georgia Institute of Technology Presented at Allerton Conference 2014 Information sensing for
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,
More informationNORMS ON SPACE OF MATRICES
NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system
More information