Robust PCA. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Similar documents
Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Robust Principal Component Analysis

Principal Component Analysis

Optimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices

The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices

Robust Component Analysis via HQ Minimization

Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation

Fast Algorithms for Structured Robust Principal Component Analysis

A direct formulation for sparse PCA using semidefinite programming

SPARSE SIGNAL RESTORATION. 1. Introduction

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Supplementary Materials: Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications

Sparse PCA with applications in finance

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

CSC 576: Variants of Sparse Learning

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

arxiv: v2 [stat.ml] 1 Jul 2013

Grassmann Averages for Scalable Robust PCA Supplementary Material

Lecture 9: September 28

Compressive Sensing, Low Rank models, and Low Rank Submatrix

A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank

Sparse Solutions of an Undetermined Linear System

Supplementary Material of A Novel Sparsity Measure for Tensor Recovery

arxiv: v3 [math.oc] 18 Oct 2013

Lecture 8: February 9

arxiv: v1 [cs.cv] 17 Nov 2015

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

arxiv: v1 [cs.cv] 1 Jun 2014

Lecture 5 : Projections

Matrix Differentiation

Robust Orthonormal Subspace Learning: Efficient Recovery of Corrupted Low-rank Matrices

Dense Error Correction for Low-Rank Matrices via Principal Component Pursuit

Low-Rank Tensor Completion by Truncated Nuclear Norm Regularization

Structured matrix factorizations. Example: Eigenfaces

Linear Algebra Formulas. Ben Lee

Lecture Notes 10: Matrix Factorization

sparse and low-rank tensor recovery Cubic-Sketching

A direct formulation for sparse PCA using semidefinite programming

Big Data Analytics: Optimization and Randomization

Exact Recoverability of Robust PCA via Outlier Pursuit with Tight Recovery Bounds

Analysis of Robust PCA via Local Incoherence

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Matrix Algebra, part 2

FAST FIRST-ORDER METHODS FOR STABLE PRINCIPAL COMPONENT PURSUIT

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Minimizing the Difference of L 1 and L 2 Norms with Applications

Solving Corrupted Quadratic Equations, Provably

Optimization for Learning and Big Data

CHAPTER 11. A Revision. 1. The Computers and Numbers therein

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison

Fast Nonnegative Matrix Factorization with Rank-one ADMM

On the Convex Geometry of Weighted Nuclear Norm Minimization

Supplement to A Generalized Least Squares Matrix Decomposition. 1 GPMF & Smoothness: Ω-norm Penalty & Functional Data

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Low Rank Matrix Completion Formulation and Algorithm

Elastic-Net Regularization of Singular Values for Robust Subspace Learning

User s Guide for LMaFit: Low-rank Matrix Fitting

Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds

Stochastic dynamical modeling:

Algorithms for constrained local optimization

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

GoDec: Randomized Low-rank & Sparse Matrix Decomposition in Noisy Case

Linearized Alternating Direction Method: Two Blocks and Multiple Blocks. Zhouchen Lin 林宙辰北京大学

of Orthogonal Matching Pursuit

Sparse Optimization Lecture: Dual Methods, Part I

ORIE 4741: Learning with Big Messy Data. Proximal Gradient Method

Tractable Upper Bounds on the Restricted Isometry Constant

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization

Coordinate Update Algorithm Short Course Proximal Operators and Algorithms

Bindel, Fall 2009 Matrix Computations (CS 6210) Week 8: Friday, Oct 17

Elaine T. Hale, Wotao Yin, Yin Zhang

Iterative Methods. Splitting Methods

Fantope Regularization in Metric Learning

Matrix Completion from Noisy Entries

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

Sparse Covariance Selection using Semidefinite Programming

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Novel methods for multilinear data completion and de-noising based on tensor-svd

9. Dual decomposition and dual algorithms

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

ROBUST LOW-RANK TENSOR MODELLING USING TUCKER AND CP DECOMPOSITION

Recovering Low-Rank and Sparse Matrices via Robust Bilateral Factorization

MATRIX RECOVERY FROM QUANTIZED AND CORRUPTED MEASUREMENTS

Dominant Feature Vectors Based Audio Similarity Measure

Performance Guarantees for ReProCS Correlated Low-Rank Matrix Entries Case

arxiv: v1 [stat.ml] 1 Mar 2015

Lecture: Smoothing.

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

A new method on deterministic construction of the measurement matrix in compressed sensing

Divide-and-Conquer Matrix Factorization

Non-convex Robust PCA: Provable Bounds

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations

Robust Principal Component Analysis Based on Low-Rank and Block-Sparse Matrix Decomposition

Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery

Transcription:

Robust PCA CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Robust PCA 1 / 52

Previously... Previously... not robust against outliers linear least squares PCA robust against outliers trimmed least squares trimmed PCA Various ways to robustify PCA: trimming: remove outliers covariance matrix with 0-1 weight [Xu95]: similar to trimming weighted SVD [Gabriel79]: weighting robust error function [Torre2001]: winsorizing Strength: simple concepts Weakness: no guarantee of optimality Leow Wee Kheng (NUS) Robust PCA 2 / 52

Robust PCA Robust PCA PCA can be formulated as follows: Given a data matrix D, recover a low-rank matrix A from D such that the error E = D A is minimized: min E F, subject to rank(a) r, D = A+E. (1) A,E r min(m,n) is the target rank of A. F is the Frobenius norm. Notes: This definition of PCA includes dimensionality reduction. PCA is severely affected by large-amplitude noise; not robust. Leow Wee Kheng (NUS) Robust PCA 3 / 52

Robust PCA [Wright2009] formulated the Robust PCA problem as follows: Given a data matrix D = A+E where A and E are unknown but A is low-rank and E is sparse, recover A. An obvious way to state the robust PCA problem in math is: Given a data matrix D, find A and E that solve the problem min rank(a)+λ E 0, subject to A+E = D. (2) A,E λ is a Lagrange multiplier. E 0 : l 0 -norm, number of non-zero elements in E. E is sparse if E 0 is small. Leow Wee Kheng (NUS) Robust PCA 4 / 52

Robust PCA min rank(a)+λ E 0, subject to A+E = D. (2) A,E D A E Problem 2 is a matrix recovery problem. rank(a) and E 0 are not continuous, not convex; very hard to solve; no efficient algorithm. Leow Wee Kheng (NUS) Robust PCA 5 / 52

Robust PCA [Candès2011, Wright2009] reformulated Problem 2 as follows: Given an m n data matrix D, find A and E that solve min A +λ E 1, subject to A+E = D. (3) A,E A : nuclear norm, sum of singular values of A; surrogate for rank(a). E 1 : l 1 -norm, sum of absolute values of elements of E; surrogate for E 0. Solution of Problem 3 can be recovered exactly if A is sufficiently low-rank but not sparse, and E is sufficiently sparse but not low-rank, with optimal λ = 1/ max(m,n). A and E 1 are convex; can apply convex optimization. Leow Wee Kheng (NUS) Robust PCA 6 / 52

Robust PCA Introduction to Convex Optimization Introduction to Convex Optimization For a differentiable function f(x), its minimizer x is given by [ df( x) f( x) dx = x 1 ] f( x) = 0. (4) x m f x df / dx = 0 x Leow Wee Kheng (NUS) Robust PCA 7 / 52

Robust PCA Introduction to Convex Optimization E 1 is not differentiable when any of its element is zero! f(x) 12 10 8 6 4 2 4 2 x2 0 2 4 4 2 0 x1 2 4 0 Cannot write the following because they don t exist: d E 1 de, E 1 e ij, d e ij de ij. WRONG! (5) Fortunately, E 1 is convex. Leow Wee Kheng (NUS) Robust PCA 8 / 52

Robust PCA Introduction to Convex Optimization A function f(x) is convex if and only if x 1,x 2, α [0,1], f(αx 1 +(1 α)x 2 ) αf(x 1 )+(1 α)f(x 2 ). (6) f ( x) x 1 x 2 x x Leow Wee Kheng (NUS) Robust PCA 9 / 52

Robust PCA Introduction to Convex Optimization A vector g(x) is a subgradient of convex function f at x if f(y) f(x) g(x) (y x), y. (7) f ( x) l 1 l 2 l 3 x 1 x 2 x At differentiable point x 1 : one unique subgradient = gradient. At non-differentiable point x 2 : multiple subgradients. Leow Wee Kheng (NUS) Robust PCA 10 / 52

Robust PCA Introduction to Convex Optimization The subdifferential f(x) is the set of all subgradients of f at x: { } f(x) = g(x) g(x) (y x) f(y) f(x), y. (8) Caution: Subdifferential f(x) is not partial differentiation f(x)/ x. Don t mix up. Partial differentiation is defined for differentiable functions. It is a single vector. Subdifferential is defined for convex functions, which may be non-differentiable. It is a set of vectors. Leow Wee Kheng (NUS) Robust PCA 11 / 52

Robust PCA Introduction to Convex Optimization Example: Absolute value. x is not differentiable at x = 0 but subdifferentiable at x = 0: x if x > 0, {+1} if x > 0, x = 0 if x = 0, x = [ 1,+1] if x = 0, x if x < 0. { 1} if x < 0. (9) g = 1 x g = +1 1 0 < g < 1 0 x 0 absolute value g = 0 x 1 subdifferential Leow Wee Kheng (NUS) Robust PCA 12 / 52

Robust PCA Introduction to Convex Optimization Example: l 1 -norm of an m-d vector x. m x 1 = x i, x 1 = i=1 m x i. (10) i=1 Caution: Right-hand-side of Eq. 10 is set addition. This gives a product of m sets, one for each element x i : {+1} if x i > 0, x 1 = J 1 J m, J i = [ 1,+1] if x i = 0, { 1} if x i < 0. Alternatively, we can write x 1 as = +1 if x i > 0, x 1 = {g} such that g i [ 1,+1] if x i = 0, = 1 if x i < 0. (11) (12) Leow Wee Kheng (NUS) Robust PCA 13 / 52

Robust PCA Introduction to Convex Optimization Another alternative: x 1 = {g} such that { gi = sgn(g i ) if x i > 0, g i 1 if x i = 0. (13) Here are some examples of the subdifferentials of 2D l 1 -norm. f(0,0) = [ 1,1] [ 1,1] f(1,0) = {1} [ 1,1] f(1,1) = {(1,1)} Leow Wee Kheng (NUS) Robust PCA 14 / 52

Robust PCA Introduction to Convex Optimization Optimality Condition x is a minimum of f if 0 f(x), i.e., 0 is a subgradient of f. f ( x) x g ( x) = 0 x For more details on convex functions and convex optimization, refer to [Bertsekas2003, Boyd2004, Rockafellar70, Shor85]. Leow Wee Kheng (NUS) Robust PCA 15 / 52

Robust PCA Back to Robust PCA Back to Robust PCA Shrinkage or soft threshold operator: x ε if x > ε, T ε (x) = x+ε if x < ε, 0 otherwise. (14) T ε( x ) ε 0 ε x Leow Wee Kheng (NUS) Robust PCA 16 / 52

Robust PCA Back to Robust PCA Using convex optimization, the following minimizers are shown [Cai2010, Hale2007]: For an m n matrix M with SVD USV, UT ε (S)V = argmin X ε X + 1 2 X M 2 F, (15) T ε (M) = argmin X ε X 1 + 1 2 X M 2 F. (16) Leow Wee Kheng (NUS) Robust PCA 17 / 52

Robust PCA Solving Robust PCA Solving Robust PCA There are several ways to solve robust PCA (Problem 3) [Lin2009,Wright2009]: principal component pursuit iterative thresholding accelerated proximal gradient augmented Lagrange multipliers Leow Wee Kheng (NUS) Robust PCA 18 / 52

Augmented Lagrange Multipliers Augmented Lagrange Multipliers Consider the problem min x f(x) subject to c j (x) = 0,j = 1,...,m. (17) This is a constrained optimization problem. Lagrange multipliers method reformulates Problem 17 as: with some constants λ j. min x f(x)+ m λ j c j (x) (18) j=1 Leow Wee Kheng (NUS) Robust PCA 19 / 52

Augmented Lagrange Multipliers Penalty method reformulates Problem 17 as: m minf(x)+µ c 2 x j(x). (19) j=1 Parameter µ increases over iteration, e.g., by factor of 10. Need µ to get good solution. Augmented Lagrange multipliers method combines Lagrange multipliers and penalty: m minf(x)+ λ j c j (x)+ µ m c 2 x 2 j(x). (20) j=1 j=1 Denote λ = [λ 1 λ m ], c(x) = [c 1 (x) c m (x)]. Then, Problem 20 becomes min x f(x)+λ c(x)+ µ 2 c(x) 2. (21) Leow Wee Kheng (NUS) Robust PCA 20 / 52

Augmented Lagrange Multipliers If the constraints form a matrix C = [c jk ], then define Λ = [λ jk ]. Then Problem 20 becomes min x f(x)+ Λ,C(x) + µ 2 C(x) 2 F. (22) Λ,C is sum of product of corresponding elements: Λ,C = j λ jk c jk. k C F is the Frobenius norm: C 2 F = j c 2 jk. k Leow Wee Kheng (NUS) Robust PCA 21 / 52

Augmented Lagrange Multipliers Augmented Lagrange Multipliers (ALM) Method 1. Initialize Λ, µ > 0, ρ 1. 2. Repeat until convergence: 2.1 Compute x = argmin x L(x) where 2.2 Update Λ = Λ+µC(x). 2.3 Update µ = ρµ. L(x) = f(x)+ Λ,C(x) + µ 2 C(x) 2 F. What kind of optimization algorithm is this? ALM does not need µ to get good solution. That means, can converge faster. Leow Wee Kheng (NUS) Robust PCA 22 / 52

Augmented Lagrange Multipliers With ALM, robust PCA (Problem 3) is reformulated as min A,E A +λ E 1 + Y,D A E + µ 2 D A E 2 F, (23) Y are the Lagrange multipliers. Constraints C = D A E. To implement Step 2.1 of ALM, need to find minima for A and E. Adopt alternating optimization. Leow Wee Kheng (NUS) Robust PCA 23 / 52

Augmented Lagrange Multipliers Trace of Matrix Trace of Matrix The trace of a matrix A is the sum of its diagonal elements a ii : tr(a) = n a ii. (24) i=1 Strictly speaking, scalar c and a 1 1 matrix [c] are not the same thing. Nevertheless, since tr([c]) = c, we often write, for simplicity tr(c) = c. Properties tr(a) = n λ i, where λ i are the eigenvalues of A. i=1 tr(a+b) = tr(a)+tr(b) tr(ca) = ctr(a) tr(a) = tr(a ) Leow Wee Kheng (NUS) Robust PCA 24 / 52

Augmented Lagrange Multipliers Trace of Matrix tr(abcd) = tr(bcda) = tr(cdab) = tr(dabc) tr(x Y) = tr(xy ) = tr(y X) = tr(yx ) = x ij y ij i j Leow Wee Kheng (NUS) Robust PCA 25 / 52

Augmented Lagrange Multipliers From Problem 23, the minimal E with other variables fixed is given by min E λ E 1 + Y,D A E + µ 2 D A E 2 F min E λ E 1 +tr(y (D A E))+ µ 2 D A E 2 F min E λ E 1 +tr( Y E)+ µ 2 tr((d A E) (D A E)). (Homework) min E λ E 1 + µ 2 tr((e (D A+Y/µ)) (E (D A+Y/µ))) min E λ E 1 + µ 2 E (D A+Y/µ) 2 F min E λ µ E 1 + 1 2 E (D A+Y/µ) 2 F. (25) Leow Wee Kheng (NUS) Robust PCA 26 / 52

Augmented Lagrange Multipliers Compare Eq. 16 and Problem 25 T ε (M) = argmin X ε X 1 + 1 2 X M 2 F, min E λ µ E 1 + 1 2 E (D A+Y/µ) 2 F. Set X = E, M = D A+Y/µ. Then, E = T λ/µ (D A+Y/µ). (26) Leow Wee Kheng (NUS) Robust PCA 27 / 52

Augmented Lagrange Multipliers From Problem 23, the minimal A with other variables fixed is given by min A A + Y,D A E + µ 2 D A E 2 F min A +tr(y (D A E))+ µ 2 D A E 2 F. (Homework) min A Compare Problem 27 and Eq. 15 1 µ A + 1 2 A (D E+Y/µ) 2 F (27) UT ε (S)V = argmin X ε X + 1 2 X M 2 F, Set X = A, M = D E+Y/µ. Then, A = UT 1/µ (S)V. (28) Leow Wee Kheng (NUS) Robust PCA 28 / 52

Augmented Lagrange Multipliers Robust PCA for Matrix Recovery via ALM Inputs: D. 1. Initialize A = 0, E = 0. 2. Initialize Y, µ > 0, ρ > 1. 3. Repeat until convergence: 4. Repeat until convergence: 5. U,S,V = SVD(D E+Y/µ). 6. A = UT 1/µ (S)V. 7. E = T λ/µ (D A+Y/µ). 8. Update Y = Y +µ(d A E). 9. Update µ = ρµ. Outputs: A, E. Leow Wee Kheng (NUS) Robust PCA 29 / 52

Augmented Lagrange Multipliers Typical initialization [Lin2009]: Y = sgn(d)/j(sgn(d)). sgn(d) gives sign of each matrix element of D. J( ) gives scaling factors: J(X) = max( X 2,λ 1 X ). X 2 is spectral norm, largest singular value of X. X is largest absolute value of elements of X. µ = 1.25 D 2. ρ = 1.5. λ = 1/ max(m,n) for m n matrix D. Leow Wee Kheng (NUS) Robust PCA 30 / 52

Augmented Lagrange Multipliers Example: Recovery of video background. video frames data matrix D Leow Wee Kheng (NUS) Robust PCA 31 / 52

Augmented Lagrange Multipliers Sample results: data matrix low-rank error low-rank error Leow Wee Kheng (NUS) Robust PCA 32 / 52

Augmented Lagrange Multipliers Example: Removal of specular reflection and shadow. D A E Leow Wee Kheng (NUS) Robust PCA 33 / 52

Fixed-Rank Robust PCA Fixed-Rank Robust PCA In reflection removal, reflection may be global. ground-truth input Then, E is not sparse: violate RPCA condition! But, rank of A = 1. Fix the rank of A to deal with non-sparse E [Leow2013]. Leow Wee Kheng (NUS) Robust PCA 34 / 52

Fixed-Rank Robust PCA Fixed-Rank Robust PCA via ALM Inputs: D. 1. Initialize A = 0, E = 0. 2. Initialize Y, µ > 0, ρ > 1. 3. Repeat until convergence: 4. Repeat until convergence: 5. U,S,V = SVD(D E+Y/µ). 6. If rank(t 1/µ (S)) < r, A = UT 1/µ (S)V ; else A = US r V. 7. E = T λ/µ (D A+Y/µ). 8. Update Y = Y +µ(d A E). 9. Update µ = ρµ. Outputs: A, E. S r is S with last m r singular values set to 0. Leow Wee Kheng (NUS) Robust PCA 35 / 52

Fixed-Rank Robust PCA Example: Removal of local reflections. ground-truth input FRPCA RPCA Leow Wee Kheng (NUS) Robust PCA 36 / 52

Fixed-Rank Robust PCA Example: Removal of global reflections. ground-truth input FRPCA RPCA Leow Wee Kheng (NUS) Robust PCA 37 / 52

Fixed-Rank Robust PCA Example: Removal of global reflections. ground-truth input FRPCA RPCA Leow Wee Kheng (NUS) Robust PCA 38 / 52

Fixed-Rank Robust PCA Example: Background recovery for traffic video: fast moving vehicles. input FRPCA RPCA Leow Wee Kheng (NUS) Robust PCA 39 / 52

Fixed-Rank Robust PCA Example: Background recovery for traffic video: slow moving vehicles. input FRPCA RPCA Leow Wee Kheng (NUS) Robust PCA 40 / 52

Fixed-Rank Robust PCA Example: Background recovery for traffic video: temporary stop. input FRPCA RPCA Leow Wee Kheng (NUS) Robust PCA 41 / 52

Matrix Completion Matrix Completion Customers are asked to rate the movies from 1 (poor) to 5 (excellent). A 5 5 3 2 B 3 4 3 C 3 4 5 4 D 4 5 5 3. Customers rate only some movies some data are missing. How to estimate the missing data? matrix completion. Leow Wee Kheng (NUS) Robust PCA 42 / 52

Matrix Completion Let D denote data matrix with missing elements set to 0, and M = {(i,j)} denote the indices of missing elements in D. Then, the matrix completion problem can be formulated as Given D and M, find matrix A that solves the problem min A A subject to A+E = D,E ij = 0 (i,j) / M. (29) For (i,j) / M, constrain E ij = 0 so that A ij = D ij ; no change. For (i,j) M, D ij = 0, i.e., A ij = E ij ; recovered value. Reformulating Problem 29 using augmented Lagrange multipliers gives min A A + Y,D A E + µ 2 D A E 2 F (30) such that E ij = 0 (i,j) / M. Leow Wee Kheng (NUS) Robust PCA 43 / 52

Matrix Completion Robust PCA for Matrix Completion Inputs: D. 1. Initialize A = 0, E = 0. 2. Initialize Y, µ > 0, ρ > 1. 3. Repeat until convergence: 4. Repeat until convergence: 5. U,S,V = SVD(D E+Y/µ). 6. A = UT 1/µ (S)V. 7. E = Γ M (D A+Y/µ), where { Xij, for (i,j) M. Γ M (X) = 0, for (i,j) / M, 8. Update Y = Y +µ(d A E). 9. Update µ = ρµ. Outputs: A, E. Leow Wee Kheng (NUS) Robust PCA 44 / 52

Matrix Completion Example: Recovery of occluded parts in face images. input ground matrix matrix truth completion recovery Leow Wee Kheng (NUS) Robust PCA 45 / 52

Summary Summary Robustification of PCA PCA trimming 0 1 covariance weighted SVD robust error function trimmed PCA trimmed PCA weighted PCA winsorized PCA Leow Wee Kheng (NUS) Robust PCA 46 / 52

Summary low rank + sparse matrix decomposition Robust PCA PCA robust PCA (matrix recovery) fix rank of A fix elements of E fixed rank robust PCA robust PCA (matrix completion) Leow Wee Kheng (NUS) Robust PCA 47 / 52

Probing Questions Probing Questions If the data matrix of a problem is composed of a low-rank matrix, a sparse matrix and something else, can you still use robust PCA methods? If yes, how? If no, why? In application of robust PCA to high-resolution colour image processing, the data matrix contains three times as many rows as the number of pixels in the images, which can lead to a very large data matrix that takes a long time to compute. Suggest a way to overcome this problem. In application of robust PCA to video processing, the data matrix contains as many columns as the number of video frames, which can lead to a very large data matrix that is more than the available memory required to store the matrix. Suggest a way to overcome this problem. Leow Wee Kheng (NUS) Robust PCA 48 / 52

Homework Homework 1. Show that tr(x Y) = i x ij y ij. 2. Show that the following two optimization problems are equivalent: min E λ E 1 +tr( Y E)+ µ 2 tr((d A E) (D A E)) min E λ E 1 + µ 2 tr((e (D A+Y/µ)) (E (D A+Y/µ))) 3. Show that the minimal A of Problem 23 with other variables fixed is given by min A 1 µ A + 1 2 A (D E+Y/µ) 2 F. j Leow Wee Kheng (NUS) Robust PCA 49 / 52

References I References 1. D. P. Bertsekas. Convex Analysis and Optimization. Athena Scientific, 2003. 2. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004. 3. J. F. Cai, E. J. Candès, Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal of Optimization, 20(4):1956 1982, 2010. 4. E. J. Candès, X. Li, Y. Ma, and J. Wright. Robust principal component analysis? Journal of ACM, 58(3), 2011. 5. F. De la Torre and M. J. Black. Robust principal component analysis for computer vision. In Proc. Int. Conf. Computer Vision, 2001. 6. R. T. Rockafellar. Convex Analysis. Princeton University Press, 1970. Leow Wee Kheng (NUS) Robust PCA 50 / 52

References II References 7. K. Gabriel and S. Zamir. Lower rank approximation of matrices by least squares with any choice of weights. Technometrics, 21:489-498, 1979. 8. E. T. Hale, W. Yin, Y. Zhang. A fixed-point continuation method for l 1 -regularized minimization with applications to compressed sensing. Tech. Report TR07-07, Dept. of Comp. and Applied Math., Rice University, 2007. 9. W. K. Leow, Y. Cheng, L. Zhang, T. Sim, and L. Foo. Background recovery by fixed-rank robust principal component analysis. In Proc. Int. Conf. on Computer Analysis of Images and Patterns, 2013. 10. Z. Lin, M. Chen, L. Wu, and Y. Ma. The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. Tech. Report UILU-ENG-09-2215, UIUC, 2009. 11. N. Z. Shor. Minimization Methods for Non-Differentiable Functions. Springer-Verlag, 1985. Leow Wee Kheng (NUS) Robust PCA 51 / 52

References III References 12. J. Wright, Y. Peng, Y. Ma, A. Ganesh, and S. Rao. Robust principal component analysis: Exact recovery of corrupted low-rank matrices by convex optimization. In Proc. Conf. on Neural Information Processing Systems, 2009. 13. L. Xu and A. Yuille. Robust principal component analysis by self-organizing rules based on statistical physics approach. IEEE Trans. Neural Networks, 6(1):131-143, 1995. Leow Wee Kheng (NUS) Robust PCA 52 / 52