A Convex Approach for Designing Good Linear Embeddings. Chinmay Hegde

Similar documents
IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER

Structured matrix factorizations. Example: Eigenfaces

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

Compressed Sensing: Extending CLEAN and NNLS

SPARSE signal representations have gained popularity in recent

Sketching for Large-Scale Learning of Mixture Models

Sparse Approximation and Variable Selection

L26: Advanced dimensionality reduction

Compressive Sensing and Beyond

Nonlinear Dimensionality Reduction

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization

Provable Alternating Minimization Methods for Non-convex Optimization

Introduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012

LECTURE NOTE #11 PROF. ALAN YUILLE

Nonlinear Dimensionality Reduction. Jose A. Costa

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Recovering any low-rank matrix, provably

Sparse Solutions of an Undetermined Linear System

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Some Optimization and Statistical Learning Problems in Structural Biology

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

Spectral Algorithms I. Slides based on Spectral Mesh Processing Siggraph 2010 course

15. Conic optimization

Geometric Constraints II

Invertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning

HW2 - Due 01/30. Each answer must be mathematically justified. Don t forget your name.

Compressed Sensing and Neural Networks

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

The Graph Realization Problem

Recovering overcomplete sparse representations from structured sensing

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Multiclass Classification-1

Sparse molecular image representation

Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction

Compressive Sensing, Low Rank models, and Low Rank Submatrix

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation

Design of Spectrally Shaped Binary Sequences via Randomized Convex Relaxations

The Projected Power Method: An Efficient Algorithm for Joint Alignment from Pairwise Differences

Scalable Subspace Clustering

Machine Learning for Signal Processing Sparse and Overcomplete Representations

Lecture: Expanders, in theory and in practice (2 of 2)

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

Introduction to Compressed Sensing

Three Generalizations of Compressed Sensing

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

WIRELESS sensor networks (WSNs) have attracted

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization

Semi Supervised Distance Metric Learning

Lecture: Face Recognition and Feature Reduction

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

Low-Dimensional Signal Models in Compressive Sensing

EUSIPCO

Fantope Regularization in Metric Learning

A Quantum Interior Point Method for LPs and SDPs

Congruence. Chapter The Three Points Theorem

Recovery of Low-Rank Plus Compressed Sparse Matrices with Application to Unveiling Traffic Anomalies

A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality

Non-linear Dimensionality Reduction

Orientation Assignment in Cryo-EM

An Introduction to Sparse Approximation

Lecture: Face Recognition and Feature Reduction

Detecting Sparse Structures in Data in Sub-Linear Time: A group testing approach

Adaptive Compressive Imaging Using Sparse Hierarchical Learned Dictionaries

The convex algebraic geometry of rank minimization

Machine Learning - MT Clustering

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

CS 6375 Machine Learning

Dimensionality Reduction

Big Data Analytics: Optimization and Randomization

A tutorial on sparse modeling. Outline:

PCA FACE RECOGNITION

EECS 275 Matrix Computation

Symmetric Factorization for Nonconvex Optimization

Semidefinite Programming Basics and Applications

CoSaMP. Iterative signal recovery from incomplete and inaccurate samples. Joel A. Tropp

Real solving polynomial equations with semidefinite programming

Statistical Issues in Searches: Photon Science Response. Rebecca Willett, Duke University

High-Rank Matrix Completion and Subspace Clustering with Missing Data

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

Blind Compressed Sensing

Low-Rank Matrix Recovery

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

Bayesian Paradigm. Maximum A Posteriori Estimation

Problem 1 (Exercise 2.2, Monograph)

Feature selection and extraction Spectral domain quality estimation Alternatives

Uniqueness Conditions for A Class of l 0 -Minimization Problems

The Direct Extension of ADMM for Multi-block Convex Minimization Problems is Not Necessarily Convergent

arxiv: v1 [math.st] 10 Sep 2015

Exponential decay of reconstruction error from binary measurements of sparse signals

An Extended Frank-Wolfe Method, with Application to Low-Rank Matrix Completion

Low-rank Matrix Completion with Noisy Observations: a Quantitative Comparison

Randomized Algorithms

Transcription:

A Convex Approach for Designing Good Linear Embeddings Chinmay Hegde

Redundancy in Images

Image Acquisition Sensor Information content <<< size of signal Can we leverage this fact to build an acquisition system that is specifically tailored to such signals?

(Linear) Dim. Reduction Design criteria Linear mapping Reduce dimensionality as much as possible Ensure information preservation Near-isometric embedding Q. Is this possible?

(Linear) Dim. Reduction A. Yes [JL] Suppose Then, w.h.p.,

(Linear) Dim. Reduction Q. Can we beat random projections? A. - On the one hand: Lower bounds for JL 03] [Alon

(Linear) Dim. Reduction Q. Can we beat random projections? A. - On the one hand: Lower bounds for JL [Alon 03] - On the other hand: Carefully constructed linear projections can often do better This Talk: An optimization based approach for designing good linear embeddings

Designing a Good Linear Map Given: (normalized) pairwise differences Want: the shortest matrix, such that

Designing a Good Linear Map Given: (normalized) pairwise differences Want: the shortest matrix, such that

Designing a Good Linear Map Convert quadratic constraints in into linear constraints in ( Lifting trick ) Rank of P = number of rows in

Designing a Good Linear Map Convert quadratic constraints in into linear constraints in ( Lifting trick ) Use a nuclear-norm relaxation of the rank Simplified problem: [HSYB12]

Semidefinite Formulation - Solvable via standard interior point techniques - Rank of solution is controlled by

Semidefinite Formulation - Solvable via standard interior point techniques - Rank of solution is controlled by - Practical considerations: N large, Q very large! For a matrix P of size N x N, the computational costs per iteration scale as

Algorithm Alternating Direction Method of Multipliers (ADMM) [HSYB12]

Algorithm: NuMax Alternating Direction Method of Multipliers (ADMM) - solve for P using spectral thresholding - solve for L using least-squares - solve for q using squishing Computational costs per iteration: [HSYB12]

Ext: Task Adaptivity Can prune the secants according to the task at hand If goal is signal reconstruction, preserve all pairwise differences If goal is classification, preserve only inter-class pairwise differences Can preferentially weight the input vectors according to importance (connections with boosting)

Ext: Model Adaptivity Designing sensing matrices for redundant dictionaries Desiderata: the holographic basis must be as incoherent as possible (in terms of mutual/ average coherence) for good reconstruction [Elad06]

Ext: Model Adaptivity Designing sensing matrices for redundant dictionaries Desiderata: the holographic basis must be as incoherent as possible (in terms of mutual/ average coherence) for good reconstruction [Elad06] Upshot: coherence pairwise inner products linear constraints: Hence, simple variant of NuMax works

Squares Number of measurements M 100 80 60 40 20 NuMax PCA Random 0 0 0.2 0.4 0.6 0.8 Isometry constant δ M=40 linear measurements enough to ensure isometry constant of 0.01

Squares: CS Reconstruction 20 15 NuMax: 20 db NuMax: 6 db NuMax: 0 db Random: 20 db Random: 6 db Random: 0 db MSE 10 5 0 15 20 25 30 35 M NuMax uniformly beats Random Projections

Circles vs. Squares 0.5 0.4 NuMax PCA Random Probability of error 0.3 0.2 0.1 0 0 10 20 30 40 50 60 Number of measurements M NuMax beats Random Projections, PCA

Circles vs. Squares 0.5 0.4 NuMax PCA Random Probability of error 0.3 0.2 0.1 0 0 10 20 30 40 50 60 Number of measurements M NuMax beats Random Projections, PCA

Circles vs. Squares 0.5 0.4 NuMax PCA Random Probability of error 0.3 0.2 0.1 0 0 10 20 30 40 50 60 Number of measurements M NuMax beats Random Projections, PCA

MNIST Dataset M = 10 basis functions suffice to achieve = 0.2

LabelMe: Image Retrieval 1 Percentage of preserved NN 0.8 0.6 0.4 0.2 NuMax PCA 0 0 20 Random 40 60 80 Number of NN - Goal: preserve neighborhood structure - N = 512, Q = 4000, M = 45 suffices - NuMax beats Random, PCA

(Some) Theory Can we predict (or bound) the rank of NuMax solution?

(Some) Theory Can we predict (or bound) the rank of NuMax solution? Result: (easy: count the number of active constraints)

Summary Goal: develop a representation for data that is linear, isometric Can be posed as a rank-minimization problem Semi-definite program (SDP) achieves this NuMax achieves this very efficiently Applications: Compressive sensing, classification, retrieval, ++ [HSYB12] A Convex Approach for Learning Near-Isometric Linear Embeddings, Nov 2012