Model Selection with Partly Smooth Functions

Similar documents
Sparsity and Compressed Sensing

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

arxiv: v3 [math.oc] 15 Sep 2014

Activity Identification and Local Linear Convergence of Forward Backward-type methods

Sparse Regularization on Thin Grids I: the LASSO

Exact Support Recovery for Sparse Spikes Deconvolution

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

ROBUST BLIND SPIKES DECONVOLUTION. Yuejie Chi. Department of ECE and Department of BMI The Ohio State University, Columbus, Ohio 43210

Gauge optimization and duality

Generalized greedy algorithms.

Least Squares and Linear Systems

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Provable Alternating Minimization Methods for Non-convex Optimization

Regularization and Inverse Problems

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

One condition for all: solution uniqueness and robustness of l 1 -synthesis and l 1 -analysis minimizations

Sparse Proteomics Analysis (SPA)

Linear Inverse Problems

Inverse problems and sparse models (6/6) Rémi Gribonval INRIA Rennes - Bretagne Atlantique, France.

arxiv: v3 [math.oc] 27 Oct 2016

Continuous Primal-Dual Methods in Image Processing

Régularisations de Faible Complexité pour les Problèmes Inverses. Low Complexity Regularizations of Inverse Problems

Constrained optimization

Oslo Class 6 Sparsity based regularization

Mathematical introduction to Compressed Sensing

The degrees of freedom of the Group Lasso for a General Design

CSC 576: Variants of Sparse Learning

The degrees of freedom of the Lasso for general design matrix

OWL to the rescue of LASSO

Three Generalizations of Compressed Sensing

Dictionary Learning for photo-z estimation

Inverse Problems meets Statistical Learning

The degrees of freedom of partly smooth regularizers

Robust Sparse Analysis Regularization

Robust multichannel sparse recovery

Overview. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Self-Calibration and Biconvex Compressive Sensing

Composite Objective Mirror Descent

1 Sparsity and l 1 relaxation

Spectral k-support Norm Regularization

Sparse Optimization Lecture: Dual Certificate in l 1 Minimization

Sélection adaptative des paramètres pour le débruitage des images

Primal-dual algorithms for next-generation radio-interferometric imaging

Recovery Guarantees for Rank Aware Pursuits

Blind Identification of Invertible Graph Filters with Multiple Sparse Inputs 1

Sparsity in system identification and data-driven control

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Sparsity and Morphological Diversity in Source Separation. Jérôme Bobin IRFU/SEDI-Service d Astrophysique CEA Saclay - France

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE

Convex relaxation for Combinatorial Penalties

Sparse Solutions of an Undetermined Linear System

Iterative regularization of nonlinear ill-posed problems in Banach space

Massive MIMO: Signal Structure, Efficient Processing, and Open Problems II

GROUP SPARSITY WITH OVERLAPPING PARTITION FUNCTIONS

Reconstruction of Block-Sparse Signals by Using an l 2/p -Regularized Least-Squares Algorithm

EUSIPCO

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina

LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING

Introduction to Compressed Sensing

Beyond incoherence and beyond sparsity: compressed sensing in the real world

EE 381V: Large Scale Optimization Fall Lecture 24 April 11

A Tutorial on Compressive Sensing. Simon Foucart Drexel University / University of Georgia

Sparsifying Transform Learning for Compressed Sensing MRI

Sparse Approximation and Variable Selection

On the recovery of measures without separation conditions

Exponential Weighted Aggregation vs Penalized Estimation: Guarantees And Algorithms

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco

Uniqueness Conditions For Low-Rank Matrix Recovery

Recovering any low-rank matrix, provably

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

Sparse Signal Reconstruction with Hierarchical Decomposition

Lecture Notes 10: Matrix Factorization

Least Sparsity of p-norm based Optimization Problems with p > 1

Sparse Recovery Beyond Compressed Sensing

Recovery of Sparse Signals from Noisy Measurements Using an l p -Regularized Least-Squares Algorithm

An Introduction to Compressed Sensing

Compressed Sensing in Astronomy

Recent Developments in Compressed Sensing

A Simple Algorithm for Nuclear Norm Regularized Problems

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

Proximal Methods for Optimization with Spasity-inducing Norms

Subspace Projection Matrix Completion on Grassmann Manifold

A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration

The Sparsity Gap. Joel A. Tropp. Computing & Mathematical Sciences California Institute of Technology

Variational Image Restoration

Tikhonov Regularization in General Form 8.1

Topographic Dictionary Learning with Structured Sparsity

MCMC Sampling for Bayesian Inference using L1-type Priors

Enhanced Compressive Sensing and More

Optimisation Combinatoire et Convexe.

Super-resolution via Convex Programming

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

Generalized Power Method for Sparse Principal Component Analysis

Non-convex Robust PCA: Provable Bounds

IMAGE RESTORATION: TOTAL VARIATION, WAVELET FRAMES, AND BEYOND

Recent Advances in Structured Sparse Models

Sparse Parameter Estimation: Compressed Sensing meets Matrix Pencil

Primal Dual Pursuit A Homotopy based Algorithm for the Dantzig Selector

Transcription:

Model Selection with Partly Smooth Functions Samuel Vaiter, Gabriel Peyré and Jalal Fadili vaiter@ceremade.dauphine.fr August 27, 2014 ITWIST 14 Model Consistency of Partly Smooth Regularizers, arxiv:1405.1004, 2014 1

Linear Inverse Problems Forward model y = Φ x 0 + w Forward operator Φ : R n R q linear (q n) 2

Linear Inverse Problems Forward model y = Φ x 0 + w Forward operator Φ : R n R q linear (q n) ill-posed problem 2

Linear Inverse Problems Forward model y = Φ x 0 + w Forward operator Φ : R n R q linear (q n) ill-posed problem denoising inpainting deblurring 2

Variational Regularization Trade-off between prior regularization and data fidelity 3

Variational Regularization Trade-off between prior regularization and data fidelity x Argmin J(x) + x R n 1 2λ y Φx 2 (P y,λ ) 3

Variational Regularization Trade-off between prior regularization and data fidelity x Argmin J(x) + x R n 1 2λ y Φx 2 (P y,λ ) λ 0 + x Argmin J(x) subject to y = Φx (P y,0 ) x R n 3

Variational Regularization Trade-off between prior regularization and data fidelity x Argmin J(x) + x R n 1 2λ y Φx 2 (P y,λ ) λ 0 + x Argmin J(x) subject to y = Φx (P y,0 ) x R n J convex, bounded from below and finite-valued function, typically non-smooth. 3

Objective x 0 y x 4

Low Complexity Models Sparsity J(x) = X xi i=1,...,n Mx = x 0 : supp(x 0 ) supp(x) 5

Low Complexity Models Sparsity J(x) = X xi Group sparsity X J(x) = xb i=1,...,n b B Mx = x 0 : supp(x 0 ) supp(x) 5

Low Complexity Models Sparsity J(x) = X xi Group sparsity X J(x) = xb i=1,...,n Mx = x 0 : supp(x 0 ) supp(x) b B Low rank J(x) = X σi (x) i=1,...,n Mx = x 0 : rank(x 0 ) = rank(x) 5

Partly Smooth Functions [Lewis 2002] T M x x M J is partly smooth at x relative to a C 2 -manifold M if Smoothness. J restricted to M is C 2 around x Sharpness. h (T M x), t J(x + th) is non-smooth at t = 0. Continuity. J on M is continuous around x. 6

Partly Smooth Functions [Lewis 2002] T M x x M J, G partly smooth J is partly smooth at x relative to a C 2 -manifold M if Smoothness. J restricted to M is C 2 around x Sharpness. h (T M x), t J(x + th) is non-smooth at t = 0. Continuity. J on M is continuous around x. J + G J D with D linear operator partly smooth J σ (spectral lift) 1, 1, 1,2,,, max i ( d i, x ) + partly smooth. 6

Dual Certificates x Argmin J(x) subject to y = Φx (P y,0 ) x R n 7

Dual Certificates Source condition x Argmin J(x) subject to y = Φx (P y,0 ) x R n Φ p J(x) J(x) Φ x p Φx = Φx 0 7

Dual Certificates Source condition x Argmin J(x) subject to y = Φx (P y,0 ) x R n Φ p J(x) J(x) Φ x p Φx = Φx 0 Proposition There exists a dual certificate p if, and only if, x 0 is a solution of (P y,0 ). 7

Dual Certificates x Argmin J(x) subject to y = Φx (P y,0 ) x R n Source condition Φ p J(x) Non-degenerate source condition Φ p ri J(x) J(x) Φ x p Φx = Φx 0 Proposition There exists a dual certificate p if, and only if, x 0 is a solution of (P y,0 ). 7

Linearized Precertificate Minimal norm certificate p 0 = argmin p subject to Φ p J(x 0 ) 8

Linearized Precertificate Minimal norm certificate p 0 = argmin p subject to Φ p J(x 0 ) Linearized precertificate p F = argmin p subject to Φ p aff J(x 0 ) 8

Linearized Precertificate Minimal norm certificate p 0 = argmin p subject to Φ p J(x 0 ) Linearized precertificate p F = argmin p subject to Φ p aff J(x 0 ) Proposition Assume Ker Φ T M x 0 = {0}. Then, p F ri J(x 0 ) p F = p 0 8

Manifold Selection Theorem Assume J is partly smooth at x 0 relative to M. If Φ p F ri J(x 0 ) and Ker Φ T M x 0 = {0}. There exists C > 0 such that if max(λ, w /λ) C, the unique solution x of (P y,λ ) satisfies x M and x x 0 = O( w ). 9

Manifold Selection Theorem Assume J is partly smooth at x 0 relative to M. If Φ p F ri J(x 0 ) and Ker Φ T M x 0 = {0}. There exists C > 0 such that if max(λ, w /λ) C, the unique solution x of (P y,λ ) satisfies x M and x x 0 = O( w ). Almost sharp analysis (Φ p F J(x 0 ) x M x0 ) 9

Manifold Selection Theorem Assume J is partly smooth at x 0 relative to M. If Φ p F ri J(x 0 ) and Ker Φ T M x 0 = {0}. There exists C > 0 such that if max(λ, w /λ) C, the unique solution x of (P y,λ ) satisfies x M and x x 0 = O( w ). Almost sharp analysis (Φ p F J(x 0 ) x M x0 ) [Fuchs 2004]: l 1 [Bach 2008]: l 1 l 2 and nuclear norm. 9

Sparse Spike Deconvolution x 0 10

Sparse Spike Deconvolution Φx = i x i ϕ( i) J(x) = x 1 γ Φx 0 x 0 10

Sparse Spike Deconvolution Φx = i x i ϕ( i) J(x) = x 1 γ Φx 0 x 0 Φ η F ri J(x 0 ) Φ +, I Φ c I sign(x 0,I ) < 1 stable recovery I = supp(x 0 ) η 0,I c 1 γ crit γ 10

1D Total Variation and Jump Set J = d 1, M x = { x : supp( d x ) supp( d x) }, Φ = Id 11

1D Total Variation and Jump Set J = d 1, M x = { x : supp( d x ) supp( d x) }, Φ = Id x i u k i k +1 1 stable jump unstable jump Φ p F = div u 11

Take-away Message Partial smoothness: encodes models using singularities 12

Future Work Extended-valued functions: minimization under constraints min x R n 1 2 y Φx 2 + λj(x) subject to x 0 13

Future Work Extended-valued functions: minimization under constraints 1 min x R n 2 y Φx 2 + λj(x) subject to x 0 Non-convexity: Fidelity and regularization, dictionary learning min x k R n,d D k 1 2 y ΦDx k 2 + λj(x k ) 13

Future Work Extended-valued functions: minimization under constraints 1 min x R n 2 y Φx 2 + λj(x) subject to x 0 Non-convexity: Fidelity and regularization, dictionary learning min x k R n,d D k 1 2 y ΦDx k 2 + λj(x k ) Infinite dimensional problems: partial smoothness for BV, Besov 1 min f BV(Ω) L 2 (Ω) 2 g Ψf L 2 (Ω) + λ Df (Ω) 13

Future Work Extended-valued functions: minimization under constraints 1 min x R n 2 y Φx 2 + λj(x) subject to x 0 Non-convexity: Fidelity and regularization, dictionary learning min x k R n,d D k 1 2 y ΦDx k 2 + λj(x k ) Infinite dimensional problems: partial smoothness for BV, Besov 1 min f BV(Ω) L 2 (Ω) 2 g Ψf L 2 (Ω) + λ Df (Ω) Compressed sensing: Optimal bounds for partly smooth regularizers 13

Thanks for your attention 14