Sparsifying Transform Learning for Compressed Sensing MRI

Similar documents
Efficient Data-Driven Learning of Sparse Signal Models and Its Applications

LEARNING OVERCOMPLETE SPARSIFYING TRANSFORMS FOR SIGNAL PROCESSING. Saiprasad Ravishankar and Yoram Bresler

Learning Sparsifying Transforms

EUSIPCO

TRACKING SOLUTIONS OF TIME VARYING LINEAR INVERSE PROBLEMS

Sparse linear models

694 IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, VOL. 3, NO. 4, DECEMBER 2017

Bayesian Nonparametric Dictionary Learning for Compressed Sensing MRI

Recovering overcomplete sparse representations from structured sensing

Applied Machine Learning for Biomedical Engineering. Enrico Grisan

Overview. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013

2.3. Clustering or vector quantization 57

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

Structured matrix factorizations. Example: Eigenfaces

SPARSE signal representations have gained popularity in recent

EE 381V: Large Scale Optimization Fall Lecture 24 April 11

Sparse & Redundant Signal Representation, and its Role in Image Processing

An Introduction to Sparse Approximation

A discretized Newton flow for time varying linear inverse problems

Regularizing inverse problems using sparsity-based signal models

Compressed Sensing and Related Learning Problems

Sparse linear models and denoising

Machine Learning for Signal Processing Sparse and Overcomplete Representations

Strengthened Sobolev inequalities for a random subspace of functions

Compressed Sensing and Neural Networks

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

EFFICIENT LEARNING OF DICTIONARIES WITH LOW-RANK ATOMS

Compressed Sensing via Partial l 1 Minimization

Lecture Notes 9: Constrained Optimization

Introduction to Compressed Sensing

Motivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble

Greedy Signal Recovery and Uniform Uncertainty Principles

Edinburgh Research Explorer

Bayesian Paradigm. Maximum A Posteriori Estimation

Sparse Solutions of an Undetermined Linear System

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

The Iteration-Tuned Dictionary for Sparse Representations

SOS Boosting of Image Denoising Algorithms

1 Sparsity and l 1 relaxation

Compressive Sensing and Beyond

On the coherence barrier and analogue problems in compressed sensing

Minimizing the Difference of L 1 and L 2 Norms with Applications

Greedy Dictionary Selection for Sparse Representation

Introduction to Sparsity. Xudong Cao, Jake Dreamtree & Jerry 04/05/2012

Tutorial: Sparse Signal Processing Part 1: Sparse Signal Representation. Pier Luigi Dragotti Imperial College London

Generalized Orthogonal Matching Pursuit- A Review and Some

Recent developments on sparse representation

SIGNAL SEPARATION USING RE-WEIGHTED AND ADAPTIVE MORPHOLOGICAL COMPONENT ANALYSIS

Adaptive Compressive Imaging Using Sparse Hierarchical Learned Dictionaries

Provable Alternating Minimization Methods for Non-convex Optimization

Combining Sparsity with Physically-Meaningful Constraints in Sparse Parameter Estimation

Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise

Bayesian Methods for Sparse Signal Recovery

Randomness-in-Structured Ensembles for Compressed Sensing of Images

Oslo Class 6 Sparsity based regularization

Application to Hyperspectral Imaging

Super-resolution via Convex Programming

Primal Dual Pursuit A Homotopy based Algorithm for the Dantzig Selector

Computing Sparse Representation in a Highly Coherent Dictionary Based on Difference of L 1 and L 2

Optimization for Compressed Sensing

2D X-Ray Tomographic Reconstruction From Few Projections

Gauge optimization and duality

Non-convex Robust PCA: Provable Bounds

Compressive Sensing Theory and L1-Related Optimization Algorithms

A tutorial on sparse modeling. Outline:

Overcomplete Dictionaries for. Sparse Representation of Signals. Michal Aharon

Invertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning

LEARNING DATA TRIAGE: LINEAR DECODING WORKS FOR COMPRESSIVE MRI. Yen-Huan Li and Volkan Cevher

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT

Algorithms for sparse analysis Lecture I: Background on sparse approximation

MIT 9.520/6.860, Fall 2018 Statistical Learning Theory and Applications. Class 08: Sparsity Based Regularization. Lorenzo Rosasco

Sparse analysis Lecture III: Dictionary geometry and greedy algorithms

Lecture 22: More On Compressed Sensing

A Quest for a Universal Model for Signals: From Sparsity to ConvNets

Compressive Sensing (CS)

Seismic data interpolation and denoising by learning a tensor tight frame

Exponential decay of reconstruction error from binary measurements of sparse signals

IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER

ABSTRACT. Recovering Data with Group Sparsity by Alternating Direction Methods. Wei Deng

Sensing systems limited by constraints: physical size, time, cost, energy

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Particle Filtered Modified-CS (PaFiMoCS) for tracking signal sequences

Compressed sensing and imaging

Blind Compressed Sensing

Low-Complexity FPGA Implementation of Compressive Sensing Reconstruction

Mathematical introduction to Compressed Sensing

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Sparse Approximation and Variable Selection

CSC 576: Variants of Sparse Learning

1 Computing with constraints

Compressed Sensing: Extending CLEAN and NNLS

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011

Beyond incoherence and beyond sparsity: compressed sensing in the real world

Lecture Notes 5: Multiresolution Analysis

A Sparsity Enforcing Framework with TVL1 Regularization and. its Application in MR Imaging and Source Localization. Wei Shen

Compressed Sensing and Sparse Recovery

Tensor-Based Dictionary Learning for Multidimensional Sparse Recovery. Florian Römer and Giovanni Del Galdo

GREEDY SIGNAL RECOVERY REVIEW

Topographic Dictionary Learning with Structured Sparsity

Transcription:

Sparsifying Transform Learning for Compressed Sensing MRI Saiprasad Ravishankar and Yoram Bresler Department of Electrical and Computer Engineering and Coordinated Science Laborarory University of Illinois at Urbana-Champaign April 8, 2013

Outline Why compressed sensing MRI (CSMRI)? Nonadaptive CSMRI Synthesis Dictionary Learning MRI Transform vs. Synthesis Model Transform Learning MRI Formulations Algorithms Results (Static MRI) Conclusions

Motivation for Compressed Sensing MRI Data are samples in k-space, acquired sequentially in time. Acquisition rate limited by MR physics, etc. CS allows recovery of images from limited measurements Sparsity in transform domain or dictionary Acquisition incoherent with sparse model Reconstruction non-linear, non-convex Fig. from Lustig et al. 07

Compressed Sensing MRI (Nonadaptive) min x F u x y 2 2 +λ Ψx 1 (1) x C P - Image as vector, y C m - measurements. F u C m P - Undersampled Fourier encoding matrix (m < P). Ψ C T P - global, orthonormal transform. Total Variation penalty also added to (1) [Lustig et al. 07]. CSMRI with non-adaptive transforms limited to 2.5-3 fold undersampling [Ma et al. 08].

Synthesis Dictionary Learning The DL problem - min R x Dα 2 2 s.t. α 0 s (2) D,{α } R C n P extracts n n patch of x. D C n K - patch-based dictionary. α C K - sparse, R x Dα. s - sparsity level. DL problem is NP-hard. Algorithms such as K-SVD 1 alternate between finding D and {α }. 1 [Aharon et al. 06]

Learning Dictionaries from Undersampled Data (DLMRI) 2 (P0) min x,d,{α } Sparse Fitting {}}{ R x Dα 2 2 +ν F ux y 2 2 }{{} Data Fidelity s.t. α 0 s. (P0) learns D, and reconstructs x, from only undersampled y. But, (P0) NP-hard, non-convex even if l 0 -quasinorm relaxed to l 1. DLMRI solves (P0) by alternating between DL (solving for D,{α }) and reconstruction update (solving for x). 2 [Ravishankar & Bresler 11]

2D Random Sampling Example - 6x undersampling 0.3 0.25 0.2 0.15 0.1 0.05 LDP 3 reconstruction (22 db) LDP error magnitude 0 0.3 0.25 0.2 0.15 0.1 0.05 DLMRI reconstruction (32 db) DLMRI error magnitude 0 Data from Miki Lustig, UC Berkeley. 3 LDP - Lustig, Donoho, and Pauly ( 07).

Drawbacks of DLMRI DLMRI computations do not scale well O(Kn 2 P) for a P-pixel image, and D C n K. Cost dominated by dictionary learning, particularly sparse coding, which by itself is an NP-hard problem. DL algorithms such as K-SVD can get stuck in bad local minima or even saddle points. Can we learn better, more efficient sparse models for MR images?

Synthesis Model for Sparse Representation Given a signal y C n, and dictionary D C n K, we assume y = Dx with x 0 K. Real world signals modeled as y = Dx +e, e is deviation term. Given D, and sparsity level s, ˆx = argmin x y Dx 2 2 s.t. x 0 s This is the NP-hard synthesis sparse coding problem. Greedy and l 1 -relaxation algorithms are computationally expensive.

Transform Model for Sparse Representation Given a signal y C n, and transform W C m n, we model Wy = x +η with x 0 m and η - error term. Natural images are approximately sparse in Wavelets, DCT. Given W, and sparsity s, transform sparse coding is ˆx = argmin x Wy x 2 2 s.t. x 0 s ˆx computed exactly by thresholding Wy. Sparse coding is cheap! Signal recovered as W ˆx. Sparsifying transforms exploited for compression (JPEG2000), etc.

Square Transform Learning (P1) min W,{α } Sparsification Error {}}{ WR x α 2 2 λ s.t. α 0 s Regularizers {( }} ){ log detw W 2 F Sparsification error - measures deviation of patch in transform domain from perfect sparsity. λ > 0. The log detw restricts solution to full rank transforms. W 2 F keeps obective function bounded from below. ( ) Minimizing λ log detw W 2 F encourages reduction of condition number. The solution to (P1) is perfectly conditioned (κ = 1) as λ. (P1) is non-convex.

Transform Learning (TL) Algorithm Algorithm for (P1) alternates between updating {α } and W. Sparse Coding Step solves (P1) with fixed W. min WR x α 2 {α } 2 s.t. α 0 s (3) Easy problem - Solution ˆα computed exactly by thresholding WR x, and retaining s largest magnitude coefficients. Transform Update Step solves (P1) with fixed α s. min WR x α 2 W 2 λ log detw +µ W 2 F (4) Closed-form solution: Ŵ = U 2 (Σ+ ( ) Σ 2 )1 2 +2λI n Q H L 1, where ( ) R xx H R H +µi n = LL H, and L 1 R xα H = QΣU H.

TL Properties Obective converges for our exact alternating algorithm. Empirical evidence suggests convergence to same obective value regardless of initialization. Computational cost of TL : O(MNn 2 ) for N training signals, M iterations, and W C n n is significantly lower than cost for DL : O(MNn 3 ). Reduction in order by n for n n patch. Large values of λ enforce well-conditioning of transform.

Transform Learning MRI (TLMRI) (P2) min x,w,{α } WR x α 2 2 +λq(w)+ν F ux y 2 2 s.t. α 0 s. Similar to DLMRI formulation, but uses transform model. Q(W) = log detw + W 2 F. We modify (P2) by introducing extra variables ˆx in a penalty-type formulation, which leads to efficient algorithms. (P2) min Wˆx α 2 2 +λq(w)+ν F ux y 2 2 x,w,{ˆx },{α } +τ R x ˆx 2 2 s.t. α 0 s. Penalty R x ˆx 2 2 will also help us adaptively choose sparsity s.

TLMRI Algorithm - Denoising Step (P2) solved using alternating minimization. For given image x (corrupted), (P2) reduces to a denoising problem, with ˆx the denoised patches. Denoising Step - (P3) min W,{ˆx },{α } Wˆx α 2 2 +λq(w)+τ s.t. α 0 s. R x ˆx 2 2 Denoising involves: Transform learning (solve for W, {α } with fixed ˆx = WR x, s = s). Variable sparsity patch update (solving for ˆx and s ).

TLMRI Algorithm - Denoising Step The variable sparsity patch update involves solving (P3a) min Wˆx H s (WR x) 2 {ˆx } 2 +τ R x ˆx 2 2 H s (b) thresholds to s largest elements of b C n. For fixed s, (P3a) reduces to separate least squares problems in each ˆx. As s ր n, the denoising error R x ˆx LS 2 ց 0, with ˆxLS 2 the least squares solution for specific s. We pick s so that the error is below a threshold C - can be done efficiently [Ravishankar & Bresler 12]. C decreases over iterations, as iterates become more refined.

TLMRI Algorithm - Reconstruction Update Step Reconstruction Update Step - (P4) min x τ R x ˆx 2 2 +ν F ux y 2 2 Update performed directly in k-space { S(k x,k y ), (k x,k y ) / Ω Fx (k x,k y ) = S(k x,k y)+ν S 0(k x,k y) 1+ν, (k x,k y ) Ω (5) Fx(k x,k y ) - updated k-space value, S 0 (k x,k y ) - measured value, Ω- subset of k-space sampled. S = F RH ˆx β, ν = ν τβ (β - Number of patches covering any pixel).

TLMRI Algorithm Properties Every step of our algorithm involves efficient closed-form solutions. Per-iteration computational cost of TLMRI is lower than that of DLMRI in order by factor n (patch size).

Cartesian Sampling with 4x undersampling 33 PSNR 32 31 30 TLMRI DLMRI LDP Zero Filling 29 28 5 10 15 20 25 Iteration Number Original Image Zero-filling recon. PSNR vs. Iterations (PSNR = 28.94 db) TLMRI with square transform is better and 12x faster than 4x overcomplete DLMRI with 6 6 patches. TLMRI significantly better and also faster than LDP that employs fixed transforms.

Cartesian Sampling with 4x undersampling TLMRI recon. DLMRI recon. (PSNR = 32.54 db) (PSNR = 32.40 db) 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 TLMRI Error 0 DLMRI Error 0

Unconstrained TLMRI TLMRI algorithm requires setting error thresholds for variable sparsity update, and uses a penalty method-type approach. Alternative scheme employs an l 0 penalty instead, and uses the Augmented Lagrangian. (P5) Wˆx α 2 2 +λq(w)+ν F ux y 2 2 min x,w,{ˆx },{α },µ +η 2 α 0 + Re { µ H (R x ˆx ) } + τ R x ˆx 2 2 2 µ is a Lagrange multiplier matrix with µ C n as columns. This is an unconstrained formulation, which is still non-convex. We solve it using the alternating direction method of multipliers (ADMM). We can group some terms together in (P5) and set µ = µ τ.

Algorithm for Unconstrained TLMRI Update of α uses simple hard thresholding of W ˆx with threshold level η. min W ˆx α 2 2 +η2 α 0 (6) {α } Update of W uses Closed-Form Solution. min W ˆx α 2 W 2 λ log detw +λ W 2 F (7) Update of {ˆx } involves a least squares problem in each ˆx. min Wˆx α 2 {ˆx } 2 + τ R x ˆx µ 2 (8) 2 2 Update of µ : µ = µ (R x ˆx ). Update of x done efficiently in k-space. τ min R x ˆx µ 2 x 2 +ν F 2 ux y 2 2 (9)

Unconstrained TLMRI - Cartesian 4x undersampling 33 0.2 0.2 32 0.15 0.15 PSNR 31 30 Unconstrained TLMRI DLMRI 0.1 0.05 0.1 0.05 29 5 10 15 20 25 Iteration Number PSNR Unconstrained TLMRI DLMRI vs. Iterations Error Error (PSNR = 32.55 db) (PSNR = 32.40 db) 0 0 Our algorithm for Unconstrained TLMRI is better and 19x faster than DLMRI. Penalty approach performs similarly to unconstrained one with appropriate choice of error thresholds, but is slower.

Unconstrained TLMRI - 2D Random 5x undersampling 0.25 0.2 0.15 0.1 0.05 TLMRI 4 recon. (PSNR = 30.52 db) TLMRI Error 0 DLMRI recon. (PSNR = 28.70 db) DLMRI Error 0.25 0.2 0.15 0.1 0.05 0 Data from Miki Lustig, UC Berkeley. 4 12x Speedup over DLMRI.

Conclusions We proposed transform learning for undersampled MRI (TLMRI). Each step in our algorithms involves simple closed-form solutions. TLMRI provides comparable or better reconstructions than DLMRI. TLMRI is significantly faster than DLMRI. Unconstrained TLMRI algorithm faster than penalty-based approach. Speedups over DLMRI increase with patch size: >40x for 8 8 patches application to 3D/4D reconstruction. Iterates in our TLMRI algorithms empirically observed to converge. Future Work: Adaptive overcomplete transforms and doubly sparse transforms for MRI. Extension to dynamic MRI, functional MRI, etc.