A FISTA-like scheme to accelerate GISTA?

Similar documents
Minimizing Isotropic Total Variation without Subiterations

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

SPARSE SIGNAL RESTORATION. 1. Introduction

Sparsity Regularization

Optimized first-order minimization methods

Learning MMSE Optimal Thresholds for FISTA

Adaptive Primal Dual Optimization for Image Processing and Learning

Compressive Sensing (CS)

MMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm

Agenda. Fast proximal gradient methods. 1 Accelerated first-order methods. 2 Auxiliary sequences. 3 Convergence analysis. 4 Numerical examples

A Study of Numerical Algorithms for Regularized Poisson ML Image Reconstruction

Restoration of Missing Data in Limited Angle Tomography Based on Helgason- Ludwig Consistency Conditions

Two-Material Decomposition From a Single CT Scan Using Statistical Image Reconstruction

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

Regularizing inverse problems using sparsity-based signal models

Projected Nesterov s Proximal-Gradient Signal Recovery from Compressive Poisson Measurements

Approximate Message Passing Algorithms

DNNs for Sparse Coding and Dictionary Learning

NIH Public Access Author Manuscript Inverse Probl. Author manuscript; available in PMC 2010 April 20.

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

A Tutorial on Primal-Dual Algorithm

A memory gradient algorithm for l 2 -l 0 regularization with applications to image restoration

F3-A2/F3-A3: Tensor-based formulation for Spectral Computed Tomography (CT) with novel regularization techniques

Relaxed linearized algorithms for faster X-ray CT image reconstruction

A Brief Overview of Practical Optimization Algorithms in the Context of Relaxation

2D X-Ray Tomographic Reconstruction From Few Projections

arxiv: v3 [math.oc] 29 Jun 2016

Fast proximal gradient methods

SEAGLE: Robust Computational Imaging under Multiple Scattering

Optimization for Learning and Big Data

Preconditioned ADMM with nonlinear operator constraint

Variational Image Restoration

arxiv: v1 [math.na] 2 Nov 2015

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

COMPRESSIVE OPTICAL DEFLECTOMETRIC TOMOGRAPHY

Minimizing the Difference of L 1 and L 2 Norms with Applications

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization

TRACKING SOLUTIONS OF TIME VARYING LINEAR INVERSE PROBLEMS

Lasso: Algorithms and Extensions

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction

1 Sparsity and l 1 relaxation

Iterative Data Refinement for Soft X-ray Microscopy

X-ray scattering tomography for biological applications

Projected Nesterov s Proximal-Gradient Algorithm for Sparse Signal Reconstruction with a Convex Constraint

The relationship between image noise and spatial resolution of CT scanners

On the acceleration of the double smoothing technique for unconstrained convex optimization problems

Enhanced Compressive Sensing and More

MLCC 2018 Variable Selection and Sparsity. Lorenzo Rosasco UNIGE-MIT-IIT

Bayesian Paradigm. Maximum A Posteriori Estimation

Primal-dual algorithms for the sum of two and three functions 1

Dual and primal-dual methods

Low-rank Promoting Transformations and Tensor Interpolation - Applications to Seismic Data Denoising

Comparative Study of Restoration Algorithms ISTA and IISTA

Stochastic Optimization: First order method

OWL to the rescue of LASSO

Scan Time Optimization for Post-injection PET Scans

Recent developments on sparse representation

Lecture 9: September 28

Optimization methods

Optimization methods

Accelerated MRI Image Reconstruction

A Unified Approach to Proximal Algorithms using Bregman Distance

LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING

Objective Functions for Tomographic Reconstruction from. Randoms-Precorrected PET Scans. gram separately, this process doubles the storage space for

CONE-BEAM computed tomography (CBCT) is a widely

An unconstrained multiphase thresholding approach for image segmentation

Signal Restoration with Overcomplete Wavelet Transforms: Comparison of Analysis and Synthesis Priors

Sparse Regularization via Convex Analysis

Lecture 1: September 25

Convex Hodge Decomposition and Regularization of Image Flows

About Split Proximal Algorithms for the Q-Lasso

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Minimization by Incremental Surrogate Optimization (MISO)

A Primal-dual Three-operator Splitting Scheme

Accelerated primal-dual methods for linearly constrained convex problems

Steven Tilley Fully 3D Recon 2015 May 31 -June 4

Sparse linear models and denoising

Towards Proton Computed Tomography

2 Regularized Image Reconstruction for Compressive Imaging and Beyond

Fast and Accurate HARDI and its Application to Neurological Diagnosis

A Localized Linearized ROF Model for Surface Denoising

A Linearly Convergent First-order Algorithm for Total Variation Minimization in Image Processing

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Solving DC Programs that Promote Group 1-Sparsity

Truncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences

Using ADMM and Soft Shrinkage for 2D signal reconstruction

arxiv: v1 [physics.geo-ph] 23 Dec 2017

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE

arxiv: v1 [stat.ml] 22 Nov 2016

A Dual Formulation of the TV-Stokes Algorithm for Image Denoising

Adaptive Corrected Procedure for TVL1 Image Deblurring under Impulsive Noise

Research Article Exact Interior Reconstruction with Cone-Beam CT

Variational image restoration by means of wavelets: simultaneous decomposition, deblurring and denoising

Dual methods for the minimization of the total variation

Scaled gradient projection methods in image deblurring and denoising

Sparsity Regularization for Image Reconstruction with Poisson Data

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning

Transcription:

A FISTA-like scheme to accelerate GISTA? C. Cloquet 1, I. Loris 2, C. Verhoeven 2 and M. Defrise 1 1 Dept. of Nuclear Medicine, Vrije Universiteit Brussel 2 Dept. of Mathematics, Université Libre de Bruxelles MIMS seminar Manchester, January 11, 2013. ccloquet@vub.ac.be, igloris@ulb.ac.be, cverhoev@ulb.ac.be, mdefrise@vub.ac.be tiny.cc/cloquet

Cone Beam CT @ VUB [Philips Brightview XCT]

Cone Beam CT @ VUB [Philips Brightview XCT] Cone-Beam Compute newscenter.philips.com W. Scarfe et al., J Can Dent Assoc 2006, 72(1):75-80 Figure 1: X-ray beam projection scheme comparing a single detector array fan-beam CT (a) and cone-beam CT (b) geometry. F c c (l

Cone Beam CT @ VUB [Bruker Skyscan microct 1178]

Challenges CT 0.4 2.0 % of the cancers in the US caused by CT studies? [Brenner and Hall, NEJM, 2007] lower the dose, ie : do more with less

Challenges CT 0.4 2.0 % of the cancers in the US caused by CT studies? [Brenner and Hall, NEJM, 2007] Cone-beam specific lower the dose, ie : do more with less Cone-beam artifacts

Medical images are piecewise constant... ar.in.tum.de

... i.e. the gradient of medical images is sparse Sparsity brsoc.org.uk

... i.e. the gradient of medical images is sparse Sparsity Sparse gradient ( f f = x, f y, f z ) t i f (x i) 2 is small Few transitions sharp edges between flat regions [Defrise et al., 2011; Rudin et al., 1992; Sidky and Pan, 2008; Sidky et al., 2006] brsoc.org.uk

CT acquisition and reconstruction Attenuation of an X-ray beam I = I 0 exp y = def ( µ(s) = log ) µ(s)ds ( ) I0 I

CT acquisition and reconstruction Attenuation of an X-ray beam CT acquisition I = I 0 exp y = def ( µ(s) = log ) µ(s)ds ( ) I0 I image : data : x R M y R P J (P projections) projector : K : R M R P J : x {y p = K p x} p=1..p

CT acquisition and reconstruction Attenuation of an X-ray beam CT acquisition I = I 0 exp y = def ( µ(s) = log ) µ(s)ds ( ) I0 I image : data : x R M y R P J (P projections) projector : K : R M R P J : x {y p = K p x} p=1..p Cost function Φ(x, y) = G(x, y) + λ H(A x)

CT acquisition and reconstruction Attenuation of an X-ray beam CT acquisition I = I 0 exp y = def ( µ(s) = log ) µ(s)ds ( ) I0 I image : data : x R M y R P J (P projections) projector : K : R M R P J : x {y p = K p x} p=1..p Cost function Reconstruction Φ(x, y) = G(x, y) + λ H(A x) x = arg min x Φ(x, y)

Cost function Φ(x, y) = G(x, y) + λ H(A x)

Cost function Φ(x, y) = G(x, y) + λ H(A x) Data term : G convex + smooth G(x, y) = 1 K x y 2 2

Cost function Φ(x, y) = G(x, y) + λ H(A x) Data term : G convex + smooth G(x, y) = 1 K x y 2 2 Penalty term : H convex + non smooth with A : any linear operator

Cost function Φ(x, y) = G(x, y) + λ H(A x) Data term : G convex + smooth G(x, y) = 1 K x y 2 2 Penalty term : H convex + non smooth with A : any linear operator Total variation penalty [isotropic] x i x i [100] (A x) i = ( x) i = x i x i [010] x i x i [001] H(z) = z 1

How to solve?

Simultaneous Algebraic Reconstruction Technique [Andersen and Kak, 1984] Initialization x 0 : arbitrary image R M 0 < τ p < 2/ K p K T p.

Simultaneous Algebraic Reconstruction Technique [Andersen and Kak, 1984] Initialization x 0 : arbitrary image R M 0 < τ p < 2/ K p K T p. Iteration : x n+1 = I SART (x n ) x (n,0) = x n for p=0... P-1: x (n,p+1) = x (n,p) + τ p K T p ( yp K p x (n,p)) x n+1 = x (n,p)

Simultaneous Algebraic Reconstruction Technique [Andersen and Kak, 1984] Initialization x 0 : arbitrary image R M 0 < τ p < 2/ K p K T p. Iteration : x n+1 = I SART (x n ) x (n,0) = x n for p=0... P-1: x (n,p+1) = x (n,p) + τ p K T p ( yp K p x (n,p)) x n+1 = x (n,p) No account for any penalty

Current algorithms suited for TV us.123rf.com

Current algorithms suited for TV Algorithms PICCS, ASD-POCS : alternates the minimization of data and of TV { } ISTA : x k = arg min x H(x) + 1 x (x 2t k 1 t k G(x k 1 )) 2 k SART-TV : uses a surrogate and a differentiable TV

Current algorithms suited for TV Algorithms PICCS, ASD-POCS : alternates the minimization of data and of TV { } ISTA : x k = arg min x H(x) + 1 x (x 2t k 1 t k G(x k 1 )) 2 k SART-TV : uses a surrogate and a differentiable TV [PICCS : [Chen et al., 2008], ASD-POCS : [Ramani and Fessler, 2012; Sidky and Pan, 2008], ISTA : [Beck and Teboulle, 2009b; Daubechies et al., 2004], SART-TV : [Defrise et al., 2011]] Drawbacks need of imbricated iterations when using TV (all)

Current algorithms suited for TV Algorithms PICCS, ASD-POCS : alternates the minimization of data and of TV { } ISTA : x k = arg min x H(x) + 1 x (x 2t k 1 t k G(x k 1 )) 2 k SART-TV : uses a surrogate and a differentiable TV [PICCS : [Chen et al., 2008], ASD-POCS : [Ramani and Fessler, 2012; Sidky and Pan, 2008], ISTA : [Beck and Teboulle, 2009b; Daubechies et al., 2004], SART-TV : [Defrise et al., 2011]] Drawbacks need of imbricated iterations when using TV (all) need of a differentiable penalty (SART-TV)

Current algorithms suited for TV Algorithms PICCS, ASD-POCS : alternates the minimization of data and of TV { } ISTA : x k = arg min x H(x) + 1 x (x 2t k 1 t k G(x k 1 )) 2 k SART-TV : uses a surrogate and a differentiable TV [PICCS : [Chen et al., 2008], ASD-POCS : [Ramani and Fessler, 2012; Sidky and Pan, 2008], ISTA : [Beck and Teboulle, 2009b; Daubechies et al., 2004], SART-TV : [Defrise et al., 2011]] Drawbacks need of imbricated iterations when using TV (all) need of a differentiable penalty (SART-TV) no proof of convergence (PICCS, ASD-POCS)

Current algorithms suited for TV Algorithms PICCS, ASD-POCS : alternates the minimization of data and of TV { } ISTA : x k = arg min x H(x) + 1 x (x 2t k 1 t k G(x k 1 )) 2 k SART-TV : uses a surrogate and a differentiable TV [PICCS : [Chen et al., 2008], ASD-POCS : [Ramani and Fessler, 2012; Sidky and Pan, 2008], ISTA : [Beck and Teboulle, 2009b; Daubechies et al., 2004], SART-TV : [Defrise et al., 2011]] Drawbacks need of imbricated iterations when using TV (all) need of a differentiable penalty (SART-TV) no proof of convergence (PICCS, ASD-POCS) slow convergence (ISTA) : Φ(x n, y) Φ(x, y) n 1

FISTA accelerates the convergence of ISTA [Beck and Teboulle, 2009a] Initialization x 1 = 0 x 0 : an arbitrary image R M t 0 = 1 θ 0 = 0

FISTA accelerates the convergence of ISTA [Beck and Teboulle, 2009a] Initialization x 1 = 0 x 0 : an arbitrary image R M t 0 = 1 θ 0 = 0 Iteration x n+1 = I FISTA (x n, x n 1 ) x n+1 = I ISTA ( (1 + θn ) x n θ n x n 1)

FISTA accelerates the convergence of ISTA [Beck and Teboulle, 2009a] Initialization x 1 = 0 x 0 : an arbitrary image R M t 0 = 1 θ 0 = 0 Iteration x n+1 = I FISTA (x n, x n 1 ) x n+1 = I ISTA ( (1 + θn ) x n θ n x n 1) (t n+1, θ n+1 ) = s(t n )

FISTA accelerates the convergence of ISTA [Beck and Teboulle, 2009a] Initialization x 1 = 0 x 0 : an arbitrary image R M t 0 = 1 θ 0 = 0 Iteration x n+1 = I FISTA (x n, x n 1 ) x n+1 = I ISTA ( (1 + θn ) x n θ n x n 1) (t n+1, θ n+1 ) = s(t n ) with s(t n) = ( ) 1+ 1+4 tn 2, tn 1. 2 t n+1

FISTA accelerates the convergence of ISTA [Beck and Teboulle, 2009a] Initialization x 1 = 0 x 0 : an arbitrary image R M t 0 = 1 θ 0 = 0 Iteration x n+1 = I FISTA (x n, x n 1 ) x n+1 = I ISTA ( (1 + θn ) x n θ n x n 1) (t n+1, θ n+1 ) = s(t n ) with s(t n) = ( ) 1+ 1+4 tn 2, tn 1. 2 t n+1 Speed of convergence Φ(x n, y) Φ(x, y) n 2.

A way to overcome the difficulties : Generalized ISTA (GISTA) [Loris and Verhoeven, 2011] Cost function Φ(x, y) = G(x, y) + λ H(A x) suitable for A = reduces to ISTA for A orthogonal no internal iteration proven convergence

GISTA [Loris and Verhoeven, 2011] Initialization x 0 : arbitrary image R M w 0 = 0 R D M τ < 2/ K T K σ < 1/ AA T.

GISTA [Loris and Verhoeven, 2011] Initialization x 0 : arbitrary image R M τ < 2/ K T K w 0 = 0 R D M σ < 1/ AA T. Iteration : (x n+1, w n+1 ) = I GISTA (x n, w n ) x n+1 = x n + τk T (y K x n )

GISTA [Loris and Verhoeven, 2011] Initialization x 0 : arbitrary image R M τ < 2/ K T K w 0 = 0 R D M σ < 1/ AA T. Iteration : (x n+1, w n+1 ) = I GISTA (x n, w n ) x n+1 = x n + τk T (y K x n ) τ T w n

GISTA [Loris and Verhoeven, 2011] Initialization x 0 : arbitrary image R M τ < 2/ K T K w 0 = 0 R D M σ < 1/ AA T. Iteration : (x n+1, w n+1 ) = I GISTA (x n, w n ) x n+1 = x n + τk T (y K x n ) τ T w n ( w n+1 = P λ w n + σ x n+1) τ

GISTA [Loris and Verhoeven, 2011] Initialization x 0 : arbitrary image R M τ < 2/ K T K w 0 = 0 R D M σ < 1/ AA T. Iteration : (x n+1, w n+1 ) = I GISTA (x n, w n ) x n+1 = x n + τk T (y K x n ) τ T w n ( w n+1 = P λ w n + σ x n+1) τ { λ ui / u P λ (u) = i if u i > λ u i if u i λ with u i R D, and u i = ui,x 2 + u2 i,y + u2 i,z. } i=1..m

GISTA [Loris and Verhoeven, 2011] Initialization x 0 : arbitrary image R M τ < 2/ K T K w 0 = 0 R D M σ < 1/ AA T. Iteration : (x n+1, w n+1 ) = I GISTA (x n, w n ) x n+1 = x n + τk T (y K x n ) τ T w n ( w n+1 = P λ w n + σ x n+1) τ x n+1 = x n + τk T (y K x n ) τ T w n+1. { λ ui / u P λ (u) = i if u i > λ u i if u i λ with u i R D, and u i = ui,x 2 + u2 i,y + u2 i,z. } i=1..m

But GISTA is slow cross section of a mouse, short scan, 98 projections, acquired on the Skyscan 1178, after 260 (left) and 1000 iterations (right), reconstructed with GISTA and λ =.01

This work How to go as fast as possible? initialization restart FGISTA

Numerical experiment Dataset Poisson noise : 10 4 photons/lor forbild thorax phantom P = 200 projections of J = 600 pixels imp.uni-erlangen.de Image 2D, M = 600 600

GISTA : initialization matters How many initial SART iterations lead to lowest cost within N < 10 4 iterations? cost function 10 1.6 10 1.5 λ =.0025 : starts with... iter of SART... 0... 1... 4 # initial SART iter. 5 4 3 2 10 1.4 1 10 0 10 1 10 2 10 3 10 4 iteration 10 3 10 2 10 1 λ # of initial SART iter when λ

Restarted GISTA Initialization: M 1 iterations of SART Iteration: (1) perform 1 iteration of SART (2) run GISTA during N iter (3) set w = 0 (4) back to (1) Inspired by the restart of conjugate gradient (see also: [O Donoghue and Candès, 2012; Powell, 1977; Sidky and Pan, 2008])

Restarted GISTA (RGISTA) Does RGISTA lead to a lower cost within N < 10 2 iterations?

Restarted GISTA (RGISTA) Does RGISTA lead to a lower cost within N < 10 2 iterations? 10 3.4 λ =.5 cost function 10 3 10 2.6 no restart restart after 1 restart after 30 10 0 10 1 10 2 iteration NO

Restarted GISTA (RGISTA) Does RGISTA lead to a lower cost within N < 10 2 iterations? cost function 10 3.4 10 3 10 2.6 λ =.5 no restart restart after 1 restart after 30 10 0 10 1 10 2 iteration cost function 10 1.6 10 1.5 λ =.0025 restart after 2 no restart 10 1 10 2 iteration NO YES

Restarted GISTA (RGISTA) Does RGISTA lead to a lower cost within N < 10 2 iterations? cost function 10 3.4 10 3 10 2.6 λ =.5 no restart restart after 1 restart after 30 10 0 10 1 10 2 iteration cost function 10 1.6 10 1.5 λ =.0025 restart after 2 no restart 10 1 10 2 iteration NO YES e.g. restart after 2 iter 6 iter instead of 18.

Restarted GISTA (RGISTA) Does RGISTA lead to a lower cost within N < 10 2 iterations? cost function 10 3.4 10 3 10 2.6 λ =.5 no restart restart after 1 restart after 30 10 0 10 1 10 2 iteration cost function 10 1.6 10 1.5 λ =.0025 restart after 2 no restart 10 1 10 2 iteration NO YES e.g. restart after 2 iter 6 iter instead of 18. efficiency of RGISTA depends on λ

Restarted GISTA (RGISTA) λ = 0.5 λ = 0.0025 4 iter 5000 iter 4 iter 5000 iter 0.3 0.25 0.2 0.15 0.1 0.05 0 0 100 200 300 400 0.4 0.3 0.2 0.1 0 50 100 150 200 250 300 350 400

FGISTA Initialization x 1 = 0 x 0 : an arbitrary image R M w 1 = w 0 = 0 R D M t 0 = 1, θ 0 = 0

FGISTA Initialization x 1 = 0 x 0 : an arbitrary image R M w 1 = w 0 = 0 R D M t 0 = 1, θ 0 = 0 Iteration : (x n, w n ) = I FGISTA (x n, w n, x n 1, w n 1 ) v n = (1 + θ n ) x n θ n x n 1 z n = (1 + θ n ) w n θ n w n 1 (x n, w n ) = I GISTA (v n, z n ) (t n+1, θ n+1 ) = s(t n )

FGISTA Initialization x 1 = 0 x 0 : an arbitrary image R M w 1 = w 0 = 0 R D M t 0 = 1, θ 0 = 0 Iteration : (x n, w n ) = I FGISTA (x n, w n, x n 1, w n 1 ) v n = (1 + θ n ) x n θ n x n 1 z n = (1 + θ n ) w n θ n w n 1 (x n, w n ) = I GISTA (v n, z n ) (t n+1, θ n+1 ) = s(t n ) same fixed points as GISTA

FGISTA Initialization x 1 = 0 x 0 : an arbitrary image R M w 1 = w 0 = 0 R D M t 0 = 1, θ 0 = 0 Iteration : (x n, w n ) = I FGISTA (x n, w n, x n 1, w n 1 ) v n = (1 + θ n ) x n θ n x n 1 z n = (1 + θ n ) w n θ n w n 1 (x n, w n ) = I GISTA (v n, z n ) (t n+1, θ n+1 ) = s(t n ) same fixed points as GISTA reduces to FISTA when A is orthogonal

FGISTA Initialization x 1 = 0 x 0 : an arbitrary image R M w 1 = w 0 = 0 R D M t 0 = 1, θ 0 = 0 Iteration : (x n, w n ) = I FGISTA (x n, w n, x n 1, w n 1 ) v n = (1 + θ n ) x n θ n x n 1 z n = (1 + θ n ) w n θ n w n 1 (x n, w n ) = I GISTA (v n, z n ) (t n+1, θ n+1 ) = s(t n ) same fixed points as GISTA reduces to FISTA when A is orthogonal no proof of convergence

FGISTA cost function 10 4 10 3 λ =.25 GISTA FGISTA FGISTA switched to GISTA after 15 iter 10 0 10 1 10 2 10 3 10 4 10 5 iteration

FGISTA FGISTA, λ=.25, 100 iter GISTA, λ=.25, 100 iter 250 200 250 200 y 150 150 100 50 100 50 100 200 300 400 500 600 x 100 200 300 400 500 600 x profiles 0.3 0.25 0.2 0.15 FGISTA, λ=.25, 100 iter GISTA, λ=.25, 100 iter 200 250 300 350 400 y

Discussion FGISTA and GISTA : share the same fixed points

Discussion FGISTA and GISTA : share the same fixed points Why do FISTA and GISTA not converge to the same values? rounding errors in the algorithm? limit cycle? other update of the parameters? (cf. Chambolle-Pocq)

Discussion FGISTA and GISTA : share the same fixed points Why do FISTA and GISTA not converge to the same values? rounding errors in the algorithm? limit cycle? other update of the parameters? (cf. Chambolle-Pocq) A fixed point algorithm that appears to converge numerically does not necessarily min Φ.

Discussion FGISTA and GISTA : share the same fixed points Why do FISTA and GISTA not converge to the same values? rounding errors in the algorithm? limit cycle? other update of the parameters? (cf. Chambolle-Pocq) A fixed point algorithm that appears to converge numerically does not necessarily min Φ.

Open issues How to determine, on the fly the optimal initialization?

Open issues How to determine, on the fly the optimal initialization? the optimal # and position of the restarts?

Open issues How to determine, on the fly the optimal initialization? the optimal # and position of the restarts? Why is cost(fgista) > cost(gista)?

Remember GISTA reconstructs CT images with proven convergence no internal iteration

Remember GISTA reconstructs CT images with proven convergence no internal iteration Initialization matters.

Remember GISTA reconstructs CT images with proven convergence no internal iteration Initialization matters. Restart and FGISTA may help further.

References A. H. Andersen and A. C. Kak. Simultaneous algebraic reconstruction technique (sart): A superior implementation of the art algorithm. Ultrasonic Imaging, 6:81 94, 1984. Amir Beck and Marc Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. Siam J. Imaging Sciences, 2: 183 202, 2009a. Amir Beck and Marc Teboulle. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. 2009b. G.-H. Chen, J. Tang, and S. Leng. Prior image constrained compressed sensing (piccs): A method to accurately reconstruct dynamic ct images from highly undersampled projection data sets. Med. Phys., AAPM, 35:660 663, 2008. I. Daubechies, M. Defrise, and C. De Mol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on Pure and Applied Mathematics, 57(11):1413 1457, August 2004. ISSN 1097-0312. doi: 10.1002/cpa.20042. URL http://onlinelibrary.wiley.com/doi/10.1002/cpa.20042/abstract. Michel Defrise, Christian Vanhove, and Xuan Liu. An algorithm for total variation regularization in high-dimensional linear problems. Inverse Problems, 27(6):065002, June 2011. ISSN 0266-5611, 1361-6420. doi: 10.1088/0266-5611/27/6/065002. URL http://iopscience.iop.org/0266-5611/27/6/065002. Ignace Loris and Caroline Verhoeven. On a generalization of the iterative soft-thresholding algorithm for the case of non-separable penalty. Inverse Problems, 27:125007, 2011. doi: 10.1088/0266-5611/27/12/125007. URL http://arxiv.org/abs/1104.1087. B. O Donoghue and E. Candès. Adaptive restart for accelerated gradient schemes. arxiv:1204.3982, april 2012. M. J. D. Powell. Restart procedures for the conjugate gradient method. Mathematical programming, 12:241 254, 1977. S. Ramani and J. A. Fessler. A splitting-based iterative algorithm for accelerated statistical x-ray ct reconstruction. IEEE Trans Med Imaging, 31(3):677 688, 2012. L. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D, 60:259 268, 1992. URL http://www.math-info.univ-paris5.fr/ lomn/cours/ece/physicarudinosher.pdf. Emil Y Sidky and Xiaochuan Pan. Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization. Physics in Medicine and Biology, 53(17):4777 4807, September 2008. ISSN 0031-9155, 1361-6560. doi: 10.1088/0031-9155/53/17/021. URL http://iopscience.iop.org/0031-9155/53/17/021. Emil Y. Sidky, Chien-Min Kao, and Xiaochuan Pan. Accurate image reconstruction from few-views and limited-angle data in divergent-beam ct. J. X-Ray Sci. Technol, 14:119 139, 2006.

Chambolle-Pocq derived algorithm Initialization p 0 = 0 R P J w 0 = 0 R D M x 0 = 0 R M τσ < 1/ K, Iteration : (p n+1, w n+1, x n+1 ) = I CP (p n, w n, x n ) p n+1 = (p n + σ(y K x n )) /(1 + σ) w n+1 = P λ (w n + σ x n+1 ) x n+1 = x n + τk T p n+1 τ T w n+1. Caution must be taken to adapt the dimensions of K to those of the.

Philips Brightview XCT TV leads to less noise, flat regions and sharp edges Flat panel: 1024 768 sq. el., side of 0.388 mm. Images: 300 270 256, resol. of 0.8 mm. 6 x 10 5 (a) (c) (b) (d) intensity 4 2 0 0 100 200 300 position (voxels) Figure : (a) SART, P = 720 views, 5 iterations. (b) SART, P = 100 views, 5 iterations. (c) profiles (blue=sart, P = 720, red=gista, P = 720). (d) GISTA P = 720 views, λ = 0.018, 30 iterations. The images are average of 3 consecutive slices.

Philips Brightview XCT: RISTA vs GISTA TV intensity 55 (a) 50 45 40 35 30 25 20 15 0.25 0.3 0.35 0.4 0.45 0.5 LS 6 x 10 5 (b) 4 2 0 0 50 100 150 200 250 300 position (pixels) (c) (d) Figure : (a) (LS, TV) curves for GISTA, P = 100 views, λ = 0.01. The 41th iter. are highlighted by a cross. (b) Profiles of the 41th iteration. (c) GISTA, 41 iterations. (d) RISTA, 41 iterations. The images are average of 3 consecutive slices. In this figure, blue=gist without restart (GISTA), green=gist with restart after the 10th iteration (RISTA). See Fig. 1 for SART with 100 views.

Skyscan

Skyscan

Skyscan

Skyscan

Skyscan

Skyscan