(4D) Variational Models Preserving Sharp Edges Institute for Computational and Applied Mathematics
Intensity (cnt) Mathematical Imaging Workgroup @WWU 2 0.65 0.60 DNA Akrosom Flagellum Glass 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 600 800 1 000 1 200 1 400 1 600 1 800 Raman Shift (cm -1 ) Linz, 2011
3 Some Philosophy No matter what question, L1 is the answer Stanley O. Regularization in data assimilation is at the same state it was 10 years ago in biomedical imaging The understanding and methods we gained in medical imaging can hopefully be useful in geosciences and data assimilation
4 Biomedical Imaging: 2000 vs 2010 Modality State of the art 2000 State of the art 2010 Full CT Filtered Backprojection Exact Reconstruction PET/SPECT Filtered Backprojection /EM EM-TV / Dynamic Sparse PET-CT - EM-AnatomicalTV Acousto-Opt. - Wavelet Sparse / TV EEG/MEG LORETA Sparsity / Bayesian ECG-BSPM Least Norm L1 of normal derivative Microscopy None, linear Filter Poisson-TV / Shearlet-L1
5 Based on joint work with Martin Benning, Michael Möller, Felix Lucka, Jahn Müller (Münster) Stanley Osher (UCLA) Christoph Brune (Münster / UCLA / Vancouver) Fabian Lenz (Münster), Silvia Comelli (Milano/Münster) Eldad Haber (Vancouver) Mohammad Dawood, Klaus Schäfers (NucMed/EIMI Münster) SFB 656 Linz, 2011
6 Regularization of Inverse Problems We want to solve Forward operator between Banach spaces with finite dimensional approximation (sampling, averaging)
Dynamic Biomedical Imaging 7 Maximum Likelihood / Bayes Reconstruct maximum-likelihood estimate Model of posterior probability (Bayes) Yields regularized variational problem for finite m Saarbrücken, 9.7.10
8 Minimization of penalized log-likelihood General variational approach Combines nonlocal part (including K ) with local regularization functional Gaussian noise (note: covariance hidden in output norm)
9 Example Gauss: Additive noise, i.i.d. on each pixel, mean zero, variance s Minimization of negative posterior log-likelihood yields Asymptotic variational model
10 Optimality Existence and uniqueness by variational methods General case: optimality condition is a nonlinear integro-differential equation / inclusion (integral operator K, differential operator in J ) Gauss:
11 Robustness Due to noisy data robustness of with respect to errors in f is important Problem is robust for large a, but data are only reproduced for small a Convergence of solutions as f converges or as a to zero in weak* topology
12 Structure of Solutions Analysis by convex optimization techniques, duality Structure of subgradients important Possible solution satisfy source condition Allows to gain information about regularity (e.g. of edges)
13 Structure of Solutions Optimality condition for Structure of u determined completely by properties of u B and K* For smoothing operators K, singularity not present in u B cannot be detected Model error goes into K resp. K* and directly modifies u
14 4D VAR Given time dynamics starting from unknown initial value Variational Problem to estimate initial state for further prediction Linz, 2011
15 4D VAR = 3D Variational Problem Elimination of further states from dynamics Effective Variational Problem for initial value in 3D Linz, 2011
16 Example: Linear Advection Minimize quadratic fidelity + TV of initial value subject to Upwind discretization Linz, 2011
17 4D VAR for Linear Advection Gibbs phenomenon as usual Linz, 2011
18 4D VAR for Linear Advection Full observations (black), noisy(blue), 40 noisy samples (red) Linz, 2011
19 4D VAR for Linear Advection Different noise variances Linz, 2011
20 Analysis of Model Error Optimality Exact Operator for linear advection is almost unitary Hence Linz, 2011
21 Beyond Gaussian Priors Again: optimality condition for MAP estimate If J is strictly convex and smooth, subdifferential is a singleton containing only the gradient of J, which can be inverted to obtain a similar relation. Again operator determines structure Only chance to obtain full robustness: multivalued subdifferential. Singular regularization
22 Singular Regularization Construct J such that the subdifferential at points you want to be robust is large Example: l1 sparsity Zeros are robust
23 TV-Methods: Structural Prior (Cartooning) Penalization of total Variation Formal Exact ROF-Model for denoising g : minimize total variation subject to Rudin-Osher-Fatemi 89,92
24 Why TV-Methods? Cartooning Linear Filter TV-Method
ROF Model clean noisy ROF
26 H 2 O 15 PET Left Ventricular Time Frame EM EM-Gauss EM-TV
Dynamic Biomedical Imaging 27 H 2 O 15 PET Right Ventricular Time Frame EM EM-Gauss EM-TV Saarbrücken, 9.7.10
28 4D VAR for Linear Advection Gibbs phenomenon as usual Linz, 2011
29 4D VAR for Linear Advection Full observations (black), noisy(blue), 40 noisy samples (red) Linz, 2011
30 4D VAR TV for Linear Advection Comparison for full observations Linz, 2011
31 4D VAR TV for Linear Advection Comparison for observed samples Linz, 2011
32 4D VAR TV for Linear Advection Comparison for observed samples with noise Linz, 2011
33 Analysis of Model Error Variational problem as before, add Optimality condition As before Linz, 2011
34 Analysis of Model Error Structures are robust: apply T in region where If we find s solving Poisson equation with then Linz, 2011
35 Numerical Solution: Splitting or ALM Operator Splitting into standard problem (dependent on code) and simple denoising-type problem Example: Peaceman Rachford-Splitting for Linz, 2011
36 Bayes and Uncertainty Natural prior probabilities for singular regularizations can be constructed even in a Gaussian framework Interpret J(u) as a random variable with variance s 2 Prior probability density MAP estimate minimizes
37 Bayes and Uncertainty Equivalence to original form via constraint regularization For appropriate choice of a and g, minimization of and is equivalent to subject to
38 Uncertainty Quantification Sampling with standard MCMC schemes difficult Novel Gibbs sampler by F.Lucka based on analytical integration of posterior distribution function in 1D Theoretical Insight: MSc Thesis Silvia Comelli CM Estimate for TV prior
39 Uncertainty Quantification II Error estimates in dependence on the noise, using source conditions Error estimates need appropriate distance measure,generalized Bregman-distance mb-osher 04, Resmerita 05, mb-resmerita-he 07, Benning-mb 09 Estimates for Bayesian distributions in Bregman transport distances (w. H.Pikkarainen) = 2 Wasserstein distance in the Gaussian case
40 Uncertainty Quantification III Idea: construct linear functionals from nonlinear eigenvectors We have For TV-denoising (also for linear advection example), Estimate of maximal error for mean value on balls For l1-sparsity estimate of error in single components Benning PhD 11, Benning-mb 11
Loss of Contrast ROF minimization loses contrast, total variation of the reconstruction is smaller than total variation of clean image. Image features left in residual f-u g, clean f, noisy u, ROF f-u mb-gilboa-osher-xu 06
42 Loss of Contrast = Systematic Bias of TV Becomes more severe in ill-posed problems with operator K Not just simple vision effect to be corrected, but loss of information Simple idea for Least-Squares: add back the noise to amplify = Augmented Lagrangian Osher-mb-Goldfarb-Xu-Yin 2005
43 Bregman Iteration Can be shown to be equivalent to Bregman iteration Immediate generalization to convex fidelities and regularizers Generalization to Gauss-Newton type Methods for nonlinear K: use linearization of K around last iterate u l Bachmayr-mb 2009
44 Bregman Iteration Properties like iterative regularization method Regularizing effect from appropriate termination of the iteration Better performance for oversmoothing single steps, i.e. regularization parameter a very large Limit: Inverse Scale Space Method mb-gilboa.osher-xu 2006
45 Why does Inverse Scale Space work? Singular value decomposition in fully quadratic case Eigenfunctions: yields Convergence faster in small frequencies (large eigenvalues)
46 Why does Inverse Scale Space work? Convex one-homogeneous regularization J (TV, l1, ) Eigenfunctions: yields Again large frequencies appear later. Not at all for small t! Eigenvalues in TV indeed related to jump measures PhD-Thesis Benning, 2011
47 Why does Inverse Scale Space work? Multiple frequencies not simple for nonlinear case However, various theoretical and computational results confirming exact scale decomposition PhD-Thesis Benning, 2011 / mb-frick-scherzer-osher 2007 Complete characterization of inverse scale space for discrete l1-functionals, yields jump dynamics in time, adaptive basis pursuit method with guaranteed convergence mb-möller-benning-osher, 2011
18 F-FDG PET 48 EM, 20 min EM-TV, 5s EM, 5s BREG, 5s Jahn Müller, 2011 Data from Nuclear Medicine Department, UKM Saarbrücken, 9.7.10
49 STED Microscopy Christoph Brune, 2009 Data from MPI for Biophys. Chem. Göttingen (K.Willig, A.Schönle, Hell)
50 4D Reconstruction 4D imaging of transport with penalization of large velocities: Minimize subject to Linz, 2011
51 Analysis of Motion Model Functional related to Benamou-Brenier formulation of optimal transport. Analysis different from optimal transport, since usually no initial and final densities are given (more related to mean-field games, Lasry-Lions 07) Existence by transformation to - A-priori estimate for w in L 2. Weak compactness -A-priori estimates for u in L p (0,T;BV) and for time derivative in L q (0,T;W -1,s ) - Adaptation of Aubin-Lions gives strong compactness of u in L r (0,T; L r ), and thus of the square-root in L 2r (0,T; L 2r ) Linz, 2011
4D TV Model Analysis relies on superlinear growth of F, although F=Identity seems a very reasonable choice 52 Choosing F equal to the identity would imply we seek a minimal L1 norm of the vector of total variations. Favours sparsity, i.e. solutions with very large total variation at some time step allowed if small else. This does not correspond to a smooth motion model, hence superlinear choices preferable Some indications of this effect in numerical results Linz, 2011
Numerical solution Complicated 4D variational problem combining various integral and differential operators + nonlinearity. Convexity achieved by formulation in momentum variable m = u V 53 Efficient GPU implementation by Christoph Brune on CUDA with specially designed algorithms. All subproblems solvable by FFT or shrinkage Realized by introducing new variables and inexact Uzawa Augmented Lagrangian approach Linz, 2011
54 Augmented Lagrangian Linz, 2011
55 Inexact Uzawa Augmented Lagrangian Linz, 2011
56 Update of Primal Variables Linz, 2011
57 Results: Deblurring, Synthetic Data Exact solution Blurred Data Linz, 2011
58 Results: Deblurring, Synthetic Data Exact solution Reconstruction Linz, 2011
59 Results: Cardiac 18 F-FDG PET (Eulerian) PET Reconstruction (Data) Registration to Diastole Registration to Systole Linz, 2011
60 Info http://imaging.uni-muenster.de http://www.cells-in-motion.de http://www.herzforscher.de Linz, 2011