Conjugate gradient acceleration of non-linear smoothing filters

Similar documents
Conjugate gradient acceleration of non-linear smoothing filters Iterated edge-preserving smoothing

Random walks and anisotropic interpolation on graphs. Filip Malmberg

An Investigation of 3D Dual-Tree Wavelet Transform for Video Coding

SOS Boosting of Image Denoising Algorithms

GAUSSIAN PROCESS TRANSFORMS

NONLINEAR DIFFUSION PDES

Space-Variant Computer Vision: A Graph Theoretic Approach

Learning MMSE Optimal Thresholds for FISTA

Graph Signal Processing for Image Compression & Restoration (Part II)

Linear Diffusion. E9 242 STIP- R. Venkatesh Babu IISc

Converting DCT Coefficients to H.264/AVC

Stability of Recursive Gaussian Filtering for Piecewise Linear Bilateral Filtering

ITK Filters. Thresholding Edge Detection Gradients Second Order Derivatives Neighborhood Filters Smoothing Filters Distance Map Image Transforms

A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS. Akshay Gadde and Antonio Ortega

Fast Local Laplacian Filters: Theory and Applications

ORIE 6334 Spectral Graph Theory October 13, Lecture 15

Energy-Based Image Simplification with Nonlocal Data and Smoothness Terms

Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs

Introduction to Nonlinear Image Processing

TRACKING and DETECTION in COMPUTER VISION Filtering and edge detection

Minimizing Isotropic Total Variation without Subiterations

A Comparison Between Polynomial and Locally Weighted Regression for Fault Detection and Diagnosis of HVAC Equipment

DELFT UNIVERSITY OF TECHNOLOGY

arxiv: v1 [cs.na] 6 Jan 2017

DATA defined on network-like structures are encountered

Efficient Beltrami Filtering of Color Images via Vector Extrapolation

Medical Image Analysis

Consistent Positive Directional Splitting of Anisotropic Diffusion

Nonlinear Diffusion. Journal Club Presentation. Xiaowei Zhou

Self-Tuning Spectral Clustering

Preconditioning for continuation model predictive control

On spectral partitioning of signed graphs

Advanced Edge Detection 1

Nonlinear diffusion filtering on extended neighborhood

A BAYESIAN APPROACH TO NONLINEAR DIFFUSION BASED ON A LAPLACIAN PRIOR FOR IDEAL IMAGE GRADIENT

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

NON-LINEAR DIFFUSION FILTERING

Comparison of Fixed Point Methods and Krylov Subspace Methods Solving Convection-Diffusion Equations

Gradient-domain image processing

DELFT UNIVERSITY OF TECHNOLOGY

Image enhancement. Why image enhancement? Why image enhancement? Why image enhancement? Example of artifacts caused by image encoding

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

arxiv: v1 [cs.it] 26 Sep 2018

M.A. Botchev. September 5, 2014

Region Covariance: A Fast Descriptor for Detection and Classification

Graph Signal Processing

Computational Photography

Recent implementations, applications, and extensions of the Locally Optimal Block Preconditioned Conjugate Gradient method LOBPCG

Spectral Generative Models for Graphs

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract

The Conjugate Gradient Method

Markov Random Fields

Activity Mining in Sensor Networks

Covariance Tracking Algorithm on Bilateral Filtering under Lie Group Structure Yinghong Xie 1,2,a Chengdong Wu 1,b

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

TRACKING SOLUTIONS OF TIME VARYING LINEAR INVERSE PROBLEMS

Preface to the Second Edition. Preface to the First Edition

NOISE ENHANCED ANISOTROPIC DIFFUSION FOR SCALAR IMAGE RESTORATION. 6, avenue du Ponceau Cergy-Pontoise, France.

UWB Geolocation Techniques for IEEE a Personal Area Networks

Lecture Notes 1: Vector spaces

Learning gradients: prescriptive models

Tropical Graph Signal Processing

Filtering via a Reference Set. A.Haddad, D. Kushnir, R.R. Coifman Technical Report YALEU/DCS/TR-1441 February 21, 2011

Predicting Graph Labels using Perceptron. Shuang Song

Consistent Wiener Filtering for Audio Source Separation

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Iterative Methods for Solving A x = b

ECS231: Spectral Partitioning. Based on Berkeley s CS267 lecture on graph partition

Spectral Regularization

Digital Image Processing ERRATA. Wilhelm Burger Mark J. Burge. An algorithmic introduction using Java. Second Edition. Springer

A Partial Differential Equation Approach to Image Zoom

Nonlinear Diffusion. 1 Introduction: Motivation for non-standard diffusion

SEAGLE: Robust Computational Imaging under Multiple Scattering

A NO-REFERENCE SHARPNESS METRIC SENSITIVE TO BLUR AND NOISE. Xiang Zhu and Peyman Milanfar

2 Tikhonov Regularization and ERM

Multi-Robotic Systems

A Localized Linearized ROF Model for Surface Denoising

An indicator for the number of clusters using a linear map to simplex structure

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

INTERPOLATION OF GRAPH SIGNALS USING SHIFT-INVARIANT GRAPH FILTERS

Tikhonov Regularization of Large Symmetric Problems

Regularization via Spectral Filtering

FFTs in Graphics and Vision. The Laplace Operator

The Gamma Variate with Random Shape Parameter and Some Applications

graphics + usability + visualization Edge Aware Anisotropic Diffusion for 3D Scalar Data on Regular Lattices Zahid Hossain MSc.

A TWO STAGE DYNAMIC TRILATERAL FILTER REMOVING IMPULSE PLUS GAUSSIAN NOISE

A Riemannian Framework for Denoising Diffusion Tensor Images

Learning Gaussian Process Models from Uncertain Data

Learning Graphs from Data: A Signal Representation Perspective

Multiple Similarities Based Kernel Subspace Learning for Image Classification

CS 231A Section 1: Linear Algebra & Probability Review

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

EE 224 Lab: Graph Signal Processing

Scientific Data Computing: Lecture 3

Is there life after the Lanczos method? What is LOBPCG?

Sparse Least Mean Square Algorithm for Estimation of Truncated Volterra Kernels

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

A Dual Formulation of the TV-Stokes Algorithm for Image Denoising

STAT 518 Intro Student Presentation

Convex Optimization of Graph Laplacian Eigenvalues

Transcription:

MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Conjugate gradient acceleration of non-linear smoothing filters Knyazev, A.; Malyshev, A. TR25-43 December 25 Abstract The most efficient edge-preserving smoothing filters, e.g., for denoising, are non-linear. Thus, their acceleration is challenging and is often performed in practice by tuning filter parameters, such as by increasing the width of the local smoothing neighborhood, resulting in more aggressive smoothing of a single sweep at the cost of increased edge blurring. We propose an alternative technology, accelerating the original filters without tuning, by running them through a special conjugate gradient method, not affecting their quality. The filter nonlinearity is dealt with by careful freezing and restarting. Our initial numerical experiments on toy one-dimensional s demonstrate 2x acceleration of the classical bilateral filter and 3-5x acceleration of the recently developed guided filter. 25 IEEE Global Conference on Signal and Information Processing (GlobalSIP) This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following: a notice that such copying is by permission of Mitsubishi Electric Research Laboratories, Inc.; an acknowledgment of the authors and individual contributions to the work; and all applicable portions of the copyright notice. Copying, reproduction, or republishing for any other purpose shall require a license with payment of fee to Mitsubishi Electric Research Laboratories, Inc. All rights reserved. Copyright c Mitsubishi Electric Research Laboratories, Inc., 25 2 Broadway, Cambridge, Massachusetts 239

Conjugate Gradient Acceleration of Non-Linear Smoothing Filters Andrew Knyazev Mitsubishi Electric Research Laboratories (MERL) 2 Broadway, 8th floor, Cambridge, MA 239, USA Email: knyazev@merl.com Alexander Malyshev Mitsubishi Electric Research Laboratories (MERL) 2 Broadway, 8th floor, Cambridge, MA 239, USA Email: malyshev@merl.com Abstract The most efficient edge-preserving smoothing filters, e.g., for denoising, are non-linear. Thus, their acceleration is challenging and is often performed in practice by tuning filter parameters, such as by increasing the width of the local smoothing neighborhood, resulting in more aggressive smoothing of a single sweep at the cost of increased edge blurring. We propose an alternative technology, accelerating the original filters without tuning, by running them through a special conjugate gradient method, not affecting their quality. The filter nonlinearity is dealt with by careful freezing and restarting. Our initial numerical experiments on toy one-dimensional s demonstrate 2x acceleration of the classical bilateral filter and 3-5x acceleration of the recently developed guided filter. Index Terms conjugate gradient algorithm, edge-preserving denoising, low-pass filters I. INTRODUCTION This paper is concerned with noise removal from a given noisy, which is a basic problem in processing, with many applications, e.g., in image denoising [6]. Modern denoising algorithms preserve details while removing most of the noise. A very popular denoising filter is the bilateral filter (BF), which smooths s while preserving edges, by taking the weighted average of the nearby pixels. The weights depend on both the spatial distance between the sampling locations and similarity between values, thus providing local adaptivity to the input. Bilateral filtering has initially been proposed in [2] as an intuitive tool without theoretical justification. Since then, connections between BF and other well-known filtering techniques such as anisotropic diffusion, weighted least squares, Bayesian methods, kernel regression and non-local means have been explored; see, e.g., survey [3]. We make use of the graph-based framework for analysis developed in [5], [7], where polynomial low-pass filters based on the BF coefficients are proposed. A nice introduction to processing on graphs is found in [2]. A single application of BF can be interpreted as a vertex domain transform on a graph with pixels as vertices, intensity values of each node as the graph, and filter coefficients as link weights that capture the similarity between nodes. The BF transform is a special nonlinear anisotropic diffusion, cf. [9], [], determined by the entries of the graph Laplacian A preliminary version is posted at arxiv.org matrix, which are related to the BF weights. The eigenvectors and eigenvalues of the graph Laplacian matrix allow us to extend the Fourier analysis to the graph s or images as in [2] and perform frequency selective filtering operations on graphs, similar to those in traditional processing. Another very interesting smoothing filter is the guided filter (GF), recently proposed in [], [3], and included into the MATLAB image processing toolbox. Some ideas behind GF are developed in [4]. According to our limited experience, GF is faster than BF. The authors of [] advocate that GF is gradient preserving and avoids gradient reversal artifacts in contrast to BF, which is not gradient preserving. The smoothing explicit filters similar to BF and GF can be interpreted as matrix power iterations, which are, in general case, nonlinear, or equivalently, as explicit integration in time of the corresponding nonlinear anisotropic diffusion equation [9], []. The suitable graph Laplacian matrices are determined by means of the graph-based interpretation of these power iterations. Our main contribution is accelerating the smoothing filters by means of a special variant of the conjugate gradient (CG) method, applied to the corresponding graph Laplacian matrices. To avoid oversmoothing, only few iterations of the CG acceleration can be performed. We note that there exist several nonlinear variants of the CG algorithm, see, e.g., [8]. However, the developed theory is not directly applicable in our case because it is not clear how to interpret the vector L(x)x as a gradient of a scalar function of the x, where L(x) is a graph Laplacian matrix depending on a x. II. BILATERAL FILTER (BF) We consider discrete s defined on an undirected graph G = (V, E), where the vertices V = {, 2,..., N} denote, e.g., time instances of a discrete-time or pixels of an image. The set of edges E = {(i, j)} contains only those pairs of vertices i and j that are neighbors in some predefined sense. We suppose in addition that a spatial position p i is assigned to each vertex i V so that a distance p i p j is determined between vertices i and j. Let x[j], j V, be a discrete function, which is an input to the bilateral filter. The output y[i] is the weighted average of the values in x[j]: y[i] = w ij j j w x[j]. () ij

The weights w ij are defined for (i, j) E in terms of a guidance g[i]: w ij = exp ( p i p j 2 ) ) (g[i] g[j])2 exp ( 2σr 2, (2) 2σ 2 d where σ d and σ r are the filter parameters [2]. The guidance g[i] is chosen depending on the purpose of filtering. When g coincides with the input x, the bilateral filter is nonlinear and called self-guided. The weights w ij determine the adjacency matrix W of the graph G. The matrix W is symmetric, has nonnegative elements and diagonal elements equal to. Let D be the diagonal matrix with the nonnegative diagonal entries d i = j w ij. Thus, the BF operation () is the vector transform defined by the aid of the matrices W (g) and D(g) as y = D W x = x D Lx, where L = D W is called the Laplacian matrix of the weighted graph G with the BF weights. The eigenvalues of the matrix D W are real. The eigenvalues corresponding to the highest oscillations lie near the origin. The BF transform y = D W x can be applied iteratively, (i) by changing the weights w ij at each iteration using the result of the previous iteration as a guidance g, or (ii) by using the fixed weights, calculated from the initial as a guidance, for all iterations. The former alternative results in a nonlinear filter. The latter produces a linear filter, which may be faster, since the BF weights are computed only once in the very beginning. An iterative application of the BF matrix transform is the power iteration with the amplification matrix D W. Slow convergence of the power iteration can be boosted by the aid of suitable Krylov subspace iterative methods [], [4]. III. GUIDED FILTER (GF) Algorithm Guided Filter (GF) Input: x, g, ρ, ɛ Output: y mean g = f mean (g, ρ) mean x = f mean (x, ρ) corr g = f mean (g. g, ρ) corr gx = f mean (g. x, ρ) var g = corr g mean g. mean g cov gx = corr gx mean g. mean x a = cov gx./(var g + ɛ) b = mean x a. mean g mean a = f mean (a, ρ) mean b = f mean (b, ρ) y = mean a. g + mean b Algorithm is a pseudo-code of GF proposed in [], where x and y are, respectively, the input and output s on the graph G, described in section II. GF is built by means of a guidance g, which equals x in the self-guided case. The function f mean (, ρ) denotes a mean filter of a spatial radius ρ. The constant ɛ determines the smoothness degree of the filter the larger ɛ the larger smoothing effect. The dot preceded operations. and./ denote the componentwise multiplication and division. A typical arithmetical complexity of the GF algorithm is O(N), where N is the number of elements in x, see []. The guided filter operation of Algorithm is the matrix transform y = W (g)x, where the implicitly constructed transform matrix W (g) has the following entries, see []: W ij (g) = ω 2 k:(i,j) ω k ( + (g i µ k )(g j µ k ) σ 2 k + ɛ ). (3) The mean filter f mean (, ρ) is applied in the neighborhoods ω k of a spatial radius ρ around all vertices k V. The number of pixels in ω k is denoted by ω, the same for all k. The values µ k and σ 2 k are the mean and variance of g over ω k. The matrix W is symmetric and satisfies the property j W ij =. The standard construction of the graph Laplacian matrix gives L = I W, because d i = j w ij =, i.e. the matrix D is the identity. The eigenvalues of L(g) are real nonnegative with the low frequencies accumulated near and high frequencies near. Application of a single transform y = W x attenuates the high frequency modes of x while approximately preserving the low frequency modes, cf. [7], [5]. Similar to the BF filter, the guided filter can be applied iteratively. When the guidance g is fixed, the iterated GF filter is linear. When g varies, for example, g = x for the self-guided case, the iterated GF filter is nonlinear. IV. CONJUGATE GRADIENT ACCELERATION Since the graph Laplacian matrix L is symmetric and nonnegative definite, the iterative application of the transform y = D W x can be accelerated by adopting by the CG technology. We use two variants of CG: ) with the fixed guidance equal to the input or to the, 2) with the varying guidance equal to the current value of x. Algorithm 2 Truncated PCG(k max ) Input: x, g, k max Output: x x = x ; r = W (g)x D(g)x for k =,..., k max do s = D (g)r; γ = s T r if k = then p = s else β = γ/γ old ; p = s + βp endif q = D(g)p W (g)p; α = γ/(p T q) x = x + αp; r = r αq; γ old = γ endfor Algorithm 2 is the standard preconditioned conjugate gradient algorithm formally applied to the system of linear equations Lx = and truncated after k max evaluations of the matrix-vector operation Lx. The initial vector x is a noisy input. This variant of the CG algorithm has first been suggested in [5]. Algorithm 3 is a special nonlinear preconditioned CG with l max restarts, formally applied to L(x)x = and truncated after k max iterations between restarts. Restarts are necessary because of nonlinearity of the self-guided filtering.

Algorithm 3 Truncated PCG(k max ) with l max restarts Input: x, k max, l max Output: x x = x for l =,..., l max do r = W (x)x D(x)x for k =,..., k max do s = D (x)r; γ = s T r if k = then p = s else β = γ/γ old ; p = s + βp endif q = D(x)p W (x)p; α = γ/(p T q) x = x + αp; r = r αq; γ old = γ endfor endfor V. NUMERICAL EXPERIMENTS As a proof of concept, our MATLAB tests use the clean -dimensional x c of length N = 473 shown in Figure. We choose this rather difficult, although -dimensional, example to better visually illustrate both the denoising and edge-preserving features of the filters. The noisy, also displayed in Figure, is the same for all tests and given by the formula x = x c + η, where a Gaussian white noise η has zero mean and variance σ 2 =.. The bilateral filter is used with σ d =.5 and σ r =.. The neighborhood width in BF equals 5 so that the band of W consists of 5 diagonals. The guided filter is used with ɛ =. and the neighborhood width 3. The matrix W of GF also has 5 diagonals. The CG accelerated BF/GF is called CG-BF/CG-GF. Typical numerical results of the average performance are displayed. The error after denoising is x x, where x stands for the output denoised. We calculate the peak -tonoise ratio (PSNR) and -to-noise ration (SNR). The parameters are manually optimized to reach the best possible match of the errors in the compared filters, resulting in indistinguishable error curves in our figures. The results in Figures 2 and 3 are obtained by the iterated BF and GF filters with the fixed guidance g = x c and by CG-BF and CG-GF implemented in Algorithm 2 with the same fixed guidance g = x c. These tests are performed only for comparison reasons because the guidance x c seems to be ideal for the best possible denoising results. We say that Algorithm 3 uses l max k max iterations, if it executes l max restarts with the k max evaluations L(x)x between restarts. The best denoising performance for our test problem is achieved with the following iteration combinations of the self-guided CG-BF: 3 3, 7 4, 2 5, 9 6, 7 7, 6 8, 5 9, 4, 3, 2 9. The best combinations for the self-guided CG-GF are 3, 7 4, 5 5, 4 6, 3 7. Figures 4 and 5 show the results after 3 iterations of CG-BF and 5 5 iterations of CG-GF. The numerical tests demonstrate about 2-times reduction of iterations for the self-guided bilateral filter and 3-times reduction of iterations for the guided filter with self-guidance after the conjugate gradient acceleration. It is also interesting to observe that both filters with the properly chosen parameters and iteration numbers produce almost identical output s. error.2.8.6.4.2.2.5..5.5 PSNR = 2.494, SNR = 3.753 noisy 2 3 4 Fig.. Clean and noisy s. BF: PSNR = 34.3237 CG: PSNR = 34.2763 5 5 2 25 3 35 4 45 error BF: SNR = 27.3497 error CG: SNR = 27.322 5 5 2 25 3 35 4 45 Fig. 2. 5 BF iterations versus 2 CG-BF iterations with the guidance x c. error.5..5.5 GF: PSNR = 34.4542 CG: PSNR = 34.3978 5 5 2 25 3 35 4 45 error GF: SNR = 27.482 error CG: SNR = 27.4237 5 5 2 25 3 35 4 45 Fig. 3. 9 GF iterations versus 3 CG-GF iterations with the guidance x c. VI. CONCLUSION Iterative application of BF and GF, including their nonlinear self-guided variants, can be drastically accelerated by using CG technology. Our future work concerns developing automated procedures for choosing the optimal number of CG iterations and investigating CG acceleration for 2D s.

.5 BF: PSNR = 32.6646 CG: PSNR = 32.759 5 5 2 25 3 35 4 45 error.2.. error BF: SNR = 25.696 error CG: SNR = 25.7768 5 5 2 25 3 35 4 45 Fig. 4. 6 iterations of the self-guided BF versus 3 iterations of CG-BF..5 GF: PSNR = 33.832 CG: PSNR = 33.4 5 5 2 25 3 35 4 45 error.. error GF: SNR = 26.29 error CG: SNR = 26.4 5 5 2 25 3 35 4 45 Fig. 5. 75 iterations of the self-guided GF versus 5 5 iterations of CG-GF.

REFERENCES [] A. Greenbaum, Iterative Methods for Solving Linear Systems, SIAM, Philadelphia, PA, 997. [2] P. Frossard, A. Ortega, D.I. Shuman, S.K. Narang and P. Vandergheynst, The emerging field of processing on graphs: Extending highdimensional data analysis to networks and other irregular domains, IEEE Signal Processing Magazine, vol. 3, no. 3, pp. 83 98, 23. [3] K. He and J. Sun, Fast guided filter, Tech. Report, 25, arxiv: 55.996v. [4] A. Levin, D. Lischinski and Y. Weiss, A closed-form solution to natural image matting, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 3, no.2, pp. 228 242, 28. [5] H. Mansour, D. Tian, A. Knyazev and A. Vetro, Chebyshev and conjugate gradient filters for graph image denoising, in Proc. IEEE International Conference on Multimedia and Expo Workshops (ICMEW), Chengdu, 24, pp. 6. [6] P. Milanfar, A tour of modern image filtering: new insights and methods, both practical and theoretical, IEEE Signal Processing Magazine, vol. 3, no., pp. 6 28, 23. [7] S.K. Narang, A. Gadde and A. Ortega, Bilateral filter: Graph spectral interpretation and extensions, in Proc. 2th IEEE International Conference on Image Processing (ICIP), Melbourne, Australia, 23, pp. 222 226. [8] J. Nocedal and S. J. Wright, Numerical Optimization, 2nd ed., Springer Science+Business Media, 26. [9] P. Perona and J. Malik, Scale-space and edge detection using anisotropic diffusion, in Proc. IEEE Computer Society Workshop on Computer Vision, Miami, FL, 987, pp. 6 27. [] P. Perona and J. Malik, Scale-space and edge detection using anisotropic diffusion, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 2, no. 7, pp. 629 639, 99. [] J. Sun, K. He and X. Tang, Guided image filtering, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 6, pp. 397 49, 23. [2] C. Tomasi and R. Manduchi, Bilateral filtering for gray and color images, in Proc. IEEE International Conference on Computer Vision, Bombay, 998, pp. 839 846. [3] J. Tumblin, S. Paris, P. Kornprobst and F. Durand, Bilateral filtering: Theory and applications, Foundations and Trends in Computer Graphics, vol. 4, no., pp. 73, 29. [4] H. A. van der Vorst, Iterative Krylov Methods for Large Linear Systems, Cambridge University Press, Cambridge, UK, 23.