Representation and Processing of Big Visual Data: A Data-driven Perspective

Size: px
Start display at page:

Download "Representation and Processing of Big Visual Data: A Data-driven Perspective"

Transcription

1 and of Big Visual National University of Singapore July 7, 2017

2 Main topics and Linear in Hilbert space and Gabor Sparse learning in classification/recognition Numerical methods for related optimization From sparse coding to convolutional neural network

3 Development of real applications Sparsity motivated Solutions Method for related optimization problem and Problem specific modeling Application Design of system

4 Section I: Linear and Let H denote a Hilbert space H with inner product,. Linear : consider a system X = {x n } n Z via f H {c[n]} n Z l 2 (Z) f = n Z c[n]x n. The properties of X determines basic characteristics of linear : the relationship between {c[n]} and f, x n : Existence, completeness, uniqueness, correspondence Two types of fundamental : span(x) = H. Non-redundant: Riesz bases and orthonormal bases Redundant: frames and tight frames

5 Basic linear operators in Hilbert space and Theorem (adjoint operator) Let H 1, H 2 be two Hilbert spaces, and let T be a bounded linear operator from H 1 to H 2. Then, there exists a unique bounded linear operator T : H 2 H 1 such that and T = T. T x, y H2 = x, T y H1, x H 1, y H 2, Theorem (self-adjoint operator) An operator S in Hilbert space H is called self-adjoint if S = S. Suppose there exists two constants, A, B > 0 such that a A f 2 Sf, f B f 2, f H. Then, S is an invertible operator and S B, S 1 A 1.

6 Operators for linear A sequence is called Bessel sequence if there exits a constant B > 0 such that f, xn 2 B f 2, f H. and For a Bessel sequence X, define Synthesis operator T : l 2 (Z) H: T c = c[n]x n. Analysis operator T : H l 2 (Z): (T f)[n] = f, x n, n Z. T and T are an adjoint pair with T = T = B. Ker(T ) = (Range(T )). X is linearly independent, if T is injective in l 2 (Z). X is fundamental (i.e., span(x) = H), if T is injective.

7 Definitions: with certain properties and Bessel sequences with Riesz properties A Bessel X is called a Riesz sequence if A c[n] 2 n Z n Z c[n]x n 2 2 B n Z c[n] 2. A Riesz sequence is called an orthonormal sequence if A = B = 1. A Riesz (or orthonormal) sequence X is called a Riesz (or orthonormal ) basis if span(x) = H. Bessel sequences with frame properties A Bessel sequence X is called a frame if A f 2 f, x n 2 B f 2, f H. A/B is called lower/upper frame bound. A frame is called a tight frame if T = 1, i.e., A = B = 1.

8 Two adjoint operators for and Consider a Bessel sequence X = {x k } k Z. Defining two operators Gramian operator: frame operator: T T : l 2 (Z) l 2 (Z) T T : H H Both operators are bounded self-adjoint operators. The operator T T is the Gramian operator for analyzing Riesz properties: T T [i, j] = x i, x j. The operator T T is the frame operator for analyzing frame properties: (T T )(f) = k Z f, x k x k.

9 Characterization via operators and Characterization of Riesz properties via Gramian operator T T : X is l 2 -independent iff T T is injective X is a Riesz sequence iff T T has a bounded inverse X is a orthonormal sequence iff T T = I Characterization of frame properties via frame operator T T : X is fundamental iff T T is injective X is a frame iff T T has a bounded inverse X is a tight frame iff T T = I Connections between Riesz and frame A Riesz basis is an exact frame, and a frame becomes a Riesz basis if it is l 2 -independent. An orthonormal basis is a tight frame, and a tight frame becomes an orthonormal basis if x n = 1 for all n.

10 Gramian and dual Gramian and Gramian and dual Gramian matrix for representing T T and T T Definition (Pre-Gramian) For a Bessel sequence X, the pre-gramian matrix, denoted by J X is defined as J x := ( x n, e k ) k,n, where {e k } k Z is an orthonormal basis of H. J X represents the matrix form of synthesis operator T under the orthonormal basis {e k } k Z. Definition (Gramian and dual Gramian) For a Bessel sequence X, the matrix G X = JX J X is called Gramian matrix of X, and G X = J X JX is called dual Gramian matrix.

11 Eigenvalue and Gramian G X = J X J X is with T T for Riesz properties dual Gramian G X = J X J X is with T T for frame properties. Let U denote the unitary operator that maps l 2 (Z) to H. Theorem (Bridge between dual Gramian and the operator T T ) Consider a Bessel sequence X. We have T T c = G X c, c l 2 (Z), U (T T )Ud = G X d, d l 2 (Z). Theorem (Riesz properties and frame properties) Consider a Bessel sequence X. We have X is l 2 -independent, iff G X is injective, a Riesz sequence iff G X is bounded below, an orthonormal sequence iff G X = I. X is fundamental iff G X is injective, a frame iff G X is bounded below, a tight frame iff G X = I.

12 Duality principle and Definition (Adjoint system) The system in H is called a adjoint system of a Bessel sequence X in H if J Y = V 1 J XV 2, where V 1, V 2 are two unitary operators. X and Y are connected via G X = V 1 G Y V 1 ; G X = V 2 G Y V 2, i.e., up to some unitary operators, T X = T Y and T X = T Y. Theorem (Duality Principle) Consider a Bessel sequence X H and let Y denote its adjoint system in H. X forms a frame for H iff Y forms a Riesz sequence in H. X forms a tight frame for H iff Y forms an orthonormal sequence in H.

13 Canonical dual frame and For a frame X, the frame operator S = T T is self-adjoint, positive definite and invertible. Definition (Dual frame) For a frame X, Y is called its dual frame if T Y TX = T XTY = I, i.e. f, xn y n = f, y n x n, f H. For a frame X, Y = S 1 X is called its cananical dual frame. Theorem (frame and dual frame) Consider a frame X H. We have G 1 U X U X is the canonical dual frame of X. 1/2 U G X U X forms a tight frame for H.

14 Example in R 2 and T = w 2 w 3 e 2 2 T = 2/3 0 w 1 e 1 Tight frame X = {x 1, x 2, x 3 } 2 x 1 = 3 [1, 0], 2 x 2 == 3 [cos 2π 3, sin 2π 3 ] 2 x 3 == 3 [cos 4π 3, sin 4π 3 ] T T = ( ) 1 0 T T = 0 1

15 Example in L 2 [0,1] : { be 2πibkt } k Z w/ b 1 and Proof. Consider an orthonormal basis {e k (t) = e 2πikt } k Z for L 2 [0,1]. The pre-gramian J X is J X [k, n] = b 1 0 e 2πi(k bn)t dt The system Y = {b 1/2 e 2πikb 1t 1 [0,b] (t) is an adjoint system of X by J Y [k, n] = (J X [n, k]), since J Y [k, n] = 1 b e 2πi( n b +k)t dt = b b e 2πi( n+bk)t dt. Clearly, Y forms an orthonormal sequence in L 2 [0,1], and thus X forms a tight frame for L 2 [0,1].

16 Reference and 1 Amos Ron, Zuowei Shen, Gramian of affine basis and affine frames, Approximation Theory VIII, Wavelets and Multilevel Approximation, World Scientific Publishing, (1995), Amos Ron, Zuowei Shen, Frames and stable bases for subspaces of L 2 (R d ): the duality principle of Weyl-Heisenberg sets, Proceedings of the Lanczos Centenary Conference, SIAM Pub. (1993) Casazza, Peter G, The art of frame theory, arxiv preprint math/ (1999). 4 Zhitao Fan, Hui Ji, Zuowei Shen, Dual Gramian : duality principle and unitary extension principle, Mathematics of Computation, 85 (2016) Zhitao Fan, Andreas Heinecke, Zuowei Shen, Duality for frames, Journal of Fourier Analysis and Applications, 22(1), (2016),

17 Windowed Fourier transform (Gabor transform) and Motivation: analyzing local frequency information of signals Basic idea: localizing a function by a window function and running Fourier transform on it. Consider an even window function g L 2 (R) with translation u and modulation η: g u,η (t) = e iηt g(t u) and ĝ = e u(ω η) ĝ(ω η). Definition (Gabor transform) Gf(u, η) = R f(t)g u,η (t) = 1 f(ω)ĝ u,η (ω)dω. 2π R Heisenberg Uncertainty Principle: minimum variances of a window function g in plane σt 2 σω 2 1 4, is obtained only when g is a Gaussian function, i.e. g = c 0 e (t u 0 ) 2 2σ 2.

18 From Gabor transform to discrete Gabor and Discrete Gabor system for l 2 (R) is obtained by sampling Gabor transform in plane: (u, η), g j,k = g(t u 0 j}e i2πkη0t, j, k Z. slicing time domain into different intervals with translations of a window function with compact support (or with fast decay) Constructing a Fourier system in each interval. Example (Constructing a localized o.n. basis for L 2 (R)) Define windows function g = 1 [0,1). Let {e k } k Z denote an orthonormal basis for L 2 ([0, 1]). Then, the system {g(t j)e k (t j)} j,k Z forms an orthonormal basis for L 2 (R).

19 Orthonormal bases w/ Gabor structure and Consider the indicator window function g = 1 [ u 0 2, u 0 2 ). Using Fourier basis for L 2 [ u0 2, u0 2 ): e k(t) = e i2πkη0t with η = u 1 0, we have blocked Fourier basis for L2 (R): {g(t u 0 j)e 2kηπt } j,k Z. Using Cosine I basis for L 2 ([ u0 2, u0 2 )): e k(t) = λ k cos πkt with λ = 1 if k = 0 and 2 otherwise, we have blocked Cosine basis for L 2 (R): {λ k g(t u 0 j) cos πηkt} j,k Z. Main weakness of using indicator window function: Signals are multipled by a discontinuous window function g = 1 [0,1), which creates artificial discontinuities.

20 Tight frame with Gabor structure Theorem (Non-existence of smooth localized Fourier basis) Consider a compactly supported non-negative windows function g L 2 (R) and positive constants u 0, η 0. The system and {g(t u 0 j}e i2πkη0t } j,k Z forms an orthonormal basis if and only u 0 η 0 = 1; g = 1 [0,u0)( ) (up to a translation). Theorem (Tight frame with localized atoms) Consider a window function g with supp(g) = [0, a] and g > 0 on (0, a). Then {g(t u 0 j}e iη0kt } j,k Z forms a tight frame for L 2 (R) iff (i) η 0 a 2π; (ii) g(t u 0 j) 2 = η 0.

21 From discrete system to digital system and For discrete Gabor system in L 2 (R), digital system for l 2 (Z) is constructed by directly sampling the in L 2 (R). Definition (Local DFT/DCT) Digital orthonormal basis for R N from local DFT/DCT: DFT : {e k [n] = 1 N e i 2πkn N k=0 ; DCT : {e k [n] = λ k N cos( kπ N (n )}N 1 k=0 forms an orthonormal basis for R N. } N 1 Consider digital indicator sequence g l 2 (Z) defined by g[n] = 1 [0:N 1] [n]. Then, the system {g j,k : g[n jn]e k [n], n Z} j Z,0 k N 1 forms an orthonormal basis for l 2 (Z).

22 Digital Gabor tight frames and Consider a digital system X = {g j,k } j,k l 2 (Z) defined by where u 0 is a non-zero integer. g j,k [n] = g[(n u 0 j)]e i2πη0kn, Theorem (digital Gabor tight frame for l 2 (Z)) Suppose that g l 2 (Z) is a non-negative sequence with support [0 : N 1]. Then, the system X is a tight frame for C N iff (i) η 0 N 1; and (ii) (g[( u0 j)]) 2 η 0. The same conclusion holds when using DCT as local system: g j,k [n] = g[(n u 0 j)]λ k cos(πη 0 k(n )).

23 Examples of smooth windows functions and Given an integer knot sequence Z, the B-spline of order 1 is defined as B 1 = 1 [0,1). The B-spline of order m is recursively defined as We have B m (t) = B m 1 B 1 = 1 0 B m 1 (x t)dt. (i) supp(b m ) = [0, m] and B m (t) > 0, t (0, m) (ii) j Z B m( j) = 1. (Partition of Unity). Consider the windows function g = B m (u 0 ). Then, g( u0 j) 2 = 1. Remark: normalized B-spline, 1 m 2 3 B m( 1 m 2 3 x + m 2 ), uniformly converges to Gaussian function

24 Reference and 1 De Boor, C., B(asic)-spline basics, Mathematics Research Center, University of Wisconsin-Madison (1986) 2 Ron, A., Shen, Z., Weyl-Heisenberg Frames and Riesz Bases in L 2 (R d ), Duke Math. J., 89, (1997) 3 Mallat, Stephane. A wavelet tour of signal processing. Academic press, G. Casazza, G. Kutyniok and M.C. Lammers, Duality Principles, Localization of Frames, and Gabor Theory, Vol. 10 (2005) H. Ji, Z. Shen and Y. Zhao, Directional frames for image recovery: discrete Gabor frames, Journal of Fourier Analysis and Applications, xx(x), xx-xx, xxx. 2016

25 and Non-stationary signals with varying local structures An affine system with multi-scales and multi-bands structures Sub-band frequency X = {2 n/2 ψ 1 (2 n k), 2 n/2 ψ 2 (2 n k),..., 2 n/2 ψ L (2 n k)} Definition (wavelet tight frames) An affine system X = {2 n/2 ψ l (2 n k)} L l=1 L2 (R) is called wavelet tight frames if X is a tight frame for L 2 (R), i.e. f = l,n,k f, 2 n/2 ψ l (2 n k), f L 2 (R).

26 Multi-resolution (MRA) and Multi-resolution pyramid Definition (Multi-resolution ) The sequence {V n } n Z forms an MRA for L 2 (R) if (i) V n V n+1 ; (ii) n V n = L 2 (R); (iii) V n = {0}. A function φ L 2 (R) with φ(0) = 1 is called the scaling function of the MRA, if V n = span{2 n/2 φ(2 n k)} k Z, which gives Refinable property: φ( ) = 2 k a 0 [k]φ(2 k).

27 MRA-based wavelet tight frames and A block system in scale-time plane: ; V 1 = V 0 W 0 ; V 2 = V 1 W 1 = V 0 W 0 W 1, Then, n Z W n = L 2 (R) and W i W j, i j. Definition (MRA-based wavelet tight frame) The system X = {2 n/2 ψ l (2 n k)} l [1:L],n,k Z L 2 (R) is an MRA-based tight frame if X forms a tight frame for L 2 (R), and ψ l ( ) = 2 k a l [k]φ(2 k), 1 l L. The set of sequences A = {a 0, a 1,... a L } l 2 (Z) is called the filter bank of an MRA-based wavelet tight frame, where a 0 is a low-pass filter and the others are high-pass filters.

28 Unitary extension principle (UEP) and Theorem (UEP) Consider an MRA with scaling function φ. The affine system X = {2 n/2 ψ l (2 n k)} L l=1 L2 (R) forms a tight frame for L 2 (R) if the filter bank A = {a 0, a 1,... a L } l 2 (Z) satisfies L l=0 n Z a l[2n]a l [2n + m] = δ m ; L l=0 n Z a l[2n + 1]a l [2n m] = δ m. Together with UEP, a finitely supported filter a 0 admits a compactly supported refinable function φ L 2 (R) of an MRA. Theorem (Filter bank and continuum space) A finitely supported filter bank satisfying n Z a 0[n] = 2 and (1) admits an MRA-based compactly supported wavelet tight frame for L 2 (R). (1)

29 Example and Example (Daubechies DB4 wavelets) Filter bank: Example (linear B-spline framelet) a 0 = 2[0.1629, , , , , , ] a 1 = 2[0.0075, , , , , , , ] φ ψ 1 ψ 2 Filter bank: a 0 = 1 4 [1, 2, 1]; a 1 = 2 4 [ 1, 0, 1]; a 2 = 1 4 [ 1, 2, 1].

30 Basic operators in filter bank and Definition (discrete convolution) For a signal f l 2 (R) and a filter a l 2 (R), the discrete Convolution between f and a is defined by (f a)[k] = j Z f[j]a[k j], k Z. Definition (down/up-sampling) The down-sampling operator with sampling rate p, denoted by p, is defined as (f p )[n] = f[pn], n Z. Its adjoint operator, the up-sampling operator p, is defined by (g p )[pn] = g[n], n Z and all other entries of (g p ) are zero.

31 Cascade algorithm for dyatic wavelets Mathematical Syllabus in and Hilbert space Section 2: Block with Gabor structure Consider f l 2 (Z). Define a quasi-interpolant g n : Consider a sequence g n f= l 2 (Z). fφdefine n,k φ n,k a quasi-interpolant g n : k g n = f, φ n,k φ n,k, φ n,k = 2 n/2 φ(2 n k), n = 0, 1,.... and φ n,k = 2 n/2 k φ(2 n k), ψ l,n,k = 2 n/2 ψ l (2 n k). Define Define w 0,n [k] w 0,n = [k] f, = φ g, n,k φ, n,k w, l,n w[k] l,n [k] = = f, g, ψ l,n,k, 1 l L. L. Then, Unitary Extension Principle and multi-scale Cascade algorithm w l,n [k] w= l,n [k] m Z a l[m 2k]w 0,n 1 [m]; 0,., L; w 0,n 1 [k] = = m Z L a l[m 2k]w 0,n 1 [m]; l = 0,..., L; w 0,n 1 [k] = L l=0 m Z l=0 m Z a l[k a l[k 2m]w 2m]w l,n [m]; l = 0,..., L. l,n [m]; 0,., L., a ( ) 0 a ( ) 1 a ( ) L 2 2 2,,, a () 0 2 a () 2 1 al () 2, : convolution 2: down-sampling 2: up-sampling

32 An adjoint system for filter bank frame and Consider the system X = {a l [n 2k], n Z} l [0:L],k Z. Pre-gramian matrix of X: Adjoint system: J X = {a l [n 2k] n Z } (l,k) [0:L] Z. X = {y j } j Z, with y j = [a l [n]] (l [0:L],n (j+2z). dual Grammian matrix G X : G X [n, n] == L a l [n 2k]a l [n 2k] l=0 k Z X is a tight frame iff G X = I: for m, j [0 : 1], L l=0 n (j+2z) a l [n]a l [n + m] = δ m.

33 Digital multi-scale wavele tight frames and Example ( single-level wavelet tight frame for l 2 (Z), n = 1) Define the system X = {X 0, X 1,..., X L } l 2 (Z) with X l = {a l ( 2k)} k Z. Then, the system X forms a tight frame for l 2 (Z) Example (N-level wavelet tight frame for l 2 (Z)) Define the system X = {{X l,1 } L l=1, {X l,2 } N l=1,..., {X l,n } N l=1, X 0,N } l 2 (Z), where X l,n denotes the sub-system of l-th sub-band channel at scale 2 n : X l,n = {a l,n ( 2 n k} k Z }, l = 0,..., L, n = 1,..., N, with a l,n = a l (a 0,n 1 ) 2, l = 0,..., L and a 0,0 = δ.

34 Analysis operator and synthesis operator and Consider the periodic boundary extension for finite convolution: (h f[k]) = j h[j]f[(k j) mod N], Define a basic operator W a : C 2N C N : f, h C N W a f = a( ) f 2 ; W a g = a( ) (g 2 ). Analysis operator W : Synthesis operator W : W = We have W a0 W a,1. W al W = [ W a 0, W a,1,..., W a L ] W W = I 2N 2N, W W I LN LN.

35 Tensor product approach for multi-dim. data and Tensor product of Hilbert spaces f 1 f 2, g 1 g 2 = f 1, f 2 g 1, g 2 Consider scaling function φ and one wavelet function ψ. V n+1 V n+1 := V n V n W (0,1) n W (1,0) n W (1,1) n, where LH: W n (0,1) = span(2 n φ(2 n x k 1 )ψ(2 n y k 2 ), k 1, k 2 Z) LH: W n (1,0) = span(2 n ψ(2 n x k 1 )φ(2 n y k 2 ), k 1, k 2 Z) HH: W n (1,1) = span(2 n ψ(2 n x k 1 )ψ(2 n y k 2 ), k 1, k 2 Z) A discrete decomposition/reconstruction scheme: f[n1,n2] along rows a0[ n1] 2 a[ n] a 2 [ n ] 0 2 a1[ n2] 2 a along columns [ n ] a1[ n2] 2 LL LH HL HH

36 Other variations and extensions and Complex-valued wavelet system Only real-valued symmetric/anti-symmetric wavelet orthonormal bases are Harr wavelets Complex-valued symmetric/anti-symmetric wavelet orthonormal bases with smoothness. Translation invariance for better processing with less artifacts Un-decimal discrete wavelet, frame/tight frame, without down-sampling (up-sampling) process. Large sampling rate for computational efficiency p-dilation wavelet : down-sampling (up-sampling) rate is p 2. Non-separable wavelet for multi-dimensional signals a separable function g(x, y) = g x (x)g y (y).

37 Connection between Gabor frames and MRA-based wavelet tight frames and Consider a discrete Gabor system X = {g[n 2j]e i2π1/lkn }. digital Gabor filter bank {g l } L l=0 : g l [n] = g[n]e i2πl/ln. operator can be implemented as single-level wavelet filter bank, up to a diagonal matrix with phase shifts. f [(g 0 f) 2, (g 1 f) 2,... (g L f) 2 ] Question: Introducing MRA and affint structure to Gabor system? Theorem The filter bank {g l [n] = g[n]e i2πl/ln } L l=0 satisfies the UEP, if and only if g = L 1 [1, 1, 1, 1, 1, 1]. Remark: there is no MRA-based wavelet tight frame generated via Gabor digital filter bank with smooth window function.

38 Wavelet for multi-dim. data Multi-dim signals, e.g. images, have geometrical regularities Discontinuities only on normal direction of edges and original image local edges tensor product real-valued wavelets only have two orientations. 2D filters of linear spline frames 2D filters of DCT

39 Multi-dim w/ strong orientation selectivity and multi-dim for gaining more orientation selectivity non-separable real-valued and related: ridgelet, curvelet, shearlet and many others. tensor product of complex valued : dual tree complex wavelet transform and discrete Gabor bi-frames. real part of Gabor filters imaginary part of Gabor filters

40 Reference and 1 S. Mallat. A wavelet tour of signal processing. Academic press, I. Daubechies. Ten lectures on wavelets. Vol. 61. Philadelphia: Society for industrial and applied mathematics, Z. Shen, Wavelet frames and image restorations, Proceedings of ICM. Vol IV, Hyderabad, India, (2010), B. Dong, Z. Shen, MRA-based wavelet frames, IAS/Park City Mathematics Series: The Mathematics of Image Processing, Vol 19, (2010), B. Dong and Z. Shen, Image restoration: a data-driven perspective, Proceedings of ICIAM, Beijing, China, (2015) 6 Z. Fan, H. Ji and Z. Shen, Dual Gramian : duality principle and unitary extension principle, Mathematics of Computation, 85 (2016) H. Ji, Z. Shen and Y. Zhao, Digital Gabor Filters Do Generate MRA-based Wavelet Tight Frames, Preprint, 2016.

41 of signals and Definition of sparsity and compressibility A vector is called sparse, if most entries are zero or close to zero. A class of signals is called compressible if there exists a system X such that the linear expansion of any signal in this sparse is sparse When the system X is a redundant (tight) frame, the linear expansion is no longer unique. Consider a signal f H and a redundant frame X = {x k } k Z. Synthesis sparse model: there exist a sparse coefficient sequence c such that f = k c[k]x k. Analysis sparse model: the canonical coefficient sequence { f, x k } k Z is sparse.

42 Linear inverse problems and Consider a linear system Af = b + η, where A R M N : measurement matrix f R N : unkown signal b R M : available observation η R M : measurement noise The matrix A often is ill-conditioned or not-invertible. Image in-painting : a non-invertible down-sampling matrix Image deconvolution : an ill-conditioned convolution matrix Compressed sensing : a random matrix with N >> M Tomopgraphy : a non-invertible partial radon transform A straightforward inverse will magnifies noise η in the solution.

43 Sparse solution of ill-posed inverse problems and Consider an ill-posed or non-invertible linear system Af = b + η. Suppose there exist a system such that signal f is compressible. Synthesis model: AW c = b + η, where most entries of c are zeros. Suppose the support of c, Ω, is known by oracle. Define c Ω [k] = c[i k ], i k Ω. Then, the system, AW Ωc Ω = b + η, is well-posed, if the cardinality Ω is sufficiently small. Analysis model: most entries of W f are zeros. Suppose that the support of W f is given by oracle. Let Ω c denote its complement. Then, the system ( A W Ω c ) f = ( b 0 ) + ( η ɛ is well-posed, if the cardinality of Ω c is sufficiently large. )

44 Sparsity-based convex regularization and Using l 1 -norm as the convex relaxation of the sparsity-prompting functional l 0 -norm. synthesis sparse model: x = W c, sparse model: c := argmin c 1 2 A(W c) b λ c 1. x := argmin x 1 2 Ax b λ W x 1 balanced sparse model: x = W c, c := argmin c 1 2 A(W c) b µ 2 (I W W )c λ c 1. When µ = 0, balanced model becomes synthesis model. When µ +, balanced model becomes model.

45 Mutual coherence and Exact recovery and Definition (mutual coherence) The mutual coherence of a system X = {x l } N l=1 with normalized atoms is defined as µ(x) = max i j x i, x j A vector x R N is called s-sparse if at most s entries are not zero. Theorem (sufficient condition) Consider a full rank matrix X R d N with normalized columns. For any s-sparse vector x R N satisfying s 1 2 (1 + µ(x) 1 ), y = ȳ, ȳ {argmin y 1, s.t. Xz = Xy}. Theorem (boundedness on mutual coherence) For a d N redundant system X, µ(x) ( N d d(n 1) ) d.

46 Exact recovery in noise-free case and Definition (null space property) A matrix X is said to satisfy the null space property of order s, if there exists a constant 0 β < 1 such that for any h ker(x) and any index set Ω of size Ω s, h Ω 1 β h Ω c 1. Theorem (Sufficient and necessary condition) Consider a matrix X R M N. For any s-sparse vector y, y = ȳ, ȳ {argmin y 1, s.t. Xz = Xy}. if and only if X satisfies null space property. Mutual coherence based condition is easy to check but very strong. Null space property is much weaker but difficult to check.

47 Demonstration Image deconvolution and original image blurry image de-convoluted image Image in-painting original image damaged image in-painted image

48 Real applications Blind image deconvolution Model: g = p f + η A bi-linear problem with unknowns: p, f, η and motion-blurred mage Electron Microscopy recovered image Model: g k = A k f + η A linear problem with very low signal-to-noise ratio multiple EM images 3D structure of virus

49 Reference and 1 Bin Dong, Zuowei Shen, MRA-based wavelet frames, IAS/Park City Mathematics Series: The Mathematics of Image Processing, Vol 19, (2010), Jianfeng Cai, Stanley Osher, Zuowei Shen, Split Bregman methods and frame based image restoration, SIAM J: Multiscale Modeling and Simulation 8(2), (2009), J. Cai, H. Ji, C. Liu and Z. Shen, Blind motion deblurring from a single image using sparse approximation, IEEE CVPR, Miami, M. Li, Z. Fan, H. Ji and Z. Shen, Wavelet frame based algorithm for 3D reconstruction in electron microscopy, SIAM Journal on Scientific Computing, 36(1), B45-B69, Jan H. Ji and K. Wang, Robust image deconvolution with an inaccurate blur kernel, IEEE Transactions on Image Processing, 21(4), , Apr Bin Dong, Hui Ji, Jia Li, Zuowei Shen, Yuhong Xu, Wavelet frame based blind image inpainting, Applied and Computational Harmonic Analysis, 32(2), (2012), H. Ji, Y. Luo and Z. Shen, Image recovery via geometrically structured approximation, Applied and Computational Harmonic Analysis, To appear, 2016

50 learning and For a sequence l 2 (Z), partitioning it to finitely supported blocks (with possible overlap): {f n = f Ωn } n Z R T, Ω n = [np + 1, np + T ] Z. The K-SVD method by Aaron et al. is to learn a dictionary D R T L which sparsifies {f k }, via solving 1 min { D l =1 } L 2 f k Dc n λ c k 0. l=1,{cn}n n Suppose that (Remark: Condition (i) is not guaranteed in model) (i) span{d 1,..., D M } = R T (ii) A f 2 2 k f k 2 2 B f 2 2. Then the system {D l, n} l,n forms a frame for l 2 (Z), where D l,n denotes the translated D l w.r.t. the n-th block.

51 -based and Gabor/wavelet and Given an index partition: Ω n = [np, np + T 1] and f n = f Ωn. For a Gabor system w/ window function g and step size u 0 = P and supp(g) = [0 : T 1], the entries of its operator are f, D l,n = f n, D l, where D l [k] = g[k]e i2πblk. For a single-level wavelet system w/ dilation factor p = P and supp(a l ) [0 : T 1], the entries of its operator are where D l [k] = a l [k]. f, D l,n = f n, D l, In terms of operator, Gabor are exactly dictionary-based with predefined dictionary, either Gabor atoms or wavelet filters.

52 tight frame and frames (existing dictionary learning technique) No closed-form Linear expansion Completeness of D in R T is difficult to be guaranteed. tight frames fast linear expansion: f = k f, x k x k UEP on D for generating tight frames w/ multi-scale. A general data-driven tight frame model 1 min D,{c n} n 2 f k Dc n λ c k 0. n where the filter bank D satisfies the UEP. The wavelet system generated by D forms a wavelet tight frames as long as {f k } k is a uniform (overlapped) partition of f.

53 A simple and efficient construction and Consider a un-decimal discrete wavelet tight frame, i.e., X := {X n } n Z, where X n = [a 0 [ n],..., a L 1 [ n]]. The UEP for such a system is simplified to L 1 a l [n + k]a l [n] = δ k. l=0 n Consider a finitely supported filter bank {a 0,..., a L 1 } with supp(a l ) [0 : T 1]. Define a dictionary D R T L Theorem (Local and global) D l = a l 1 [0 : T 1], l = 1,..., L. The filter bank {a 0,..., a L 1 } generates an un-decimal wavelet tight frame for l 2 (Z), provided that { 1 T D l } forms a tight frame for R T, i.e. DD = T 1 I.

54 Variational model for dictionary learning and The sparsity-based model for data-driven tight frame min f k Dc k µ c k 0, s.t. DD = T 1 I L, D,{c k } k which is equivalent to the following real-valued dictionary learning model (D T D): min Y D,C DC 2 F + λ C 0, s.t. DD = I, where Y = T [..., f 1, f 0, f 1,... ] and C = [..., c 1, c 0, c 1,... ], The optimization problem is a challenging non-convex problem. When D is a over-complete tight frame, the subproblem of calculating the sparse code C under D is a NP-hard problem.

55 Fast methods for orthogonal dictionary learning and A further simplification by consider a square matrix D R T T. D D = DD = I = min D D=I,{c k } C D Y λ C 0, Alternating iteration scheme: for k = 0, 1, 2,..., { P1 : Ck+1 := argmin C C D Y 2 F + λ C 0 P2 : D k+1 := argmin D D=I Y DC 2 F Each step in the iteration has closed-form solution { P1 : Ck+1 := Γ λ (D Y ) P2 : D k+1 := UV, where Γ µ denote the hard-thresholding operator: Γ µ (x) = x if x > µ and 0 otherwise, and (U, V ) denotes the orthogonal matrices of SVD of Y C k such that Y C k = UΣV. k

56 Demonstration of data-driven filter bank Images and filter banks

57 Reference and 1 K. Kreutz-Delgado, J. Murray, B. Rao, K. Engan, T. Lee, T. Sejnowski, learning algorithms for sparse, Neural Computation 15 (2) (2003) 2 J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, Supervised dictionary learning, NIPS, M. Aharon, M. Elad, and A. Bruckstein, K-SVD: An algorithm for designing overcomplete dictionaries for sparse, IEEE Trans. Signal Process., 54 (11), M. Elad, M. Ahron, Image denoising via sparse s over learned dictionaries, IEEE Trans. Image Processing, 54 (12) (2006) 5 J. Cai, H. Ji, Z. Shen and G. Ye, tight frame construction and image denoising, Applied and Computational Harmonic Analysis, 37(1), Y. Quan, H. Ji and Z. Shen, multi-scale non-local wavelet frame construction and image recovery, Journal of Scientific Computing, 63(2), C. Bao, H. Ji and Z. Shen, Convergence for iterative data-driven tight frame construction scheme, Applied and Computational Harmonic Analysis, 2015

58 Structured dictionary learning and Consider a data matrix Y R N M whose columns represent input features. learning is about determining a dictionary D R N K under which the linear expansion {c n } of Y n, Y n K k=1 D kc n [k], has desired properties. A general variational model 1 2 Y DC 2 F + λψ(c) + Φ(D), Ψ is the functional on code C for desired properties, e.g. sparse approximation: 0 Lable (L) consistency of linear classifier (V): min V L V C 2 F Φ is the functional on dictionary for desires structures, e.g. Normalization constraints: Φ( ) = δ Dk =1( ) Orthogonal (or tight frame) constraints: Φ( ) = δ D D=I( ), where δ D (D) = 0 if D D and + otherwise.

59 List of models of dictionary learning and 1 K-SVD for general sparse coding 1 2 Y DC 2 F + λ C 0, s.t. D k 2 = 1, k [1 : M]. 2 Orthogonal dictionary tight frame 1 2 Y DC 2 F + λ C 0, s.t. D D = I. 3 Discriminative dictionary learning for recognition 1 2 Y DC 2 F + α 2 L W C 2 F + λ C 0 + µ W 2 F, subject to D k 2 = 1, k [1 : M]. The matrix L denotes class lables of training samples and W denotes linear classifier.

60 In-coherent dictionary learning for sparse coding and Recall that for an input feature y, finding a sparse vector c such that y = Dc is NP-hard if D is an over-complete dictionary. Consider two representative methods. orthogonal matching pursuit (OMP) for solving min c 0, s.t. y = Dc l 1 -norm relating convex relaxation: min c 1, s.t. y = Dc Both will exactly recover c, provided that its mutual coherence, µ(d) = max i j D i, D j, is sufficiently small. For example, µ(d) (2s 1) 1 for OMP.

61 Model for in-coherent dictionary learning and Theorem (Boundness of mutual coherence) Consider an over-complete normalized dictionary D R N K in R N. We have µ(d) ( K N N(K 1) ) 1 2 N 1 2. Remark: The lower bound of mutual coherence is achieved when the dictionary is a equiangular frame for R N satisfying D i, D j = const, i j. An equiangular frame is a tight frame after normalization. As uniform norm is not differential. We consider relaxing it by the square of l 2 -norm, which leads to the following model: min Y D l =1,C DC 2 F + λψ(c) + µ D D I 2 F.

62 Demo. of dictionary learning for sparse coding Visualization of mutual coherence of dictionaries and K-SVD In-coherent dict. learn. (IDL) Table: Classification accuracies (%) on face and object recognition Application Dataset K-SVD IDL Face Extended YaleB Face AR Face Object Caltech

63 classificakernel space Capturing non-linear patterns in data Non-linear regression No-linear classification Kernels: Making linear models works in non-linear settings Mapping data to higher dimension to exhibits linear patterns and Euclidean space Kernel space

64 Feature mapping If the data set is not linearly separable, then adding new features might make data sets. 1 any m+ points in R m is always linearly separable and 0 x 2 k zk ( xk, xk ) Consider a data set {x k } R m. Mapping it to a large feature space via the so-called feature mapping function: φ(x) = [x 1, x 2 1, sin(x 2 ), e x2+x3, tan(x 4 ),...].

65 Feature mapping and Kernel and 1 Data set if often mapped to a very high ( ) dimensional space, How to? 2 Many classifiers use only the inner product, e.g. perception, SVM. Consider two feature vectors x i, x j, κ(x i, x j ) = φ(x i ), φ(x j ). Can t we just call κ(x i, x j ) without explicitly calling φ. Definition (kernel) A kernel K has an associated feature mapping Φ: Φ : X H K : X X R, such that K(x, z) = φ(x), φ(z), where X is input space and H is a Hilbert feature space with inner product.

66 Mercer s Theorem and Conditions: 1 K(, ) L 2 (X X ) is symmetric: K(x, y) = K(y, x). 2 K is positive semi-definite: i j K(x i, x i )c i,j 0. Define a associated Hilbert-Schmidt integral operator on L 2 (X ): Theorem [T K φ](x) = K(x, s)φ(s)ds. X Mercer s Theorem Suppose K is continuous symmetric and positive semidefinite. Them there exists an orthonormal basis {e i } i Z of L 2 (X ) such that each e i is an eigenfunction of T K with eigenvalue λ i 0, and K(x, y) = j Z λ ie j (x)e j (y). Examples: Polynormal function: K(x, z) = (1 + x, z ) d. Radial basis function: K(x, z) = e γ x z 2 2.

67 Equiangular frames in kernel space and Definition A frame X for a Hilbert space H is called a Grassmanian frame, or equiangular frame for H, if there exists a constant c 0 >) such that (i) x k = 1, x k X; ((ii) x i, x j = c 0, i j. An equiangular frame, up to a constant, is a tight frame. Among all finite-dimensional frames with the same cardinality, equiangular frame has the minimum mutual coherence. Theorem Let Φ denote the feature mapping from X to a Hilbert space H with associated kernel K(x, y) = γ( x y 2 ) for some function ψ. Then, the map of an equiangular frame X in X, denoted by Φ(X) is also an equiangular frame of the linear space span{φ(x)}.

68 Equiangular kernel dictionary learning and A direct introduction of equiangular constraints leads to challenging optimization problems. A simplified approach is considering an orthogonal dictionary D R T,T with D D = I. Variational model for equiangular kernel dictionary learning: min Φ(Y ) D D=I,C Φ(D)C 2 + λψ(c). Let D denote its minimizer. Then, Φ(D) forms an equiangular dictionary in H. When solving the optimization problem above, only the kernel K(, ) is needed to be called. e.g. radial basis function K(x z) = e (γ x z 2).

69 Reference and 1 J. Mairal, F. Bach, J. Ponce, and G. Sapiro, Online learning for matrix factorization and sparse coding, J. Mach. Learn. Res., 11, 1960, Q. Zhang and B. Li, Discriminative K-SVD for dictionary learning in face recognition, CVPR, C. Bao, H. Ji, Y. Quan and Z. Shen, learning for sparse coding: Algorithms and, IEEE Transactions on Pattern Analysis and Machine Intelligence, Y. Quan, C. Bao and H. Ji, Equiangular kernel learning with applications to dynamic texture, IEEE CVPR, Las Vegas, C. Bao, Y. Quan and H. Ji, A convergent incoherent dictionary learning algorithm for sparse coding, ECCV, Zurich, S. Gao, I. W. Tsang, and L.-T. Chia, Sparse with kernels. IEEE Trans. Image Process., 22(2), 2013.

70 Basics of proximal gradient descent method and Define proximal mapping of a proper and lower semi-continuous function: Examples: prox α g (x) = argmin y 1 2α y x 2 + g(x), α > 0. l 1 -norm, i.e. g(x) = x 1 prox α 1 (x) = sign(x) max( x α, 0) l 0 -norm, i.e. g(x) = x 0 x, if x > α prox α 0 (x) = {0, x}, if x = α 0, if x < α

71 Proximal gradient descent method and Consider an un-constrained convex minimization: where min f(x) + g(x), x f C 1 has Lipschitz gradient with modulus L g is proper and lower semi-continuous Example: the synthesis (or balanced) model for sparse recovery: f(x) = 1 2 A(W c) b 2 2, g(x) = λ c 1. Proximal gradient method (PG): for k=0,1,..., x k+1 = prox α g (x k α f(x k )).

72 Acceleration and convergence and Accelerated proximal gradient method (APG): for k = 0, 1,...,, t k = ( 4(t k 1 ) )/2, y k = x k + tk 1 1 t k (x k x k 1 ), x k+1 = prox α g (y k α f(y k )). Theorem (Convergence behavior of PG and APG) Let h = min h(x) = f(x) + g(x) and x be any minimizer and 0 < α < 1/L. Then, Proximal gradient method satisfies h(x k ) h x 0 x 2 /(2αk). Accelerated proximal gradient method satisfies h(x k ) h 2 x 0 x 2 /(α(k + 1) 2 ).

73 Alternating direction method of multiplier and Consider a convex model with linear constraints: min x,y f(x) + h(y), s.t. Ax + By = b. Example: Analysis model based sparse recovery min x,y 1 2 Ax b λ y 1, s.t. W x y = 0. Define the augmented Lagrangian L σ (x, y; z) = f(x) + h(y) + z, Ax + By b + σ Ax + By b 2 2 Alternating direction of multiplier method (ADMM) x k+1 argmin x L σ (x, y k ; z k ), y k+1 argmin y L σ (x k+1, y; z k ), z k+1 = z k + σ(ax k+1 + By k+1 b).

74 Convergence of ADMM and Assumptions on function f and g: 1 The extended-real-valued functions f and g are closed, proper and convex 2 The unaugmented Lagrangian L 0 has a saddle point 3 Both sub-problems in ADMM have bounded solutions. Theorem (Convergence of ADMM) Suppose assumptions 1-3 hold. Let (x, y ; z ) be a saddle point of L 0, p be optimal value and (x k, y k ; z k ) be the infinite sequence generated by ADMM. Then, we have Residual convergence, i.e. Ax k + By k b 0 as k + Objective convergence, i.e. f(x k ) + g(y k ) p as k + Dual variable convergence, i.e. z k z as k + The convergence of primal variables can be achieved under additional assumptions e.g. A is surjective and B = Id.

75 Multi-block iterations for non-convex problems and Consider the unconstrained non-convex minimization: min F (x) = f(x) + n g i (x i ) (2) x=(x 1,...,x n) inf F >, inf f > and inf g i > for all i f C 1 and f is Lipschitz continuous on any bounded set For each block x i, i f is L i -Lipschitz continuous g i, i = 1,..., n are proper and lower semi-continuous Example: Multi-linear problems with (non)-convex regularization Blind deconvolution f(x, h) = 1 2 y x h 2, g 1 (x) = W 1 x 1 ; g 2 (h) = W 2 h 1 +µ h 2 2. learning f(d, C) = 1 2 Y DC 2, g 1 (C) = C 0 ; g 2 (D) = δ Dk =1(D) i=1

76 Kurdyka-Łojasiewicz property and Define X η ( x) = X {x : f( x) < f(x) < f( x) + η} Φ η = {φ C 1 (0, η) C[0, η) : φ is concave, φ > 0 on (0, s)} Definition (Kurdyka-Łojasiewicz (KL) property) Let f be a proper and lower semi-continuous function. The function is said to have the KL property at x if there exist η > 0 and φ Φ η such that φ (f(x) f( x))dist(0, f(x)) 1 holds for all x X η ( x). If f satisfies the KL property at each point of dom f = {x R d : f(x) }, then f is called a KL function. Example (Typical KL functions) Semi-algebraic or Analytic functions: Polynomial functions, norm functions, TV norm, l 0 norm, rank function, logistic regression function, etc.

77 Hybrid proximal alternating method and Recall the problem: min F (x) = f(x) + n g i (x i ) x=(x 1,...,x n) i=1 Define f k i (x i) = f(x k 1,..., x k i 1, x i, x k 1 i+1,..., xk 1 n ). Hybrid proximal alternating iteration: for k = 0, 1,..., for j = 1,..., n x k+1 i end end either prox λk i f k i +gi (x k i ) or proxλk i g i (x k i λk i f k i (xk i )). Theorem (Global convergence) Let F (x) be defined in (2) and {x k = (x k 1,..., x k n)} be the infinite sequence generated by Hybrid proximal alternating method. If F is a KL function and {x k } is bounded, then {x k } converges to a point x, which is a critical point of F i.e. 0 F (x ).

78 Demonstration in dictionary learning and The performance depends on blocks partition, initialization, updating order and scheme. Consider a classic dictionary learning problem: min D,C 1 2 Y DC 2 F + λ 2 C 0, s.t. C M, d j 2 = 1, j. Choice of block partition C1 (x 1,..., x n) = (c 1,..., c q, d 1,..., d q) C2 (x 1,..., x n) = (C, d 1,..., d q) Choice of update scheme and order S1 Proximal alternating: C1 + proximal method S2 Proximal linearized alternating: C1/C2 + proximal linearized method S3 Hybrid method with acceleration: C2 + proximal method with updating order that re-updating the coefficient c i right after updating the corresponding dictionary atom d i. All sub-problems in S1-S3 have closed form solutions.

79 Reference and 1 J. Bolte, et al., Proximal alternating linearized minimization for nonconvex and nonsmooth problems, Math. Program., 136, H. Attouch, et al., Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forwardbackward splitting, and regularized gauss seidel methods, Math. Program., 137(1-2), Attouch, H., et al. Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Lojasiewicz inequality, Mathematics of Operations Research 35(2), Y. Xu and W. Yin, A block coordinate descent method for multi-convex optimization with applications to nonnegative tensor factorization and completion, SIAM J. Imaging. Sci., 6(3), C. Bao, et al., learning for sparse coding: Algorithms and convergence, IEEE Transactions on Pattern Analysis and Machine Intelligence, J.F. Cai, et al., Split Bregman methods and frame based image restoration. Multiscale modeling and simulation, 8(2), , Z. Shen, et al., An accelerated proximal gradient algorithm for frame-based image restoration via the balanced approach. SIAM Journal on Imaging Sciences, 4(2),

Deep Learning: Approximation of Functions by Composition

Deep Learning: Approximation of Functions by Composition Deep Learning: Approximation of Functions by Composition Zuowei Shen Department of Mathematics National University of Singapore Outline 1 A brief introduction of approximation theory 2 Deep learning: approximation

More information

Sparse linear models

Sparse linear models Sparse linear models Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 2/22/2016 Introduction Linear transforms Frequency representation Short-time

More information

Sparse Approximation: from Image Restoration to High Dimensional Classification

Sparse Approximation: from Image Restoration to High Dimensional Classification Sparse Approximation: from Image Restoration to High Dimensional Classification Bin Dong Beijing International Center for Mathematical Research Beijing Institute of Big Data Research Peking University

More information

Introduction to Hilbert Space Frames

Introduction to Hilbert Space Frames to Hilbert Space Frames May 15, 2009 to Hilbert Space Frames What is a frame? Motivation Coefficient Representations The Frame Condition Bases A linearly dependent frame An infinite dimensional frame Reconstructing

More information

LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING

LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING LINEARIZED BREGMAN ITERATIONS FOR FRAME-BASED IMAGE DEBLURRING JIAN-FENG CAI, STANLEY OSHER, AND ZUOWEI SHEN Abstract. Real images usually have sparse approximations under some tight frame systems derived

More information

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation

More information

DUALITY FOR FRAMES ZHITAO FAN, ANDREAS HEINECKE, AND ZUOWEI SHEN

DUALITY FOR FRAMES ZHITAO FAN, ANDREAS HEINECKE, AND ZUOWEI SHEN DUALITY FOR FRAMES ZHITAO FAN, ANDREAS HEINECKE, AND ZUOWEI SHEN Abstract. The subject of this article is the duality principle, which, well beyond its stand at the heart of Gabor analysis, is a universal

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 9. Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 9 Alternating Direction Method of Multipliers Shiqian Ma, MAT-258A: Numerical Optimization 2 Separable convex optimization a special case is min f(x)

More information

Applied Machine Learning for Biomedical Engineering. Enrico Grisan

Applied Machine Learning for Biomedical Engineering. Enrico Grisan Applied Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Data representation To find a representation that approximates elements of a signal class with a linear combination

More information

IMAGE RESTORATION: TOTAL VARIATION, WAVELET FRAMES, AND BEYOND

IMAGE RESTORATION: TOTAL VARIATION, WAVELET FRAMES, AND BEYOND IMAGE RESTORATION: TOTAL VARIATION, WAVELET FRAMES, AND BEYOND JIAN-FENG CAI, BIN DONG, STANLEY OSHER, AND ZUOWEI SHEN Abstract. The variational techniques (e.g., the total variation based method []) are

More information

Multiscale Frame-based Kernels for Image Registration

Multiscale Frame-based Kernels for Image Registration Multiscale Frame-based Kernels for Image Registration Ming Zhen, Tan National University of Singapore 22 July, 16 Ming Zhen, Tan (National University of Singapore) Multiscale Frame-based Kernels for Image

More information

Digital Image Processing

Digital Image Processing Digital Image Processing, 2nd ed. Digital Image Processing Chapter 7 Wavelets and Multiresolution Processing Dr. Kai Shuang Department of Electronic Engineering China University of Petroleum shuangkai@cup.edu.cn

More information

A Convergent Incoherent Dictionary Learning Algorithm for Sparse Coding

A Convergent Incoherent Dictionary Learning Algorithm for Sparse Coding A Convergent Incoherent Dictionary Learning Algorithm for Sparse Coding Chenglong Bao, Yuhui Quan, Hui Ji Department of Mathematics National University of Singapore Abstract. Recently, sparse coding has

More information

Block Coordinate Descent for Regularized Multi-convex Optimization

Block Coordinate Descent for Regularized Multi-convex Optimization Block Coordinate Descent for Regularized Multi-convex Optimization Yangyang Xu and Wotao Yin CAAM Department, Rice University February 15, 2013 Multi-convex optimization Model definition Applications Outline

More information

SPARSE SIGNAL RESTORATION. 1. Introduction

SPARSE SIGNAL RESTORATION. 1. Introduction SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful

More information

Digital Gabor Filters Do Generate MRA-based Wavelet Tight Frames

Digital Gabor Filters Do Generate MRA-based Wavelet Tight Frames Digital Gabor Filters Do Generate MRA-based Wavelet Tight Frames Hui Ji a,,zuoweishen a,yufeizhao a a Department of Mathematics, National University of Singapore, Singapore 7543 Abstract Gabor frames,

More information

Lecture Notes 5: Multiresolution Analysis

Lecture Notes 5: Multiresolution Analysis Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto Reproducing Kernel Hilbert Spaces 9.520 Class 03, 15 February 2006 Andrea Caponnetto About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Sparse & Redundant Signal Representation, and its Role in Image Processing

Sparse & Redundant Signal Representation, and its Role in Image Processing Sparse & Redundant Signal Representation, and its Role in Michael Elad The CS Department The Technion Israel Institute of technology Haifa 3000, Israel Wave 006 Wavelet and Applications Ecole Polytechnique

More information

Topographic Dictionary Learning with Structured Sparsity

Topographic Dictionary Learning with Structured Sparsity Topographic Dictionary Learning with Structured Sparsity Julien Mairal 1 Rodolphe Jenatton 2 Guillaume Obozinski 2 Francis Bach 2 1 UC Berkeley 2 INRIA - SIERRA Project-Team San Diego, Wavelets and Sparsity

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan

More information

Multiresolution Analysis

Multiresolution Analysis Multiresolution Analysis DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Frames Short-time Fourier transform

More information

A tutorial on sparse modeling. Outline:

A tutorial on sparse modeling. Outline: A tutorial on sparse modeling. Outline: 1. Why? 2. What? 3. How. 4. no really, why? Sparse modeling is a component in many state of the art signal processing and machine learning tasks. image processing

More information

Approximately dual frames in Hilbert spaces and applications to Gabor frames

Approximately dual frames in Hilbert spaces and applications to Gabor frames Approximately dual frames in Hilbert spaces and applications to Gabor frames Ole Christensen and Richard S. Laugesen October 22, 200 Abstract Approximately dual frames are studied in the Hilbert space

More information

Linear Independence of Finite Gabor Systems

Linear Independence of Finite Gabor Systems Linear Independence of Finite Gabor Systems 1 Linear Independence of Finite Gabor Systems School of Mathematics Korea Institute for Advanced Study Linear Independence of Finite Gabor Systems 2 Short trip

More information

1. Fourier Transform (Continuous time) A finite energy signal is a signal f(t) for which. f(t) 2 dt < Scalar product: f(t)g(t)dt

1. Fourier Transform (Continuous time) A finite energy signal is a signal f(t) for which. f(t) 2 dt < Scalar product: f(t)g(t)dt 1. Fourier Transform (Continuous time) 1.1. Signals with finite energy A finite energy signal is a signal f(t) for which Scalar product: f(t) 2 dt < f(t), g(t) = 1 2π f(t)g(t)dt The Hilbert space of all

More information

Multiscale Image Transforms

Multiscale Image Transforms Multiscale Image Transforms Goal: Develop filter-based representations to decompose images into component parts, to extract features/structures of interest, and to attenuate noise. Motivation: extract

More information

A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality

A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality Shu Kong, Donghui Wang Dept. of Computer Science and Technology, Zhejiang University, Hangzhou 317,

More information

Sparse Optimization Lecture: Basic Sparse Optimization Models

Sparse Optimization Lecture: Basic Sparse Optimization Models Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basic l 1, l 2,1, and nuclear-norm

More information

10-725/36-725: Convex Optimization Prerequisite Topics

10-725/36-725: Convex Optimization Prerequisite Topics 10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the

More information

Tutorial: Sparse Signal Processing Part 1: Sparse Signal Representation. Pier Luigi Dragotti Imperial College London

Tutorial: Sparse Signal Processing Part 1: Sparse Signal Representation. Pier Luigi Dragotti Imperial College London Tutorial: Sparse Signal Processing Part 1: Sparse Signal Representation Pier Luigi Dragotti Imperial College London Outline Part 1: Sparse Signal Representation ~90min Part 2: Sparse Sampling ~90min 2

More information

Signal Recovery, Uncertainty Relations, and Minkowski Dimension

Signal Recovery, Uncertainty Relations, and Minkowski Dimension Signal Recovery, Uncertainty Relations, and Minkowski Dimension Helmut Bőlcskei ETH Zurich December 2013 Joint work with C. Aubel, P. Kuppinger, G. Pope, E. Riegler, D. Stotz, and C. Studer Aim of this

More information

Generalized Orthogonal Matching Pursuit- A Review and Some

Generalized Orthogonal Matching Pursuit- A Review and Some Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Low-rank matrix recovery via convex relaxations Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

More information

Sparse linear models and denoising

Sparse linear models and denoising Lecture notes 4 February 22, 2016 Sparse linear models and denoising 1 Introduction 1.1 Definition and motivation Finding representations of signals that allow to process them more effectively is a central

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

446 SCIENCE IN CHINA (Series F) Vol. 46 introduced in refs. [6, ]. Based on this inequality, we add normalization condition, symmetric conditions and

446 SCIENCE IN CHINA (Series F) Vol. 46 introduced in refs. [6, ]. Based on this inequality, we add normalization condition, symmetric conditions and Vol. 46 No. 6 SCIENCE IN CHINA (Series F) December 003 Construction for a class of smooth wavelet tight frames PENG Lizhong (Λ Π) & WANG Haihui (Ξ ) LMAM, School of Mathematical Sciences, Peking University,

More information

Minimizing the Difference of L 1 and L 2 Norms with Applications

Minimizing the Difference of L 1 and L 2 Norms with Applications 1/36 Minimizing the Difference of L 1 and L 2 Norms with Department of Mathematical Sciences University of Texas Dallas May 31, 2017 Partially supported by NSF DMS 1522786 2/36 Outline 1 A nonconvex approach:

More information

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison

Optimization. Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison Optimization Benjamin Recht University of California, Berkeley Stephen Wright University of Wisconsin-Madison optimization () cost constraints might be too much to cover in 3 hours optimization (for big

More information

On Optimal Frame Conditioners

On Optimal Frame Conditioners On Optimal Frame Conditioners Chae A. Clark Department of Mathematics University of Maryland, College Park Email: cclark18@math.umd.edu Kasso A. Okoudjou Department of Mathematics University of Maryland,

More information

Contents. 0.1 Notation... 3

Contents. 0.1 Notation... 3 Contents 0.1 Notation........................................ 3 1 A Short Course on Frame Theory 4 1.1 Examples of Signal Expansions............................ 4 1.2 Signal Expansions in Finite-Dimensional

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

l 0 norm based dictionary learning by proximal methods with global convergence

l 0 norm based dictionary learning by proximal methods with global convergence l 0 norm based dictionary learning by proximal methods with global convergence Chenglong Bao, Hui Ji, Yuhui Quan and Zuowei Shen Department of Mathematics, National University of Singapore, Singapore,

More information

Invariant Scattering Convolution Networks

Invariant Scattering Convolution Networks Invariant Scattering Convolution Networks Joan Bruna and Stephane Mallat Submitted to PAMI, Feb. 2012 Presented by Bo Chen Other important related papers: [1] S. Mallat, A Theory for Multiresolution Signal

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth

More information

Wavelet Footprints: Theory, Algorithms, and Applications

Wavelet Footprints: Theory, Algorithms, and Applications 1306 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 51, NO. 5, MAY 2003 Wavelet Footprints: Theory, Algorithms, and Applications Pier Luigi Dragotti, Member, IEEE, and Martin Vetterli, Fellow, IEEE Abstract

More information

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets Ole Christensen, Alexander M. Lindner Abstract We characterize Riesz frames and prove that every Riesz frame

More information

An Introduction to Sparse Approximation

An Introduction to Sparse Approximation An Introduction to Sparse Approximation Anna C. Gilbert Department of Mathematics University of Michigan Basic image/signal/data compression: transform coding Approximate signals sparsely Compress images,

More information

Kernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning

Kernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning Kernel Machines Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 SVM linearly separable case n training points (x 1,, x n ) d features x j is a d-dimensional vector Primal problem:

More information

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

Uniqueness Conditions for A Class of l 0 -Minimization Problems

Uniqueness Conditions for A Class of l 0 -Minimization Problems Uniqueness Conditions for A Class of l 0 -Minimization Problems Chunlei Xu and Yun-Bin Zhao October, 03, Revised January 04 Abstract. We consider a class of l 0 -minimization problems, which is to search

More information

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization Design of Proection Matrix for Compressive Sensing by Nonsmooth Optimization W.-S. Lu T. Hinamoto Dept. of Electrical & Computer Engineering Graduate School of Engineering University of Victoria Hiroshima

More information

CIS 520: Machine Learning Oct 09, Kernel Methods

CIS 520: Machine Learning Oct 09, Kernel Methods CIS 520: Machine Learning Oct 09, 207 Kernel Methods Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture They may or may not cover all the material discussed

More information

2D Wavelets. Hints on advanced Concepts

2D Wavelets. Hints on advanced Concepts 2D Wavelets Hints on advanced Concepts 1 Advanced concepts Wavelet packets Laplacian pyramid Overcomplete bases Discrete wavelet frames (DWF) Algorithme à trous Discrete dyadic wavelet frames (DDWF) Overview

More information

A short introduction to frames, Gabor systems, and wavelet systems

A short introduction to frames, Gabor systems, and wavelet systems Downloaded from orbit.dtu.dk on: Mar 04, 2018 A short introduction to frames, Gabor systems, and wavelet systems Christensen, Ole Published in: Azerbaijan Journal of Mathematics Publication date: 2014

More information

Frame Diagonalization of Matrices

Frame Diagonalization of Matrices Frame Diagonalization of Matrices Fumiko Futamura Mathematics and Computer Science Department Southwestern University 00 E University Ave Georgetown, Texas 78626 U.S.A. Phone: + (52) 863-98 Fax: + (52)

More information

Inpainting for Compressed Images

Inpainting for Compressed Images Inpainting for Compressed Images Jian-Feng Cai a, Hui Ji,b, Fuchun Shang c, Zuowei Shen b a Department of Mathematics, University of California, Los Angeles, CA 90095 b Department of Mathematics, National

More information

Variational Image Restoration

Variational Image Restoration Variational Image Restoration Yuling Jiao yljiaostatistics@znufe.edu.cn School of and Statistics and Mathematics ZNUFE Dec 30, 2014 Outline 1 1 Classical Variational Restoration Models and Algorithms 1.1

More information

Invertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning

Invertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning Invertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning Xian Wei, Martin Kleinsteuber, and Hao Shen Department of Electrical and Computer Engineering Technische Universität München,

More information

Parcimonie en apprentissage statistique

Parcimonie en apprentissage statistique Parcimonie en apprentissage statistique Guillaume Obozinski Ecole des Ponts - ParisTech Journée Parcimonie Fédération Charles Hermite, 23 Juin 2014 Parcimonie en apprentissage 1/44 Classical supervised

More information

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7

x 3y 2z = 6 1.2) 2x 4y 3z = 8 3x + 6y + 8z = 5 x + 3y 2z + 5t = 4 1.5) 2x + 8y z + 9t = 9 3x + 5y 12z + 17t = 7 Linear Algebra and its Applications-Lab 1 1) Use Gaussian elimination to solve the following systems x 1 + x 2 2x 3 + 4x 4 = 5 1.1) 2x 1 + 2x 2 3x 3 + x 4 = 3 3x 1 + 3x 2 4x 3 2x 4 = 1 x + y + 2z = 4 1.4)

More information

Wavelet Frames on the Sphere for Sparse Representations in High Angular Resolution Diusion Imaging. Chen Weiqiang

Wavelet Frames on the Sphere for Sparse Representations in High Angular Resolution Diusion Imaging. Chen Weiqiang Wavelet Frames on the Sphere for Sparse Representations in High Angular Resolution Diusion Imaging Chen Weiqiang Overview 1. Introduction to High Angular Resolution Diusion Imaging (HARDI). 2. Wavelets

More information

arxiv: v1 [math.na] 26 Nov 2009

arxiv: v1 [math.na] 26 Nov 2009 Non-convexly constrained linear inverse problems arxiv:0911.5098v1 [math.na] 26 Nov 2009 Thomas Blumensath Applied Mathematics, School of Mathematics, University of Southampton, University Road, Southampton,

More information

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization

Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Inexact Alternating Direction Method of Multipliers for Separable Convex Optimization Hongchao Zhang hozhang@math.lsu.edu Department of Mathematics Center for Computation and Technology Louisiana State

More information

Sparse Estimation and Dictionary Learning

Sparse Estimation and Dictionary Learning Sparse Estimation and Dictionary Learning (for Biostatistics?) Julien Mairal Biostatistics Seminar, UC Berkeley Julien Mairal Sparse Estimation and Dictionary Learning Methods 1/69 What this talk is about?

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

Frames and operator representations of frames

Frames and operator representations of frames Frames and operator representations of frames Ole Christensen Joint work with Marzieh Hasannasab HATA DTU DTU Compute, Technical University of Denmark HATA: Harmonic Analysis - Theory and Applications

More information

Wavelets: Theory and Applications. Somdatt Sharma

Wavelets: Theory and Applications. Somdatt Sharma Wavelets: Theory and Applications Somdatt Sharma Department of Mathematics, Central University of Jammu, Jammu and Kashmir, India Email:somdattjammu@gmail.com Contents I 1 Representation of Functions 2

More information

5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE

5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE 5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER 2009 Uncertainty Relations for Shift-Invariant Analog Signals Yonina C. Eldar, Senior Member, IEEE Abstract The past several years

More information

Recent Advances in Structured Sparse Models

Recent Advances in Structured Sparse Models Recent Advances in Structured Sparse Models Julien Mairal Willow group - INRIA - ENS - Paris 21 September 2010 LEAR seminar At Grenoble, September 21 st, 2010 Julien Mairal Recent Advances in Structured

More information

EXAMPLES OF REFINABLE COMPONENTWISE POLYNOMIALS

EXAMPLES OF REFINABLE COMPONENTWISE POLYNOMIALS EXAMPLES OF REFINABLE COMPONENTWISE POLYNOMIALS NING BI, BIN HAN, AND ZUOWEI SHEN Abstract. This short note presents four examples of compactly supported symmetric refinable componentwise polynomial functions:

More information

Symmetric Wavelet Tight Frames with Two Generators

Symmetric Wavelet Tight Frames with Two Generators Symmetric Wavelet Tight Frames with Two Generators Ivan W. Selesnick Electrical and Computer Engineering Polytechnic University 6 Metrotech Center, Brooklyn, NY 11201, USA tel: 718 260-3416, fax: 718 260-3906

More information

PAIRS OF DUAL PERIODIC FRAMES

PAIRS OF DUAL PERIODIC FRAMES PAIRS OF DUAL PERIODIC FRAMES OLE CHRISTENSEN AND SAY SONG GOH Abstract. The time-frequency analysis of a signal is often performed via a series expansion arising from well-localized building blocks. Typically,

More information

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators

More information

1 Sparsity and l 1 relaxation

1 Sparsity and l 1 relaxation 6.883 Learning with Combinatorial Structure Note for Lecture 2 Author: Chiyuan Zhang Sparsity and l relaxation Last time we talked about sparsity and characterized when an l relaxation could recover the

More information

Preconditioning techniques in frame theory and probabilistic frames

Preconditioning techniques in frame theory and probabilistic frames Preconditioning techniques in frame theory and probabilistic frames Department of Mathematics & Norbert Wiener Center University of Maryland, College Park AMS Short Course on Finite Frame Theory: A Complete

More information

Contents. Acknowledgments

Contents. Acknowledgments Table of Preface Acknowledgments Notation page xii xx xxi 1 Signals and systems 1 1.1 Continuous and discrete signals 1 1.2 Unit step and nascent delta functions 4 1.3 Relationship between complex exponentials

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015 EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,

More information

Recovering overcomplete sparse representations from structured sensing

Recovering overcomplete sparse representations from structured sensing Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix

More information

An Introduction to Filterbank Frames

An Introduction to Filterbank Frames An Introduction to Filterbank Frames Brody Dylan Johnson St. Louis University October 19, 2010 Brody Dylan Johnson (St. Louis University) An Introduction to Filterbank Frames October 19, 2010 1 / 34 Overview

More information

EUSIPCO

EUSIPCO EUSIPCO 013 1569746769 SUBSET PURSUIT FOR ANALYSIS DICTIONARY LEARNING Ye Zhang 1,, Haolong Wang 1, Tenglong Yu 1, Wenwu Wang 1 Department of Electronic and Information Engineering, Nanchang University,

More information

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction

A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction A New Look at First Order Methods Lifting the Lipschitz Gradient Continuity Restriction Marc Teboulle School of Mathematical Sciences Tel Aviv University Joint work with H. Bauschke and J. Bolte Optimization

More information

Topics in Harmonic Analysis, Sparse Representations, and Data Analysis

Topics in Harmonic Analysis, Sparse Representations, and Data Analysis Topics in Harmonic Analysis, Sparse Representations, and Data Analysis Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu Thesis Defense,

More information

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22

More information

Proximal Minimization by Incremental Surrogate Optimization (MISO)

Proximal Minimization by Incremental Surrogate Optimization (MISO) Proximal Minimization by Incremental Surrogate Optimization (MISO) (and a few variants) Julien Mairal Inria, Grenoble ICCOPT, Tokyo, 2016 Julien Mairal, Inria MISO 1/26 Motivation: large-scale machine

More information

A BRIEF INTRODUCTION TO HILBERT SPACE FRAME THEORY AND ITS APPLICATIONS AMS SHORT COURSE: JOINT MATHEMATICS MEETINGS SAN ANTONIO, 2015 PETER G. CASAZZA Abstract. This is a short introduction to Hilbert

More information

Variable Metric Forward-Backward Algorithm

Variable Metric Forward-Backward Algorithm Variable Metric Forward-Backward Algorithm 1/37 Variable Metric Forward-Backward Algorithm for minimizing the sum of a differentiable function and a convex function E. Chouzenoux in collaboration with

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

Wavelet Bi-frames with Uniform Symmetry for Curve Multiresolution Processing

Wavelet Bi-frames with Uniform Symmetry for Curve Multiresolution Processing Wavelet Bi-frames with Uniform Symmetry for Curve Multiresolution Processing Qingtang Jiang Abstract This paper is about the construction of univariate wavelet bi-frames with each framelet being symmetric.

More information

A primer on the theory of frames

A primer on the theory of frames A primer on the theory of frames Jordy van Velthoven Abstract This report aims to give an overview of frame theory in order to gain insight in the use of the frame framework as a unifying layer in the

More information