.... Regularization in Reproducing Kernel Banach Spaces Guohui Song School of Mathematical and Statistical Sciences Arizona State University Comp Math Seminar, September 16, 2010 Joint work with Dr. Fred Hickernell (Illinois Institute of Technology) Dr. Haizhang Zhang (Sun Yat-Sen University) Guohui Song Comp Math 1 of 21
Outline 1 Scattered Data Approximation 2 Reproducing Kernel Hilbert Spaces 3 Reproducing Kernel Banach Spaces Guohui Song Comp Math 2 of 21
Scattered Data Approximation Setting: Given data {(xj, y j ) : j = 1, 2,..., n} in R d R. Find a function Pf which is a good fit to the given data.. Question 1... What is a good fit?.. Guohui Song Comp Math 4 of 21
A 1-D Example scattered data points Guohui Song Comp Math 5 of 21
A Regularization Approach We want to control the closeness to the given data the complexity of the function A regularization approach: Target function: L(f, y, H) := n j=1 (f (x j) y j ) 2 + λ f 2 H Pf := arg min L(f, y, H). f H Some fancy names: penalized least square, ridge regression, smoothing spline. Question 2... What is the hypothesis space H?.. Guohui Song Comp Math 6 of 21
Reproducing Kernel Hilbert Spaces (RKHS) We need H is a Hilbert space f n f H 0 = f n (x) f (x) 0 for all x X RKHS: a Hilbert space H on which the point evaluation functional is continuous. f (x) M x f H for all x X. Question 3....Ẉhere is the kernel? Guohui Song Comp Math 8 of 21
Kernel Suppose X is a subset of R d. K is a real-valued function on X X : K : X X R K is a kernel if for any positive integer m and X := {x1,..., x m } X, the kernel gram matrix K m := [K(x j, x l ) : 1 j, l m] is symmetric and positive semi-definite. Guohui Song Comp Math 9 of 21
Connections Between RKHS and Kernels [Aronszajn, 1950] There is a bijective mapping from the RKHS to the set of kernels such that K(, x) H for any x X, f (x) = (f ( ), K(, x)) H for any f H. Some properties of RKHS HK and the kernel K H 0 := span{k(, x) : x X } is dense in H K. For any f = m j=1 c jk(, x j ) H 0, f HK = K 1/2 m c 2. Guohui Song Comp Math 10 of 21
Some Examples of RKHS and Kernels Sobolev space H 2 (R): K(s, t) = 3 3 ( ) 3 e 2 s t sin s t 2 + π 6. C 0 Matérn kernel: K(s, t) = e s t Gaussian kernel: K(s, t) = e w(s t) 2, w > 0. Sinc kernel: K(s, t) = sinc(s t). Polynomial kernels: K(s, t) = (st) d, d = 1, 2,.... L 2 (R) is NOT a RKHS. Guohui Song Comp Math 11 of 21
Regularization in the RKHS H K [Kimeldorf and Wahba, 1971] Representer Theorem : Target function: L(f, y, H K ) = n j=1 (f (x j) y j ) 2 + λ f 2 H K. Let S n := span{k(, x j) : j = 1, 2,..., n}. The optimization problem reduces to finite-dimensional: min L(f, y, H K) = min L(f, y, H K) f H K f S n The minimizer is explicitly given: Pf = n α jk(, x j), where α = (K n + λi n) 1 y. j=1 Guohui Song Comp Math 12 of 21
Reproducing Kernel Banach Spaces We try to construct a Banach space B point evaluation functional δ x is continuous on B A specific construction Let B 0 := span{k(, x) : x X }. For any f = m j=1 c jk(, x j ) B 0, define f B := c 1. δ x is continuous on B 0 if K(, ) is uniformly bounded. Let B be the Banach completion of B 0 with the norm B. Guohui Song Comp Math 14 of 21
Some Properties of RKBS [Song2010+] Point evaluation functional is continuous on B if and only if α j K(, x j ) = 0 = α = 0. j=1 [Song2010+] Reproducing property still holds. Define a bilinear form <, > on B 0 B 0 such that < m m α jk(, x j), β jk(, x j) >= α T K mβ j=1 j=1 The bilinear form <, > can be extended to B B such that < f, K(, x) >= f (x), x X, f B. Guohui Song Comp Math 15 of 21
Regularization in RKBS Target function: L(f, y, B) = n j=1 (f (x j) y j ) 2 + λ f B. Recall Sn = span{k(, x j ) : j = 1, 2,..., n}. Does the optimization problem reduce to finite-dimensional??? min L(f, y, B) = min L(f, y, B) f B f S n If it can reduce to finite-dimensional, how to find the minimizer Pf = n j=1 α jk(, x j )? Guohui Song Comp Math 16 of 21
Regularization and Interpolation Define the interpolation space I n (y) = {f B : f (x j ) = y j, j = 1, 2,..., n}. [Song2010+] The following two statements are equivalent. min L(f, y, B) = min L(f, y, B), for all y R n. f B f S n min f B = min f B, for all y R n. f I n(y) f I n(y) S n Note that In (y) S n has only one element when K n is invertible. We only need to show that the minimal norm interpolation problem admits a minimizer in the finite-dimensional space S n. Guohui Song Comp Math 17 of 21
Representer Theorem in RKBS Let k(x) := (K(x, x1 ),..., K(x, x n )) T. [Song2010+] Minimal norm interpolation min f B = min f B, for all y R n f I n (y) f I n (y) S n K n 1 k(x) 1 1, for all x X. [Song2010+] Regularization min L(f, y, B) = min L(f, y, B), for all y R n f B f S n K n 1 k(x) 1 1, for all x X. Guohui Song Comp Math 18 of 21
Some Examples The condition Kn 1 k(x) 1 1 is not easy to check. We have only been able to find two kernels satisfying it so far. K(s, t) = min{s, t} st, s, t [0, 1] K(s, t) = e s t, s, t R Counter examples that does not satisfy this condition Gaussian kernels: K(s, t) = e (s t)2, Sinc Kernel: K(s, t) = sinc(s t), s, t R s, t R Guohui Song Comp Math 19 of 21
How to find the minimizer? { } n min L(f, y, B) = min (f (x j ) y j ) 2 + λ c 1 : f = n c j K(, x j ) f S n j=1 We do not have a closed form of the minimizer. Standard optimization methods may do, but we still need efficient methods especially for large size of data. j=1 Guohui Song Comp Math 20 of 21
Thank you! Guohui Song Comp Math 21 of 21