Data fitting by vector (V,f)-reproducing kernels

Data fitting by vector (V,f-reproducing kernels M-N. Benbourhim to appear in ESAIM.Proc 2007 Abstract In this paper we propose a constructive method to build vector reproducing kernels. We define the notion of vector (V,f-reproducing kernel and we prove that every vector reproducing kernel is a (V,f-reproducing kernel. We study the minimal approximation by these (V,f-reproducing kernels for different choices of V and F. Keywords: Vector (V, f-reproducing kernels, approximation theory, smoothing and interpolation (V, f-splines. AMS classification: 65Dxx-41A15-65D15, 60E05 Introduction Kernels are valuable tools in various fields of numerical analysis, including approximation, interpolation, meshless methods for solving partial differential equations, neural networks, and machine learning. This contribution proposes a constructive method to build vector reproducing kernels for approximation theory uses. The problem of computing a function from empirical data is addressed in several areas of mathematics and engineering. Depending on the context, this problem goes under the name of function estimation (statistics, function approximation and interpolation (approximation theory, among others. The outline of this paper is as follows: in Section 1, we recall some fundamental results on vectors reproducing kernels and we define the notion of (V, f-reproducing kernels. We state the fundamental result, that a matrix function is a vector reproducing kernel if and only if it is a (V, f-reproducing kernel. In Section 2, we give examples of (V, f-reproducing kernels. In Section 3, we present the first vector approximation problem and we prove the existence and/or uniqueness of the solution. In Section 4, we present the second vector approximation problem, preserving a finite dimensional vector space and we prove the existence and/or uniqueness of the solution. Laboratoire MIP-UMR 5640, Université Paul Sabatier, UFR MIG, 118, route de Narbonne, F-31062 Toulouse Cedex 04, FRANCE. E-mail:bbourhim@cict.fr. 1

Benbourhim/Data fitting by vector (V,f-reproducing kernels. 2 1 Vector (V, f-reproducing kernels 1.1 Vector reproducing kernels For any set Ω we denote by (R n Ω the real vector space of functions h : Ω R n equipped with the topology of point wise convergence. Definition 1.1 A real valued matrix-function H(t, s (H k,l (t, s 1 k,l n defined on Ω Ω is a reproducing kernel (RK if 1- It is symmetric H(t, s H T (s, t, t, s Ω. (1.1 2- For every finite set {t j } 1 j N of distinct points in Ω and for every set of real scalars, we have {λ i,l }1 l n 1 i N 1 k, H k,l (t i, t j λ j,l λ i,k 0. (1.2 Remark 1.1 Taking λ j,l µ j c j,l in Equation (1.2, it is easy to see that Definition 1.1 is equivalent to that for every finite set {t j } N 1 of distinct points in Ω and for every elements c j (c j,l 1 l n of R n, the matrix (c T i H(t i, t j c j 1 N is a positive matrix. Proposition 1.1 The RK H(t, s has the following properties: 1- For all k1,...,n, the function H k,k (t, s is a RK. 2- For all k,,...,n, and t, s in Ω, we have the Cauchy-Schwartz inequality H k,l (t, s 2 H k,k (t, th l,l (s, s. (1.3 Proof. It is a consequence of Remark 1.1 and the properties of symmetric positive matrices. Definition 1.2 A vector subspace H of (R n Ω equipped with a scalar product H is called a hilbertian subspace of (R n Ω if 1- (H, H is a Hilbert space. 2- The natural injection from H into (R n Ω is continuous. We recall some important results on RK and its associated hilbertian subspace which are studied in [9]. Theorem 1.1 For any reproducing kernel H(t, s there exists a unique hilbertian subspace H H of (R n Ω such that: 1- The space { H 0,H u (R n Ω u(t is a dense subspace of H H. } H(t, t i c i, c i R n, 1 i N, t Ω (1.4

Benbourhim/Data fitting by vector (V,f-reproducing kernels. 3 2- H(t, s is the reproducing kernel of H H : for all u H H, c R n and t Ω. 1.2 Vector (V,f-reproducing kernels u H(., tc HH c T u(t, (1.5 Definition 1.3 Let H(t, s (H k,l (t, s 1 k,l n a real valued matrix-function defined on Ω Ω. We say that H(t, s is a (V, f-reproducing kernel ((V, f RK if there exist a real Hilbert space (V, V and a function f (f k 1 k n from Ω into V n such that H(t, s ( f k (t f l (s V 1 k,l n. (1.6 Theorem 1.2 A real valued matrix-function H(t, s (H k,l (t, s 1 k,l n is a reproducing kernel if and only if it is a (V, f-reproducing kernel. Proof. It is clear that a (V, f RK, H(t, s ( f k (t f l (s V 1 k,l n is symmetric and satisfies 2 λ i,k λ j,l H k,l (t i, t j λ i,k λ j,l f k (t i f l (t j V λ i,k f k (t i 0, 1 k, 1 k, k1 which implies that H(t, s is a RK. Conversely let H(t, s (H k,l (t, s 1 k,l n be a RK. From Theorem 1.1, there exists a hilbertian subspace H H of (R n Ω which admits H(t, s as a reproducing kernel. Let V H f and f k (t H(t,.e k H H with e k (δ k,l 1 l n. From the reproducing formula (1.5, we get: H k,l (t, s f k (t f l (s Hf. Then H(t, s is a (V, f RK. In the following theorem we establish a characterization of the hilbertian subspace H f associated to H. Theorem 1.3 Let H(t, s ( f k (t f l (s V 1 k,l n be a (V, f-reproducing kernel. Its associated hilbertian subspace H f of (R n Ω is defined by: { } H f u (u k 1 k n (R n Ω v V : u k (t v f k (t V, 1 k n, t Ω. (1.7 Proof. Let A f : V (R n Ω be defined by (A f v k (t v f k (t V, 1 k n. The application A f is linear and from the inequality ( ( (A f v(t 2 v f k (t V 2 f k (t 2 V v 2 V H k,k (t, t 2 v 2 V, k1 k1 k1 we deduce that A f is continuous. Let ker(a f {v V v f k (t V 0, k 1,, n and t Ω} and, B ( ker(a f the orthogonal space in V of B. One can easily verify that B is the closure of the space span {f k (t} (k,t Nn Ω, with N n {1,..., n}. We denote P B the orthogonal projector on B. We define on H f A f (V the bilinear form A f u A f v Hf P B u P B v V. It is easy to see that this bilinear form is a scalar product on H f. Then the linear application A f : (ker(a f H f V

Benbourhim/Data fitting by vector (V,f-reproducing kernels. 4 is an isometry and consequently (H f, Hf is a Hilbert space. For all s Ω and c (c l 1 l n R n the function ( ( H(t, sc : t Ω H k,l (t, sc l 1 k n ( f k (t f l (s V c l f k (t f l (sc l V is an element of H f and satisfies the reproducing formula (1.5. Thus A f v H(, sc Hf P B v P B ( c l v f l (s V f l (sc l V v 1 k n 1 k n f l (sc l V c l (A f v l (s c T.(A f v(s, ( A f ( for all v in V. Consequently (see Theorem 1.1 H f is a hilbertian subspace of (R n Ω and admits H as a reproducing kernel. 2 Examples of (V, f-reproducing kernels 2.1 Example 1 Let V L 2 (a, b and let Ω be a subset of R d. For any function c k : Ω R, 1 k n, let f k (t(x exp(c k (tx. We have exp(b(c k (t + c l (s exp(a(c k (t + c l (s if (c H k,l (t, s k (t + c l (s 0, c k (t + c l (s (2.1 b a otherwise. and H f { u (u k 1 k n (R n Ω v L 2 (a, b : u k (t 2.2 Example 2 b a f l (sc l (t, } v(x exp(c k (txdx, 1 k n, t Ω. Let V L 2 (0, + and let Ω be a subset of R d. For any function c k : Ω ]0, + [, 1 k n. Thus 1 If f k (t(x 4 1 exp( c k (t x 2 1 then H k,l (t, s π ck (t + c l (s. 1 2 If f k (t(x exp( c k (tx then H k,l (t, s c k (t + c l (s and in particular if c k(t P k(t Q k (t, where P k (t and Q k (t are polynomials, we obtain a rational reproducing kernel H k,l (t, s Q k (tq k (s P k (tq k (s + P k (sq k (t. (2.2

Benbourhim/Data fitting by vector (V,f-reproducing kernels. 5 2.3 Example 3 : (V,f-RK of convolution type We consider the case where 1 V L 2 (R d and Ω R d. 2 f k (t(x f k (t x with f k is in the usual Sobolev space H m (R d. Then H k,l (t, s f k (t xf l (s xdx. R d Theorem 2.1 We have the following properties 1- H k,l (t, s G k,l (t s with H k,l (ξ f k ˇf l (ξ where ˇf l (x f l ( x and F is the Fourier transform. 2- G k,l C m 0 (Rd, where C m 0 (Rd is the space of compactly supported functions of class C m on R d. 3- The associated hilbertian subspace of His { } H f f L 2 (R d u (u k 1 k n C(R d ; R d v L 2 (R d : u k f k v, 1 k n C0 m (R d 4- If f k is radial, 1 k n, then the (V,f-RK H k, is radial: H k,k (t, s H k,k ( t s for 1 k, l n. Proof. 1- We have H k,l (t, s f k (t xf l (s xdx R d f k (yf l (y (t sdy f k ˇf l (t s. R d 2- f k H m (R d D α H k,l (D α f k ˇf l C 0 0 (Rd for α m, (See[4]. 3- It is a consequence of Theorem 1.2 and the property given in item 1. 4- For any orthogonal matrix A H k,l (At f k (xf l (x Atdx R d f k (Axf l (A(x tdx R d f k (xf l (x tdx H k,l (t, R d since f k (Ax f k (x and det A 1. 3 Data fitting by vector (V, f-reproducing kernels Given a subset Ω N {t 1,..., t N } of Ω and a set of vectors Z N {z 1,..., z N } in R n, the scattered data approximation problem consists in finding a vector-valued function σ ɛ such that the system of equations σ ɛ (t i z i + θ i (ɛ, (3.1

Benbourhim/Data fitting by vector (V,f-reproducing kernels. 6 has a solution of the form σ ɛ kn k1 where the unknown error function θ i (ɛ satisfy, θ i (0 0. Let A N be a linear operator from H f into (R N n defined by First, we give the following definition H f (t, t k, a ɛ i (3.2 A N u ( u 1 (t 1,..., u 1 (t N,..., u n (t 1,..., u n (t N. Definition 3.1 For all Z N (R n N, and ɛ 0 we define a (V, f-spline function as a solution σ ɛ of the following approximation problem: (P ɛ (Z N : min ((1 ɛ v 2 + ɛ Av Z 2(R Hf v C ɛ(z N n, (3.3 N where C ɛ (Z N { A 1 (Z N for ɛ 0 (Interpolating Problem H f for ɛ > 0 (Smoothing Problem and A 1 N (Z N {v H f : A N v Z N }. Here. (R n N denotes the standard Euclidean norm on (R n N. The explicit expression of the solution of the problem (3.3 is given in the following theorem. Theorem 3.1 For all u H f the problem (3.3 with A N u Z N, admits a unique solution σ ɛ H f which is explicitly given by (3.4 (3.5 σ ɛ (t H f (t, t i a ɛ i. (3.6 The coefficients a ɛ i for i 1,..., N, are obtained by solving the nn nn linear system where a ɛ and Z N are the vectors given by { ( 0 if ɛ 0, H N f + c ɛ I nn a ɛ Z N with c ɛ 1 if ɛ > 0, ɛ (3.7 a ɛ (a ɛ 1,1,..., aɛ N,1,..., aɛ 1,n,..., aɛ N,n t R n.n, Z N (Z 1,n,..., Z N,n t R n.n, and I nn and Hf N (HN (l,k f 1 l,k n are the nn nn matrix identity and nn nn matrices, respectively. The blocks H (l,k f of H f are given by Hf N < f k (t i f l (t j > V [ ]. H f (l,k 1 i N 1 j N Proof. From the continuous embedding H f R n (see Theorem 1.1, we deduce that A N is continuous. Let I be the identity operator in H f. We have (l,k

Benbourhim/Data fitting by vector (V,f-reproducing kernels. 7 1- A N (H f is a closed subspace of (R n N. 2- I(H f H f is closed. 3- ker(a N ker(i {0}. 4- ker(a N + ker(i ker(a N is closed in H f. According to the general spline theory (see [2, 5], we get the theorem. Using the general spline theory, we obtain the following proposition in the case of smoothing problem (ɛ > 0. Proposition 3.1 For all (ɛ, Z N ]0, 1[ (R n N σ ɛ H f which is explicitly given by the problem (3.3 admits a unique solution σ ɛ (t H f (t, t i a ɛ i. (3.8 The coefficients a ɛ i for i 1,..., N, are obtained by solving the nn nn non singular linear system ( Hf N + 1 ɛ I nn a ɛ Z N. (3.9 For the case of interpolating problem (ɛ 0we have the proposition Proposition 3.2 The following properties are equivalent: 1- The linear application A N is surjective from H f onto (R n N. 2- For all Z N (R n N the problem (P 0 (Z N admits a unique solution. 3- The matrix H N f is non singular, i.e. definite positive. 4- The system {f k (t i }1 k n 1 i N Proof. We have is linearly independent in V. 1 2 It is a consequence of the general spline theory (see [2, 5]. 2 3 Let a ɛ (a ɛ 1,, aɛ N be a solution of the homogeneous system HN f aɛ 0 and σ ɛ (t H f (t, t i a ɛ i. We have A N σ ɛ Hf Naɛ 0. Item 2 implies that σ ɛ 0. According to the reproducing formula (1.5, we get that for all v in H f 0 < σ ɛ v > Hf (a ɛ i T.v(t i (3.10 Let Z (i,k (δ k,l 1 l n, for i 1,, N and k 1,, n. There exists an interpolating (V, f-spline function σ 0,(i,k, 1 i N and 1 k n, solution of the problem (P 0 (Z (i,k given by (3.3. Taking σ 0,(i,k successively in equation (3.10, we get a ɛ i 0 for i 1,..., N.

Benbourhim/Data fitting by vector (V,f-reproducing kernels. 8 3 4 Since the matrix Hf N ( f k (t i f l (t j V 1 N is a Gram matrix, it is invertible if and 1 k,l n only if the system {f k (t i }1 i N is linearly independent in V. 1 k n 4 1 Since the matrix Hf N is invertible, then for all Z N (R n N there exists a (a i 1 i N in (R n N such that Hf Na Z N. The element σ 0 of H f defined by σ 0 (t H f (t, t i a i satisfies A N σ 0 Z N. 4 Data fitting preserving polynomials Let P be a finite dimensional vector subspace of (R n Ω and let {p 1,..., p m } be a basis of P. In the second scattered data approximation problem, the P-reproduction property is required. Given a subset Ω N {t 1,..., t N } of Ω and a set of vectors Z N {z 1,..., z N } in R nn, the scattered data approximation problem consists in finding a vector-valued function σ ɛ such that the system of equations σ ɛ (t i z i + θ i (ɛ (4.1 has a solution of the form σ ɛ kn k1 n(m H f (t, t k a ɛ i + j1 where the unknown error function θ i (ɛ satisfies, θ i (0 0. Hereafter we assume that p j (tb ɛ j, (4.2 (H1 For all p P : {p(t i 0, 1 i N} p 0. (4.3 (H2 H f P {0}. (4.4 Let HP f denote the Hilbert direct sum: HP f H f P. We denote by Π f the orthogonal projector from HP f onto H f and we define on HP f the linear application First, we give the following definition A N u ( u 1 (t 1,..., u 1 (t N,..., u n (t 1,..., u n (t N. Definition 4.1 For all Z N (R n N, and ɛ 0 we define a (V, f-spline function as a solution σ ɛ of the following approximation problem: (P ɛ (Z N : min ((1 ɛ Π f v 2 + ɛ Av Z 2(R Hf v C ɛ(z N n, (4.5 N where A 1 (Z N for ɛ 0 (Interpolating Problem C ɛ (Z N (HP f for ɛ > 0 (Smoothing Problem P for ɛ 1 (Least squares and A 1 (Z N {v (HP f : A N v Z N }. (4.6 (4.7

Benbourhim/Data fitting by vector (V,f-reproducing kernels. 9 The explicit expression of the solution of the problem (4.5 is given in the following theorem. Theorem 4.1 For all u (HP f the problem (4.5 with A N u Z N, admits a unique solution σ ɛ (HP f which is explicitly given by σ ɛ (t m H f (t, t i a ɛ i + q j (tb ɛ j (4.8 j1 The coefficients a ɛ i and bɛ j for i 1,..., N, are obtained by solving the nn nn linear system ( H N f + c ɛ I nn M M t O ( a ɛ b ɛ ( ZN 0 with c ɛ { 0 if ɛ 0, 1 ɛ if ɛ > 0, (4.9 where a ɛ, b ɛ and Z N are the vectors given by a ɛ (a ɛ 1,1,..., aɛ N,1,..., aɛ 1,n,..., aɛ N,n t R n.n, b ɛ (b ɛ 1,1,..., bɛ n(m,1,..., bɛ 1,n,..., bɛ n(m,n t R n.n(m, Z N (z 1,1,..., z N,1,..., z 1,n,..., z N,n t R n.n, and Hf N (HN (l,k f 1 l,k n and M (M (l,k 1 l,k n are nn nn and nn nn(m matrices, [ ] [ ] respectively. The blocks Hf N (l,k Hf N and M (l,k M (l,k of Hf N and M are given by and H N f (l,k (l,k 1 i N 1 j N < f k (t i f l (t j > V M (l,k δ l,k q j (x i, 1 i m 1 j N respectively. In particular, the property of preserving polynomials holds, which means that if there exists q P such that q(t i z i, for i 1,..., N then σ ɛ q. Proof. We have 1. A N is continuous and A N ((HP f is a closed: From the continuous embedding property, H f (R n Ω (see Theorem 1.1, we deduce that A N is continuous. A N ((HP f is closed cause it is a finite dimensional space. 2. Π f ((HP f H f is closed. 3. ker(a N ker(π f {0}. 4. ker(a N + ker(π f is closed: It is a consequence of the fact that ker(a N is closed and ker(π f P is a finite dimensional space. According to the general spline theory (see [2, 5], we get the theorem. Using the general spline theory, we obtain the following proposition in the case of smoothing problem (ɛ > 0.

Benbourhim/Data fitting by vector (V,f-reproducing kernels. 10 Proposition 4.1 For all (ɛ, Z N ]0, 1[ (R n N σ ɛ (HP f which is explicitly given by the problem (4.5 admits a unique solution σ ɛ (t m H f (t, t i a ɛ i + p j (tb ɛ j. (4.10 j1 The coefficients a ɛ i and bɛ j for i 1,..., N for i 1,..., N, are the solution of the nn nn non singular linear system ( Hf N + 1 ɛ I nn a ɛ Z N. (4.11 For the case of smoothing problem (ɛ > 0, using a similar proof as in Proposition 3.2, we get Proposition 4.2 The following properties are equivalent: 1. The linear application A N is surjective. 2. For all Z N (R n N the problem (P 0 (Z N admits a unique solution. 3. The matrix H N f is non singular, i.e. definite positive. 4. The system {f k (t i }1 k n 1 i N References is linearly independent in V. [1] L. Amodei, Reproducing Kernels of Vector-Valued Function Spaces, Curve and surface fitting; Chamonix 1996, A. Le Méhauté, C. Rabut and L.L. Schumeker, eds, Vanderbilt University Press, Nashville, 2000, 1-9. [2] M. Attéia, Hilertian Kernels and Splines Functions, Elsevier Science, North Holland, 1992. [3] M.N. Benbourhim, Constructive Approximation by (V,f-Reproducing Kernels, Curve and surface fitting; Saint Malo 1999, A. Cohen, C. Rabut and L.L. Schumeker, eds, Vanderbilt University Press, Nashville, 2000, 57-64. [4] W.F. Donughue, Distributions and Fourier Transform, Academic Press, 1972. [5] P.J. Laurent, Approximation et Optimisation, Hermann, Paris, 1972. [6] C.A. Micchelli and M. Pontil, On Learning Vector-Valued Functions, Research Note RN/03/08, Department of Computer Science, University College London, 2003. [7] S. Saitoh, Theory of reproducing kernels and its applications, Pitman Research Notes in Mathematics Series, Longman Scientific and Technical, 1994. [8] R. Schaback and H. Wendland Kernel Techniques: From Machine Learning to Meshless Methods, Acta Numerica (2006, pp. 1-97, Cambridge University Press, 2006. [9] L. Schwartz, Sous-espaces hilbertiens d espaces vectoriels topologiques et noyaux associés, J. Analyse Math. 13(1964, 115-256.