A simple remark on the order ofapproximation by compactly supported radial basis functions, and related networks

Similar documents
Approximation by Conditionally Positive Definite Functions with Finitely Many Centers

2014:05 Incremental Greedy Algorithm and its Applications in Numerical Integration. V. Temlyakov

Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels

Kernel B Splines and Interpolation

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods

INVERSE AND SATURATION THEOREMS FOR RADIAL BASIS FUNCTION INTERPOLATION

290 J.M. Carnicer, J.M. Pe~na basis (u 1 ; : : : ; u n ) consisting of minimally supported elements, yet also has a basis (v 1 ; : : : ; v n ) which f

Multivariate Interpolation with Increasingly Flat Radial Basis Functions of Finite Smoothness

Stability of Kernel Based Interpolation

Positive Definite Functions on Hilbert Space

ON VECTOR-VALUED INEQUALITIES FOR SIDON SETS AND SETS OF INTERPOLATION

Data fitting by vector (V,f)-reproducing kernels

L p Approximation of Sigma Pi Neural Networks

DIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION

Notes for Functional Analysis

AN ELEMENTARY PROOF OF THE OPTIMAL RECOVERY OF THE THIN PLATE SPLINE RADIAL BASIS FUNCTION

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods

Approximation of Multivariate Functions

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets

Interpolation by Basis Functions of Different Scales and Shapes

On the optimality of incremental neural network algorithms

Approximating By Ridge Functions. Allan Pinkus. We hope it will also encourage some readers to consider researching

D. Shepard, Shepard functions, late 1960s (application, surface modelling)

of Orthogonal Matching Pursuit

On Ridge Functions. Allan Pinkus. September 23, Technion. Allan Pinkus (Technion) Ridge Function September 23, / 27

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Stability constants for kernel-based interpolation processes

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

On interpolation by radial polynomials C. de Boor Happy 60th and beyond, Charlie!

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions

A sharp upper bound on the approximation order of smooth bivariate pp functions C. de Boor and R.Q. Jia

Non-radial solutions to a bi-harmonic equation with negative exponent

Error formulas for divided difference expansions and numerical differentiation

Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem

Estimates of the Number of Hidden Units and Variation with Respect to Half-spaces

MATH 590: Meshfree Methods

BKW-Operators on the Interval and the Sequence Spaces

Biorthogonal Spline Type Wavelets

L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS

A NOTE ON MATRIX REFINEMENT EQUATIONS. Abstract. Renement equations involving matrix masks are receiving a lot of attention these days.

NONTRIVIAL SOLUTIONS FOR SUPERQUADRATIC NONAUTONOMOUS PERIODIC SYSTEMS. Shouchuan Hu Nikolas S. Papageorgiou. 1. Introduction

Bernstein-Szegö Inequalities in Reproducing Kernel Hilbert Spaces ABSTRACT 1. INTRODUCTION

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis

k-dimensional INTERSECTIONS OF CONVEX SETS AND CONVEX KERNELS

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

Meshfree Approximation Methods with MATLAB

Existence of Solutions for a Class of p(x)-biharmonic Problems without (A-R) Type Conditions

Multiplication Operators with Closed Range in Operator Algebras

1 Math 241A-B Homework Problem List for F2015 and W2016

ON A CERTAIN GENERALIZATION OF THE KRASNOSEL SKII THEOREM

Discrete Projection Methods for Integral Equations

A Note on Nonconvex Minimax Theorem with Separable Homogeneous Polynomials

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

EXISTENCE OF SET-INTERPOLATING AND ENERGY- MINIMIZING CURVES

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Adaptive Control with Multiresolution Bases

Viscosity Iterative Approximating the Common Fixed Points of Non-expansive Semigroups in Banach Spaces

Fundamentality of Ridge Functions

Approximation theory in neural networks

Optimal data-independent point locations for RBF interpolation

Viewed From Cubic Splines. further clarication and improvement. This can be described by applying the

ON AN INEQUALITY OF KOLMOGOROV AND STEIN

Functional Analysis, Stein-Shakarchi Chapter 1

1 Continuity Classes C m (Ω)

On the computation of Hermite-Humbert constants for real quadratic number fields

Multiresolution analysis by infinitely differentiable compactly supported functions. N. Dyn A. Ron. September 1992 ABSTRACT

ON THE SUPPORT OF CERTAIN SYMMETRIC STABLE PROBABILITY MEASURES ON TVS

Metric Spaces and Topology

Overview of normed linear spaces

The Infinity Norm of a Certain Type of Symmetric Circulant Matrix

ENGEL SERIES EXPANSIONS OF LAURENT SERIES AND HAUSDORFF DIMENSIONS

Introduction to Bases in Banach Spaces

Nonlinear tensor product approximation

Ginés López 1, Miguel Martín 1 2, and Javier Merí 1

i=1 β i,i.e. = β 1 x β x β 1 1 xβ d

3. Some tools for the analysis of sequential strategies based on a Gaussian process prior

Kernels for Multi task Learning

A Proof of Markov s Theorem for Polynomials on Banach spaces

Toward Approximate Moving Least Squares Approximation with Irregularly Spaced Centers

Continuous functions that are nowhere differentiable

INFINITUDE OF MINIMALLY SUPPORTED TOTALLY INTERPOLATING BIORTHOGONAL MULTIWAVELET SYSTEMS WITH LOW APPROXIMATION ORDERS. Youngwoo Choi and Jaewon Jung

Lower Bounds for Approximation by MLP Neural Networks

Exercise Solutions to Functional Analysis

arxiv: v2 [cs.ne] 20 May 2016

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Convergence of greedy approximation I. General systems

1.4 The Jacobian of a map

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

Linear Independence of Finite Gabor Systems

On John type ellipsoids

A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators

Geometry and topology of continuous best and near best approximations

The Rademacher Cotype of Operators from l N

Weak-Star Convergence of Convex Sets

On the usage of lines in GC n sets

Problem Set 6: Solutions Math 201A: Fall a n x n,

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina

PLEASE SCROLL DOWN FOR ARTICLE

Transcription:

A simple remark on the order ofapproximation by compactly supported radial basis functions, and related networks Xin Li Department of Mathematical Sciences, University of Nevada, Las E-Mail: xinlixin@nevada.edu Abstract We consider simultaneous approximation of multivariate functions and their derivatives by using Wendland's compactly supported radial basis functions <t>s,k' By applying a greedy algorithm, it is presented that, regardless of dimension, an O(ra~*/^) order of approximation can be achieved by a linear combination of m translates of <t>s,k- A similar result on approximation by neural networks is established by using univariate radial functions as the activation functions of the networks. 1. Introduction Multivariate interpolation by radial basis functions has been studied and applied in several areas of mathematics, such as approximation theory (cf. Franke^, Micchelli**, Schaback**), curve and surface fitting (cf. Daehlen, Lyche, and Schumaker*), and numerical solutions of PDE equations (cf. Golberg and Chen^). A function $ is radial if <&(x) = </>( x ), where 0 : R+ > R is a univariate function and x is the usual Euclidean norm of x ft*.for / C(R*) and a set X = {xi,, XAT} C R of distinct points, the radial basis function interpolant a/ is given by N */,x(x) = ]>]a,0(x - x,), (1.1) 3 = 1 where the coefficents c*i,, c%# are determined by 8f,x(*j) = f(xj), l<j<n. (1.2)

336 Boundary Element Technology To ensure the solubility of the interpolation problem (1.1) and (1.2), positive definite functions are used. A continuous function $ : R* > R is positive definite, denoted by <& PD, if for any N Z+, any set of pairwise distinct centers X = {x%,,x#} C R*, and any vector a #^\{0}, the quadratic form N N is positive. If 3> is compactly supported, written as $ CS, the coefficients (*i,, OiN in (1.1) are easy to determine by (1.2). A celebrated Theorem of Bochner (cf. Steward^) characterizes all positive definite functions. In the case that $ is compactly supported, the Theorem is interpreted as: <& is positive definite if and only if its Fourier transform is nonnegative and positive on an open subset. Since the Fourier transform of a radial basis function 3>(x) = </>( x ) is given by = (2rr) R* JO, (1.3) where p = w, and J ^ is the Bessel function of thefirstkind, compactly supported radial basis functions are constructed in Wu^ and Wendland^. In this paper, we consider approximation by using Wendland functions, which we now review the definition and their properties. Following Reference [16], let yoc = r sf(s)ds, Jr and, (1.4) where ^ _a/2j+fc+im = (1 --F)+, and [x\ denotes the largest integer < x. It is shown in Reference [16] that <^& is compactly supported in [0,1], and induces a positive definite function on R* in the way that,.m-/? -*(» )> 0<r<l, where r x, and PS^ is a univariate polynomial of degree [s/2j +3&4-1. Moreover, <f>s^ possesses continuous derivatives up to order 2fc, and it is of minimal degree for a given space dimension s and smoothness 2k and is up to a constant factor uniquely determined by this setting. It is also shown in Reference [17] that the Fourier transform of #^^(x) = <^3,k( x ) satisfies w ^)-^^-'^ (1.5)

Boundary Element Technology 337 for some constants K\ and K^ and it was derived in the Theorem 2.2 of Reference [17], by applying the results of Wu and Schaback^, that if X = {xi, -,Xn} C fi for some compact subset 0 in R* satisfying uniform interior cone condition, then for any / %(#*), where i = a/2 4- k 4-1/2 and Hi(R*} is the Sobolev space, with /i := sup min llx xji " " being sufficiently small. In this paper, instead of using radial interpolants, we apply a greedy algorithm and discuss the approximation by convexly linear combinations of translates of <f>s,k- By applying the results in Reference [10], we present that, regardless of the dimension, an O(ra~*/^) order of approximation can be achieved by a linear combination of m translates of (^&, which is a well known result in the literature of neural networks. Meanwhile, we derive a similar result on the neural networks by using compactly supported univariate radial functions as the activation functions of the networks. 2. Greedy Algorithm and Order of Approximation by Compactly supported Radial Basis Functions We begin by describing a greedy algorithm. For more information on this subject, readers are referred to Jones^, DeVore and Temlyakov^, Davis, Mallat, and Avellaneda^, and Donahue, Gurvits, Darken, and Sontag^. We state a result by Jones^, and also Maurey in Pisier^. Lemma 1 If f is in the closure of the convex hull of a set G in a Hilbert space, with \\g\\ < b for each g G, then for every n > 1, and every c'> I? (I/IP, there is an fn in the convex hull of n points in G such that An iterative procedure for achieving above approximation order is provided by Jones, which we describe as a greedy algorithm as follows. Let /i G such that Inductively, for k > 1, let

338 Boundary Element Technology such that Then, _«_^ (l-o)a + c%7-/. This algorithm is greedy in the sense that we choose optimal approximation in each step. As understood, to ensure that an algorithm converges, certain conditions are needed to be imposed on the function /. Lemma 1 is first applied by Barron* to establish a well known result on approximating a multivariate function in the order of O(ra~*/^) by a network withraneurons (cf. Section 3). It is then applied by the author^ to derive a similar result on simultanuous approximation by translated radial networks, which we especially apply to compactly supported radial basis functions in this section. For a region ft G A*, denote by %%(ft) the Sobolev space consisting of all distributions / on ft with D*f L^(ft) for any k Z+, k < n, where n > 0 is an integer, with 1/2 When ft = jr*, an equivalent norm of / G Hn(R*) is given by a ^1/2 7(w)»(l + w Vdw) l- / In this paper, we consider approximation in Hn := "Hn([ K,K]*)' For a compactly supported 0a,fc> we define a function G by G(x)= ^^( x-27rk ), (2.1) which is 2?r-periodic coordinately. (The function G introduced in (2.1) is slightly different from the one in Reference [10], where ^] 0( 27rk ^) is used to ensure the study of simultaneous approximation.) For / [ 7T,7r]^), its Fourier series expansion is /(%) = where, k

Boundary Element Technology 339 are the Fourier coefficients of /. Notice = (27r)-/ 0..fc( x )e-'< Jfl» = *.,fc(k) (2.2) Let oo}, q > 0. Then, the following lemma follows from (1.5), (2.2), and Lemma 4.3 of Reference [10] and its proof, which is similar in spirit to Lemma 5 proved in the next section. Lemma 2 Let G be given in (2.1) with respect to 0s, A;- Then for any / %%+^ where 26 > m, 6% Wmf #, = ^ke^ W/)A,k(k), / 6, in the closure of the convex hull of ME, (G) = {cg(x t) : t G [ TT, TT]^, c e R with\c\ <Ef}. In the above Lemma and following results, we request 2k > n to ensure <&a,& E "Hn since (j)g^ G C?*. As an obvious conseqence of Lemmas 1 and 2, the following result arrives. Proposition 3 Let f W*~*"^+*. Let 0^ denote the compactly supported radial basis functions of minimal degree, positive definite on R* and in where 2k >n. Then for any integer m > I, there is a Junction where Cj G R, tj G [ 37T,37r]% for I <j < 2*m, such that where C/ is a constant independent of m. Proof By Lemmas 1 and 2, there is a function where Cj G #, tj G [ 7T,7r]^, for 1 < j < m, such that

340 Boundary Element Technology Notice that for each t [ 7T,7r]*, by (2.1), G(- t) is a sum of at most 2* function </>«,&( t 2?rk), k Z% which do not vanish in [ 7T,7r]*, since <&s,&(") = </>s,fc( * ) is supported in [ 1,1]*. The conclusion then follows. 3. Neural Networks by Compactly Supported Functions A neural network with one hidden layer is mathematically expressed as fc=i &,x) 4-0%), (3.1) where a is the activation function of the network, c& R, w& E #*, 0& #, for 1 < k < n, and n is the number of neurons. Approximation by the networks in (3.1) has been extensively studied in the literature, with various results established by many authors under more or less conditions (cf. [1,9], and references therein). In this section we use a compactly supported univariate function <t>i,k(%) to construct neural networks, and establish the following result. Proposition 4 Let f H^*. Let 4%% denote the compactly supported univariate radial basis function of minimal degree, positive definite on R and in C^, where 2k > n. Then for any m > 1, there is a network in the form such that 6=1 where Cf is independent of m. Let For a constant M > 0, set a(x) = V* <t>i,k(x 4-2mr), x E [-TT, TT]. (3.2) c,0e^,w ^, with c < M, w < 1, 0 <?r}. (3.3) Then we have the following lemma. For the sake of convenience to the reader, we present its proof by using a similar argument as in Reference [10]. Lemma 5 Let a be given in (3.2) with respect to 4>i^. Then for any f *H%*~^, where 2k > n, /(x) <%(/) is in the closure of the convex hull oftom,(<r) inhn norm, where M/ = M/)/$i,jb(U ).

Boundary Element Technology 341 Proof Without loss of generality, assume <%(/) = 0. Denote the closure of the convex hull of fi^/ (&) by Co(QMf (<?)) Then Co( lmf (#)) is a bounded subset in Hn- Suppose that / is not in Co(fi^(cr)), then by a standard argument on the dual space and applying the Hahn-Banach Theorem (cf. [13], for instance), there exist g& L?[ TT, TT]*, a < n, such that sup Dhgadx < r < \a\<n (3.4) for some constant r. Observe that for any e > 0, one can easily construct <?a with all order partial derivatives, compactly supported in [ 7r,7r]*, such that \\9a~ 9<*\\L*[-ir,ir]' < Therefore, without loss of generality, we assume that the functions g&, in (3.4) have all order partial derivatives and are compactly supported in [ 7r,7r]*. Evaluating the integrals in (3.4) by parts yields _ (3.5) for any h Cof^M/M). Let s(x) = ^ (-l)i^d^^(x). Then for any N<n C, c < My, Set c j a«w,x) 4-6)s(x)dx < r < I f(x)s(x)dx. (3.6) i/[-7r,7r]* 7[-7r,7r]^ Then (3.6) implies (3.7) For any multi-integer j Z*, j ^ 0, let w^ = j -. Hence w, < 1. We obtain = I s(x) I e'u'*<r J[ TT.TT]^ J -re

342 Boundary Element Technology From (3.7), we have According to the definition of M/, we have / /(xkx)dx < = f, which contradicts (3.5). The proof of the lemma is finished. Proof of Proposition 4 First, by Lemmas 1 and 5, we conclude that there is a network m N(x) = CQ(/) + Y^Cfc<7((Wfc,x) 4-Ok), W& < 1, #& < 7T fc=l such that Observe that (w&,x) -f Ok\ < (s 4- I)TT for x [ 7r,7r]*, and for t G [ (5 + l)?r, (s4-l)7r], a(t) is a sum of at most 5+2 functions </>( 4-2n7r), 2n < s-\-l. Therefore, the conclusion of Propostion 4 follows. Acknowledgement: The author thanks referees' helpful comments. References [1] A. Barron, Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inf. Theory, 39 (1993), 930-945. [2] G. Davis, S. Mallat, and M. Avellaneda, Adaptive greedy approximations, Constr. Approx., 13 (1997), 57-98. [3] M.J. Donahue, L. Gurvits, C. Darken, and E. Sontag, Rates of convex approximation in non-hilbert spaces, Constr. Approx., 13 (1997), 187-220. [4] M. Daehlen, T. Lyche, and L.L. Schumaker, Mathematical Methods for Curves and Surfaces, Nashville & London, Vanderbilt University Press, 1995.

Boundary Element Technology 343 [5] R.A. DeVore and V.N. Temlyakov, Some remarks on greedy algorithms, Advances in Computational Mathematics, 5 (1996), 173-187. [6] R. Franke, Scattered data interpolation: test of some methods, Mathematics of Computation, 38 (1982), 181-200. [7] M.A. Golberg and C.S. Chen, Discrete Projection Methods in Integral Equations, Computational Mechanics Publications, Southampton, Boston, 1997. [8] L.K. Jones, A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training, Ann. Statist., 20 (1992), 608-613. [9] X. Li, Simultaneous approximations of multivariate functions and their derivatives by neural networks with one hidden layer, Neurocomputing, 12 (1996), 327-343. [10] X. Li, On simultaneous approximations by radial basis function neural networks, Applied Mathematics and Computation, 95 (1998), 75-89. [11] C.A. Micchelli, Interpolation of scattered data: distance matrices and conditionally positive definite functions, Constr. Approx., 1 (1986), 11-22. [12] G. Pisier, Remarques sur un resultat non public de B. Maurey, Proc. Seminaire d' analyse fonctionelle, vol. 1-12, Ecole Polytechnique, Palaiseau, 1981. [13] W. Rudin, Functional Analysis, McGraw-Hill, New York, 1973. [14] R. Schaback, Improved error bounds for scattered data interpolation by radial basis function, Math. Comp., to appear. [15] J. Stewart, Positive definite functions and generalizations, an historical survey, Rocky Mountain J. Math., 6 (1976), 409-434. [16] H. Wendland, Piecewise polynomial, positive definite and compactly supported radial basis functions of minimal degree, Advances in Computational Mathematics, 4 (1995), 389-396. [17] H. Wendland, Error estimates for interpolation by compactly supported radial basis functions of minimal degree, J. Approx. Theory, 93 (1998), 258-272. [18] Z. Wu, Compactly supported positive definite radial functions, Advances in Computational Mathematics, 4 (1995) 283-292. [19] Z. Wu and R. Schaback, Local error estimates for radial basis function interpolation of scattered data, IMA Journal of Numerical Analysis, 13 (1993) 13-27.