An l 1 Regularized Method for Numerical Differentiation Using Empirical Eigenfunctions

Similar documents
Learnability of Gaussians with flexible variances

Introduction to Machine Learning. Recitation 11

CS Lecture 13. More Maximum Likelihood

Shannon Sampling II. Connections to Learning Theory

Block designs and statistics

Sharp Time Data Tradeoffs for Linear Inverse Problems

Feature Extraction Techniques

Weighted- 1 minimization with multiple weighting sets

Lower Bounds for Quantized Matrix Completion

Least Squares Fitting of Data

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Non-Parametric Non-Line-of-Sight Identification 1

RECOVERY OF A DENSITY FROM THE EIGENVALUES OF A NONHOMOGENEOUS MEMBRANE

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

3.3 Variational Characterization of Singular Values

COS 424: Interacting with Data. Written Exercises

The linear sampling method and the MUSIC algorithm

Manifold learning via Multi-Penalty Regularization

Support recovery in compressed sensing: An estimation theoretic approach

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Generalized AOR Method for Solving System of Linear Equations. Davod Khojasteh Salkuyeh. Department of Mathematics, University of Mohaghegh Ardabili,

Physics 215 Winter The Density Matrix

A remark on a success rate model for DPA and CPA

Kernel Methods and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Ştefan ŞTEFĂNESCU * is the minimum global value for the function h (x)

Introduction to Kernel methods

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

ON THE TWO-LEVEL PRECONDITIONING IN LEAST SQUARES METHOD

The Methods of Solution for Constrained Nonlinear Programming

Machine Learning Basics: Estimators, Bias and Variance

Explicit solution of the polynomial least-squares approximation problem on Chebyshev extrema nodes

Hybrid System Identification: An SDP Approach

Numerical Solution of Volterra-Fredholm Integro-Differential Equation by Block Pulse Functions and Operational Matrices

NORMAL MATRIX POLYNOMIALS WITH NONSINGULAR LEADING COEFFICIENTS

A note on the multiplication of sparse matrices

An Improved Particle Filter with Applications in Ballistic Target Tracking

Physics 139B Solutions to Homework Set 3 Fall 2009

Multi-view Discriminative Manifold Embedding for Pattern Classification

A note on the realignment criterion

On Conditions for Linearity of Optimal Estimation

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Nyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison

A Bernstein-Markov Theorem for Normed Spaces

Support Vector Machines. Machine Learning Series Jerry Jeychandra Blohm Lab

1 Bounding the Margin

New Classes of Positive Semi-Definite Hankel Tensors

An Inverse Interpolation Method Utilizing In-Flight Strain Measurements for Determining Loads and Structural Response of Aerospace Vehicles

arxiv: v1 [math.na] 10 Oct 2016

Highly Robust Error Correction by Convex Programming

A NEW ROBUST AND EFFICIENT ESTIMATOR FOR ILL-CONDITIONED LINEAR INVERSE PROBLEMS WITH OUTLIERS

Asynchronous Gossip Algorithms for Stochastic Optimization

PRÜFER SUBSTITUTIONS ON A COUPLED SYSTEM INVOLVING THE p-laplacian

Computational and Statistical Learning Theory

An RIP-based approach to Σ quantization for compressed sensing

Numerically repeated support splitting and merging phenomena in a porous media equation with strong absorption. Kenji Tomoeda

Algebraic Montgomery-Yang problem: the log del Pezzo surface case

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Detection and Estimation Theory

IN modern society that various systems have become more

DISSIMILARITY MEASURES FOR ICA-BASED SOURCE NUMBER ESTIMATION. Seungchul Lee 2 2. University of Michigan. Ann Arbor, MI, USA.

On the theoretical analysis of cross validation in compressive sensing

Exact tensor completion with sum-of-squares

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Necessity of low effective dimension

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

1. Introduction. This paper is concerned with the study of convergence of Krylov subspace methods for solving linear systems of equations,

Research Article Approximate Multidegree Reduction of λ-bézier Curves

Universal algorithms for learning theory Part II : piecewise polynomial functions

UNIVERSITY OF TRENTO ON THE USE OF SVM FOR ELECTROMAGNETIC SUBSURFACE SENSING. A. Boni, M. Conci, A. Massa, and S. Piffer.

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Randomized Recovery for Boolean Compressed Sensing

Research Article Robust ε-support Vector Regression

Neural Network Learning as an Inverse Problem

PAC-Bayes Analysis Of Maximum Entropy Learning

arxiv: v1 [stat.ml] 31 Jan 2018

Boosting with log-loss

Research Article Some Formulae of Products of the Apostol-Bernoulli and Apostol-Euler Polynomials

Robustness and Regularization of Support Vector Machines

Hermite s Rule Surpasses Simpson s: in Mathematics Curricula Simpson s Rule. Should be Replaced by Hermite s

A BLOCK MONOTONE DOMAIN DECOMPOSITION ALGORITHM FOR A NONLINEAR SINGULARLY PERTURBED PARABOLIC PROBLEM

Bayes Decision Rule and Naïve Bayes Classifier

Multi-Scale/Multi-Resolution: Wavelet Transform

Support Vector Machines. Goals for the lecture

arxiv: v1 [cs.lg] 8 Jan 2019

Distributed Subgradient Methods for Multi-agent Optimization

A Simple Regression Problem

GLOBALLY CONVERGENT LEVENBERG-MARQUARDT METHOD FOR PHASE RETRIEVAL

Hamming Compressed Sensing

A Note on the Applied Use of MDL Approximations

Fundamental Limits of Database Alignment

HIGH RESOLUTION NEAR-FIELD MULTIPLE TARGET DETECTION AND LOCALIZATION USING SUPPORT VECTOR MACHINES

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

Yongquan Zhang a, Feilong Cao b & Zongben Xu a a Institute for Information and System Sciences, Xi'an Jiaotong. Available online: 11 Mar 2011

Decentralized Adaptive Control of Nonlinear Systems Using Radial Basis Neural Networks

Transcription:

Journal of Matheatical Research with Applications Jul., 207, Vol. 37, No. 4, pp. 496 504 DOI:0.3770/j.issn:2095-265.207.04.0 Http://jre.dlut.edu.cn An l Regularized Method for Nuerical Differentiation Using Epirical Eigenfunctions Junbin LI, Renhong WANG, Min XU School of Matheatical Sciences, Dalian University of Technology, Liaoning 6024, P. R. China Abstract We propose an l regularized ethod for nuerical differentiation using epirical eigenfunctions. Copared with traditional ethods for nuerical differentiation, the output of our ethod can be considered directly as the derivative of the underlying function. Moreover, our ethod could produce sparse representations with respect to epirical eigenfunctions. Nuerical results show that our ethod is quite effective. Keywords nuerical differentiation; epirical eigenfunctions; l regularization; ercer kernel MR(200) Subject Classification 65D5; 65F22. Introduction Nuerical differentiation is a proble to deterine the derivatives of a function fro the values on scattered points. It plays an iportant role in scientific research and application, such as solving Volterra integral equation [], iage processing [2], option pricing odels [3] and identification [4]. The ain difficulty of nuerical differentiation is that it is an ill-posed proble, which eans, the sall error of easureent ay cause huge error in the coputed derivatives [5]. Several ethods for nuerical differentiation have been proposed in the literature, including difference ethods [6] and interpolation ethods [7]. In particular, soe researchers proposed to use Tikhonov regularization for nuerical differentiation probles, which has been shown quite effective [8 0]. Note that ost regularization ethods for nuerical differentiation consist of estiating a function fro the given data and then coputing derivatives of the function. However, in any practical applications, the thing we need to obtain is the derivative of the underlying function not the underlying function itself [3,4]. Thus, a natural approach to coputing derivatives would be to directly estiate the derivatives. In this paper, we propose an algorith for nuerical differentiation in the fraework of statistical learning theory. More specifically, we study an l regularized algorith for nuerical differentiation using epirical eigenfunctions. The key Received April 25, 207; Accepted May 25, 207 Supported by the National Nature Science Foundation of China (Grant Nos. 30052; 30045; 27060; 60064; 67068), the Fundaental Research Funds for the Central Universities (Grant No. DUT6LK33) and the Fundaental Research of Civil Aircraft (Grant No. MJ-F-202-04). * Corresponding author E-ail address: junbin@ail.dlut.edu.cn (Junbin LI)

An l regularized ethod for nuerical differentiation using epirical eigenfunctions 497 advantage of the algorith is that its output could be considered directly as the derivative of the underlying function. Moreover, the algorith produces sparse representations with respect to epirical eigenfunctions, without assuing sparsity in ters of any basis or syste. The reainder of this paper is organized as follows. In Section 2, we first review soe basic facts in statistical learning theory and then present our ain algorith. In Section 3, we present an approach for coputing explicitly the epirical eigenfunctions. In Section 4, we establish the representer theore of the algorith. To illustrate the effectiveness of the algorith, we provide several nuerical exaples in Section 5. Finally, soe concluding rearks are given in Section 6. 2. Forulation of the ethod theory. To present our ain algorith, let us first describe the basic setting of statistical learning Let X be the input space and Y R the output space. Assue that ρ is a Borel probability easure on Z X Y. Let ρ X be the arginal distribution on X and ρ( x) the conditional distribution on Y at given x. Let f ρ be the function defined by f ρ (x) ydρ(y x), x X. Y Given a saple z {(x i, y i )} i drawn independently and identically according to ρ, we are interested in estiating the derivative of f ρ. More precisely, we want to find a function f z : X R that can be used as an approxiation of the derivative of f ρ. Before proceeding further, we need to introduce soe notions related to kernels [,2]. A Mercer kernel on X is defined to be a syetric continuous function K : X X R such that for any finite subset {x i } i of X, the atrix K whose (i, j) entry is K(x i, x j ) is positive seidefinite. Let span{k x : x X} denote the space spanned by the set {K x K(, x) : x X}. We define an inner product in the space span{k x : x X} as follows: s α i K xi, i t β j K tj j K s i j t α i β j K(x i, t j ). The reproducing kernel Hilbert space H K associated with K is defined to be the copletion of span{k x : x X} under the nor K induced by the inner product, K. The reproducing property in H K takes the for f(x) f, K x K for all x X and f H K. Let κ sup x,y X K(x, y). Then it follows fro the reproducing property that f κ f K, f H K. Taylor s expansion of a function g(u) about the point x gives us, for u x, g(u) g(x) + g (x)(u x). Thus the epirical error incurred by the function f on the saple points x x i, u x j can be easured by (g(u) g(x) g (x)(u x)) 2 (y i y j + f(x i )(x j x i )) 2.

498 Junbin LI, Renhong WANG and Min XU The restriction u x could be enforced by the weight ω i,j ω (s) i,j > 0 corresponding to (x i, x j ) with the requireent that ω (s) i,j 0 as x i x j /s. One possible choice of weights is given by a Gaussian with variance s > 0. Let ω be the function on R given by ω(x) s e x2 4 2s 2. Then this choice of weights is ω i,j ω (s) i,j ω(x i x j ). The following regularized algorith for nuerical differentiation was proposed in [3]: in f H K { 2 } ω(x i x j )(y i y j f(x i )(x i x j )) 2 + γ f 2 K. i,j In this paper, we shall odify the algorith () by using an l regularizer. Note that the l regularizer plays a key role in producing sparse approxiations. This phenoenon has been observed in LASSO [4] and copressed sensing [5], under the assuption that the approxiated function has a sparse representation with respect to soe basis. Let L K,s denote the operator defined by L K,s (f) ω(x u)k x (u x) 2 f(x)dρ X (x)dρ X (u), f H K. X X The operator L K,s is copact, positive, and self-adjoint [3]. Therefore it has at ost countably any eigenvalues, and all of these eigenvalues are nonnegative. One can arrange these eigenvalues {λ l } (with ultiplicities) as a nonincreasing sequence tending to 0 and take an associated sequence of eigenfunctions {φ l } to be an orthonoral basis of H K. Let x denote the unlabeled part of the saples z {(x i, y i )} i, i.e., x {x i} i. defined on H K as follows: L x K,s(f) ( ) We consider another operator Lx K,s ω(x i x j )K xi (x j x i ) 2 f(x i ), f H K. (3) i,j It is easy to show that E x (L x K,s f) L K,sf, which eans E x (L x K,s ) L K,s. As a result, L x K,s can be viewed as an epirical version of the operator L K,s with respect to x. The operator L x K,s is self-adjoint, positive. Its eigensyste, called an epirical eigensyste, is denoted by {(λ x l, φx l )}, where the eigenvalues {λ x l } are arranged in nonincreasing order. We notice here two iportant facts: for one thing, all the epirical eigenfunctions {φ x l } for an orthonoral basis of H K; for another, at ost eigenvalues are nonzero, i.e., λ x l 0 whenever l >. Based on the first epirical eigenfunctions {φ x l } l, we are now in a position to present our ain algorith as follows: c z γ arg in c R { 2 i,j The output function of algorith (4) is ( ω(x i x j ) y i y j f z γ ( l c z γ,lφ x l, l ) ) 2 } c l φ x l (x i ) (x j x i ) + γ c. (4) which is expected to approxiate the derivative of the underlying target function f ρ. Next we shall focus on the coputations of epirical eigenpairs, the representer theore (i.e., the explicit solution to proble (4), and the sparsity of coefficients in the representation f z γ l cz γ,l φx l.

An l regularized ethod for nuerical differentiation using epirical eigenfunctions 499 3. Coputations of epirical eigenpairs We shall establish in this section an approach for coputing explicitly the epirical eigenpairs {(λ x l, φx l )}. To present our ethod, soe notations and definitions are needed. Recall that K denotes the atrix whose (i, j) entry is K(x i, x j ). For i, define b i j ω(x i x j )(x j x i ) 2, d i b i. Let B diag{b, b 2,..., b }, D diag{d, d 2,..., d }, and A DKD. Denote rank(a) and rank(l x K,s ) to be the ranks of the atrix A and the operator L x K,s, respectively. In the following theore, we shall express the epirical eigenpairs of the operator L x K,s in ters of the eigenpairs of a atrix. Theore 3. Let d rank(a). Denote all eigenvalues of A as λ λ 2 λ d > λ d+ λ 0, and the corresponding orthonoral eigenvectors as u, u 2,..., u. Then rank(l x K,s ) rank(a), and the epirical eigenpairs {(λx l, φx l )}d l of Lx K,s can be coputed in ters of the eigenpairs of A as follows: λ x l λ l ( ), φx l λl Proof By the definitions of L x K,s and φx l, we have j d j (u l ) j K xj. and L x K,s(φ x l ) ( ) λ l ( ) λ l ( ) λ l ( ) λ l ( ) λ l λ l ( ) j i,p i i ω(x i x p )K xi (x p x i ) 2 d j (u l ) j K(x i, x j ) K xi K xi j K xi d i i j ( ) ω(x i x p )(x p x i ) 2 K(x i, x j )d j (u l ) j j p K xi d i λ l (u l ) i i λl i φ x p, φ x q K λp λ q λp λ q b i K(x i, x j )d j (u l ) j d i K(x i, x j )d j (u l ) j d i (u l ) i K xi λ x l φ x l i,j i,j di (u p ) i K xi, d j (u q ) j K xj d i (u p ) i K(x i, x j )d j (u q ) j (Du p ) T K(Du q ) u T p D T KDu q λp λ q λp λ q K

500 Junbin LI, Renhong WANG and Min XU u T λ q p Au q u T p u q δ p,q. λp λ q λ p Therefore, the nubers {λ x l }d l are eigenvalues of Lx K,s with corresponding orthonoral eigenfunctions {φ x l }d l, and rank(lx K,s ) rank(a). that Let φ x l On the other hand, let t rank(l x K,s ). Then, for l t, it follows fro Lx K,s (φx l ) λx l φx l ( ) ω(x i x j )K(x i, x p )(x j x i ) 2 φ x l (x i ) i,j ( ) K(x i, x p )b i φ x l (x i ) λ x l φ x l (x p ), p. i x (φ x l (x ),..., φ x l (x )) T. Then Now, for p, q, we have ( ) KBφx l x λ x l φ x l x, ( ) DKD2 φ x l x λ x l Dφ x l x, ( ) ADφx l x λ x l Dφ x l x. δ p,q λ x p L x K,s(φ x p), φ x q K ( ) ω(x i x j )K xi (x j x i ) 2 φ x p, φ x q K ( ) ( ) ( ) i,j ω(x i x j )(x j x i ) 2 φ x p(x i )φ x q (x i ) i,j φ x p(x i )φ x q (x i ) ω(x i x j )(x j x i ) 2 i i j φ x p(x i )b i φ x q (x i ) ( ) ( ) Dφx p x, Dφ x q x. (d i φ x p(x i ))(d i φ x q (x i )) It follows that for l t, Dφ x l x are the orthonoral eigenvector syste of A, and rank(a) rank(l x K,s ). The proof of the theore is now copleted. i Reark 3.2 According to the proof of Theore 3., we know that the eigenfunctions {φ x l }d l satisfy the following two properties: ( ) i,j ω(x i x j )(x j x i ) 2 φ x p(x i )φ x q (x i ) δ p,q λ x p. If λ x l 0, then φ x l (x i)(x j x i ) 0. 4. Representer Theore

An l regularized ethod for nuerical differentiation using epirical eigenfunctions 50 The following theore provides the solution to proble (4) explicitly. Theore 4. For l, denote { Sl z 2 λ x i,j ω(s) (x i x j )(y i y j )φ x l (x i)(x j x i ), if λ x l > 0, l 0, otherwise. Then the solution to proble (4) is given by 0, ( ) if 2λ x l Sz l γ, c z γ,l ( ) Sl z γ ( 2λ x l ) Sl z + ( ) γ 2λ x l, if 2λ x l Sz l > γ and Sz l > γ 2λ, x l, if 2λ x l Sz l > γ and Sz l < γ 2λ. x l Proof Let ω i,j ω(x i x j ). By using Reark 3.2, we can reduce the epirical error part in algorith (4) as follows: ( ) ) 2 2 ω i,j (y i y j + c l φ x l (x i ) (x j x i ) i,j i,j l l [( 2 ] 2 ω i,j c l φ x l (x i )(x j x i )) + 2(y i y j ) c l φ x l (x i )(x j x i ) + (y i y j ) 2 2 2 2 2 i,j p,q 2 2 i,j ω i,j l c l φ x l (x i )(x j x i ) 2 l ω i,j (y i y j ) c l φ x l (x i )(x j x i ) c p c q i,j c l l i,j l l c l φ x l (x i ) + 2 ω i,j (y i y j ) 2 + i,j ω i,j φ x p(x i )(x j x i ) 2 φ x q (x i ) + 2 ω i,j (y i y j ) 2 + ω i,j (y i y j )φ x l (x i )(x j x i ) 2 c 2 l λ x l ( ) + 2 ω i,j (y i y j ) 2 + 2 2 l c l l i,j l i,j ω i,j (y i y j )φ x l (x i )(x j x i ) i,j c 2 l λ x l + 2 λ x l Sl z c l + 2 ω i,j (y i y j ) 2. l i,j We now have an equivalent for of the algorith as c z { γ arg in c R λx l (c l + ( ) Sz l ) 2 + γ c l }. l It is easy to see that c z γ,l 0 when λx l 0. When λ x l > 0, the coponents c z γ,l can be found by solving the following optiization proble c z { γ,l arg in (c + c R ( ) Sz l ) 2 γ + c }, ( ) λ x l (5)

502 Junbin LI, Renhong WANG and Min XU which has the solution given by (5). This proves the theore. 5. Nuerical exaples We present three nuerical exaples to illustrate the approxiating perforance for the nuerical differentiation. We consider the following functions f (x) x 2 exp( x 2 /4), (6) f 2 (x) sin(x) exp( x 2 /8), (7) f 3 (x) x 2 cos(x)/8, (8) f 4 (x) x sin(x). (9) To estiate the coputational error, we choose N test points {t i } N i0 on the interval [ 4, 4] and then copute the errors by using the following two forulae: E (f) N (fγ z (t i ) f (t i ), N i0 E 2 (f) N (fγ N z (t i ) f (t i )) 2. i0 In the experients, the points {x i } 20 i0 are uniforly distributed over [ 4, 4], i.e., x i 4 + 0.4i (0 i 20). The paraeters s and γ are chosen as 0. and 0.00, respectively. The resulting nuerical results are shown in Figures and 2. Moreover, the errors are listed in Table. Fro these figures, it could be observed that the function f z γ atches the derivative function f ρ well. Meanwhile, the sparse properties could be explicitly found fro the nuber of non-zero coefficients in Table. Function f E (f) E 2 (f) rate of non-zero coefficients f (x) 0.042 0.0473 0/2 f 2 (x) 0.0222 0.0257 0/2 f 3 (x) 0.0489 0.0868 0/2 f 4 (x) 0.030 0.0349 0/2 Table Errors 6. Discussion In this paper, we study a ethod for nuerical differentiation in the fraework of statistical learning theory. Based on epirical eigenfunctions, we propose an l regularized algorith. We present an approach for coputing explicitly the epirical eigenfunctions and establish the representer theore of the algorith. Copared with traditional ethods for nuerical differentiation, the output of our ethod could be considered directly as the derivative of the

An l regularized ethod for nuerical differentiation using epirical eigenfunctions 503 underlying function. Moreover, the algorith could produce sparse representations with respect to epirical eigenfunctions, without assuing sparsity in ters of any basis or syste. Finally, this work leaves several open issues for further study. For exaple, it is interesting to extend our ethod to the estiation of gradient in high diensional space. (a) Figure (a) Approxiate derivative of f (x) (b) (b) Approxiate derivative of f 2(x) (a) Figure 2 (a) Approxiate derivative of f 3(x) (b) (b) Approxiate derivative of f 4(x) Acknowledgeents coents and constructive suggestions. The authors are indebted to the anonyous reviewers for their careful References [] Jinquan CHENG, Y. C. HON, Yanbo WANG. A Nuerical Method for the Discontinuous Solutions of Abel Integral Equations. Aer. Math. Soc., Providence, RI, 2004. [2] S. R. DEANS. The Radon Transfor and Soe of Its Applications. Courier Corporation, 2007. [3] E. G. HAUG. The Coplete Guide to Option Pricing Forulas. McGraw-Hill Copanies, 2007. [4] M. HANKE, O. SCHERZER. Error analysis of an equation error ethod for the identification of the diffusion coefficient in a quasi-linear parabolic differential equation. SIAM J. Appl. Math., 999, 59(3): 02 027. [5] A. N. TIKHONOV, V. Y. ARSENIN. Solutions of Ill-Posed Probles. Washington, DC: Winston, 977. [6] R. S. ANDERSSEN, M. HEGLAND. For nuerical differentiation, diensionality can be a blessing!. Math. Cop., 999, 68(227): 2 4. [7] T. J. RIVLIN. Optially stable Lagrangian nuerical differentiation. SIAM J. Nuer. Anal., 975, 2(5): 72 725. [8] J. CULLUM. Nuerical differentiation and regularization. SIAM J. Nuer. Anal., 97, 8(2): 254 265.

504 Junbin LI, Renhong WANG and Min XU [9] Shuai LU, S. PEREVERZEV. Nuerical differentiation fro a viewpoint of regularization theory. Math. Cop., 2006, 75(256): 853 870. [0] Ting WEI, Y. C. HON, Yanbo WANG. Reconstruction of nuerical derivatives fro scattered noisy data. Inverse Probles, 2005, 2(2): 657 672. [] N. ARONSZAJN. Theory of reproducing kernels. Trans. Aer. Math. Soc., 950, 68(3): 337 404. [2] F. CUKER, Dingxuan ZHOU. Learning Theory: An Approxiation Theory Viewpoint. Cabridge University Press, Cabridge, 2007. [3] S. MUKHERJEE, Dingxuan ZHOU. Learning coordinate covariances via gradients. J. Mach. Learn. Res., 2006, 7: 59 549. [4] E. J. CANDÈS, J. ROMBERG, T. TAO. Robust uncertainty principles: exact signal reconstruction fro highly incoplete frequency inforation. IEEE Trans. Infor. Theory, 2006, 52(2): 489 509. [5] Hongyan WANG, Quanwu XIAO, Dingxuan ZHOU. An approxiation theory approach to learning with l regularization. J. Approx. Theory, 203, 67: 240 258.