Stable Parameterization Schemes for Gaussians

Size: px
Start display at page:

Download "Stable Parameterization Schemes for Gaussians"

Transcription

1 Stable Parameterization Schemes for Gaussians Michael McCourt Department of Mathematical and Statistical Sciences University of Colorado Denver ICOSAHOM 2014 Salt Lake City June 24, 2014 (UCDenver) Parameterizing Gaussians June 24, / 28

2 Special Thanks I would like to offer my thanks to Rodrigo Platte for the invitation to this session, and to all of you in attendance for your attention. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

3 Introduction The shape parameter ε associated with some RBFs plays an important role in the quality of associated approximations. Example Gaussian support vector machine decision contour ε = 10, Too Erratic ε = 1, Nice ε =.1, Insensitive Comparison with the true pattern allows for judgment. How can the quality be judged without the true pattern? michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

4 Kernel-based Interpolation Basic strategies for parameterizing kernels are based on: Error bounds ([9] through [10]) Stationary interpolation strategies (e.g., [13], or [7]) Trial and error (almost everyone) Today, we focus on statistical approaches to optimizing ε, generally based on the interpretation of scattered data interpretation through spatial statistics or data assimilation. We restrict our discussion to Gaussian kernels K (x, z) = e ε2 x z 2 ; most results apply to other positive definite kernels as well. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

5 Statistical Parameterization Schemes By using spatial statistics to analyze data, we open up avenues: Kriging variance, a.k.a, the power function Golomb-Weinberger bound Maximum likelihood estimation Maximum a posteriori estimation k-fold cross-validation (not discussed here) Generalized cross-validation (also not discussed) Goal In this talk, we will look at how these quantities can be computed for small ε values using a stable basis for Gaussians [3]. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

6 Hilbert-Schmidt Eigenexpansion The term Hilbert-Schmidt SVD (HS-SVD) was introduced in [2], though the idea originated in [5]. We recall the Hilbert-Schmidt eigenvalue problem, K (x, z)ϕ n (z)ρ(z)dz = λ n ϕ n (x), which allows us to write a positive definite kernel as K (x, z) = λ n ϕ n (x)ϕ n (z), n =d n is a multiindex. By exploiting this series, the standard basis k(x) T = ( K (x, x 1 ) K (x, x N ) ) can be expressed using a stable basis ψ(x) T. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

7 Hilbert-Schmidt Eigenexpansion for Gaussians Gaussians in d dimensions have the eigenexpansion α λ n = (ε 2 + δ 2 + α 2 ) (2 n d+1/2) ε2( n d), ϕ n (x) = γ n e δ2 x 2 H n 1 (βαx) α is a free parameter (another one...), and β, γ n, δ 2 are defined by ε and α; ignore them. Important Limits For small ε, the eigenvalues decay quickly and the eigenfunctions approach Hermite polynomials. lim λ n ε 2( n d) ε 0 lim ϕ n (x) H n 1 (αx) ε 0 Designer Kernels Work such as [15] and [4] suggest constructing kernels by choosing eigenfunctions to satisfy certain properties. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

8 Eigenfunction Decompositions and Matrix Manipulations The N N kernel matrix K can be written as the infinite matrix product ( ) ( ) K (x, z) = λ n ϕ n (x)ϕ n (z) K = ΦΛΦ T Λ1 Φ T = Φ 1 Λ 2 Φ T, n =d 2 where Φ 1, Λ 1 are N N and invertible. The Hilbert-Schmidt eigenvalues appear in nonincreasing order on the diagonal of Λ. We may write K = ΨΛ 1 Φ T 1 K = Φ ( Λ1 Φ T 1 Λ 2 Φ T 2 ) ( ) I = Φ N Λ 2 Φ T Λ 1 Φ T 2 Φ T 1 Λ 1 1 = ΨΛ 1Φ T 1. 1 is the Hilbert-Schmidt SVD, or just HS-SVD. Ψ is made of the eigenfunctions plus a small correction Ψ = ( ) ( ) I Φ 1 Φ N 2 Λ 2 Φ T = Φ 1 + Φ 2 Λ 2 Φ T 2 Φ T 1 Λ 1 2 Φ T 1 Λ michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

9 Stable Interpolation Through the Hilbert-Schmidt SVD Forgoing the details, the standard basis and stable basis are related by k(x) T = ψ(x) T Λ 1 Φ T 1, K = ΨΛ 1Φ T 1. k(x i ) T and ψ(x i ) T are the ith rows of K and Ψ respectively. Evaluating the interpolant s at the point x can now be done stably: s(x) = k(x) T K 1 y = ψ(x) T Λ 1 Φ T 1 Φ T 1 Λ 1 1 Ψ 1 y = ψ(x) T Ψ 1 y. This new basis ψ is more stable than k because the Λ 1 term is eliminated analytically. For small ε, the condition of Λ 1 becomes untenable. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

10 More About the HS-SVD Notes This strategy is often referred to as RBF-QR when Φ T 2 Φ T 1 is computed through a QR factorization. The vector k(x) T K 1 = ψ(x) T Ψ 1 contains the Cardinal functions. We are confident in the existence of a HS-SVD. Uniqueness is another question... Strategy We want to study the role that this idea plays in the objective functions associated with various parameterization schemes. Most of our work today involves exploiting the HS-SVD to try and open up the range of viable ε values for our Gaussians. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

11 HS-SVD of K is NOT a Symmetric Decomposition Interesting Point Despite the SPD nature of K, ΨΛ 1 Φ T 1 is NOT a symmetric decomposition, i.e., Ψ Φ 1. While surprising, this does not severely hinder analysis. To get some practice working with the HS-SVD, let us study Ψ 1 Φ 1 and carefully apply inverses: Ψ 1 Φ 1 = (Φ 1 + Φ 2 Λ 2 Φ T 2 Φ T 1 Λ 1 1 ) 1 Φ 1 = (I N + Φ 1 1 Φ 2Λ 2 Φ T 2 Φ T 1 Λ 1 1 ) 1 = Λ 1 (Λ 1 + Φ 1 1 Φ 2Λ 2 Φ T 2 Φ T 1 ) 1 = Λ 1 (Λ 1 + G T G) 1, where G = Λ 1/2 2 Φ T 2 Φ T 1. Clearly, Ψ 1 Φ 1 is symmetric, since it is a product of symmetric matrices. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

12 HS-SVD of K Approaches a Symmetric Decomposition as ε 0 We know that Ψ 1 Φ 1 = Λ 1 (Λ 1 + G T G) 1, G = Λ 1/2 2 Φ T 2 Φ T 1. We can study the magnitude of these components: lim Λ 1 2 = ε 0 = 1, ε 0 ( ( ) ) lim ε 0 GT G 2 = lim Λ 1/2 ε 0 2 Φ T 2 Φ T O 2 ε N1/d. This suggests that (Λ 1 + G T G) 1 Λ 1 and Ψ 1 Φ 1 Λ 1 Λ 1 1 = I N, as ε 0. Applied to Parameterization Schemes Enough practice... let s see how this really works. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

13 Kernel-based Interpolation as Kriging Two strategies for studying scattered data {x i, y i } N i=1 {X, y}: Numerical Analysis Data was generated by a function from the RKHS induced by K, H K. s(x) = k(x) T K 1 y, y(x) s(x) y HK K (x, x) k(x) T K 1 k(x) Spatial Statistics Data was realized from a zero-mean Gaussian random field Y with covariance K. ( ) Y x {X, y} N k(x) T K 1 y, K (x, x) k(x) T K 1 k(x) Note that V (x) = P(x) 2 = K (x, x) k(x) T K 1 k(x). michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

14 Kriging Variance Minimizing V (x) would, in a sense, yield the most trustworthy predictions. Its computation can be subject to ill-conditioning as ε 0 V (x) = K (x, x) k(x) T K 1 k(x) = K (x, x) ψ(x) T Ψ 1 k(x) Recall the Cardinal functions: k(x) T K 1 = ψ(x) T Ψ 1. Better stability, but numerical cancelation is unavoidable. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

15 Golomb-Weinberger Bound The standard native space error bound can be augmented into the Golomb-Weinberger bound [6] y(x) s(x) y HK V (x) δ s HK V (x), for some small δ value related to the quality of the interpolation. Ignoring the fact that V (x) is still limited by cancelation, we must also compute s HK successfully to evaluate this bound. Recall that, through the reproducing property, s 2 H K = s, s HK = y T K 1 k( ), k( ) T K 1 y HK = y T K 1 k( ), k( ) T HK K 1 y = y T K 1 KK 1 y = y T K 1 y. We will study this using the HS-SVD. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

16 The Hilbert Space Norm (Mahalanobis Distance) y T K 1 y Define Ψb = y, the system for the stable basis interpolation coefficients. Using this and K = ΨΛ 1 Φ T 1 in y T K 1 y gives y T K 1 y = b T Ψ T Φ T 1 Λ 1 1 Ψ 1 Ψb = b T (Ψ 1 Φ 1 ) T Λ 1 1 b Our old friend Ψ 1 Φ 1 = (I N + Φ 1 1 Φ 2Λ 2 Φ T 2 Φ T 1 Λ 1 1 ) 1 returns! Canceling out the double inverse and applying the transpose gives y T K 1 y = b T (I N + Λ 1 1 Φ 1 1 Φ 2Λ 2 Φ T 2 Φ T 1 )Λ 1 1 b = b T Λ 1 1 b + bt Λ 1 1 Φ 1 1 Φ 2Λ 2 Φ T 2 Φ T 1 Λ 1 1 b = b T Λ 1 1 b + (Λ1/2 2 Φ T 2 Φ T 1 Λ 1 1 b)t (Λ 1/2 2 Φ T 2 Φ T 1 Λ 1 1 b) b T Λ 1 1 b Computing this quantity is feasible, although it grows quickly. The bound in the last line may be rather tight for small ε. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

17 The Hilbert Space Norm - A Tight Bound Again considering Gaussians with small ε, lim Λ 1/2 ε 0 2 Φ T 2 Φ T 1 Λ 1 1 b 2 Λ 1/2 lim ε 0 This lets us say that as compared to lim Λ 1/2 ε 0 2 Φ T 2 Φ T 1 Λ 1 b T Λ 1 1 b = Λ 1/2 1 b 2 Φ T 2 Φ T 1 Λ 1 b 2 ε 1+N1/d ε 2N1/d b 2 = ε 1 N1/d b 2 1 b 2 2 ε2 2N1/d 1 b Λ 1/2 2 2 b 2 2 = ε 2N1/d b 2 2. Thus we may compute, or at least bound, s HK = y T K 1 y. 1 michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

18 Maximum Likelihood Estimation Even though cancelation in the Kriging variance prevents us from leveraging y T K 1 y in the Golomb-Weinberger bound, it can be used in the maximum likelihood estimator [11]. Saving some time, we write that the MLE ε is the ε which minimizes C MLE (ε) = N log(y T K 1 y) + log det K. Using the HS-SVD, we know that log det K = log det(ψλ 1 Φ T 1 ) = log det Ψ + log det Λ 1 + log det Φ 1. This gives us a tool for parameterizing Gaussians with small ε. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

19 MLE Example Example: N = 169 Halton points fit to log(3 + x + y) + 2(1 + x + y) cos(2xy) on [ 1, 1] 2. Primary bound: b T Λ 1 b Correction: Λ 1/2 2 Φ T 2 Φ T 1 Λ 1 1 b 2 2 michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

20 Maximum A Posteriori Estimation When prior beliefs regarding an optimal ε value are available, they can supplement the likelihood function. This idea is discussed in general in, e.g., [8], and in the context of Gaussian Processes in [12] and [14]. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

21 Conclusion Results Repeating the strategy for interpolation, the HS-SVD may provide opportunities to study Gaussian parameterization schemes for small ε. The Hilbert space norm, MLE and Cross-validation (not discussed) are viable for small ε. Other PD kernels will work as well if their eigenexpansion is available or can be approximated. To Be Accomplished Kriging variance is better but still troubled. Reduce computational cost. Consider incorporation with matrix-free approaches, e.g., [1]. Tests on applications. (UCDenver) Parameterizing Gaussians June 24, / 28

22 Appendix: k-fold Cross-Validation Cross-Validation Strategy Using subsets of your data X I, predict values at omitted points X O and use the residual to measure the quality of that parameterization. If we block up our kernel system Kc = y matrix as ( ) ( ) KII K K = IO y, y = I, c = K OI K OO y O ( ci such that KA = I, the residual at X O is y O K OI K 1. y I Summing all the residuals over k disjoint omitted point sets O = {X (1) O,..., X (k) would produce C CV (ε; O) = y O K OI K 1 II y I. X O O This is what we want to minimize, but K 1 II II c O ), may be dangerous. O } michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

23 Appendix: k-fold Cross-Validation with HS-SVD Rewriting the blocks of K with the HS-SVD gives ( ) ( KII K K = IO ΨII Λ = I Φ T I Ψ IO Λ O Φ T ) O K OI K OO Ψ OI Λ I Φ T I Ψ OO Λ O Φ T O Notably, it says K II = Ψ II Λ I Φ T I residual y O K OI K 1 : y I II and K OI = Ψ OI Λ I Φ T I. Substituting into the K OI K 1 II = Ψ II Λ I Φ T I Φ T I Λ 1 I Ψ 1 OI = Ψ OI Ψ 1 II gives the stable residual (unsurprisingly) as y O Ψ OI Ψ 1. y I II michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

24 Appendix: Cross-Validation, Rippa Trick Define the blocked kernel system, Kc = y c = Ay: ( ) ( ) ( ) KII K K = IO AII A, A = IO y, y = I, c = K OI K OO A OI A OO This says that c O = A OI y I + A OO y O, and AK = I gives us A OI = A OO K T IO K 1 II, so y O c O = A OO K OI K 1 II y I + A OO y O, A 1 OO c O = K OI K 1 II y I + y O. Thus, the residual y O K OI K 1 II y I = A 1 OO c O. ( ci c O ). michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

25 Appendix: Cross-Validation, Rippa Trick for HS-SVD Rewriting the blocks of K with the HS-SVD gives ( ) ( ) ( KII K K = IO ΨII Ψ = IO ΛI Φ T I K OI K OO Ψ OI Ψ OO Λ O Φ T O ), so, by absorbing the ΛΦ matrix into c, Kc = y becomes ( ) ( ) ( ) ( ) ( ) ( ) ΨII Ψ IO bi y = I bi AII A = IO y I Ψ OI Ψ OO b O y O b O A OI where BΨ = I. The absorption gives us c O = Φ T O Λ 1 A OO stare long enough, you can show A OO = Φ T O Λ 1 O B OO, so A 1 OO c O = B 1 OO b O. y O O b O and, if you michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

26 Appendix: Kriging Variance - A Deeper Exploration The HS-SVD helps avoid instability, but cannot circumvent numerical cancelation - thus the limit towards machine precision. This brief example shows the basic idea behind using the HS-SVD; now another example to more efficiently compute V (x). Using k(x) T = ψ(x) T Λ 1 Φ T 1 and K = ΨΛ 1Φ T 1 in k(x)t K 1 k(x) gives k(x) T K 1 k(x) = ( ) ( ) 1 ψ(x) T Λ 1 Φ T 1 ΨΛ 1 Φ T 1 Φ1 Λ 1 ψ(x) = ψ(x) T Ψ 1 Φ 1 Λ 1 ψ(x) Aha! We recognize Ψ 1 Φ 1 from our earlier practice (practice makes perfect). michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

27 Appendix: Kriging Variance - Fun With Matrices We use the result, Ψ 1 Φ 1 = Λ 1 (Λ 1 + Φ 1 1 Φ 2Λ 2 Φ T 2 Φ T 1 ) 1, in our current expression: k(x) T K 1 k(x) = ψ(x) T Ψ 1 Φ 1 Λ 1 ψ(x) = ψ(x) T Λ 1 (Λ 1 + Φ 1 1 Φ 2Λ 2 Φ T 2 Φ T 1 ) 1 Λ 1 ψ(x) We can pull Λ 1/2 from both sides of the inverse to write this as: k(x) T K 1 k(x)=ψ(x) T Λ 1/2 1 (I N +Λ 1/2 1 Φ 1 1 Φ 2Λ 2 Φ T 2 Φ T 1 Λ 1/2 1 ) 1 Λ 1/2 1 ψ(x) =ψ(x) T Λ 1/2 (I N + LL T ) 1 Λ 1/2 ψ(x) where L = Λ 1/2 2 Φ 2 Φ 1 1 Λ 1/2 1. As ε 0, for Gaussians, LL T 2 L 2 Λ 2 1/2 2 Φ 2 Φ 1 Λ 1/2 2 2 ε2n1/d ε 2(N1/d 1) = ε 2. So for small ε, a von Neumann series allows us to write k(x) T K 1 k(x) ψ(x) T Λψ(x) ψ(x) T Λ 1/2 LL T Λ 1/2 ψ(x) michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

28 Appendix: Kriging Variance - Reduced Computation So, for small ε, we can write k(x) T K 1 k(x) ψ(x) T Λψ(x) ψ(x) T Λ 1/2 LL T Λ 1/2 ψ(x) +... This allows us to evaluate the Kriging variance for small ε with V (x) = K (x, x) k(x) T K 1 k(x) K (x, x) ψ(x) T Λ 1 ψ(x) + ψ(x) T Λ 1/2 LL T Λ 1/2 ψ(x)... After forming the stable basis ψ, no further matrix inverses (such as ψ(x) T Ψ 1 ) need be computed. Of course, this will not bypass the numerical cancelation, but it can reduce the cost. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

29 M. Anitescu, J. Chen, and L. Wang. A matrix-free approach for solving the parametric Gaussian process maximum likelihood problem. SIAM Journal on Scientific Computing, 34(1):A240 A262, January R. Cavoretto, G. E. Fasshauer, and M. McCourt. An introduction to the Hilbert-Schmidt SVD using iterated Brownian bridge kernels. Numerical Algorithms, pages 1 30, G. E. Fasshauer and M. J. McCourt. Stable evaluation of Gaussian radial basis function interpolants. SIAM J. Sci. Comput., 34(2):A737 A762, G. E. Fasshauer and Q. Ye. Reproducing kernels of generalized Sobolev spaces via a Green function approach with distributional operators. Numerische Mathematik, 119: , June michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

30 B. Fornberg and C. Piret. A stable algorithm for flat radial basis functions on a sphere. SIAM J. Sci. Comput., 30(1):60 80, M. Golomb and H. F. Weinberger. Optimal approximation and error bounds. In R. E. Langer, editor, On Numerical Approximation, pages University of Wisconsin Press, A. Iske. Multiresolution Methods in Scattered Data Modelling, volume 37 of Lecture Notes in Computational Science and Engineering. Springer, Berlin, J. S. Liu. Monte Carlo strategies in scientific computing. Springer, L.-T. Luh. michael.mccourt@ucdenver.edu (UCDenver) Parameterizing Gaussians June 24, / 28

31 The shape parameter in the Gaussian function. Computers & Mathematics with Applications, 63(3): , W. R. Madych. Miscellaneous error bounds for multiquadric and related interpolators. Computers & Mathematics with Applications, 24(12): , K. V. Mardia and R. J. Marshall. Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika, 71(1): , B. Nagy, J. L. Loeppky, and W. J. Welch. Fast Bayesian inference for Gaussian process models. Technical report, Technical Report 230, Dept. Statistics, Univ. British Columbia, (UCDenver) Parameterizing Gaussians June 24, / 28

32 R. Schaback and H. Wendland. Kernel techniques: From machine learning to meshless methods. Acta Numerica, 15: , C. K. I. Williams and D. Barber. Bayesian classification with Gaussian processes. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20(12): , Barbara Zwicknagl and Robert Schaback. Interpolation and approximation in Taylor spaces. Journal of Approximation Theory, 171:65 83, July (UCDenver) Parameterizing Gaussians June 24, / 28

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt.

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt. SINGAPORE SHANGHAI Vol TAIPEI - Interdisciplinary Mathematical Sciences 19 Kernel-based Approximation Methods using MATLAB Gregory Fasshauer Illinois Institute of Technology, USA Michael McCourt University

More information

Positive Definite Kernels: Opportunities and Challenges

Positive Definite Kernels: Opportunities and Challenges Positive Definite Kernels: Opportunities and Challenges Michael McCourt Department of Mathematical and Statistical Sciences University of Colorado, Denver CUNY Mathematics Seminar CUNY Graduate College

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 2: Radial Basis Function Interpolation in MATLAB Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 1 Part 3: Radial Basis Function Interpolation in MATLAB Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 2 Part 3: Native Space for Positive Definite Kernels Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu MATH

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 33: Adaptive Iteration Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590 Chapter 33 1 Outline 1 A

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 33: Adaptive Iteration Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590 Chapter 33 1 Outline 1 A

More information

Multivariate Interpolation with Increasingly Flat Radial Basis Functions of Finite Smoothness

Multivariate Interpolation with Increasingly Flat Radial Basis Functions of Finite Smoothness Multivariate Interpolation with Increasingly Flat Radial Basis Functions of Finite Smoothness Guohui Song John Riddle Gregory E. Fasshauer Fred J. Hickernell Abstract In this paper, we consider multivariate

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 14: The Power Function and Native Space Error Estimates Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu

More information

STABLE EVALUATION OF GAUSSIAN RBF INTERPOLANTS

STABLE EVALUATION OF GAUSSIAN RBF INTERPOLANTS STABLE EVALUATION OF GAUSSIAN RBF INTERPOLANTS GREGORY E. FASSHAUER AND MICHAEL J. MCCOURT Abstract. We provide a new way to compute and evaluate Gaussian radial basis function interpolants in a stable

More information

DIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION

DIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION Meshless Methods in Science and Engineering - An International Conference Porto, 22 DIRECT ERROR BOUNDS FOR SYMMETRIC RBF COLLOCATION Robert Schaback Institut für Numerische und Angewandte Mathematik (NAM)

More information

A new stable basis for radial basis function interpolation

A new stable basis for radial basis function interpolation A new stable basis for radial basis function interpolation Stefano De Marchi and Gabriele Santin Department of Mathematics University of Padua (Italy) Abstract It is well-known that radial basis function

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

RBF-FD Approximation to Solve Poisson Equation in 3D

RBF-FD Approximation to Solve Poisson Equation in 3D RBF-FD Approximation to Solve Poisson Equation in 3D Jagadeeswaran.R March 14, 2014 1 / 28 Overview Problem Setup Generalized finite difference method. Uses numerical differentiations generated by Gaussian

More information

Compact Matérn Kernels and Piecewise Polynomial Splines Viewed From a Hilbert-Schmidt Perspective

Compact Matérn Kernels and Piecewise Polynomial Splines Viewed From a Hilbert-Schmidt Perspective Noname manuscript No (will be inserted by the editor) Compact Matérn Kernels and Piecewise Polynomial Splines Viewed From a Hilbert-Schmidt Perspective Roberto Cavoretto Gregory E Fasshauer Michael McCourt

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 34: Improving the Condition Number of the Interpolation Matrix Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Accuracy and Optimality of RKHS Methods Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu MATH 590 1 Outline 1 Introduction

More information

Week 3: Linear Regression

Week 3: Linear Regression Week 3: Linear Regression Instructor: Sergey Levine Recap In the previous lecture we saw how linear regression can solve the following problem: given a dataset D = {(x, y ),..., (x N, y N )}, learn to

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods The Connection to Green s Kernels Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu MATH 590 1 Outline 1 Introduction

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Fixed-domain Asymptotics of Covariance Matrices and Preconditioning

Fixed-domain Asymptotics of Covariance Matrices and Preconditioning Fixed-domain Asymptotics of Covariance Matrices and Preconditioning Jie Chen IBM Thomas J. Watson Research Center Presented at Preconditioning Conference, August 1, 2017 Jie Chen (IBM Research) Covariance

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Radial basis functions topics in slides

Radial basis functions topics in slides Radial basis functions topics in 40 +1 slides Stefano De Marchi Department of Mathematics Tullio Levi-Civita University of Padova Napoli, 22nd March 2018 Outline 1 From splines to RBF 2 Error estimates,

More information

Kernel B Splines and Interpolation

Kernel B Splines and Interpolation Kernel B Splines and Interpolation M. Bozzini, L. Lenarduzzi and R. Schaback February 6, 5 Abstract This paper applies divided differences to conditionally positive definite kernels in order to generate

More information

ECE521 lecture 4: 19 January Optimization, MLE, regularization

ECE521 lecture 4: 19 January Optimization, MLE, regularization ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity

More information

Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels

Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels Y.C. Hon and R. Schaback April 9, Abstract This paper solves the Laplace equation u = on domains Ω R 3 by meshless collocation

More information

Stability of Kernel Based Interpolation

Stability of Kernel Based Interpolation Stability of Kernel Based Interpolation Stefano De Marchi Department of Computer Science, University of Verona (Italy) Robert Schaback Institut für Numerische und Angewandte Mathematik, University of Göttingen

More information

Nonparametric Regression With Gaussian Processes

Nonparametric Regression With Gaussian Processes Nonparametric Regression With Gaussian Processes From Chap. 45, Information Theory, Inference and Learning Algorithms, D. J. C. McKay Presented by Micha Elsner Nonparametric Regression With Gaussian Processes

More information

Dimensionality Reduction and Principal Components

Dimensionality Reduction and Principal Components Dimensionality Reduction and Principal Components Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Motivation Recall, in Bayesian decision theory we have: World: States Y in {1,..., M} and observations of X

More information

Gaussian Processes for Machine Learning

Gaussian Processes for Machine Learning Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of

More information

Radial basis function partition of unity methods for PDEs

Radial basis function partition of unity methods for PDEs Radial basis function partition of unity methods for PDEs Elisabeth Larsson, Scientific Computing, Uppsala University Credit goes to a number of collaborators Alfa Ali Alison Lina Victor Igor Heryudono

More information

5.6 Nonparametric Logistic Regression

5.6 Nonparametric Logistic Regression 5.6 onparametric Logistic Regression Dmitri Dranishnikov University of Florida Statistical Learning onparametric Logistic Regression onparametric? Doesnt mean that there are no parameters. Just means that

More information

Regularized Least Squares

Regularized Least Squares Regularized Least Squares Ryan M. Rifkin Google, Inc. 2008 Basics: Data Data points S = {(X 1, Y 1 ),...,(X n, Y n )}. We let X simultaneously refer to the set {X 1,...,X n } and to the n by d matrix whose

More information

Bayesian Support Vector Machines for Feature Ranking and Selection

Bayesian Support Vector Machines for Feature Ranking and Selection Bayesian Support Vector Machines for Feature Ranking and Selection written by Chu, Keerthi, Ong, Ghahramani Patrick Pletscher pat@student.ethz.ch ETH Zurich, Switzerland 12th January 2006 Overview 1 Introduction

More information

Spatial Process Estimates as Smoothers: A Review

Spatial Process Estimates as Smoothers: A Review Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed

More information

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression

CSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares Scattered Data Approximation o Noisy Data via Iterated Moving Least Squares Gregory E. Fasshauer and Jack G. Zhang Abstract. In this paper we ocus on two methods or multivariate approximation problems

More information

Kernel Methods. Foundations of Data Analysis. Torsten Möller. Möller/Mori 1

Kernel Methods. Foundations of Data Analysis. Torsten Möller. Möller/Mori 1 Kernel Methods Foundations of Data Analysis Torsten Möller Möller/Mori 1 Reading Chapter 6 of Pattern Recognition and Machine Learning by Bishop Chapter 12 of The Elements of Statistical Learning by Hastie,

More information

Meshfree Approximation Methods with MATLAB

Meshfree Approximation Methods with MATLAB Interdisciplinary Mathematical Sc Meshfree Approximation Methods with MATLAB Gregory E. Fasshauer Illinois Institute of Technology, USA Y f? World Scientific NEW JERSEY LONDON SINGAPORE BEIJING SHANGHAI

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 6: Scattered Data Interpolation with Polynomial Precision Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu

More information

9.2 Support Vector Machines 159

9.2 Support Vector Machines 159 9.2 Support Vector Machines 159 9.2.3 Kernel Methods We have all the tools together now to make an exciting step. Let us summarize our findings. We are interested in regularized estimation problems of

More information

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart Machine Learning Bayesian Regression & Classification learning as inference, Bayesian Kernel Ridge regression & Gaussian Processes, Bayesian Kernel Logistic Regression & GP classification, Bayesian Neural

More information

Nonparameteric Regression:

Nonparameteric Regression: Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 4: The Connection to Kriging Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu MATH 590 Chapter 4 1 Outline

More information

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart

Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart Statistical and Learning Techniques in Computer Vision Lecture 2: Maximum Likelihood and Bayesian Estimation Jens Rittscher and Chuck Stewart 1 Motivation and Problem In Lecture 1 we briefly saw how histograms

More information

22 : Hilbert Space Embeddings of Distributions

22 : Hilbert Space Embeddings of Distributions 10-708: Probabilistic Graphical Models 10-708, Spring 2014 22 : Hilbert Space Embeddings of Distributions Lecturer: Eric P. Xing Scribes: Sujay Kumar Jauhar and Zhiguang Huo 1 Introduction and Motivation

More information

Linear Regression (9/11/13)

Linear Regression (9/11/13) STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Stability constants for kernel-based interpolation processes

Stability constants for kernel-based interpolation processes Dipartimento di Informatica Università degli Studi di Verona Rapporto di ricerca Research report 59 Stability constants for kernel-based interpolation processes Stefano De Marchi Robert Schaback Dipartimento

More information

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning Mark Schmidt University of British Columbia, May 2016 www.cs.ubc.ca/~schmidtm/svan16 Some images from this lecture are

More information

Logistic Regression. Will Monroe CS 109. Lecture Notes #22 August 14, 2017

Logistic Regression. Will Monroe CS 109. Lecture Notes #22 August 14, 2017 1 Will Monroe CS 109 Logistic Regression Lecture Notes #22 August 14, 2017 Based on a chapter by Chris Piech Logistic regression is a classification algorithm1 that works by trying to learn a function

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 5: Completely Monotone and Multiply Monotone Functions Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH

More information

SVMs, Duality and the Kernel Trick

SVMs, Duality and the Kernel Trick SVMs, Duality and the Kernel Trick Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 26 th, 2007 2005-2007 Carlos Guestrin 1 SVMs reminder 2005-2007 Carlos Guestrin 2 Today

More information

Bayesian Linear Regression [DRAFT - In Progress]

Bayesian Linear Regression [DRAFT - In Progress] Bayesian Linear Regression [DRAFT - In Progress] David S. Rosenberg Abstract Here we develop some basics of Bayesian linear regression. Most of the calculations for this document come from the basic theory

More information

Theoretical and computational aspects of multivariate interpolation with increasingly flat radial basis functions

Theoretical and computational aspects of multivariate interpolation with increasingly flat radial basis functions Theoretical and computational aspects of multivariate interpolation with increasingly flat radial basis functions Elisabeth Larsson Bengt Fornberg June 0, 003 Abstract Multivariate interpolation of smooth

More information

Regularization in Reproducing Kernel Banach Spaces

Regularization in Reproducing Kernel Banach Spaces .... Regularization in Reproducing Kernel Banach Spaces Guohui Song School of Mathematical and Statistical Sciences Arizona State University Comp Math Seminar, September 16, 2010 Joint work with Dr. Fred

More information

A Statistical Analysis of Fukunaga Koontz Transform

A Statistical Analysis of Fukunaga Koontz Transform 1 A Statistical Analysis of Fukunaga Koontz Transform Xiaoming Huo Dr. Xiaoming Huo is an assistant professor at the School of Industrial and System Engineering of the Georgia Institute of Technology,

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Convergence Rates of Kernel Quadrature Rules

Convergence Rates of Kernel Quadrature Rules Convergence Rates of Kernel Quadrature Rules Francis Bach INRIA - Ecole Normale Supérieure, Paris, France ÉCOLE NORMALE SUPÉRIEURE NIPS workshop on probabilistic integration - Dec. 2015 Outline Introduction

More information

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015 EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

Dimensionality Reduction and Principle Components

Dimensionality Reduction and Principle Components Dimensionality Reduction and Principle Components Ken Kreutz-Delgado (Nuno Vasconcelos) UCSD ECE Department Winter 2012 Motivation Recall, in Bayesian decision theory we have: World: States Y in {1,...,

More information

A orthonormal basis for Radial Basis Function approximation

A orthonormal basis for Radial Basis Function approximation A orthonormal basis for Radial Basis Function approximation 9th ISAAC Congress Krakow, August 5-9, 2013 Gabriele Santin, joint work with S. De Marchi Department of Mathematics. Doctoral School in Mathematical

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Iteration basics Notes for 2016-11-07 An iterative solver for Ax = b is produces a sequence of approximations x (k) x. We always stop after finitely many steps, based on some convergence criterion, e.g.

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

Machine Learning - MT & 5. Basis Expansion, Regularization, Validation

Machine Learning - MT & 5. Basis Expansion, Regularization, Validation Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

Kernel Methods. Barnabás Póczos

Kernel Methods. Barnabás Póczos Kernel Methods Barnabás Póczos Outline Quick Introduction Feature space Perceptron in the feature space Kernels Mercer s theorem Finite domain Arbitrary domain Kernel families Constructing new kernels

More information

Green s Functions: Taking Another Look at Kernel Approximation, Radial Basis Functions and Splines

Green s Functions: Taking Another Look at Kernel Approximation, Radial Basis Functions and Splines Green s Functions: Taking Another Look at Kernel Approximation, Radial Basis Functions and Splines Gregory E. Fasshauer Abstract The theories for radial basis functions (RBFs) as well as piecewise polynomial

More information

CLASS NOTES Computational Methods for Engineering Applications I Spring 2015

CLASS NOTES Computational Methods for Engineering Applications I Spring 2015 CLASS NOTES Computational Methods for Engineering Applications I Spring 2015 Petros Koumoutsakos Gerardo Tauriello (Last update: July 27, 2015) IMPORTANT DISCLAIMERS 1. REFERENCES: Much of the material

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G.

26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 10-708: Probabilistic Graphical Models, Spring 2015 26 : Spectral GMs Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 1 Introduction A common task in machine learning is to work with

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 7: Conditionally Positive Definite Functions Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu MATH 590 Chapter

More information

Solving Boundary Value Problems (with Gaussians)

Solving Boundary Value Problems (with Gaussians) What is a boundary value problem? Solving Boundary Value Problems (with Gaussians) Definition A differential equation with constraints on the boundary Michael McCourt Division Argonne National Laboratory

More information

Notes on Regularized Least Squares Ryan M. Rifkin and Ross A. Lippert

Notes on Regularized Least Squares Ryan M. Rifkin and Ross A. Lippert Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-2007-025 CBCL-268 May 1, 2007 Notes on Regularized Least Squares Ryan M. Rifkin and Ross A. Lippert massachusetts institute

More information

Radial Basis Functions I

Radial Basis Functions I Radial Basis Functions I Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo November 14, 2008 Today Reformulation of natural cubic spline interpolation Scattered

More information

LMS Algorithm Summary

LMS Algorithm Summary LMS Algorithm Summary Step size tradeoff Other Iterative Algorithms LMS algorithm with variable step size: w(k+1) = w(k) + µ(k)e(k)x(k) When step size µ(k) = µ/k algorithm converges almost surely to optimal

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Vector Spaces, Orthogonality, and Linear Least Squares

Vector Spaces, Orthogonality, and Linear Least Squares Week Vector Spaces, Orthogonality, and Linear Least Squares. Opening Remarks.. Visualizing Planes, Lines, and Solutions Consider the following system of linear equations from the opener for Week 9: χ χ

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Day 5: Generative models, structured classification

Day 5: Generative models, structured classification Day 5: Generative models, structured classification Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 22 June 2018 Linear regression

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Interpolation by Basis Functions of Different Scales and Shapes

Interpolation by Basis Functions of Different Scales and Shapes Interpolation by Basis Functions of Different Scales and Shapes M. Bozzini, L. Lenarduzzi, M. Rossini and R. Schaback Abstract Under very mild additional assumptions, translates of conditionally positive

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes

Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan 1, M. Koval and P. Parashar 1 Applications of Gaussian

More information

The Learning Problem and Regularization

The Learning Problem and Regularization 9.520 Class 02 February 2011 Computational Learning Statistical Learning Theory Learning is viewed as a generalization/inference problem from usually small sets of high dimensional, noisy data. Learning

More information

1 Linear Regression and Correlation

1 Linear Regression and Correlation Math 10B with Professor Stankova Worksheet, Discussion #27; Tuesday, 5/1/2018 GSI name: Roy Zhao 1 Linear Regression and Correlation 1.1 Concepts 1. Often when given data points, we want to find the line

More information