THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING

Size: px
Start display at page:

Download "THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING"

Transcription

1 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING IRINA HOLMES AND AMBAR N. SENGUPTA Abstract. There has been growing recent interest in probabilistic interpretations of kernel-based methods as well as learning in Banach spaces. The absence of a useful Lebesgue measure on an infinitedimensional reproducing kernel Hilbert space is a serious obstacle for such stochastic models. We propose an estimation model for the ridge regression problem within the framework of abstract Wiener spaces and show how the support vector machine solution to such problems can be interpreted in terms of the Gaussian Radon transform. 1. Introduction A central task in machine learning is the prediction of unknown values based on learning from a given set of outcomes. More precisely, suppose X is a non-empty set, the input space, and D = {(p 1, y 1 ), (p 2, y 2 ),..., (p n, y n )} X R (1.1) is a finite collection of input values p j together with their corresponding real outputs y j. The goal is to predict the value y corresponding to a yet unobserved input value p X. The fundamental assumption of kernel-based methods such as support vector machines (SVM) is that the decision or prediction function ˆf λ belongs to a reproducing kernel Hilbert space (RKHS) H over X, corresponding to a positive definite function K : X X R (see Section 2.1). In particular, in ridge regression, the Date: October Mathematics Subject Classification. Primary 44A12, Secondary 28C20, 62J07. Key words and phrases. Gaussian Radon Transform, Ridge Regression, Kernel Methods, Abstract Wiener Space, Gaussian Process. Irina Holmes is a Dissertation Year Fellow at Louisiana State University, Department of Mathematics; irina.c.holmes@gmail.com. Ambar Niel Sengupta, is Professor of Mathematics at Louisiana State University; ambarnsg@gmail.com. 1

2 2 IRINA HOLMES AND AMBAR N. SENGUPTA predictor function ˆf λ is the solution to the minimization problem: ( n ) ˆf λ = arg min (y j f(p j )) 2 + λ f 2, (1.2) f H j=1 where λ > 0 is a regularization parameter and f denotes the norm of f in H. The regularization term λ f 2 penalizes functions that overfit the training data. The problem considered in (1.2) has a unique solution and this solution is a linear combination of the functions K pj = K(p j, ). Specifically: n ˆf λ = c j K pj (1.3) where c R n is given by: j=1 c = (K D + λi n ) 1 y (1.4) where K D R n n is the matrix with entries [K D ] ij = K(p i, p j ), I n is the identity matrix of size n and y = (y 1,..., y n ) R n is the vector of inputs. We present a geometrical proof of this classic result in Theorem 5.1. Recently there has been a surge in interest in probabilistic interpretations of kernel-based learning methods; see for instance [1, 12, 17, 18, 20, 25]. Let us first consider a Bayesian perspective: suppose the data arises as y j = f(p j ) + ɛ j where f is random, ɛ 1,..., ɛ n are independent Gaussian with mean 0 and unknown variance σ 2 > 0, and are independent of f. This gives us a prior distribution: n 1 P (y p, f) = 1 2πσ 2 e 2σ 2 (y j f(p j )) 2 j=1 If the RKHS H under consideration is finite-dimensional we have standard Gaussian measure on H, with density function: P (f) = 1 2π dim(h) e 1 2 f 2 with respect to Lebesgue measure on H. This gives rise to the posterior distribution: P (f x, y) e 1 n 2σ 2 j=1 (y j f(p j )) f 2. (1.5)

3 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 3 The maximum a posteriori (MAP) estimator then yields the same function as in (1.3), with regularization parameter λ = σ 2. Equivalently, one may assume that the data arises as a sample path of a Gaussian process, perturbed by a Gaussian measurement error: y(p) = f(p) + ɛ(p) where {f(p) : p X } is a Gaussian process indexed by X with mean 0 and covariance K: Cov(f(p), f(q)) = K(p, q), and {ɛ(p) : p X } is a Gaussian process independent of f, with mean 0 and covariance σ 2 I: Cov(ɛ(p), ɛ(q)) = σ 2 δ p,q. Then {y(p) : p X } is also a Gaussian process, with mean 0 and covariance K + σ 2 I. The natural choice for the prediction ŷ at a future input value p is the conditional expectation: ŷ = E [f(p) y(p 1 ) = y 1,..., y(p n ) = y n ]. (1.6) Then, as we show in Lemma 4.1, ŷ = [K(p, p 1 )... K(p, p n )] (K D + σ 2 I n ) 1 y n = c j K pj (p) j=1 = ˆf λ (p), where ˆf λ is again the SVM estimator in (1.3) with regularization parameter λ = σ 2. Clearly the Bayesian approach does not work whenever H is infinitedimensional, which is often the case, because of the absence of a useful Lebesgue measure in infinite dimensions. The SVM approach though still goes through in infinite dimensions. Our goal in this article is to show that so does the Gaussian process approach and, moreover, that they remain equivalent. Since the SVM solution in (1.3) is contained in a finite-dimensional subspace, the possibly infinite dimensionality of H is less of a problem in this setting. However, if we want to predict based on a stochastic model for f and H is infinite-dimensional, the absence of Lebesgue measure becomes a significant problem. In particular, we cannot have the desired Gaussian process {f(p) : p X } on H itself. The approach we propose is to work within the framework of abstract Wiener spaces, introduced by L. Gross in the celebrated work [8]. The concept of abstract Wiener space was born exactly from this need for a standard Gaussian measure in infinite dimensions and has become

4 4 IRINA HOLMES AND AMBAR N. SENGUPTA the standard framework in infinite-dimensional analysis. We outline the basics of this theory in Section 2.2 and showcase the essentials of the classical Wiener space, as the original special case of an abstract Wiener space, in Section 2.4. Simply put, we construct a Gaussian measure with the desired properties on a larger Banach space B that contains H as a dense subspace; the geometry of this measure is dictated by the inner-product on H. This construction is presented in Section 2.3. We then show in Section 4 how the ridge regression learning problem outlined above can be understood in terms of the Gaussian Radon transform. The Gaussian Radon transform associates to a function f on B the function Gf, defined on the set of closed affine subspaces of H, whose value Gf(L) on an affine subspace L H is the integral f dµ L, where µ L is a Gaussian measure on the closure L of L in B obtained from the given Gaussian measure on B. This is explained in more detail in Section 3. For more on the Gauss Radon transform we refer to the works [2, 3, 10, 11, 16]. In Section 5 we present a geometric view. First we present a geometric proof of the representer theorem for the SVM minimization problem and then describe the relationship with the Gaussian Radon transform in geometric terms. Finally, another area that has recently seen strong activity is learning in Banach spaces; see for instance [6, 21, 24]. Of particular interest are reproducing kernel Banach spaces, introduced in [24], which are special Banach spaces whose elements are functions. The Banach space B we use in Section 4 is a completion of a reproducing kernel Hilbert space, but does not directly consist of functions. A notable exception is the classical Wiener space, where the Banach space is C[0, 1]. In Section 6 we address this issue and propose a realization of B as a space of functions. 2. Realizing covariance structures with Banach spaces In this section we construct a Gaussian measure on a Banach space along with random variables defined on this space for which the covariance structure is specified in advance. In more detail, suppose X is a non-empty set and K : X X R (2.1) a function that is symmetric and positive definite (in the sense that the matrix [K(p, q)] p,q X is symmetric and positive definite); we will construct a measure µ on a certain Banach space B along with a family of

5 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 5 Gaussian random variables K p, for each p X, such that K(p, q) = Cov( K p, K q ) (2.2) for all p, q X. A well-known choice for K(p, q), for p, q R d, is given by e p q 2 2s, where s > 0 is a scale parameter. The strategy is as follows: first we construct a Hilbert space H along with elements K p H, for each p X, for which K p, K q = K(p, q) for all p, q X. Next we describe how to obtain a Banach space B, equipped with a Gaussian measure, along with random variables K p that have the required covariance structure (2.2). The first step is a standard result for reproducing kernel Hilbert spaces Constructing a Hilbert space from a covariance structure. For this we simply quote the well-known Moore-Aronszajn theorem (see, for example, Chapter 4 of Steinwart [22]). Theorem 2.1. Let X be a non-empty set and K : X X R a function for which the matrix [K(p, q)] p,q X is symmetric and positive definite: (i) K(p, q) = K(q, p) for all p, q X, and (ii) N j,k=1 c jc k K(p j, p k ) 0 holds for all integers N 1, all points p 1,..., p N X, and all c 1,..., c N R. Then there is a unique Hilbert space H consisting of real-valued functions defined on the set X, containing the functions K p = K(p, ) for all p X, with the inner product on H being such that f(p) = K p, f for all f H and p X. (2.3) Moreover, the linear span of {K p : p X } is dense in H. The Hilbert space H is called the reproducing kernel Hilbert space (RKHS) over X with reproducing kernel K. For the mapping Φ : X H : p Φ(p) = K p, known as the canonical feature map, we have Φ(p) Φ(q) 2 = K(p, p) 2K(p, q) + K(q, q) (2.4) for all p, q X. Hence if X is a topological space and K is continuous then so is the function (p, q) Φ(p) Φ(q) 2 given in (2.4), and the value of this being 0 when p = q, it follows that p Φ(p) is continuous.

6 6 IRINA HOLMES AND AMBAR N. SENGUPTA In particular, if X has a countable dense subset then so does the image Φ(X ), and since this spans a dense subspace of H it follows that H is separable Gaussian measures on Banach spaces. The theory of abstract Wiener spaces developed by Gross [8] provides the machinery for constructing a Gaussian measure on a Banach space B obtained F by completing a given Hilbert space H using a special type F ε of norm called a measurable norm. Conversely, according to a fundamental result of Gross, every centered, any nondegenerate Gaussian measure on a h F h > ε real separable Banach space arises in this way from completing an underlying Hilbert space called the Figure 1. A measurable norm on H Cameron-Martin space. We work here with real Hilbert spaces as a complex structure plays no role in the Gaussian measure in this context. Definition 2.1. A norm on a real separable Hilbert space H is said to be a measurable norm provided that for any ɛ > 0, there is a finitedimensional subspace F ɛ of H such that: γ F {h F : h > ɛ} < ɛ (2.5) for every finite-dimensional subspace F of H with F F ɛ, where γ F denotes standard Gaussian measure on F. We denote the norm on H arising from the inner-product, by, which is not to be confused with a measurable norm. Here are three facts about measurable norms (for proofs see Gross [8], Kuo [15] or Eldredge [7]): (1) A measurable norm is always weaker than the original norm: there is c > 0 such that: h c h, (2.6) for all h H. (2) If H is infinite-dimensional, the original Hilbert norm is not a measurable norm.

7 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 7 (3) If A is an injective Hilbert-Schmidt operator on H then h = Ah, for all h H (2.7) specifies a measurable norm on H. Henceforth we denote by B the Banach space obtained by completion of H with respect to. Any element f H gives a linear functional H R : g f, g that is continuous with respect to the norm. Let H B be the subspace of H consisting of all f for which g f, g is continuous with respect to the norm. By fact (1) above, any linear functional on H that is continuous with respect to is also continuous with respect to and hence is of the form f, for a unique f H by the traditional Riesz theorem. Thus H B consists precisely of those f H for which the linear functional f, is the restriction to H of a (unique) continuous linear functional on the Banach space B. We denote this extension of f, to an element of B by I(f): I(f) = the continuous linear functional on B agreeing with f, on H (2.8) for all f H B. Moreover, B consists exactly of the elements I(f) for f H B : for any φ B the restriction φ H is in H (because of the relation (2.6)) and hence φ = I(f) for a unique f H B. The fundamental result of Gross [8] is that there is a unique Borel measure µ on B such that for every f H B the linear functional I(f), viewed as a random variable on (B, µ), is Gaussian with mean 0 and variance f 2 ; thus: e ii(f) dµ = e 1 2 f 2 H (2.9) for all f H B H. The mapping B I : H B L 2 (B, µ) is a linear isometry. A triple (H, B, µ) where H is a real separable Hilbert space, B is the Banach space obtained by completing H with respect to a measurable norm, and µ is the Gaussian measure in (2.9), is called an abstract Wiener space. The measure µ is known as Wiener measure on B.

8 8 IRINA HOLMES AND AMBAR N. SENGUPTA Suppose g H is orthogonal to H B ; then for any φ B we have φ(g) = f, g = 0 where f is the element of H B for which I(f) = φ. Since φ(g) = 0 for all φ B it follows that g = 0. Thus, H B is dense in H. (2.10) Consequently I extends uniquely to a linear isometry of H into L 2 (B, µ). We denote this extension again by I: It follows by taking L 2 -limits that the random variable I(f) is again Gaussian, satisfying (2.9), for every f H. It will be important for our purposes to emphasize again that if f H is such that the linear functional f, on H is continuous with respect to the norm then the random variable I(f) is the unique continuous linear functional on B that agrees with f on H. If f, is not continuous on H with respect to the norm then I(f) is obtained by L 2 (B, µ)- approximating with elements of B. I : H L 2 (B, µ). (2.11) B, H, H B B Figure 2. Abstract Wiener Space 2.3. Construction of the random variables K p. As before, we assume given a model covariance structure K : X X R. We have seen how this gives rise to a Hilbert space H, which is the closed linear span of the functions K p = K(p, ), with p running over X. Now let be any measurable norm on H, and let B the Banach space obtained by completion of H with respect to. Let µ be the centered Gaussian measure on B as discussed above in the context of (2.9). We set K p def = I(K p ), (2.12)

9 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 9 for all p X ; thus K p L 2 (B, µ). As a random variable on (B, µ) the function K p is Gaussian, with mean 0 and variance K p 2, and the covariance structure is given by: Cov( K p, K q ) = K p, K q L 2 (µ) = K p, K q H (since I is an isometry) = K(p, q). (2.13) Thus we have produced Gaussian random variables K p, for each p X, on a Banach space, with a given covariance structure Classical Wiener space. Let us look at the case of the classical Wiener space, as an example of the preceding structures. We take X = [0, T ] for some positive real T, and K BM (s, t) = min{s, t} = 1 [0,s], 1 [0,t] L 2, (2.14) for all s, t [0, T ], where the superscript refers to Brownian motion. Then Ks BM (x) is x for x [0, s] and is constant at s when x [s, T ]. It follows then that H 0, the linear span of the functions Ks BM for s [0, T ], is the set of all functions on [0, T ] with initial value 0 and graph consisting of linear segments; in other words, H 0 consists of all functions f on [0, T ] whose derivative exists and is locally constant outside a finite set of points. The inner product on H 0 is determined by observing that Ks BM, Kt BM = K BM (s, t) = min{s, t}; (2.15) we can verify that this coincides with f, g = T 0 f (u)g (u) du (2.16) for all f, g H 0. As consequence, H is the Hilbert space consisting of functions f on [0, T ] that can be expressed as f(t) = t 0 g(x) dx for all t [0, T ], (2.17) for some g L 2 [0, T ]; equivalently, H consists of all absolutely continuous functions f on [0, T ], with f(0) = 0, for which the derivative f exists almost everywhere and is in L 2 [0, T ]. The sup norm sup is a measurable norm on H (see Gross [8, Example 2] or Kuo [15]). The completion B is then C 0 [0, T ], the space of all continuous functions starting at 0. If f H 0 then I(f) is Gaussian of mean 0 and variance f 2, relative to the Gaussian measure µ on B. We can check readily that K BM t = I(K BM t )

10 10 IRINA HOLMES AND AMBAR N. SENGUPTA is Gaussian of mean 0 and variance t for all t [0, T ]. The process BM t K t yields standard Brownian motion, after a continuous version is chosen. 3. The Gaussian Radon Transform The classical Radon transform Rf of a function f : R n R associates to each hyperplane P R n the integral of f over P ; this is useful in scanning technologies and image reconstruction. The Gaussian Radon transform generalizes this to infinite dimensions and works with Gaussian measures (instead of Lebesgue measure, for which there is no useful infinite dimensional analog); the motivation for studying this transform comes from the task of reconstructing a random variable from its conditional expectations. We have developed this transform in the setting of abstract Wiener spaces in our work [10] (earlier works, in other frameworks, include [2, 3, 16]). Let H be a real separable Hilbert space, and B the Banach space obtained by completing H with respect to a measurable norm. Let M 0 be a closed subspace of H. Then M p = p + M 0 (3.1) is a closed affine subspace of H, for any point p H. In [10, Theorem 2.1] we have constructed a probability measure µ Mp on B uniquely specified by its Fourier transform e ix dµ Mp = e i(p,x ) 1 2 P M 0 x 2, for all x B H B H (3.2) B where P M0 denotes orthogonal projection on M 0 in H. For our present purposes we formulate the description of the measure µ Mp in slightly different terms. For every closed affine subspace L H there is a Borel probability measure µ L on B and there is a linear mapping I : H L 2 (B, µ L ) such that for every h H, the random variable I(h) satisfies e ii(h) dµ L = e i f L,h 1 2 P h 2, for all h H, (3.3) B where f L is the point on L closest to the origin in H, and P : H H is the orthogonal projection operator onto the closed subspace f L + L. Moreover, µ L is concentrated on the closure of L in B: µ L (L) = 1,

11 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 11 where L denotes the closure in B of the affine subspace L H. (Note that the mapping I depends on the subspace L. For more details about I L, see Corollary and observation (v) following Theorem 2.1 in [11].) From the condition (3.3), holding for all h H, we see that I(h) is Gaussian with mean h, f L and variance P h 2. The Gaussian Radon transform Gf for a Borel function f on B is a function defined on the set of all closed affine subspaces L in H by: Gf(L) def = B f dµ L. (3.4) (Of course, Gf is defined if the integral is finite for all such L.) The value Gf(L) corresponds to the conditional expectation of f, the conditioning being given by the closed affine subspace L. For a precise formulation of this and proof for L being of finite codimension see the disintegration formula given in [11, Theorem 3.1]. The following is an immediate consequence of [11, Corollary 3.2] and will play a role in Section 4: Proposition 3.1. Let (H, B, µ) be an abstract Wiener space and linearly independent elements h 1, h 2,..., h n of H. Let f be a Borel function on B, square-integrable with respect to µ, and let ( n ) F : R n R : (y 1,..., y n ) F (y 1,..., y n ) = Gf [ h j, = y j ]. Then F ( I(h 1 ), I(h 2 ),..., I(h n ) ) is a version of the conditional expectation E [f σ (I(h 1 ), I(h 2 ),..., I(h n ))]. 4. Machine Learning using the Gaussian Radon Transform Consider again the learning problem outlined in the introduction: suppose X is a separable topological space and H is the RKHS over X with reproducing kernel K : X X R. Let: D = {(p 1, y 1 ), (p 2, y 2 ),..., (p n, y n )} X R (4.1) be our training data and suppose we want to predict the value y at a future value p X. Note that H is a real separable Hilbert space so, as outlined in Section 2.2, we may complete H with respect to a measurable norm to obtain the Banach space B and Gaussian measure µ. Moreover, recall from Section 2.3 that we constructed for every p X the random variable K p = IK p L 2 (B, µ), (4.2) j=1

12 12 IRINA HOLMES AND AMBAR N. SENGUPTA where I : H L 2 (B, µ) is the isometry in (2.11), and Cov( K p, K q ) = K p, K q = K(p, q) (4.3) for all p, q X. Therefore { K p : p X } is a centered Gaussian process with covariance K. Now since for every f H the value f(p) is given by K p, f, we work with K p (f) as our desired random-variable prediction. The first guess would therefore be to use: [ ŷ = E Kp K p1 = y 1,..., K ] pn = y n (4.4) as our prediction of the output y corresponding to the input p. We will now need the following observation. Lemma 4.1. Suppose that (Z, Z 1, Z 2,..., Z n ) is a centered R n+1 -valued Gaussian random variable on a probability space (Ω, F, P) and let A R n n be the covariance matrix: A jk = Cov(Z j, Z k ), for all 1 j, k n, and suppose that A is invertible. Then: E [Z Z 1, Z 2,..., Z n ] = a 1 Z 1 + a 2 Z a n Z n (4.5) where a = (a 1,..., a n ) R n is given by a = A 1 z, (4.6) where z R n is given by z k = Cov(Z, Z k ) for all 1 k n. Proof. Let Z 0 be the orthogonal projection of Z on the linear span of Z 1,..., Z n ; thus Y = Z Z 0 is orthogonal to Z 1,..., Z n and, of course, (Y, Z 1,..., Z n ) is Gaussian. Hence, being all jointly Gaussian, the random variable Y is independent of (Z 1,..., Z n ). Then for any S σ(z 1,..., Z n ) we have E[Z1 S ] = E[Z 0 1 S ] + E[Y 1 S ] = E[Z 0 1 S ] + E[Y ]E[1 S ] = E[Z 0 1 S ]. (4.7) Since this holds for all S σ(z 1,..., Z n ), and since the random variable Z 0, being a linear combination of Z 1,..., Z n, is in σ(z 1,..., Z n ), we conclude that Z 0 = E [Z σ(z 1,..., Z n )]. (4.8) Thus the conditional expectation of Z is the orthogonal projection Z 0 onto the linear span of the variables Z 1,..., Z n.

13 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 13 Writing we have Z 0 = a 1 Z 1 + a 2 Z a n Z n, E[ZZ j ] = E[Z 0 Z j ] = n E[Z j Z k ]a k = (Aa) j, k=1 noting that A jk = Cov(Z j, Z k ) = E[Z j Z k ] since all these variables have mean 0 by hypothesis. Hence we have a = A 1 z. Then our prediction in (4.4) becomes: ŷ = [K(p 1, p)... K(p n, p)] (K D ) 1 y n = b j K pj (p) j=1 where b = (K D ) 1 y and K D is, as in the introduction, the matrix with entries K(p i, p j ). As we shall see in Theorem 5.2, the function n j=1 b jk pj is the element ˆf 0 H of minimal norm such that ˆf 0 (p i ) = y i for all 1 i n. Therefore (4.4) is the solution in the traditional spline setting, and does not take into account the regularization parameter λ. However, the next theorem shows that the ridge regression solution can be obtained in terms of the Gaussian Radon transform, by taking the Gaussian process approach outlined in the introduction. Theorem 4.1. Let H be the RKHS over a separable topological space X with real-valued reproducing kernel K and B be the completion of H with respect to a measurable norm with Wiener measure µ. Let D = {(p 1, y 1 ), (p 2, y 2 ),..., (p n, y n )} X R (4.9) be fixed and p X. Let {e 1, e 2,..., e n } H be an orthonormal set such that: {e 1,..., e n } [span{k p1,..., K pn, K p }] (4.10) and for every 1 j n let ẽ j = I(e j ) where I : H L 2 (B, µ) is the isometry in (2.11). Then for any λ > 0: [ E Kp K p1 + λẽ 1 = y 1,..., K pn + ] λẽ n = y n = ˆf λ (p) (4.11) where ˆf λ is the solution to the ridge regression problem in (1.2) with regularization parameter λ. Consequently: ( ˆf λ (p) = G K n [ p K pj + ] ) λe j, = y j (4.12) j=1

14 14 IRINA HOLMES AND AMBAR N. SENGUPTA where G K p is the Gaussian Radon transform of Kp. Note that a completely precise statement of the relation (4.11) is as follows. Let us for the moment write the quantity ˆf λ (p), involving the vector y, as ˆf λ (p; y). Then ( ˆf λ (p; Kp1 + λẽ 1,..., K pn + )) λẽ n (4.13) is a version of the conditional expectation [ ( E Kp σ Kp1 + λẽ 1,..., K pn + )] λẽ n. (4.14) Proof. By our assumption in (4.10): ( Cov Kpi + λẽ j, K + ) pj λẽ j = K pi + λe i, K pj + λe j = K(p i, p j ) + λδ i,j for all 1 i, j n. We note that this covariance matrix is invertible (see the argument preceding equation (5.16) below). Moreover, ( Cov Kp, K + ) pj λẽ j = K(p j, p) for all 1 j n. Then by Lemma 4.1: [ E Kp K p1 + λẽ 1 = y 1,..., K pn + ] λẽ n = y n = [K(p 1, p)... K(p n, p)] (K D + λi n ) 1 y n = c j K pj (x) where c = (K D + λi n ) 1 y j=1 = ˆf λ (p) Finally, (4.12) follows directly from Proposition 3.1. The interpretation of the predicted value in terms of the Gaussian Radon transform allows for quite a broad class of functions that can be considered for prediction. As a simple example, consider the task of predicting the maximum value of an unknown function over a future period using knowledge from the training data. For this we take both the RKHS and the Banach space to be spaces of real-valued functions on X (see section 6 for more on this); the predicted value would be GF (L) (4.15)

15 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 15 where L is the closed subspace of the RKHS reflecting the training data, and F is, for example, a function of the form F (b) = sup p S K p (b) (4.16) for some given set S of future dates. We note that the prediction (4.15) is, in general, different from sup G K p (L), p S where G K p (L) is the SVM prediction as in (4.12); in other words, the prediction (4.15) is not the same as simply taking the supremum over the predictions given by the SVM minimizer. We note also that in this type of problem the Hilbert space, being a function space, is necessarily infinite-dimensional. 5. A geometric formulation In this section we present a geometric view of the relationship between the Gaussian Radon transform and the representer theorem used in support vector machine theory; thus, this will be a geometric interpretation of Theorem 4.1. Given a reproducing kernel Hilbert space H of functions defined on a set X with reproducing kernel K : X X R, we wish to find a function ˆf λ H that minimizes the functional E λ (f) = n [y j f(p j )] 2 + λ f 2, (5.1) j=1 where p 1,..., p n are given points in X, y 1,..., y n are given values in R, and λ > 0 is a parameter. Our first goal in this section is to present a geometric proof of the following representer theorem widely used in support vector machine theory. The result has its roots in the work of Kimeldorf and Wahba [13,14] (for example, [13, Lemmas 2.1 and 2.2]) in the context of splines; in this context it is also worth noting the work of de Boor and Lynch [5] where Hilbert space methods were used to study splines. Theorem 5.1. With notation and hypotheses as above, there is a unique ˆf λ H such that E λ ( ˆf λ ) is inf f H E λ (f). Moreover, ˆf λ is given explicitly

16 16 IRINA HOLMES AND AMBAR N. SENGUPTA by ˆf λ = n c i K pi (5.2) i=1 where the vector c R n is (K D +λi n ) 1 y, with K D being the n n matrix with entries [K D ] ij = K(p i, p j ) and y = (y 1,..., y n ) R n. Proof. It will be convenient to scale the inner-product, of H by λ. Consequently, we denote by H λ the space H with inner-product: We shall use the linear mapping f, g Hλ = λ f, g, for all f, g H. (5.3) T : R n H λ (5.4) that maps e i to λ 1 K pi for i {1,..., n}, where {e 1,..., e n } is the standard basis of R n : T (e i ) = λ 1 K pi We observe then that for any f H λ for each i, and so for i {1,..., n}. T f, e i R n = f, T e i Hλ = λ f, λ 1 K pi = f(p i ) (5.5) T f = n f(p j )e j. (5.6) j=1 Consequently, we can rewrite E λ (f) as E λ (f) = y T f 2 R n + f 2 H λ, (5.7) and from this we see that E λ (f) has a geometric meaning as the distance from the point (f, T f) H λ R n to the point (0, y) in H λ R n : E λ (f) = (0, y) (f, T f) 2 H λ Rn. (5.8) Thus the minimization problem for E λ ( ) is equivalent to finding the point on the subspace M = {(f, T f) : f H λ } H λ R n (5.9) closest to (0, y). Now the subspace M is just the graph Gr(T ) and it is a closed subspace of H λ R n because it is the orthogonal complement of a subspace (as we see below in (5.13)). Hence by standard Hilbert space theory there is indeed a unique point on M that is closest to (0, y), and this point is in fact of the form ( ˆf λ, T ˆfλ ) = (0, y) + (a, b), (5.10)

17 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 17 0, y n 0, y n f, T f f, T f H H Figure 3. A geometric interpretation of Theorem 5.1 where the vector (a, b) H λ R n is orthogonal to M. Now the condition for orthogonality to M means that and this is equivalent to (a, b), (f, T f) Hλ R n = 0 for all f H, 0 = a, f Hλ + b, T f R n = a + T b, f Hλ = λ a + T b, f for all f H. Therefore a + T b = 0. (5.11) Thus, [Gr(T )] = {( T c, c) : c R n }. (5.12) Conversely, we can check directly that Gr(T ) = {( T c, c) : c R n }. (5.13) Returning to (5.10) we see that the point on M closest to (0, y) is of the form ( ˆf λ, T ˆfλ ) = (a, y + b) = ( T b, y + b) (5.14) for some b R n. Since the second component is T applied to the first, we have y + b = T T b, and solving for b we obtain b = (T T + I n ) 1 y. (5.15) Note that the operator T T + I n on R n is invertible, since (T T + I n )w, w R n w 2 R n, so that if (T T + I n )w = 0 then w = 0. Then from (5.14) we have ˆf λ = a = T b given by ˆf λ = T [ (T T + I n ) 1 y ]. (5.16)

18 18 IRINA HOLMES AND AMBAR N. SENGUPTA Now we just need to write this in coordinates. The matrix for T T has entries (T T )e i, e j R n = T e i, T e j R n = λ 1 K pi, λ 1 K pj Hλ = λ λ 1 K pi, λ 1 K pj = λ 1 [K D ] ij, (5.17) and so ˆf λ = T [ (T T + I n ) 1 y ] [ n ] = T (I n + λ 1 K D ) 1 ij y je i. i,j=1 Since T e i = λ 1 K pi, we can write this as ˆf λ = [ n n ] (I n + λ 1 K D ) 1 ij y j λ 1 K pi, i=1 j=1 which simplifies readily to (5.2). The observations about the graph Gr(T ) used in the preceding proof are in the spirit of the analysis of adjoints of operators carried out by von Neumann [23]. With ˆf λ being the minimizer as above, we can calculate the minimum value of E( ): E( ˆf λ ) = (0, y) ( ˆf λ, T ˆfλ ) 2 H λ R n = (a, b) 2 H λ Rn (using (5.10)) = T b, T b Hλ + b, b R n = T T b, b R n + b, b R n (5.18) = (T T + I n )b, b R n = y, b R n (using (5.15)) = (T T + I) 1/2 y 2 Rn (again using (5.15).)

19 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 19 It is useful to keep in mind that our definition of T in (5.4), and hence of T, depends on λ. We note that the norm squared of ˆf λ itself is ˆf λ 2 = λ 1 T (T T + I n ) 1 y, T (T T + I n ) 1 y Hλ = λ 1 (T T ) 1/2 (T T + I) 1 y 2 H λ n = λ 1/2 K 1/2 D (λ 1 K D + I n ) 1 y j λ 1 K pj 2 j=1 = λ 1 n j=1 K 1/2 D (K D + λi n ) 1 y j K pj 2 (5.19) Let us now turn to the traditional spline setting. A function f H, whose graph passes through the training points (p i, y i ), for i {1,..., n}, of minimum norm has to be found. We present here a geometrical description in the spirit of Theorem 5.1. This is in fact the result one would obtain by formally taking λ = 0 in Theorem 5.1. Theorem 5.2. Let H be a reproducing kernel Hilbert space of functions on a set X, and let (p 1, y 1 ),..., (p n, y n ) be points in X R. Let K : X X R be the reproducing kernel for H, and let K q : X R : q K(q, q ), for every q X. Assume that the functions K p1,..., K pn are linearly independent. Then, for any y = (y 1,..., y n ) R n, the element in {f H : f(p 1 ) = y 1,..., f(p n ) = y n } of minimum norm is given by ˆf 0 = n c 0i K pi, (5.20) i=1 where c 0 = (c 1,..., c n ) = K 1 D y, with K D being the n n matrix whose (i, j)-th entry is K(p i, p j ). The assumption of linear independence of the K pi is simply to ensure that there does exist a function f H with values y i at the points p i. Proof. Let T 0 : R n H be the linear mapping specified by T 0 e i = K pi, for i {1,..., n}. Then the adjoint T : H R n is given explicitly by n n T0 f = f, K pi e i = f(p i )e i i=1 i=1

20 20 IRINA HOLMES AND AMBAR N. SENGUPTA and so {f H : f(p 1 ) = y 1,..., f(p n ) = y n } = {f H : T 0 f = y}. (5.21) Since the linear functionals K pi are linearly independent, no nontrivial linear combination of them is 0 and so the only vector in R n orthogonal to the range of T0 is 0; this Im(T 0 ) = R n. (5.22) Let ˆf 0 be the point on the closed affine subspace in (5.21) that is nearest the origin in H. Then ˆf 0 is the point on {f H : T 0 f = y} orthogonal to ker T 0. f H: T 0 f = y f 0 Ker T 0 H Figure 4. The point on {f H : T0 f = y} closest to the origin. Now it is a standard observation that ker T 0 = [Im(T 0 )]. (If T 0 v = 0 then v, T 0 w = T 0 v, w = 0, so that ker T 0 [Im(T 0 )] ; conversely, if v [Im(T 0 )] then T 0 v, w = v, T 0 w = 0 for all w H and so T 0 v = 0.) Hence (ker T 0 ) is the closure of Im(T 0 ). Now Im(T 0 ) is a finite-dimensional subspace of H and hence is closed; therefore (ker T 0 ) = Im(T 0 ). (5.23) Returning to our point ˆf 0 we conclude that ˆf 0 Im(T 0 ). Thus, ˆf 0 = T 0 g for some g H. The requirement that T 0 ˆf 0 be equal to y means that (T 0 T 0 )g = y, and so ˆf 0 = T 0 (T 0 T 0 ) 1 y. (5.24)

21 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 21 We observe here that T 0 T 0 : R n R n is invertible because any v ker(t 0 T 0 ) satisfies T 0 v 2 = T 0 T 0 v, v = 0, so that ker(t 0 T 0 ) = ker(t 0 ), and this is {0}, again by the linear independence of the functions K pi. The matrix for T 0 T 0 is just the matrix K D because its (i, j)-th entry is (T 0 T 0 ) ij = T e i, T e j = K pi, K pj = K(p i, p j ) = (K D ) ij. Thus ˆf 0 works out to n i,j=1 (K 1 D ) ijy j K pi. Working in the setting of Theorem 5.2, and assuming that H is separable, let B be the Banach space obtained as completion of H with respect to a measurable norm. Recall from (3.3) that the Gaussian Radon transform of a function F on H is the function on the set of closed affine subspaces of H given by GF (L) = F dµ L, (5.25) where L is any closed subspace of H, and µ L is the Borel measure on B specified by the Fourier transform e ii(h) dµ L = e i h, ˆf P h 2, for all h H, (5.26) B wherein ˆf 0 is the point on L closest to the origin in H and P is the orthogonal projection onto the closed subspace L ˆf 0. Let us apply this to the closed affine subspace L = {f H : f(p 1 ) = y 1,..., f(p n ) = y n }. From equation (5.26) we see that I(h) is a Gaussian variable with mean h, ˆf 0 and variance P h 2. Now let us take for h the function K p H, for any point p X ; then I(K p ) is Gaussian, with respect to the measure µ L, with mean E µl ( K p ) = K p, ˆf 0 = ˆf 0 (p), (5.27) where K p is the random variable I(K p ) defined on (B, µ L ). The function ˆf 0 is as given in (5.20). Now consider the special case where p = p i from the training data. Then E µl ( K pi ) = ˆf 0 (p i ) = y i, because ˆf 0 belongs to L, by definition. Moreover, the variance of Kpi is the norm-squared of the orthogonal projection of Kpi onto the closed subspace L 0 = L ˆf 0 = {f H : f(p 1 ) = 0,..., f(p n ) = 0}.

22 22 IRINA HOLMES AND AMBAR N. SENGUPTA However, for any f L we have f ˆf 0, K pi = f(p i ) ˆf 0 (p i ) = 0, and so the variance of Kpi is 0. Thus, with respect to the measure µ L, the functions K pi take the constant values y i almost everywhere. This is analogous to our result Theorem 4.1; in the present context the conclusion is [ ˆf 0 (p) = E Kp K p1 = y 1,..., K ] p1 = y 1. (5.28) We can also obtain a geometrical perspective on Theorem 4.1 itself, by working with the closed affine subspace of H λ R n given by L 1 = (0, y) + Gr(T ). The point in this set closest to (0, 0) is v = (0, y) + ( ˆf λ, T ˆfλ ), (5.29) where ( ˆf λ, T ˆfλ ) is the point on Gr(T ) closest to (0, y). Thus, v = ( ˆf λ, y + T ˆfλ ). Since Gr(T ) is the subspace {(T c, c) : c R n } it has basis vectors (T e i, e i ) for i {1,..., n}. We observe that for any (f, z) M p : Thus, (f, z), (T e i, e i ) Hλ R n = (0, y), (T e i, e i ) Hλ R n = y i. L 1 = n i=1{(f, z) H λ R n : (f, z), (T e i, e i ) Hλ R n = y i }. Using the procedure of Section 2.2, we complete H λ R n with respect to a measurable norm to obtain a Banach space B. Now consider the measure µ L1, specified in (3.2), for the closed affine subspace L 1 of H λ R n, and the map I : H λ R n L 2 ( B, µ L1 ). For every h H λ R n, I L1 (h) is Gaussian with mean v, h, with v being the point of L 1 closest to (0, 0). With this in mind, take (λ 1 K q, 0) for h, where q X and K q H. We have (λ 1 K q, 0), p Hλ R n = (λ 1 K q, 0), ( ˆf λ, y + T ˆfλ ) Hλ R n = ˆf λ (q). So the predicted value ˆf λ (q) is exactly the mean of the random variable I L1 (λ 1 K q, 0) on B with respect to the measure µ L1.

23 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING Realizing B as a space of functions In this section we present some results of a somewhat technical nature to address the question as to whether the elements of the Banach space B can be viewed as functions Continuity of Kp. A general measurable norm on H does not know about the kernel function K and hence there seems to be no reason why the functionals K p, on H would be continuous with respect to the norm. To remedy this situation we prove that there exist measurable norms on H relative to which the functionals K p, are continuous for p running along a dense sequence of points in X : Proposition 6.1. Let H be the reproducing kernel Hilbert space associated to a continuous kernel function K : X X R, where X is a separable metric space. Let D be a countable dense subset of X. Then there is a measurable norm on H with respect to which K p, is a continuous linear functional for every p D. Proof. Let D consist of the points p 1, p 2,..., forming a dense subset of X. By the Gram-Schmidt process we obtain an orthonormal basis e 1, e 2,... of H such that K pn is contained in the span of e 1,..., e n, for every n. Now consider the bounded linear operator A : H H specified by requiring that Ae n = 1 e n n for all n; this is Hilbert-Schmidt because n Ae n 2 < and is clearly injective. Hence, by the Property (3) discussed in the context of (2.7), [ ] 1/2 f def 1 = Af = n e n, f 2 for all f H 2 specifies a measurable norm on H. Then n e n, f n f, from which we see that the linear functional e n, on H is continuous with respect to the norm. Hence, by definition of the linear isometry I : H L 2 (B, µ) given in (2.8), I(e n ) is the element in B that agrees with e n, on H. In particular each I(e n ) is continuous and hence K p = I(K p ) is a continuous linear functional on B for every p D. QED The measurable norm we have constructed in the preceding proof arises from a (new) inner-product on H. However, given any other measurable norm 0 on H the sum def = + 0

24 24 IRINA HOLMES AND AMBAR N. SENGUPTA is also a measurable norm (not necessarily arising from an inner-product) and the linear functional K p, : H R is continuous with respect to the norm for every p D Elements of B as functions. If a Banach space B is obtained by completing a Hilbert space H of functions, the elements of B need not consist of functions. However, when H is a reproducing kernel Hilbert space as we have been discussing and under reasonable conditions on the reproducing kernel function K it is true that elements of B can almost be thought of as functions on X. For this we first develop a lemma: Lemma 6.1. Suppose H is a separable real Hilbert space and B the Banach space obtained by completing H with respect to a measurable norm. Let B 0 be a closed subspace of B that is transverse to H in the sense that H B 0 = {0}, and let B 1 = B/B 0 = {b + B 0 : b B} (6.1) be the quotient Banach space, with the standard quotient norm, given by Then the mapping b + B 0 1 def = inf{ x : x b + B 0 }. (6.2) ι 1 : H B/B 0 : h ι(h) + B 0, (6.3) where ι : H B is the inclusion, is a continuous linear injective map, and def h 1 = ι 1 (h) for all h H, (6.4) specifies a measurable norm on H. The image of H under ι 1 is a dense subspace of B 1. Proof. Let us first note that by definition of the quotient norm b + B 0 1 b for all b B. Hence h 1 h for all h H B. Let ɛ > 0. Then since is a measurable norm on H there is a finitedimensional subspace F ɛ H such that if F is any finite-dimensional subspace of H orthogonal to F ɛ then, as noted back in (2.5), γ F {h F : h > ɛ} < ɛ, where γ F is standard Gaussian measure on F. Then γ F {h F : h 1 > ɛ} γ F {h F : h > ɛ} < ɛ, (6.5)

25 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 25 where the first inequality holds because whenever h 1 > ɛ we also have h h 1 > ɛ. Thus, 1 is a measurable norm on H. The image ι 1 (H) is the same as the projection of the dense subspace H B onto the quotient space B 1 = B/B 0 and hence this image is dense in B 1 (an open set in the complement of ι 1 (H) would have inverse image in B that is in the complement of H, and would have to be empty because H is dense in B). QED We can now establish the identification of B as a function space. Proposition 6.2. Let K : X X R be a continuous function, symmetric and non-negative definite, where X is a separable topological space, and D a countable dense subset of X. Let H be the corresponding reproducing kernel Hilbert space. Then there is a measurable norm 1 on H such that the Banach space B 1 obtained by completing H with respect to 1 can be realized as a space of functions on the set D. Proof. Let B be the completion of H with respect to a measurable norm of the type given in Proposition 6.1. Thus K p, : H R is continuous with respect to when p D; let K p : B R be the continuous linear extension of K p to the Banach space B, for p D. Now let def B 0 = p D ker K p, (6.6) a closed subspace of B. We observe that B 0 is transverse to H; for if x B 0 is in H then K p, x = 0 for all p D and so x = 0 since {K p : p D} spans a dense subspace of H as noted in Theorem 2.1 and the remark following it. Then by Lemma 6.1, B 1 = B/B 0 is a Banach space that is a completion of H in the sense that ι 1 : H B 1 : h h+b 0 def is continuous linear with dense image and h 1 = ι 1 (h) 1, for h H, specifies a measurable norm on H. Let Kp 1 be the linear functional on B 1 induced by K p : K 1 p(b + B 0 ) def = K p (b) for all b B, (6.7) which is well-defined because the linear functional Kp is 0 on B 0. We note that K 1 p( ι1 (h) ) = K 1 p(h + B 0 ) = K p (h) = K p, h for all h H, and so K 1 p is the continuous linear extension of K p, to B 1 through ι 1 : H B 1, viewed as an inclusion map.

26 26 IRINA HOLMES AND AMBAR N. SENGUPTA Now to each b 1 B 1 associate the function f b1 We will shows that the mapping f b1 : D R : p K 1 p(b 1 ). b 1 f b1 on D given by: is injective; thus it realizes B 1 as a set of functions on the set D. To this end, suppose that f b+b0 = f c+b0 for some b, c B. This means K 1 p(b + B 0 ) = K 1 p(c + B 0 ) for all p D, and so K p (b) = K p (c) for all p D. Then b c ker K p for all p D. Thus b c B 0 and so b + B 0 = c + B 0. Thus we have shown that b 1 f b1 is injective. QED We have defined the function f b on the set D X, with notation and hypotheses as above. Now taking a general point p X and a sequence of points p n D converging to p the function K p on B is the L 2 (µ)- limit the sequences of functions K pn. Thus we can define f b (p) = K p (b), with the understanding that for a given p X, the value f b (p) is µ- almost-surely defined in terms of its dependence on b. In the theory of Gaussian random fields one has conditions on the covariance function K that ensure that D B R : p K p (b) is continuous in p for µ-a.e. b B, and in this case the function f b extends uniquely to a continuous function on X, for µ-almost-every b B. Acknowledgments. This work is part of a research project covered by NSA grant H I. Holmes would like to express her gratitude to the Louisiana State University Graduate School for awarding her the LSU Graduate School Dissertation Year Fellowship, which made most of her contribution to this work possible. We thank Kalyan B. Sinha for useful discussions. References [1] Aleksandr Y Aravkin, Bradley M. Bell, James V. Burke, and Gianluigi Pillonetto, The connection between Bayesian estimation of a Gaussian random field and RKHS, Submitted to IEEE Transactions on Neural Networks and Learning Systems, available at

27 THE GAUSSIAN RADON TRANSFORM AND MACHINE LEARNING 27 [2] Jeremy J. Becnel, The support theorem for the Gauss-Radon transform, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 15 (2012), no. 2, , 21, DOI /S MR [3] Jeremy J. Becnel and Ambar N. Sengupta, A support theorem for a Gaussian Radon transform in infinite dimensions, Trans. Amer. Math. Soc. 364 (2012), no. 3, , DOI /S , available at https: // MR [4] Vladimir I. Bogachev, Gaussian measures, Mathematical Surveys and Monographs, vol. 62, American Mathematical Society, Providence, RI, MR (2000a:60004) [5] Carl de Boor and Robert E. Lynch, On splines and their minimum properties, J. Math. Mech. 15 (1966), MR (34 #3159) [6] Ricky Der and Daniel Lee, Large-Margin Classification in Banach Spaces, Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS-07), 2007, pp , available at edu/proceedings/papers/v2/der07a/der07a.pdf. [7] Nate Eldredge, Math 7770: Analysis and Probability on Infinite-Dimensional Spaces, Online class notes (2012), available at ~neldredge/7770/7770-lecture-notes.pdf. [8] Leonard Gross, Abstract Wiener spaces, Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. II: Contributions to Probability Theory, Part 1, Univ. California Press, Berkeley, Calif., 1967, pp MR (35 #3027) [9] Sigurdur Helgason, The Radon transform, 2nd ed., Progress in Mathematics, vol. 5, Birkhäuser Boston Inc., Boston, MA, MR (2000m:44003) [10] Irina Holmes and Ambar N. Sengupta, A Gaussian Radon transform for Banach spaces, J. Funct. Anal. 263 (2012), no. 11, , DOI /j.jfa , available at pdf. MR [11] Irina Holmes, An inversion formula for the Gaussian Radon transform for Banach spaces, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 16 (2013), no. 4, available at [12] Ferenc Huszár and Simon Lacoste-Julien, A Kernel Approach to Tractable Bayesian Nonparametrics, arxiv preprint (2011), available at org/pdf/ v3.pdf. [13] George S. Kimeldorf and Grace Wahba, A correspondence between Bayesian estimation on stochastic processes and smoothing by splines, Ann. Math. Statist. 41 (1970), , available at oldie/kw70bayes.pdf. MR (40 #8206) [14], Some Results on Tchebycheffian Spline Functions, Journal of Mathematical Analysis and Applications 33 (1971), no. 1, 82 95, available at http: // [15] Hui Hsiung Kuo, Gaussian measures in Banach spaces, Lecture Notes in Mathematics, Vol. 463, Springer-Verlag, Berlin, MR (57 #1628) [16] Vochita Mihai and Ambar N. Sengupta, The Radon-Gauss transform, Soochow J. Math. 33 (2007), no. 3, MR (2009a:44006)

28 28 IRINA HOLMES AND AMBAR N. SENGUPTA [17] Umut Ozertem and Deniz Erdogmus, RKHS Bayes Discriminant: A Subspace Constrained Nonlinear Feature Projection for Signal Detection, Neural Networks, IEEE Transactions on 20 (2009), no. 7, , DOI /TNN [18] Natesh S. Pillai, Qiang Wu, Feng Liang, Sayan Mukherjee, and Robert L. Wolpert, Characterizing the function space for Bayesian kernel models, J. Mach. Learn. Res. 8 (2007), , available at volume8/pillai07a/pillai07a.pdf. MR [19] Johann Radon, Über die Bestimmung von Funktionen durch ihre Integralwerte längs gewisser Mannigfaltigkeiten, Computed tomography (Cincinnati, Ohio, 1982), Proc. Sympos. Appl. Math., vol. 27, Amer. Math. Soc., Providence, R.I., 1982, pp (German). MR (84f:01040) [20] Peter Sollich, Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities, Mach. Learn. 46 (2002), no. 1-3, 21 52, DOI /A: , available at [21] Bharath K. Sriperumbudur, Kenji Fukumizu, and Gert R. G. Lanckriet, Learning in Hilbert vs. Banach Spaces: A Measure Embedding Viewpoint., NIPS, 2011, pp , available at nips_rkbs_11.pdf. [22] Ingo Steinwart and Andreas Christmann, Support vector machines, Information Science and Statistics, Springer, New York, MR (2010f:62002) [23] J. von Neumann, Über adjungierte Funktionaloperatoren, Ann. of Math. (2) 33 (1932), no. 2, , DOI / (German). MR [24] Haizhang Zhang, Yuesheng Xu, and Jun Zhang, Reproducing kernel Banach spaces for machine learning, J. Mach. Learn. Res. 10 (2009), , DOI /IJCNN , available at wustl.edu/mlpapers/paper_files/jmlr10_zhang09b.pdf. MR (2011c:62219) [25] Zhihua Zhang, Guang Dai, Donghui Wang, and Michael I. Jordan, Bayesian Generalized Kernel Models, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS-10), 2010, pp , available at zhang-etal-aistats10.pdf. Department of Mathematics, Louisiana State University, Baton Rouge, LA 70803, irina.c.holmes@gmail.com Department of Mathematics, Louisiana State University, Baton Rouge, LA 70803, ambarnsg@gmail.com

Research Statement Irina Holmes October 2013

Research Statement Irina Holmes October 2013 Research Statement Irina Holmes October 2013 My current research interests lie in infinite-dimensional analysis and geometry, probability and statistics, and machine learning. The main focus of my graduate

More information

A GAUSSIAN RADON TRANSFORM FOR BANACH SPACES

A GAUSSIAN RADON TRANSFORM FOR BANACH SPACES A GAUSSIAN RADON TRANSFORM FOR ANACH SPACES IRINA HOLMES AND AMAR N. SENGUPTA Abstract. We develop a Radon transform on anach spaces using Gaussian measure and prove that if a bounded continuous function

More information

THE RADON-GAUSS TRANSFORM

THE RADON-GAUSS TRANSFORM THE ADON-GAUSS TANSFOM VOCHITA MIHAI AND AMBA. SENGUPTA Abstract. Gaussian measure is constructed for any given hyperplane in an infinite-dimensional Hilbert space, and this is used to define a generalization

More information

Regularization in Reproducing Kernel Banach Spaces

Regularization in Reproducing Kernel Banach Spaces .... Regularization in Reproducing Kernel Banach Spaces Guohui Song School of Mathematical and Statistical Sciences Arizona State University Comp Math Seminar, September 16, 2010 Joint work with Dr. Fred

More information

The Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee

The Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee The Learning Problem and Regularization 9.520 Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing

More information

CONTINUITY OF CP-SEMIGROUPS IN THE POINT-STRONG OPERATOR TOPOLOGY

CONTINUITY OF CP-SEMIGROUPS IN THE POINT-STRONG OPERATOR TOPOLOGY J. OPERATOR THEORY 64:1(21), 149 154 Copyright by THETA, 21 CONTINUITY OF CP-SEMIGROUPS IN THE POINT-STRONG OPERATOR TOPOLOGY DANIEL MARKIEWICZ and ORR MOSHE SHALIT Communicated by William Arveson ABSTRACT.

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS

ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS J. OPERATOR THEORY 44(2000), 243 254 c Copyright by Theta, 2000 ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS DOUGLAS BRIDGES, FRED RICHMAN and PETER SCHUSTER Communicated by William B. Arveson Abstract.

More information

EQUIVALENCE OF TOPOLOGIES AND BOREL FIELDS FOR COUNTABLY-HILBERT SPACES

EQUIVALENCE OF TOPOLOGIES AND BOREL FIELDS FOR COUNTABLY-HILBERT SPACES EQUIVALENCE OF TOPOLOGIES AND BOREL FIELDS FOR COUNTABLY-HILBERT SPACES JEREMY J. BECNEL Abstract. We examine the main topologies wea, strong, and inductive placed on the dual of a countably-normed space

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

COMMON COMPLEMENTS OF TWO SUBSPACES OF A HILBERT SPACE

COMMON COMPLEMENTS OF TWO SUBSPACES OF A HILBERT SPACE COMMON COMPLEMENTS OF TWO SUBSPACES OF A HILBERT SPACE MICHAEL LAUZON AND SERGEI TREIL Abstract. In this paper we find a necessary and sufficient condition for two closed subspaces, X and Y, of a Hilbert

More information

Approximate Kernel PCA with Random Features

Approximate Kernel PCA with Random Features Approximate Kernel PCA with Random Features (Computational vs. Statistical Tradeoff) Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Journées de Statistique Paris May 28,

More information

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms (February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops

More information

On Riesz-Fischer sequences and lower frame bounds

On Riesz-Fischer sequences and lower frame bounds On Riesz-Fischer sequences and lower frame bounds P. Casazza, O. Christensen, S. Li, A. Lindner Abstract We investigate the consequences of the lower frame condition and the lower Riesz basis condition

More information

The Gaussian Radon transform in classical Wiener space

The Gaussian Radon transform in classical Wiener space Communications on Stochastic Analysis Volume 8 Number 2 Article 7 6-1-214 The Gaussian Radon transform in classical Wiener space Irina Holmes Ambar N Sengupta Follow this and additional works at: https://digitalcommons.lsu.edu/cosa

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces 9.520: Statistical Learning Theory and Applications February 10th, 2010 Reproducing Kernel Hilbert Spaces Lecturer: Lorenzo Rosasco Scribe: Greg Durrett 1 Introduction In the previous two lectures, we

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

Boundedly complete weak-cauchy basic sequences in Banach spaces with the PCP

Boundedly complete weak-cauchy basic sequences in Banach spaces with the PCP Journal of Functional Analysis 253 (2007) 772 781 www.elsevier.com/locate/jfa Note Boundedly complete weak-cauchy basic sequences in Banach spaces with the PCP Haskell Rosenthal Department of Mathematics,

More information

Your first day at work MATH 806 (Fall 2015)

Your first day at work MATH 806 (Fall 2015) Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies

More information

The Gaussian Radon Transform for Banach Spaces

The Gaussian Radon Transform for Banach Spaces Louisiana State University LSU Digital Commons LSU Doctoral Dissertations Graduate School 214 The Gaussian Radon Transform for anach Spaces Irina Holmes Louisiana State University and Agricultural and

More information

CHAPTER VIII HILBERT SPACES

CHAPTER VIII HILBERT SPACES CHAPTER VIII HILBERT SPACES DEFINITION Let X and Y be two complex vector spaces. A map T : X Y is called a conjugate-linear transformation if it is a reallinear transformation from X into Y, and if T (λx)

More information

Functional Analysis HW #5

Functional Analysis HW #5 Functional Analysis HW #5 Sangchul Lee October 29, 2015 Contents 1 Solutions........................................ 1 1 Solutions Exercise 3.4. Show that C([0, 1]) is not a Hilbert space, that is, there

More information

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Statistical Inference with Reproducing Kernel Hilbert Space Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department

More information

Kernels for Multi task Learning

Kernels for Multi task Learning Kernels for Multi task Learning Charles A Micchelli Department of Mathematics and Statistics State University of New York, The University at Albany 1400 Washington Avenue, Albany, NY, 12222, USA Massimiliano

More information

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space 1 Professor Carl Cowen Math 54600 Fall 09 PROBLEMS 1. (Geometry in Inner Product Spaces) (a) (Parallelogram Law) Show that in any inner product space x + y 2 + x y 2 = 2( x 2 + y 2 ). (b) (Polarization

More information

Spectral Theory, with an Introduction to Operator Means. William L. Green

Spectral Theory, with an Introduction to Operator Means. William L. Green Spectral Theory, with an Introduction to Operator Means William L. Green January 30, 2008 Contents Introduction............................... 1 Hilbert Space.............................. 4 Linear Maps

More information

Math Solutions to homework 5

Math Solutions to homework 5 Math 75 - Solutions to homework 5 Cédric De Groote November 9, 207 Problem (7. in the book): Let {e n } be a complete orthonormal sequence in a Hilbert space H and let λ n C for n N. Show that there is

More information

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto Reproducing Kernel Hilbert Spaces 9.520 Class 03, 15 February 2006 Andrea Caponnetto About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space.

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Hilbert Spaces Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Vector Space. Vector space, ν, over the field of complex numbers,

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

Citation Osaka Journal of Mathematics. 41(4)

Citation Osaka Journal of Mathematics. 41(4) TitleA non quasi-invariance of the Brown Authors Sadasue, Gaku Citation Osaka Journal of Mathematics. 414 Issue 4-1 Date Text Version publisher URL http://hdl.handle.net/1194/1174 DOI Rights Osaka University

More information

Functional Analysis I

Functional Analysis I Functional Analysis I Course Notes by Stefan Richter Transcribed and Annotated by Gregory Zitelli Polar Decomposition Definition. An operator W B(H) is called a partial isometry if W x = X for all x (ker

More information

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators

More information

SHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction

SHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction SHARP BOUNDARY TRACE INEQUALITIES GILES AUCHMUTY Abstract. This paper describes sharp inequalities for the trace of Sobolev functions on the boundary of a bounded region R N. The inequalities bound (semi-)norms

More information

4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan

4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan Wir müssen wissen, wir werden wissen. David Hilbert We now continue to study a special class of Banach spaces,

More information

The Schwartz Space: Tools for Quantum Mechanics and Infinite Dimensional Analysis

The Schwartz Space: Tools for Quantum Mechanics and Infinite Dimensional Analysis Mathematics 2015, 3, 527-562; doi:10.3390/math3020527 Article OPEN ACCESS mathematics ISSN 2227-7390 www.mdpi.com/journal/mathematics The Schwartz Space: Tools for Quantum Mechanics and Infinite Dimensional

More information

Strictly Positive Definite Functions on a Real Inner Product Space

Strictly Positive Definite Functions on a Real Inner Product Space Strictly Positive Definite Functions on a Real Inner Product Space Allan Pinkus Abstract. If ft) = a kt k converges for all t IR with all coefficients a k 0, then the function f< x, y >) is positive definite

More information

THE GAUSSIAN RADON TRANSFORM AS A LIMIT OF SPHERICAL TRANSFORMS

THE GAUSSIAN RADON TRANSFORM AS A LIMIT OF SPHERICAL TRANSFORMS THE GAUSSIAN RADON TRANSFORM AS A LIMIT OF SPHERICAL TRANSFORMS AMBAR N. SENGUPTA Abstract. The Gaussian Radon transform is an analog of the traditional Radon transform in an infinite dimensional Gaussian

More information

What is an RKHS? Dino Sejdinovic, Arthur Gretton. March 11, 2014

What is an RKHS? Dino Sejdinovic, Arthur Gretton. March 11, 2014 What is an RKHS? Dino Sejdinovic, Arthur Gretton March 11, 2014 1 Outline Normed and inner product spaces Cauchy sequences and completeness Banach and Hilbert spaces Linearity, continuity and boundedness

More information

A DECOMPOSITION THEOREM FOR FRAMES AND THE FEICHTINGER CONJECTURE

A DECOMPOSITION THEOREM FOR FRAMES AND THE FEICHTINGER CONJECTURE PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0002-9939(XX)0000-0 A DECOMPOSITION THEOREM FOR FRAMES AND THE FEICHTINGER CONJECTURE PETER G. CASAZZA, GITTA KUTYNIOK,

More information

引用北海学園大学学園論集 (171): 11-24

引用北海学園大学学園論集 (171): 11-24 タイトル 著者 On Some Singular Integral Operato One to One Mappings on the Weight Hilbert Spaces YAMAMOTO, Takanori 引用北海学園大学学園論集 (171): 11-24 発行日 2017-03-25 On Some Singular Integral Operators Which are One

More information

Intertibility and spectrum of the multiplication operator on the space of square-summable sequences

Intertibility and spectrum of the multiplication operator on the space of square-summable sequences Intertibility and spectrum of the multiplication operator on the space of square-summable sequences Objectives Establish an invertibility criterion and calculate the spectrum of the multiplication operator

More information

Trace Class Operators and Lidskii s Theorem

Trace Class Operators and Lidskii s Theorem Trace Class Operators and Lidskii s Theorem Tom Phelan Semester 2 2009 1 Introduction The purpose of this paper is to provide the reader with a self-contained derivation of the celebrated Lidskii Trace

More information

Means of unitaries, conjugations, and the Friedrichs operator

Means of unitaries, conjugations, and the Friedrichs operator J. Math. Anal. Appl. 335 (2007) 941 947 www.elsevier.com/locate/jmaa Means of unitaries, conjugations, and the Friedrichs operator Stephan Ramon Garcia Department of Mathematics, Pomona College, Claremont,

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Karhunen-Loève decomposition of Gaussian measures on Banach spaces

Karhunen-Loève decomposition of Gaussian measures on Banach spaces Karhunen-Loève decomposition of Gaussian measures on Banach spaces Jean-Charles Croix jean-charles.croix@emse.fr Génie Mathématique et Industriel (GMI) First workshop on Gaussian processes at Saint-Etienne

More information

Self-adjoint extensions of symmetric operators

Self-adjoint extensions of symmetric operators Self-adjoint extensions of symmetric operators Simon Wozny Proseminar on Linear Algebra WS216/217 Universität Konstanz Abstract In this handout we will first look at some basics about unbounded operators.

More information

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,

More information

On the simplest expression of the perturbed Moore Penrose metric generalized inverse

On the simplest expression of the perturbed Moore Penrose metric generalized inverse Annals of the University of Bucharest (mathematical series) 4 (LXII) (2013), 433 446 On the simplest expression of the perturbed Moore Penrose metric generalized inverse Jianbing Cao and Yifeng Xue Communicated

More information

Infinite-dimensional Vector Spaces and Sequences

Infinite-dimensional Vector Spaces and Sequences 2 Infinite-dimensional Vector Spaces and Sequences After the introduction to frames in finite-dimensional vector spaces in Chapter 1, the rest of the book will deal with expansions in infinitedimensional

More information

NUCLEAR SPACE FACTS, STRANGE AND PLAIN

NUCLEAR SPACE FACTS, STRANGE AND PLAIN NUCLEAR SPACE FACTS, STRANGE AND PLAIN JEREMY J. BECNEL AND AMBAR N. SENGUPTA Abstract. We present a scenic, but practical drive through nuclear spaces, stopping to look at unexpected results both for

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

REAL AND COMPLEX ANALYSIS

REAL AND COMPLEX ANALYSIS REAL AND COMPLE ANALYSIS Third Edition Walter Rudin Professor of Mathematics University of Wisconsin, Madison Version 1.1 No rights reserved. Any part of this work can be reproduced or transmitted in any

More information

DEFINABLE OPERATORS ON HILBERT SPACES

DEFINABLE OPERATORS ON HILBERT SPACES DEFINABLE OPERATORS ON HILBERT SPACES ISAAC GOLDBRING Abstract. Let H be an infinite-dimensional (real or complex) Hilbert space, viewed as a metric structure in its natural signature. We characterize

More information

3. (a) What is a simple function? What is an integrable function? How is f dµ defined? Define it first

3. (a) What is a simple function? What is an integrable function? How is f dµ defined? Define it first Math 632/6321: Theory of Functions of a Real Variable Sample Preinary Exam Questions 1. Let (, M, µ) be a measure space. (a) Prove that if µ() < and if 1 p < q

More information

ON MATRIX VALUED SQUARE INTEGRABLE POSITIVE DEFINITE FUNCTIONS

ON MATRIX VALUED SQUARE INTEGRABLE POSITIVE DEFINITE FUNCTIONS 1 2 3 ON MATRIX VALUED SQUARE INTERABLE POSITIVE DEFINITE FUNCTIONS HONYU HE Abstract. In this paper, we study matrix valued positive definite functions on a unimodular group. We generalize two important

More information

S. DUTTA AND T. S. S. R. K. RAO

S. DUTTA AND T. S. S. R. K. RAO ON WEAK -EXTREME POINTS IN BANACH SPACES S. DUTTA AND T. S. S. R. K. RAO Abstract. We study the extreme points of the unit ball of a Banach space that remain extreme when considered, under canonical embedding,

More information

Topological vectorspaces

Topological vectorspaces (July 25, 2011) Topological vectorspaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ Natural non-fréchet spaces Topological vector spaces Quotients and linear maps More topological

More information

COVARIANCE IDENTITIES AND MIXING OF RANDOM TRANSFORMATIONS ON THE WIENER SPACE

COVARIANCE IDENTITIES AND MIXING OF RANDOM TRANSFORMATIONS ON THE WIENER SPACE Communications on Stochastic Analysis Vol. 4, No. 3 (21) 299-39 Serials Publications www.serialspublications.com COVARIANCE IDENTITIES AND MIXING OF RANDOM TRANSFORMATIONS ON THE WIENER SPACE NICOLAS PRIVAULT

More information

Almost Invariant Half-Spaces of Operators on Banach Spaces

Almost Invariant Half-Spaces of Operators on Banach Spaces Integr. equ. oper. theory Online First c 2009 Birkhäuser Verlag Basel/Switzerland DOI 10.1007/s00020-009-1708-8 Integral Equations and Operator Theory Almost Invariant Half-Spaces of Operators on Banach

More information

TWO MAPPINGS RELATED TO SEMI-INNER PRODUCTS AND THEIR APPLICATIONS IN GEOMETRY OF NORMED LINEAR SPACES. S.S. Dragomir and J.J.

TWO MAPPINGS RELATED TO SEMI-INNER PRODUCTS AND THEIR APPLICATIONS IN GEOMETRY OF NORMED LINEAR SPACES. S.S. Dragomir and J.J. RGMIA Research Report Collection, Vol. 2, No. 1, 1999 http://sci.vu.edu.au/ rgmia TWO MAPPINGS RELATED TO SEMI-INNER PRODUCTS AND THEIR APPLICATIONS IN GEOMETRY OF NORMED LINEAR SPACES S.S. Dragomir and

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 22: Preliminaries for Semiparametric Inference

Introduction to Empirical Processes and Semiparametric Inference Lecture 22: Preliminaries for Semiparametric Inference Introduction to Empirical Processes and Semiparametric Inference Lecture 22: Preliminaries for Semiparametric Inference Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics

More information

Spectral isometries into commutative Banach algebras

Spectral isometries into commutative Banach algebras Contemporary Mathematics Spectral isometries into commutative Banach algebras Martin Mathieu and Matthew Young Dedicated to the memory of James E. Jamison. Abstract. We determine the structure of spectral

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 12, 2007 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Approximate Kernel Methods

Approximate Kernel Methods Lecture 3 Approximate Kernel Methods Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Machine Learning Summer School Tübingen, 207 Outline Motivating example Ridge regression

More information

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms.

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. Vector Spaces Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. For each two vectors a, b ν there exists a summation procedure: a +

More information

Real Analysis Notes. Thomas Goller

Real Analysis Notes. Thomas Goller Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................

More information

Regularity of the density for the stochastic heat equation

Regularity of the density for the stochastic heat equation Regularity of the density for the stochastic heat equation Carl Mueller 1 Department of Mathematics University of Rochester Rochester, NY 15627 USA email: cmlr@math.rochester.edu David Nualart 2 Department

More information

THE FORM SUM AND THE FRIEDRICHS EXTENSION OF SCHRÖDINGER-TYPE OPERATORS ON RIEMANNIAN MANIFOLDS

THE FORM SUM AND THE FRIEDRICHS EXTENSION OF SCHRÖDINGER-TYPE OPERATORS ON RIEMANNIAN MANIFOLDS THE FORM SUM AND THE FRIEDRICHS EXTENSION OF SCHRÖDINGER-TYPE OPERATORS ON RIEMANNIAN MANIFOLDS OGNJEN MILATOVIC Abstract. We consider H V = M +V, where (M, g) is a Riemannian manifold (not necessarily

More information

CHAPTER I THE RIESZ REPRESENTATION THEOREM

CHAPTER I THE RIESZ REPRESENTATION THEOREM CHAPTER I THE RIESZ REPRESENTATION THEOREM We begin our study by identifying certain special kinds of linear functionals on certain special vector spaces of functions. We describe these linear functionals

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Measurable Choice Functions

Measurable Choice Functions (January 19, 2013) Measurable Choice Functions Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/fun/choice functions.pdf] This note

More information

EECS 598: Statistical Learning Theory, Winter 2014 Topic 11. Kernels

EECS 598: Statistical Learning Theory, Winter 2014 Topic 11. Kernels EECS 598: Statistical Learning Theory, Winter 2014 Topic 11 Kernels Lecturer: Clayton Scott Scribe: Jun Guo, Soumik Chatterjee Disclaimer: These notes have not been subjected to the usual scrutiny reserved

More information

Non-separable AF-algebras

Non-separable AF-algebras Non-separable AF-algebras Takeshi Katsura Department of Mathematics, Hokkaido University, Kita 1, Nishi 8, Kita-Ku, Sapporo, 6-81, JAPAN katsura@math.sci.hokudai.ac.jp Summary. We give two pathological

More information

FREMLIN TENSOR PRODUCTS OF CONCAVIFICATIONS OF BANACH LATTICES

FREMLIN TENSOR PRODUCTS OF CONCAVIFICATIONS OF BANACH LATTICES FREMLIN TENSOR PRODUCTS OF CONCAVIFICATIONS OF BANACH LATTICES VLADIMIR G. TROITSKY AND OMID ZABETI Abstract. Suppose that E is a uniformly complete vector lattice and p 1,..., p n are positive reals.

More information

Vector measures of bounded γ-variation and stochastic integrals

Vector measures of bounded γ-variation and stochastic integrals Vector measures of bounded γ-variation and stochastic integrals Jan van Neerven and Lutz Weis bstract. We introduce the class of vector measures of bounded γ-variation and study its relationship with vector-valued

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

9 Brownian Motion: Construction

9 Brownian Motion: Construction 9 Brownian Motion: Construction 9.1 Definition and Heuristics The central limit theorem states that the standard Gaussian distribution arises as the weak limit of the rescaled partial sums S n / p n of

More information

ADJOINT FOR OPERATORS IN BANACH SPACES

ADJOINT FOR OPERATORS IN BANACH SPACES ADJOINT FOR OPERATORS IN BANACH SPACES T. L. GILL, S. BASU, W. W. ZACHARY, AND V. STEADMAN Abstract. In this paper we show that a result of Gross and Kuelbs, used to study Gaussian measures on Banach spaces,

More information

Chapter 8 Integral Operators

Chapter 8 Integral Operators Chapter 8 Integral Operators In our development of metrics, norms, inner products, and operator theory in Chapters 1 7 we only tangentially considered topics that involved the use of Lebesgue measure,

More information

Independence of some multiple Poisson stochastic integrals with variable-sign kernels

Independence of some multiple Poisson stochastic integrals with variable-sign kernels Independence of some multiple Poisson stochastic integrals with variable-sign kernels Nicolas Privault Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang Technological

More information

ON BURKHOLDER'S BICONVEX-FUNCTION CHARACTERIZATION OF HILBERT SPACES

ON BURKHOLDER'S BICONVEX-FUNCTION CHARACTERIZATION OF HILBERT SPACES PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 118, Number 2, June 1993 ON BURKHOLDER'S BICONVEX-FUNCTION CHARACTERIZATION OF HILBERT SPACES JINSIK MOK LEE (Communicated by William J. Davis) Abstract.

More information

Extensions of Lipschitz functions and Grothendieck s bounded approximation property

Extensions of Lipschitz functions and Grothendieck s bounded approximation property North-Western European Journal of Mathematics Extensions of Lipschitz functions and Grothendieck s bounded approximation property Gilles Godefroy 1 Received: January 29, 2015/Accepted: March 6, 2015/Online:

More information

Hilbert spaces. 1. Cauchy-Schwarz-Bunyakowsky inequality

Hilbert spaces. 1. Cauchy-Schwarz-Bunyakowsky inequality (October 29, 2016) Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/fun/notes 2016-17/03 hsp.pdf] Hilbert spaces are

More information

Tools from Lebesgue integration

Tools from Lebesgue integration Tools from Lebesgue integration E.P. van den Ban Fall 2005 Introduction In these notes we describe some of the basic tools from the theory of Lebesgue integration. Definitions and results will be given

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

Elementary linear algebra

Elementary linear algebra Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The

More information

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis Real Analysis, 2nd Edition, G.B.Folland Chapter 5 Elements of Functional Analysis Yung-Hsiang Huang 5.1 Normed Vector Spaces 1. Note for any x, y X and a, b K, x+y x + y and by ax b y x + b a x. 2. It

More information

I teach myself... Hilbert spaces

I teach myself... Hilbert spaces I teach myself... Hilbert spaces by F.J.Sayas, for MATH 806 November 4, 2015 This document will be growing with the semester. Every in red is for you to justify. Even if we start with the basic definition

More information

On metric characterizations of some classes of Banach spaces

On metric characterizations of some classes of Banach spaces On metric characterizations of some classes of Banach spaces Mikhail I. Ostrovskii January 12, 2011 Abstract. The first part of the paper is devoted to metric characterizations of Banach spaces with no

More information

212a1214Daniell s integration theory.

212a1214Daniell s integration theory. 212a1214 Daniell s integration theory. October 30, 2014 Daniell s idea was to take the axiomatic properties of the integral as the starting point and develop integration for broader and broader classes

More information

Yuqing Chen, Yeol Je Cho, and Li Yang

Yuqing Chen, Yeol Je Cho, and Li Yang Bull. Korean Math. Soc. 39 (2002), No. 4, pp. 535 541 NOTE ON THE RESULTS WITH LOWER SEMI-CONTINUITY Yuqing Chen, Yeol Je Cho, and Li Yang Abstract. In this paper, we introduce the concept of lower semicontinuous

More information

A BOOLEAN ACTION OF C(M, U(1)) WITHOUT A SPATIAL MODEL AND A RE-EXAMINATION OF THE CAMERON MARTIN THEOREM

A BOOLEAN ACTION OF C(M, U(1)) WITHOUT A SPATIAL MODEL AND A RE-EXAMINATION OF THE CAMERON MARTIN THEOREM A BOOLEAN ACTION OF C(M, U(1)) WITHOUT A SPATIAL MODEL AND A RE-EXAMINATION OF THE CAMERON MARTIN THEOREM JUSTIN TATCH MOORE AND S LAWOMIR SOLECKI Abstract. We will demonstrate that if M is an uncountable

More information

Conditional moment representations for dependent random variables

Conditional moment representations for dependent random variables Conditional moment representations for dependent random variables W lodzimierz Bryc Department of Mathematics University of Cincinnati Cincinnati, OH 45 22-0025 bryc@ucbeh.san.uc.edu November 9, 995 Abstract

More information

Bernstein-Szegö Inequalities in Reproducing Kernel Hilbert Spaces ABSTRACT 1. INTRODUCTION

Bernstein-Szegö Inequalities in Reproducing Kernel Hilbert Spaces ABSTRACT 1. INTRODUCTION Malaysian Journal of Mathematical Sciences 6(2): 25-36 (202) Bernstein-Szegö Inequalities in Reproducing Kernel Hilbert Spaces Noli N. Reyes and Rosalio G. Artes Institute of Mathematics, University of

More information

Reminder Notes for the Course on Measures on Topological Spaces

Reminder Notes for the Course on Measures on Topological Spaces Reminder Notes for the Course on Measures on Topological Spaces T. C. Dorlas Dublin Institute for Advanced Studies School of Theoretical Physics 10 Burlington Road, Dublin 4, Ireland. Email: dorlas@stp.dias.ie

More information

Asymptotically Efficient Nonparametric Estimation of Nonlinear Spectral Functionals

Asymptotically Efficient Nonparametric Estimation of Nonlinear Spectral Functionals Acta Applicandae Mathematicae 78: 145 154, 2003. 2003 Kluwer Academic Publishers. Printed in the Netherlands. 145 Asymptotically Efficient Nonparametric Estimation of Nonlinear Spectral Functionals M.

More information

b i (µ, x, s) ei ϕ(x) µ s (dx) ds (2) i=1

b i (µ, x, s) ei ϕ(x) µ s (dx) ds (2) i=1 NONLINEAR EVOLTION EQATIONS FOR MEASRES ON INFINITE DIMENSIONAL SPACES V.I. Bogachev 1, G. Da Prato 2, M. Röckner 3, S.V. Shaposhnikov 1 The goal of this work is to prove the existence of a solution to

More information

Karhunen-Loève decomposition of Gaussian measures on Banach spaces

Karhunen-Loève decomposition of Gaussian measures on Banach spaces Karhunen-Loève decomposition of Gaussian measures on Banach spaces Jean-Charles Croix GT APSSE - April 2017, the 13th joint work with Xavier Bay. 1 / 29 Sommaire 1 Preliminaries on Gaussian processes 2

More information

On the Baum-Connes conjecture for Gromov monster groups

On the Baum-Connes conjecture for Gromov monster groups On the Baum-Connes conjecture for Gromov monster groups Martin Finn-Sell Fakultät für Mathematik, Oskar-Morgenstern-Platz 1, 1090 Wien martin.finn-sell@univie.ac.at June 2015 Abstract We present a geometric

More information