Geometry on Probability Spaces

Size: px
Start display at page:

Download "Geometry on Probability Spaces"

Transcription

1 Geometry on Probability Spaces Steve Smale Toyota Technological Institute at Chicago 427 East 60th Street, Chicago, IL 60637, USA Ding-Xuan Zhou Department of Mathematics, City University of Hong Kong Tat Chee Avenue, Kowloon, Hong Kong, CHINA December 4, 2007 Preliminary Report Introduction Partial differential equations and the Laplacian operator on domains in Euclidean spaces have played a central role in understanding natural phenomena. However this avenue has been limited in many areas where calculus is obstructed as in singular spaces, and function spaces of functions on a space X where X itself is a function space. Examples of the last occur in vision and quantum field theory. In vision it would be useful to do analysis on the space of images and an image is a function on a patch. Moreover in analysis and geometry, the Lebesgue measure and its counterpart on manifolds are central. These measures are unavailable in the vision example and even in learning theory in general. There is one situation where in the last several decades, the problem has been studied with some success. That is when the underlying space is finite (or even discrete). The introduction of the graph Laplacian has been a major development in algorithm research

2 and is certainly useful for unsupervised learning theory. The point of view taken here is to benefit from both the classical research and the newer graph theoretic ideas to develop geometry on probability spaces. This starts with a space X equipped with a kernel (like a Mercer kernel) which gives a topology and geometry; X is to be equipped as well with a probability measure. The main focus is on a construction of a (normalized) Laplacian, an associated heat equation, diffusion distance, etc. In this setting the point estimates of calculus are replaced by integral quantities. One thinks of secants rather than tangents. Our main result, Theorem below, bounds the error of an empirical approximation to this Laplacian on X. 2 Kernels and Metric Spaces Let X be a set and K : X X R be a reproducing kernel, that is, K is symmetric and for any finite set of points {x,, x l } X, the matrix (K(x i, x j )) l i,j= is positive semidefinite. The Reproducing Kernel Hilbert Space (RKHS) H K associated with the kernel K is defined to be the completion of the linear span of the set of functions {K x = K(x, ) : x X} with the inner product denoted as, K satisfying K x, K y K = K(x, y). See [, 0, ]. We assume that the feature map F : X H K mapping x X to K x H K is injective. Definition. Define a metric d = d K induced by K on X as d(x, y) = K x K y K. We assume that (X, d) is complete and separable (if it is not complete, we may complete it). The feature map is an isometry. Observe that K(x, y) = K x, K y K = { Kx, K x K + K y, K y K (d(x, y)) 2}. 2 Then K is continuous on X X, and hence is a Mercer kernel. Let ρ X be a Borel probability measure on the metric space X. Example. Let X = R n and K be the reproducing kernel K(x, y) = x, y R n. Then d K (x, y) = x y R n and H K is the space of linear functions. Example 2. Let X be a closed subset of R n and K be the kernel given in the above example restricted to X X. Then X is complete and separable, and K is a Mercer kernel. Moreover, one may take ρ X to be any Borel probability measure on X. 2

3 Remark. In Examples and 2, one can replace R n by a separable Hilbert space. Example 3. In Example 2, let x = {x i } m i= be a finite subset of X drawn from ρ X. Then the restriction K x is a Mercer kernel on x and it is natural to take ρ x to be the uniform measure on x. 3 Approximation of Integral Operators The integral operator L K : L 2 ρ X H K associated with the pair (K, ρ X ) is defined by L K (f)(x) := K(x, y)f(y)dρ X (y), x X. (3.) X It may be considered as a self-adjoint operator on L 2 ρ X or H K. See [0, ]. We shall use the same notion L K for these operators defined on different domains. Assume that x := {x i } m i= is a sample independently drawn according to ρ X. Recall the sampling operator S x : H K l 2 (x) associated with x defined [7, 8] by S x (f) = ( f(x i ) ) m i=. By the reproducing property of H K taking the form K x, f K = f(x) with x X, f H K, the adjoint of the sampling operator, S T x : l 2 (x) H K, is given by S T x c = m c i K xi, i= c l 2 (x). As the sample size m tends to infinity, the operator m ST x S x converges to the integral operator L K. We prove the following convergence rate. Proposition. Assume κ := sup x X K(x, x) <. (3.2) Let x be a sample independently drawn from (X, ρ X ). With confidence δ, we have m ST x S x L K 4κ2 log ( 2/δ ). (3.3) HK H K m 3

4 Proof. The proof follows [8]. We need a probability inequality [5, 8] for a random variable ξ on (X, ρ X ) with values in a Hilbert space (H, ). It says that if ξ M < almost surely, then with confidence δ, m [ ξ(xi ) E(ξ) ] 4 M log ( 2/δ ). (3.4) m m i= We apply this probability inequality to the separable Hilbert space H of Hilbert-Schmidt operators on H K, denoted as HS(H K ). The inner product for A, A 2 HS(H K ) is defined as A, A 2 HS = j A e j, A 2 e j K where {e j } j is an orthonormal basis of H K. The space HS(H K ) is the subspace of the space of bounded linear operators on H K with the norm A HS <. The random variable ξ on (X, ρ X ) in (3.4) is given by ξ(x) = K x, K x K, x X. The values of ξ are rank-one operators in HS(H K ). For each x X, if we choose an orthonormal basis {e j } j of H K with e = K x / K(x, x), we see that ξ(x) 2 HS = j ξ(x)e j 2 K = (K(x, x)) 2 κ 4. Observe that m ST x S x = m m i= K x i, K xi K = m m i= ξ(x i) and E(ξ) = L K. Then we observe that the desired bound (3.3) follows from (3.4) with M = κ 2 and the norm relation A HK H K A HS. (3.5) This proves the proposition. Some analysis related to Proposition can be found in [9, 4, 5, 2]. Definition 2. Denote p = K X xdρ X H K. We assume that p is positive on X. The normalized Mercer kernel on X associated with the pair (K, ρ X ) is defined by K(x, y) = K(x, y) p(x) p(y), x, y X. (3.6) This construction is in [9], except that we have replaced their positive weighting function by a Mercer kernel. In the following, means expressions wrt the normalized kernel K, in particular Kx := ( K(xi, x j ) 4 ) m i,j=.

5 Consider the RKHS H K, and the integral operator L K : H K H K associated with the normalized Mercer kernel K. Denote the sampling operator H K l 2 (x) associated with the pair ( K, x) as S x. If we denote H K,x = span{ K xi } m i= in H K and its orthogonal complement as H K,x, then we see that S x (f) = 0 for any f H K,x. Hence S x T S x H = 0. Moreover, Kx is the matrix K,x representation of the operator S x T S x H K,x. Proposition yields the following bound with K replacing K in Proposition. Theorem. Denote p 0 = min x X p(x). For any 0 < δ <, with confidence δ, L K m S x T S H x 4κ2 log ( 2/δ ). (3.7) mp0 K H K Note that m S T x S x is given from data x by a linear algorithm [6]. So Theorem is an error estimate for that algorithm. The analysis in [9] deeply involves the constant min x,y X K(x, y). In the case of the Gaussian kernel K σ (x, y) = exp{ x y 2 }, this constant decays exponentially fast as the 2σ 2 variance σ of the gaussian becomes small, while in Theorem the constant p 0 decays polynomially fast, at least in the manifold setting [20]. 4 Tame Heat Equation Define a tame version of Laplacian as K = I L K. The tame heat equation takes the form u t = Ku. (4.) One can see that the solution to (4.) with the initial condition u(0) = u 0 is given by u(t) = exp { t K} u0. (4.2) 5 Diffusion Distances Let = λ λ be the eigenvalues of L K : L 2 ρ X L 2 ρ X and {ϕ (i) } i be corresponding normalized eigenfunctions which form an orthonormal basis of L 2 ρ X. The 5

6 Mercer Theorem asserts that K(x, y) = λ i ϕ (i) (x)ϕ (i) (y) i= where the convergence holds in L 2 ρ X and uniformly. Definition 3. For t > 0 we define a reproducing kernel on X as K t (x, y) = λ t iϕ (i) (x)ϕ (i) (y). (5.) i= Then the tame diffusion distance D t on X is defined as d Kt by D t (x, y) = K t x K t y Kt = { i= λ t i [ ϕ (i) (x) ϕ (i) (y) ] } /2 2. (5.2) The kernel Kt may be interpreted as a tame version of the heat kernel on a manifold. The functions ( K t ) x are in H K for t, but only in L 2 ρ X for t 0. The quantity corresponding to D t for the classical heat kernel and other varieties has been developed by Coifman et. al. [8, 9] where it is used in data analysis and more. Note that K t is continuous for t, in contrast to the classical kernel (Greens function) which is of course singular on the diagonal of X X. The integral operator L Kt the positive operator L K. associated with the reproducing kernel Kt is the tth power of Note that for t =, the distance D is the one induced by the kernel K. When t becomes large, the effect of the eigenfunctions ϕ (i) with small eigenvalues λ i is little. In particular, when λ has multiplicity, since ϕ () = p we have lim t Dt (x, y) = p(x) p(y). In the same way, since the tame Laplacian K has eigenpairs ( λ i, ϕ (i) ), we know that the solution (4.2) to the tame heat equation (4.) with the initial condition u(0) = u 0 L 2 ρ X satisfies lim u(t) = u 0, p L 2 t ρx p. 6

7 6 Graph Laplacian Consider the reproducing kernel K and the finite subset x = {x i } m i= of X. Definition 4. Let K x be the symmetric matrix K x = (K(x i, x j )) m i,j= and D x be the diagonal matrix with (D x ) i,i = m j= (K x) i,j = m j= K(x i, x j ). Define a discrete Laplacian matrix L = L x = D x K x. The normalized discrete Laplacian is defined by K,x = I D 2 x K x D 2 x = I m K x, (6.) where K x is the special case of the matrix K x when ρ X is the uniform distribution of the finite set x, that is, m K x := K(x i, x j ) m m l= K(x. (6.2) i, x l ) m m l= K(x j, x l ) The above construction is the same as that for graph Laplacians [7]. See [2] for a discussion in learning theory. But the setting here is different: the entries K(x i, x j ) are induced by a reproducing kernel, they might take negative values satisfying a positive definiteness condition, and are different from weights of a graph or entries of an adjacency matrix. i,j= In the following, means expressions with respect to an implicit sample x. 7 Approximation of Eigenfunctions In this section we study the approximation of eigenfunctions from the bound for approximation of linear operators. We need the following result which follows from the general discussion on perturbation of eigenvalues and eigenvectors, see e.g. [4]. For completeness, we give a detailed proof for the approximation of eigenvectors. Proposition 2. Let A and  be two compact positive self-adjoint operators on a Hilbert space H, with nondecreasing eigenvalues {λ j } and { λ j } with multiplicity. Then there holds max λ j λ j A Â. (7.) j Let w k be a normalized eigenvector of A associated with eigenvalue λ k. If r > 0 satisfies λ k λ k r, λ k λ k+ r, A  r 2, (7.2) 7

8 then w k ŵ k 4 A Â, r where ŵ k is a normalized eigenvector of  associated with eigenvalue λ k. Proof. The eigenvalue estimate (7.) follows from Weyl s Perturbation Theorem (see e.g. [4]). Let {w j } j be an orthonormal basis of H consisting of eigenvectors of A associated with eigenvalues {λ j }. Then ŵ k = j α jw j where α j = ŵ k, w j. Consider Âŵ k Aŵ k. It can be expressed as Âŵ k Aŵ k = λ k ŵ k α j Aw j = ) α j ( λk λ j w j. j j When j k, we see from (7.2) and (7.) that λ k λ j λ k λ j λ k λ k min{λ k λ k, λ k λ k+ } A  r 2. Hence Âŵ k Aŵ k 2 = j α 2 j ( λk λ j ) 2 j k α 2 j ) 2 r ( λk 2 λ j α 2 4 j. j k But Âŵ k Aŵ k  A. It follows that { j k From ŵ k 2 = j α2 j =, we also have α k = Therefore, α 2 j } /2 2  A. r { /2 { /2. j k j} α2 j k j} α2 w k ŵ k = ( α k )w k + α k w k ŵ k = ( α k )w k j k α j w j This proves the desired bound. α k + j k α j w j 2 { j k α 2 j } /2 4  A. r An immediate easy corollary of Proposition 2 and Theorem is about the approximation of eigenfunctions of L K by those of m S T x S x. 8

9 Recall the eigenpairs {λ i, ϕ (i) } i of the compact self-adjoint positive operator L K on L 2 ρ X. Observe that ϕ (k) L 2 ρx =, but ϕ (k) 2 K = λ k L Kϕ (k), ϕ (k) K = λ k ϕ (k) 2 L = 2 ρ X λ k. So λk ϕ (k) is a normalized eigenfunction of L K on H K associated with eigenvalue λ k. Also, m S T x S x H K,x = 0 and m K x is the matrix representation of m S T x S x H K,x. Corollary. Let k {,..., m} such that r := min{λ k λ k, λ k λ k+ } > 0. If 0 < δ < and m N satisfy then with confidence δ, we have ϕ (k) Φ (k) λk K 4κ 2 log ( 4/δ ) mp0 r 2, 6κ2 log ( 4/δ ) r mp 0, where Φ (k) is a normalized eigenfunction of m S T x S x associated with its kth largest eigenvalue λ k,x. Moreover, λ k,x is the kth largest eigenvalue of the matrix m K x, and with an associated normalized eigenvector v, we have Φ (k) = λk,x m ST x v. 8 Algorithmic Issue for Approximating Eigenfunctions Corollary provides quantitative understanding of the approximation of eigenfunctions of L K by those of the operator m S T x S x. The matrix representation m K x is given by the kernel K which involves K and the function p defined by ρ X. Since ρ X is unknown algorithmically, we want to study the approximation of eigenfunctions by methods involving only K and the sample x. This is done here. We shall apply Proposition 2 to the operator A = L K and  defined by means of the matrix K x given by (K, x) in (6.2), and derive estimates for the approximation of eigenfunctions. Let λ,x λ 2,x... λ m,x 0 be the eigenvalues of the matrix m K x. Theorem 2. Let k {,..., m} such that r := min{λ k λ k, λ k λ k+ } > 0. If 0 < δ < and m N satisfy 4κ 2 log ( 4/δ ) mp0 r 8, (8.) then with confidence δ, we have ϕ(k) ST m x v K 72 λ k κ 2 log ( 4/δ ) r mp 0, (8.2) 9

10 where v l 2 (x) is a normalized eigenvector of the matrix m K x with eigenvalue λ k,x. Remark 2. Since K = I L K and K,x = I D 2 x K x D 2 x = I m K x, Theorem 2 provides error bounds for the approximation of eigenfunctions of the tame Laplacian K by those of the normalized discrete Laplacian K,x. Note that m ST x v = m m i= v i K xi. We need some preliminary estimates for the approximation of linear operators. Applying the probability inequality (3.4) for the random variable ξ on (X, ρ X ) with values in the Hilbert space H K given by ξ(x) = K x, we have Lemma. Denote p x = m m j= K x j H K. With confidence δ/2, we have p x p K 4κ log( 4/δ ) m. (8.3) The error bound (8.3) implies ( ) max p x(x i ) p(x i ) 4κ2 log 4/δ. (8.4) i=,...,m m This yields the following error bounds between the matrices K x and K x. Lemma 2. Let x X m. When (8.3) holds, we have m K x m K x (2 + 4κ2 log ( 4/δ ) ) 4κ 2 log ( 4/δ ). (8.5) mp0 mp0 Proof. Let Σ = diag {Σ i } m i= with Σ i := px(x i ) p(xi ). Then m K x = Σ m K x Σ. Decompose m K x m K x = Σ m K x (Σ I) + (Σ I) m K x. We see that m K x m K x Σ m K x Σ I + Σ I m K x. Since K m x I, we have K m x. Also, for each i, px (x i ) p(xi ) p x(x i ) p(x i ). p(x i ) Hence Σ I max i=,...,m This in connection with (8.4) proves the statement. 0 p x (x i ) p(x i ). p(x i )

11 Now we can prove the estimates for the approximation of eigenfunctions. Proof of Theorem 2. First, we define a linear operator which plays the role of  in Proposition 2. This is an operator, denoted as L K,x, on H K such that L K,x (f) = 0 for any f H K,x, and H K,x is an invariant subspace of L K,x with the matrix representation equal to the matrix K m x given by (6.2). That is, in H K,x H, we have K,x [ L K,x Kx,..., K ] [ xm = Kx,..., K ] xm m K x, LK,x H K,x = 0. Second, we consider the difference between the operator L K and L K,x. By Theorem, there exists a subset X of X m of measure at least δ/2 such that (3.7) holds true for x X. By Lemmas and 2, we know that there exists another subset X 2 of X m of measure at least δ/2 such that (8.3) and (8.5) hold for x X 2. Recall that S x T S x H = 0 and K,x K x is the matrix representation of the operator S x T S x H K,x. This in connection with (8.5) and the definition of L K,x tells us that for x X 2, m S x T S x L K,x = m K x m K x (2 + 4κ2 log ( 4/δ ) ) 4κ 2 log ( 4/δ ). mp0 mp0 Together with (3.7), we have ( L K L K,x 3 + 4κ2 log ( 4/δ ) ) 4κ 2 log ( 4/δ ), x X X 2. (8.6) mp0 mp0 Third, we apply Proposition 2 to the operators A = L K and  = L K,x on the Hilbert space H K. Let x be in the subset X X 2 which has measure at least δ. Since r, the estimate (8.6) together with the assumption (8.) tells us that (7.2) holds. Then by Proposition 2, we have ϕ (k) ϕ (k) 4 L λk K r K L 50κ 2 log ( 4/δ ) K,x r, (8.7) mp 0 where ϕ (k) is a normalized eigenfunction of L K,x associated with eigenvalue λ k,x. Finally, we specify ϕ (k). From the definition of the operator L K,x, we see that for u l 2 (x), we have ( m ) L K,x u i Kxi = i= m i= ( ) m K x u i K xi.

12 Since m i= u i K xi 2 K = u T Kx u, we know that the normalized eigenfunction ϕ (k) of L K,x associated with eigenvalue λ k,x can be taken as ( ) /2 m ϕ (k) = v T Kx v v i Kxi = i= ( ) /2 m vt Kx v ST m x v where v l 2 (x) is a normalized eigenvector of the matrix m K x with eigenvalue λ k,x. Since (8.5) holds, we see that m vt Kx v λ k,x = m vt Kx v m vt Kx v = v T ( m K x m K x )v 9κ2 log ( 4/δ ). mp0 Hence m vt Kx v λ k m vt Kx v λ k,x + λ k,x λ k 22κ2 log ( 4/δ ). mp0 ( ) /2 Since m vt Kx v ST m x v =, we have K ST λk m x v ϕ (k) = K λk m vt K ST x v m x v K m vt Kx v λ k = λk + m vt K x v λk m vt K ST x v m x v 22κ2 log ( 4/δ ). K mp0 λ k This in connection with (8.7) tells us that for x X X 2 we have ϕ (k) ST λk λk m x v 50κ2 log ( 4/δ ) ( ) r + 22κ2 log 4/δ. mp 0 mp0 λ k Since r λ k λ k+ λ k, this proves the stated error bound in Theorem 2. K References [] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc. 68 (950), [2] M. Belkin and P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput. 5 (2003),

13 [3] M. Belkin and P. Niyogi, Convergence of Laplacian eigenmaps, preprint, [4] R. Bhatia, Matrix Analysis, Graduate Texts in Mathematics 69, Springer-Verlag, New York, 997. [5] G. Blanchard, O. Bousquet, and L. Zwald, Statistical properties of kernel principal component analysis, Mach. Learn. 66 (2007), [6] S. Bougleux, A. Elmoataz, and M. Melkemi, Discrete regularization on weighted graphs for image and mesh filtering, Lecture Notes in Computer Science 4485 (2007), pp [7] F. R. K. Chung, Spectral Graph Theory, Regional Conference Series in Mathematics 92, SIAM, Philadelphia, 997. [8] R. Coifman, S. Lafon, A. Lee, M. Maggioni, B. Nadler, F. Warner, and S. Zucker, Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps, PNAS of USA 02 (2005), [9] R. Coifman and M. Maggiono, Diffusion wavelets, Appl. Comput. Harmonic Anal. 2 (2006), [0] F. Cucker and S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc. 39 (200), 49. [] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, [2] E. De Vito, A. Caponnetto, and L. Rosasco, Model selection for regularized least-squares algorithm in learning theory, Found. Comput. Math. 5 (2005), [3] G. Gilboa and S. Osher, Nonlocal operators with applications to image processing, UCLA CAM Report 07-23, July [4] V. Koltchinskii and E. Giné, Random matrix approximation of spectra of integral operators, Bernoulli 6 (2000), [5] I. Pinelis, Optimum bounds for the distributions of martingales in Banach spaces, Ann. Probab. 22 (994),

14 [6] S. Smale and D. X. Zhou, Shannon sampling and function reconstruction from point values, Bull. Amer. Math. Soc. 4 (2004), [7] S. Smale and D. X. Zhou, Shannon sampling II. Connections to learning theory, Appl. Comput. Harmonic Anal. 9 (2005), [8] S. Smale and D. X. Zhou, Learning theory estimates via integral operators and their approximations, Constr. Approx. 26 (2007), [9] U. von Luxburg, M. Belkin, and O. Bousquet, Consistency of spectral clustering, Ann. Stat., to appear. [20] G. B. Ye and D. X. Zhou, Learning and approximation by Gaussians on Riemannian manifolds, Adv. Comput. Math., in press. DOI 0.007/s [2] D. Zhou and B. Schölkopf, Regularization on discrete spaces, in Pattern Recognition, Proc. 27th DAGM Symposium, Berlin, 2005, pp

Derivative reproducing properties for kernel methods in learning theory

Derivative reproducing properties for kernel methods in learning theory Journal of Computational and Applied Mathematics 220 (2008) 456 463 www.elsevier.com/locate/cam Derivative reproducing properties for kernel methods in learning theory Ding-Xuan Zhou Department of Mathematics,

More information

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian

Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics

More information

Limits of Spectral Clustering

Limits of Spectral Clustering Limits of Spectral Clustering Ulrike von Luxburg and Olivier Bousquet Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tübingen, Germany {ulrike.luxburg,olivier.bousquet}@tuebingen.mpg.de

More information

From graph to manifold Laplacian: The convergence rate

From graph to manifold Laplacian: The convergence rate Appl. Comput. Harmon. Anal. 2 (2006) 28 34 www.elsevier.com/locate/acha Letter to the Editor From graph to manifold Laplacian: The convergence rate A. Singer Department of athematics, Yale University,

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

Contribution from: Springer Verlag Berlin Heidelberg 2005 ISBN

Contribution from: Springer Verlag Berlin Heidelberg 2005 ISBN Contribution from: Mathematical Physics Studies Vol. 7 Perspectives in Analysis Essays in Honor of Lennart Carleson s 75th Birthday Michael Benedicks, Peter W. Jones, Stanislav Smirnov (Eds.) Springer

More information

March 13, Paper: R.R. Coifman, S. Lafon, Diffusion maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University

March 13, Paper: R.R. Coifman, S. Lafon, Diffusion maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University Kernels March 13, 2008 Paper: R.R. Coifman, S. Lafon, maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University Kernels Figure: Example Application from [LafonWWW] meaningful geometric

More information

Kernels for Multi task Learning

Kernels for Multi task Learning Kernels for Multi task Learning Charles A Micchelli Department of Mathematics and Statistics State University of New York, The University at Albany 1400 Washington Avenue, Albany, NY, 12222, USA Massimiliano

More information

On Some Extensions of Bernstein s Inequality for Self-Adjoint Operators

On Some Extensions of Bernstein s Inequality for Self-Adjoint Operators On Some Extensions of Bernstein s Inequality for Self-Adjoint Operators Stanislav Minsker e-mail: sminsker@math.duke.edu Abstract: We present some extensions of Bernstein s inequality for random self-adjoint

More information

Mercer s Theorem, Feature Maps, and Smoothing

Mercer s Theorem, Feature Maps, and Smoothing Mercer s Theorem, Feature Maps, and Smoothing Ha Quang Minh, Partha Niyogi, and Yuan Yao Department of Computer Science, University of Chicago 00 East 58th St, Chicago, IL 60637, USA Department of Mathematics,

More information

Learning Theory of Randomized Kaczmarz Algorithm

Learning Theory of Randomized Kaczmarz Algorithm Journal of Machine Learning Research 16 015 3341-3365 Submitted 6/14; Revised 4/15; Published 1/15 Junhong Lin Ding-Xuan Zhou Department of Mathematics City University of Hong Kong 83 Tat Chee Avenue Kowloon,

More information

Convergence of Eigenspaces in Kernel Principal Component Analysis

Convergence of Eigenspaces in Kernel Principal Component Analysis Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation

More information

A spectral clustering algorithm based on Gram operators

A spectral clustering algorithm based on Gram operators A spectral clustering algorithm based on Gram operators Ilaria Giulini De partement de Mathe matiques et Applications ENS, Paris Joint work with Olivier Catoni 1 july 2015 Clustering task of grouping

More information

Citation Osaka Journal of Mathematics. 41(4)

Citation Osaka Journal of Mathematics. 41(4) TitleA non quasi-invariance of the Brown Authors Sadasue, Gaku Citation Osaka Journal of Mathematics. 414 Issue 4-1 Date Text Version publisher URL http://hdl.handle.net/1194/1174 DOI Rights Osaka University

More information

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Statistical Inference with Reproducing Kernel Hilbert Space Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Distributed Learning with Regularized Least Squares

Distributed Learning with Regularized Least Squares Journal of Machine Learning Research 8 07-3 Submitted /5; Revised 6/7; Published 9/7 Distributed Learning with Regularized Least Squares Shao-Bo Lin sblin983@gmailcom Department of Mathematics City University

More information

Functional Analysis Review

Functional Analysis Review Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all

More information

arxiv: v2 [math.pr] 27 Oct 2015

arxiv: v2 [math.pr] 27 Oct 2015 A brief note on the Karhunen-Loève expansion Alen Alexanderian arxiv:1509.07526v2 [math.pr] 27 Oct 2015 October 28, 2015 Abstract We provide a detailed derivation of the Karhunen Loève expansion of a stochastic

More information

Analysis of Spectral Kernel Design based Semi-supervised Learning

Analysis of Spectral Kernel Design based Semi-supervised Learning Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,

More information

Graphs, Geometry and Semi-supervised Learning

Graphs, Geometry and Semi-supervised Learning Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

Hilbert Spaces. Contents

Hilbert Spaces. Contents Hilbert Spaces Contents 1 Introducing Hilbert Spaces 1 1.1 Basic definitions........................... 1 1.2 Results about norms and inner products.............. 3 1.3 Banach and Hilbert spaces......................

More information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures

More information

ON THE HÖLDER CONTINUITY OF MATRIX FUNCTIONS FOR NORMAL MATRICES

ON THE HÖLDER CONTINUITY OF MATRIX FUNCTIONS FOR NORMAL MATRICES Volume 10 (2009), Issue 4, Article 91, 5 pp. ON THE HÖLDER CONTINUITY O MATRIX UNCTIONS OR NORMAL MATRICES THOMAS P. WIHLER MATHEMATICS INSTITUTE UNIVERSITY O BERN SIDLERSTRASSE 5, CH-3012 BERN SWITZERLAND.

More information

Spectral Regularization for Support Estimation

Spectral Regularization for Support Estimation Spectral Regularization for Support Estimation Ernesto De Vito DSA, Univ. di Genova, and INFN, Sezione di Genova, Italy devito@dima.ungie.it Lorenzo Rosasco CBCL - MIT, - USA, and IIT, Italy lrosasco@mit.edu

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators

More information

Introduction to Spectral Theory

Introduction to Spectral Theory P.D. Hislop I.M. Sigal Introduction to Spectral Theory With Applications to Schrodinger Operators Springer Introduction and Overview 1 1 The Spectrum of Linear Operators and Hilbert Spaces 9 1.1 TheSpectrum

More information

Your first day at work MATH 806 (Fall 2015)

Your first day at work MATH 806 (Fall 2015) Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies

More information

Singular Value Inequalities for Real and Imaginary Parts of Matrices

Singular Value Inequalities for Real and Imaginary Parts of Matrices Filomat 3:1 16, 63 69 DOI 1.98/FIL16163C Published by Faculty of Sciences Mathematics, University of Niš, Serbia Available at: http://www.pmf.ni.ac.rs/filomat Singular Value Inequalities for Real Imaginary

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms (February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops

More information

Learnability of Gaussians with flexible variances

Learnability of Gaussians with flexible variances Learnability of Gaussians with flexible variances Ding-Xuan Zhou City University of Hong Kong E-ail: azhou@cityu.edu.hk Supported in part by Research Grants Council of Hong Kong Start October 20, 2007

More information

Approximate Kernel Methods

Approximate Kernel Methods Lecture 3 Approximate Kernel Methods Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Machine Learning Summer School Tübingen, 207 Outline Motivating example Ridge regression

More information

Review and problem list for Applied Math I

Review and problem list for Applied Math I Review and problem list for Applied Math I (This is a first version of a serious review sheet; it may contain errors and it certainly omits a number of topic which were covered in the course. Let me know

More information

Multiscale Analysis and Diffusion Semigroups With Applications

Multiscale Analysis and Diffusion Semigroups With Applications Multiscale Analysis and Diffusion Semigroups With Applications Karamatou Yacoubou Djima Advisor: Wojciech Czaja Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu

More information

EXPLICIT UPPER BOUNDS FOR THE SPECTRAL DISTANCE OF TWO TRACE CLASS OPERATORS

EXPLICIT UPPER BOUNDS FOR THE SPECTRAL DISTANCE OF TWO TRACE CLASS OPERATORS EXPLICIT UPPER BOUNDS FOR THE SPECTRAL DISTANCE OF TWO TRACE CLASS OPERATORS OSCAR F. BANDTLOW AND AYŞE GÜVEN Abstract. Given two trace class operators A and B on a separable Hilbert space we provide an

More information

Statistical and Computational Analysis of Locality Preserving Projection

Statistical and Computational Analysis of Locality Preserving Projection Statistical and Computational Analysis of Locality Preserving Projection Xiaofei He xiaofei@cs.uchicago.edu Department of Computer Science, University of Chicago, 00 East 58th Street, Chicago, IL 60637

More information

3. Some tools for the analysis of sequential strategies based on a Gaussian process prior

3. Some tools for the analysis of sequential strategies based on a Gaussian process prior 3. Some tools for the analysis of sequential strategies based on a Gaussian process prior E. Vazquez Computer experiments June 21-22, 2010, Paris 21 / 34 Function approximation with a Gaussian prior Aim:

More information

Banach Journal of Mathematical Analysis ISSN: (electronic)

Banach Journal of Mathematical Analysis ISSN: (electronic) Banach J. Math. Anal. 6 (2012), no. 1, 139 146 Banach Journal of Mathematical Analysis ISSN: 1735-8787 (electronic) www.emis.de/journals/bjma/ AN EXTENSION OF KY FAN S DOMINANCE THEOREM RAHIM ALIZADEH

More information

Approximate Kernel PCA with Random Features

Approximate Kernel PCA with Random Features Approximate Kernel PCA with Random Features (Computational vs. Statistical Tradeoff) Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Journées de Statistique Paris May 28,

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

Learning sets and subspaces: a spectral approach

Learning sets and subspaces: a spectral approach Learning sets and subspaces: a spectral approach Alessandro Rudi DIBRIS, Università di Genova Optimization and dynamical processes in Statistical learning and inverse problems Sept 8-12, 2014 A world of

More information

5 Compact linear operators

5 Compact linear operators 5 Compact linear operators One of the most important results of Linear Algebra is that for every selfadjoint linear map A on a finite-dimensional space, there exists a basis consisting of eigenvectors.

More information

Five Mini-Courses on Analysis

Five Mini-Courses on Analysis Christopher Heil Five Mini-Courses on Analysis Metrics, Norms, Inner Products, and Topology Lebesgue Measure and Integral Operator Theory and Functional Analysis Borel and Radon Measures Topological Vector

More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information Introduction Consider a linear system y = Φx where Φ can be taken as an m n matrix acting on Euclidean space or more generally, a linear operator on a Hilbert space. We call the vector x a signal or input,

More information

Ventral Visual Stream and Deep Networks

Ventral Visual Stream and Deep Networks Ma191b Winter 2017 Geometry of Neuroscience References for this lecture: Tomaso A. Poggio and Fabio Anselmi, Visual Cortex and Deep Networks, MIT Press, 2016 F. Cucker, S. Smale, On the mathematical foundations

More information

Bi-stochastic kernels via asymmetric affinity functions

Bi-stochastic kernels via asymmetric affinity functions Bi-stochastic kernels via asymmetric affinity functions Ronald R. Coifman, Matthew J. Hirn Yale University Department of Mathematics P.O. Box 208283 New Haven, Connecticut 06520-8283 USA ariv:1209.0237v4

More information

Diffusion Wavelets and Applications

Diffusion Wavelets and Applications Diffusion Wavelets and Applications J.C. Bremer, R.R. Coifman, P.W. Jones, S. Lafon, M. Mohlenkamp, MM, R. Schul, A.D. Szlam Demos, web pages and preprints available at: S.Lafon: www.math.yale.edu/~sl349

More information

Online Gradient Descent Learning Algorithms

Online Gradient Descent Learning Algorithms DISI, Genova, December 2006 Online Gradient Descent Learning Algorithms Yiming Ying (joint work with Massimiliano Pontil) Department of Computer Science, University College London Introduction Outline

More information

On the simplest expression of the perturbed Moore Penrose metric generalized inverse

On the simplest expression of the perturbed Moore Penrose metric generalized inverse Annals of the University of Bucharest (mathematical series) 4 (LXII) (2013), 433 446 On the simplest expression of the perturbed Moore Penrose metric generalized inverse Jianbing Cao and Yifeng Xue Communicated

More information

Functional Analysis Review

Functional Analysis Review Functional Analysis Review Lorenzo Rosasco slides courtesy of Andre Wibisono 9.520: Statistical Learning Theory and Applications September 9, 2013 1 2 3 4 Vector Space A vector space is a set V with binary

More information

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral

More information

Takens embedding theorem for infinite-dimensional dynamical systems

Takens embedding theorem for infinite-dimensional dynamical systems Takens embedding theorem for infinite-dimensional dynamical systems James C. Robinson Mathematics Institute, University of Warwick, Coventry, CV4 7AL, U.K. E-mail: jcr@maths.warwick.ac.uk Abstract. Takens

More information

Measures on spaces of Riemannian metrics CMS meeting December 6, 2015

Measures on spaces of Riemannian metrics CMS meeting December 6, 2015 Measures on spaces of Riemannian metrics CMS meeting December 6, 2015 D. Jakobson (McGill), jakobson@math.mcgill.ca [CJW]: Y. Canzani, I. Wigman, DJ: arxiv:1002.0030, Jour. of Geometric Analysis, 2013

More information

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. Adjoint operator and adjoint matrix Given a linear operator L on an inner product space V, the adjoint of L is a transformation

More information

The Solvability Conditions for the Inverse Eigenvalue Problem of Hermitian and Generalized Skew-Hamiltonian Matrices and Its Approximation

The Solvability Conditions for the Inverse Eigenvalue Problem of Hermitian and Generalized Skew-Hamiltonian Matrices and Its Approximation The Solvability Conditions for the Inverse Eigenvalue Problem of Hermitian and Generalized Skew-Hamiltonian Matrices and Its Approximation Zheng-jian Bai Abstract In this paper, we first consider the inverse

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 9, 2011 About this class Goal In this class we continue our journey in the world of RKHS. We discuss the Mercer theorem which gives

More information

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space Robert Jenssen, Deniz Erdogmus 2, Jose Principe 2, Torbjørn Eltoft Department of Physics, University of Tromsø, Norway

More information

Basic Calculus Review

Basic Calculus Review Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V

More information

The spectral zeta function

The spectral zeta function The spectral zeta function Bernd Ammann June 4, 215 Abstract In this talk we introduce spectral zeta functions. The spectral zeta function of the Laplace-Beltrami operator was already introduced by Minakshisundaram

More information

Robust Laplacian Eigenmaps Using Global Information

Robust Laplacian Eigenmaps Using Global Information Manifold Learning and its Applications: Papers from the AAAI Fall Symposium (FS-9-) Robust Laplacian Eigenmaps Using Global Information Shounak Roychowdhury ECE University of Texas at Austin, Austin, TX

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

Data dependent operators for the spatial-spectral fusion problem

Data dependent operators for the spatial-spectral fusion problem Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.

More information

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures S.G. Bobkov School of Mathematics, University of Minnesota, 127 Vincent Hall, 26 Church St. S.E., Minneapolis, MN 55455,

More information

SPECTRAL THEORY EVAN JENKINS

SPECTRAL THEORY EVAN JENKINS SPECTRAL THEORY EVAN JENKINS Abstract. These are notes from two lectures given in MATH 27200, Basic Functional Analysis, at the University of Chicago in March 2010. The proof of the spectral theorem for

More information

Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007

Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007 Non-Asymptotic Theory of Random Matrices Lecture 4: Dimension Reduction Date: January 16, 2007 Lecturer: Roman Vershynin Scribe: Matthew Herman 1 Introduction Consider the set X = {n points in R N } where

More information

A CONCISE PROOF OF THE GEOMETRIC CONSTRUCTION OF INERTIAL MANIFOLDS

A CONCISE PROOF OF THE GEOMETRIC CONSTRUCTION OF INERTIAL MANIFOLDS This paper has appeared in Physics Letters A 200 (1995) 415 417 A CONCISE PROOF OF THE GEOMETRIC CONSTRUCTION OF INERTIAL MANIFOLDS James C. Robinson Department of Applied Mathematics and Theoretical Physics,

More information

Theory of Positive Definite Kernel and Reproducing Kernel Hilbert Space

Theory of Positive Definite Kernel and Reproducing Kernel Hilbert Space Theory of Positive Definite Kernel and Reproducing Kernel Hilbert Space Statistical Inference with Reproducing Kernel Hilbert Space Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

1. Mathematical Foundations of Machine Learning

1. Mathematical Foundations of Machine Learning 1. Mathematical Foundations of Machine Learning 1.1 Basic Concepts Definition of Learning Definition [Mitchell (1997)] A computer program is said to learn from experience E with respect to some class of

More information

Error Estimates for Multi-Penalty Regularization under General Source Condition

Error Estimates for Multi-Penalty Regularization under General Source Condition Error Estimates for Multi-Penalty Regularization under General Source Condition Abhishake Rastogi Department of Mathematics Indian Institute of Technology Delhi New Delhi 006, India abhishekrastogi202@gmail.com

More information

Data-dependent representations: Laplacian Eigenmaps

Data-dependent representations: Laplacian Eigenmaps Data-dependent representations: Laplacian Eigenmaps November 4, 2015 Data Organization and Manifold Learning There are many techniques for Data Organization and Manifold Learning, e.g., Principal Component

More information

Are Loss Functions All the Same?

Are Loss Functions All the Same? Are Loss Functions All the Same? L. Rosasco E. De Vito A. Caponnetto M. Piana A. Verri November 11, 2003 Abstract In this paper we investigate the impact of choosing different loss functions from the viewpoint

More information

Square roots of perturbed sub-elliptic operators on Lie groups

Square roots of perturbed sub-elliptic operators on Lie groups Square roots of perturbed sub-elliptic operators on Lie groups Lashi Bandara (Joint work with Tom ter Elst, Auckland and Alan McIntosh, ANU) Centre for Mathematics and its Applications Australian National

More information

1. General Vector Spaces

1. General Vector Spaces 1.1. Vector space axioms. 1. General Vector Spaces Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule

More information

On the bang-bang property of time optimal controls for infinite dimensional linear systems

On the bang-bang property of time optimal controls for infinite dimensional linear systems On the bang-bang property of time optimal controls for infinite dimensional linear systems Marius Tucsnak Université de Lorraine Paris, 6 janvier 2012 Notation and problem statement (I) Notation: X (the

More information

Spectral Processing. Misha Kazhdan

Spectral Processing. Misha Kazhdan Spectral Processing Misha Kazhdan [Taubin, 1995] A Signal Processing Approach to Fair Surface Design [Desbrun, et al., 1999] Implicit Fairing of Arbitrary Meshes [Vallet and Levy, 2008] Spectral Geometry

More information

Elementary linear algebra

Elementary linear algebra Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The

More information

The goal of this chapter is to study linear systems of ordinary differential equations: dt,..., dx ) T

The goal of this chapter is to study linear systems of ordinary differential equations: dt,..., dx ) T 1 1 Linear Systems The goal of this chapter is to study linear systems of ordinary differential equations: ẋ = Ax, x(0) = x 0, (1) where x R n, A is an n n matrix and ẋ = dx ( dt = dx1 dt,..., dx ) T n.

More information

Chapter 4 Euclid Space

Chapter 4 Euclid Space Chapter 4 Euclid Space Inner Product Spaces Definition.. Let V be a real vector space over IR. A real inner product on V is a real valued function on V V, denoted by (, ), which satisfies () (x, y) = (y,

More information

A Concise Course on Stochastic Partial Differential Equations

A Concise Course on Stochastic Partial Differential Equations A Concise Course on Stochastic Partial Differential Equations Michael Röckner Reference: C. Prevot, M. Röckner: Springer LN in Math. 1905, Berlin (2007) And see the references therein for the original

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

Global vs. Multiscale Approaches

Global vs. Multiscale Approaches Harmonic Analysis on Graphs Global vs. Multiscale Approaches Weizmann Institute of Science, Rehovot, Israel July 2011 Joint work with Matan Gavish (WIS/Stanford), Ronald Coifman (Yale), ICML 10' Challenge:

More information

Spectral Theorem for Self-adjoint Linear Operators

Spectral Theorem for Self-adjoint Linear Operators Notes for the undergraduate lecture by David Adams. (These are the notes I would write if I was teaching a course on this topic. I have included more material than I will cover in the 45 minute lecture;

More information

Upper triangular forms for some classes of infinite dimensional operators

Upper triangular forms for some classes of infinite dimensional operators Upper triangular forms for some classes of infinite dimensional operators Ken Dykema, 1 Fedor Sukochev, 2 Dmitriy Zanin 2 1 Department of Mathematics Texas A&M University College Station, TX, USA. 2 School

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,

More information

MODELS OF LEARNING AND THE POLAR DECOMPOSITION OF BOUNDED LINEAR OPERATORS

MODELS OF LEARNING AND THE POLAR DECOMPOSITION OF BOUNDED LINEAR OPERATORS Eighth Mississippi State - UAB Conference on Differential Equations and Computational Simulations. Electronic Journal of Differential Equations, Conf. 19 (2010), pp. 31 36. ISSN: 1072-6691. URL: http://ejde.math.txstate.edu

More information

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax = . (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

More information

An Iterated Graph Laplacian Approach for Ranking on Manifolds

An Iterated Graph Laplacian Approach for Ranking on Manifolds An Iterated Graph Laplacian Approach for Ranking on Manifolds Xueyuan Zhou Department of Computer Science University of Chicago Chicago, IL zhouxy@cs.uchicago.edu Mikhail Belkin Department of Computer

More information

Real Variables # 10 : Hilbert Spaces II

Real Variables # 10 : Hilbert Spaces II randon ehring Real Variables # 0 : Hilbert Spaces II Exercise 20 For any sequence {f n } in H with f n = for all n, there exists f H and a subsequence {f nk } such that for all g H, one has lim (f n k,

More information

A graph based approach to semi-supervised learning

A graph based approach to semi-supervised learning A graph based approach to semi-supervised learning 1 Feb 2011 Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples.

More information

Exercise Sheet 1.

Exercise Sheet 1. Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?

More information

Topics in Harmonic Analysis Lecture 1: The Fourier transform

Topics in Harmonic Analysis Lecture 1: The Fourier transform Topics in Harmonic Analysis Lecture 1: The Fourier transform Po-Lam Yung The Chinese University of Hong Kong Outline Fourier series on T: L 2 theory Convolutions The Dirichlet and Fejer kernels Pointwise

More information

Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian

Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian Radu Balan January 31, 2017 G = (V, E) An undirected graph G is given by two pieces of information: a set of vertices V and a set of edges E, G =

More information