Nonlinear Dimensionality Reduction

Size: px
Start display at page:

Download "Nonlinear Dimensionality Reduction"

Transcription

1 Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012)

2 Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

3 Centering in Feature Space Suppose we use kernel function ˆk(, ) which induces a nonlinear feature map ˆφ from the input space X to some feature space F. The images of the N points in F are ˆφ(x (1) ),..., ˆφ(x (N) ), which in general are not centered. The corresponding kernel matrix ˆK is ˆK = [ ˆK ij ] N N = [ˆk(x (i), x (j) )] N N = [ ˆφ(x (i) ), ˆφ(x (j) ) ] N N We want to translate the coordinate system of F such that the new origin is at the sample mean of the N points, i.e., φ(x (i) ) = ˆφ(x (i) ) 1 N N ˆφ(x (j) ) j=1

4 Centering in Feature Space (2) As a result, we also convert the kernel matrix ˆK to K: K = [K ij ] N N = [k(x (i), x (j) )] N N = [ φ(x (i) ), φ(x (j) ) ] N N Let Z = [φ(x (1) ),..., φ(x (N) )] T Ẑ = [ ˆφ(x (1) ),..., ˆφ(x (N) )] T H = I 1 N 11T where 1 is a column vector of ones. We write Z = HẐ. Hence, K = ZZ T = HẐẐT H = HˆKH

5 Eigenvalue Equation Based on Covariance Matrix The covariance matrix of the N centered points in F is given by C = 1 N N φ(x (i) )φ(x (i) ) T (1) i=1 If F is infinite-dimensional (e.g., F is a Hilbert space), we can think of φ(x (i) )φ(x (i) ) T as a linear operator on F, mapping z φ(x (i) ) φ(x (i) ), z. To perform PCA in F, we solve the following eigenvalue equation for the eigenvalues λ k (k = 1,..., N) and eigenvectors v k (k = 1,..., N) of C: Cv = λv (2)

6 Eigenvalue Equation Based on Covariance Matrix (2) Substituting (1) into (2) gives an equivalent form of (2): λv = 1 N N φ(x (i) )φ(x (i) ) T v = i=1 N i=1 φ(x (i) ), v φ(x (i) ) N If λ 0, then we have the following dual eigenvector representation: v = N i=1 φ(x (i) ), v φ(x (i) ) = λn N α i φ(x (i) ) (3) for some coefficients α i (i = 1,..., N). Thus, all eigenvector solutions v with nonzero eigenvalues λ 0 must lie in the span of φ(x (1) ),..., φ(x (N) ). (2) can be written as the following set of equations: φ(x (k) ), Cv = λ φ(x (k) ), v, k = 1,..., N. (4) i=1

7 Eigenvalue Equation Based on Kernel Matrix Substituting (1) and (3) into (4), we have 1 N N j=1 K kj N α i K ji = λ i=1 i=1 N α i K ki, k = 1,..., N or in matrix form: K 2 α = NλKα, (5) where K is the kernel matrix (or Gram matrix) and α = (α 1,..., α N ) T. If K is invertible, (5) can be expressed as the following (dual) eigenvalue equation: Kα = ξα (6) where ξ = Nλ.

8 Normalization of Eigenvectors Let ξ 1... ξ N 0 denote the N eigenvalues of K and α 1,..., α N be the corresponding eigenvectors. Suppose ξ p is the smallest nonzero eigenvalue for some 1 p N. We normalize α 1,..., α p such that v k, v k = 1, k = 1,..., p. (7)

9 Normalization of Eigenvectors (2) Substituting (3) into (7), we have N α ik α jk K ij = 1, i,j=1 α k, Kα k = 1, α k, ξ k α k = 1, α k, α k = 1, ξ k for all k = 1,..., p.

10 Normalization of Eigenvectors (3) Suppose the eigenvectors obtain for (6) are such that α k = 1, k = 1,..., p. Then we should modify (3) to in order to satisfy (7). v k = 1 ξk N i=1 α ik φ(x (i) )

11 Embedding of New Data Points For any input x, the kth principal component y k of φ(x) is given by y k = v k, φ(x) = 1 ξk N i=1 α ik φ(x (i) ), φ(x) = 1 ξk N i=1 α ik k(x (i), x). If x = x j for some 1 j N, i.e., x is one of the N original points, then the kth principal component y jk of φ(x j ) becomes y jk = v k, φ(x (j) ) = 1 ξk = N i=1 1 ξk (ξ k α k ) j = ξ k α jk, α ik K ij = 1 ξk (Kα k ) j which is proportional to the expansion coefficient α jk.

12 Embedding of New Data Points (2) Let Y = [y jk ] N p. Then we can express Y as Y = [α 1,..., α p ]diag( ξ 1,..., ξ p ). Note that K = YY T.

13 Geodesic Distance Euclidean distance in the high-dimensional input space cannot reflect the true low-dimensional geometry of the manifold.

14 Geodesic Distance (2) The geodesic ( shortest path ) distance should be used instead. Geodesic distance: Neighboring points: input space Euclidean distance provided a good approximation of the geodesic distance. Faraway points: geodesic distance can be approximated by adding up a sequence of short hops between neighboring points based on Euclidean distance.

15 Isomap Algorithm Isomap is a nonlinear dimensionality reduction (NLDR) method that is based on metric MDS but seeks to preserve the intrinsic geometry of the data as captured in the geodesic distances between data points. Three steps of the Isomap algorithm: 1 Construct the neighborhood graph 2 Compute the shortest paths 3 Construct the low-dimensional embedding

16 Isomap Algorithm (2) Given distance d(i, j) between point pairs for N points in X. 1 Construct the neighborhood graph: Define a graph G over all N data points by connecting points i and j if their distance d(i, j) is closer than ɛ (ɛ-isomap) or if i is one of the K nearest neighbors of j (K -Isomap). Set edge lengths equal to d(i, j). 2 Compute the shortest paths: Initialize d G (i, j) = d(i, j) if i and j are linked by an edge and d G (i, j) = otherwise. For each k = 1,..., N, replace all entries d G (i, j) by min(d G (i, j), d G (i, k) + d G (k, j)). Then D G = [d G (i, j)] contains the shortest path distances between all point pairs in G. 3 Construct the low-dimensional embedding (by MDS).

17 Isomap Algorithm (3) There are two bottlenecks in the Isomap algorithm. Shortest path computation: Floydś algorithm: O(N 3 ) Dijkstraś algorithm (with Fibonacci heaps): O(KN 2 log N) where K is the neighborhood size. Eigendecomposition: O(N 3 )

18 Example - Face Images

19 Example - Face Images

20 Intrinsic Dimensionality of Data Manifolds In practice, some of the eigenvalues may be so close to zero that they can be ignored. As with PCA and MDS, the true (intrinsic) dimensionality of the data can be estimated from the decrease in error as the dimensionality of the low-dimensional space increases. For nonlinear manifolds, PCA and MDS tend to overestimate the intrinsic dimensionality. The intrinsic degrees of freedom provide a simple way to analyze and manipulate high-dimensional data.

21 Intrinsic Dimensionality of Data Manifolds (2) The residual variance of PCA, MDS, and Isomap on 4 data sets: (A) face images (MDS, Isomap ) (B) Swiss roll data (MDS, Isomap ) (C) hand images (MDS, Isomap ) (D) handwritten 2 s(pca, MDS, Isomap )

22 Global vs. Local Embedding Methods Metric MDS and Isomap compute embeddings that seek to preserve inter-point straight-line (Euclidean) distances or geodesic distances between all pairs of points. Hence they are global methods. Both locally linear embedding (LLE) and Laplacian eigenmap try to recover the global nonlinear structure from local geometric properties. They are local methods. Overlapping local neighborhoods, collectively analyzed, can provide information about the global geometry.

23 Computational Advantages of LLE Like PCA and MDS, LLE is simple to implement and its optimization problems do not have local minima. Although only linear algebraic methods are used, the constraint that points are only reconstructed from neighbors based on locally linear fits can result in highly nonlinear embeddings. Its main step involves a sparse eigenvalue problem that scales up better with large, high-dimensional data sets.

24 Problem Setting Let X = {x (1),..., x (N) } be a set of n points in a high-dimensional input space R D. The N data points are assumed to lie on or near a nonlinear manifold of intrinsic dimensionality p < D (typically p D). Provided that sufficient data are available by sampling well from the manifold, the goal of LLE is to find a low-dimensional embedding of X by mapping the D-dimensional data into a single global coordinate system in R p. Let us denote the set of N points in the embedding space R p by Y = {y (1),..., y (N) }.

25 LLE Algorithm 1 For each data point x (i) X : Find the set N i of K nearest neighbors of x (i). Compute the reconstruction weights of the neighbors that minimize the error of reconstructing x (i). 2 Compute the low-dimensional embedding Y that best preserves the local geometry represented by the reconstruction weights.

26 Locally Linear Fitting If the manifold is sufficiently dense, then each point and its neighbors are expected to lie on or close to a locally linear patch of the manifold. The local geometry of a patch is characterized by the reconstruction weights with which a data point is constructed from its neighbors. Let w i denote the K -dimensional vector of local reconstruction weights for data point x (i). (One may also consider the full N-dimensional weight vector by constraining the terms w ij for x (j) / N i to 0)

27 Constrained Least Squares Problem Optimality is achieved by minimizing the local reconstruction error function for each data point x (i) : E i (w i ) = x (i) x (j) N i w ij x (j) 2 which is the squared distance between x (i) and its reconstruction, subject to the constraints x (j) N i w ij = 1 T w i = 1 and w ij = 0 for any x (j) / N i. This is a constrained least squares problem that can be solved using the classical method of Lagrange multipliers.

28 Constrained Least Squares Problem(2) The error function E i (w i ) can be rewritten as follows: E i (w i ) = [ w ij (x (i) x (j) )] T [ w ij (x (i) x (j) )] x (j) N i x (j) N i = w ij w ik (x (i) x (j) ) T (x (i) x (k) )] x (j),x (k) N i = w T i G i w i where G i = [(x (i) x (j) ) T (x (i) x (k) )] K K is the local Gram matrix for x (i). To minimize E i (w i ) subject to the constraint 1 T w i = 1, we define Lagrangian function with multiplier λ: L(w i, λ) = w T i G i w i + λ(1 1 T w i )

29 Constrained Least Squares Problem (3) The partial derivative of L(w i, λ) w.r.t. w i and λ are L w i = 2G i w i λ1 L λ = 1 1T w i Setting the above equations to 0, we finally get (if G 1 i w i = G 1 i 1 1 T G 1 i 1 exists)

30 A More Efficient Method Instead of inverting G i, a more efficient way is to solve the linear system of equations G i ŵ i = 1 for ŵ i and then compute w i as ŵi w i = 1 T ŵ i so that the equality constraint 1 T w i = 1 is satisfied. Based on the reconstruction weights computed for all N data points, we form a weight matrix W = [w ij ] N N that will be used in the next step.

31 Low-Dimensional Embedding Given the weight matrix W, the best low-dimensional embedding Y can be computed by minimizing the following error function w.r.t. Y = [y (1),..., y (N) ] T R N p : J(Y) = N y (i) i=1 x (j) N i w ij y (j) Let b i be the ith column of the identity matrix I and w i be the ith column of W T (i.e., w i is the weight vector for x (i) ).

32 Optimization We can rewrite J(Y) as N N J(Y) = Y T b i Y T w i 2 = Y T (b i w i ) 2 i=1 i=1 = Y T (I W T ) 2 F = Tr[YT (I W) T (I W)Y] = Tr[Y T MY] where M = (I W) T (I W) is symmetric and positive semi-definite matrix (since x T Mx 0 for all x). M is sparse for reasonable choices of the neighborhood size K (i.e., K N).

33 Invariance to Translation, Rotation and Scaling Note that the error function J(Y) is invariant to translation, rotation and scaling of the vectors y (i) in the low-dimensional embedding Y. To remove the translational degree of freedom, we require the vectors y (i) to have zero mean, i.e. N y (i) = Y T 1 = 0 i=1 To remove the degree of freedom due to rotation and scaling, we constrain the vectors y (i) to have covariance matrix equal to the identity matrix, i.e. 1 N N y (i) (y (i) ) T = 1 N YT Y = I i=1

34 Eigenvalue Problem The optimization problem can thus be stated as min Tr(Y T MY) Y subject to Y T 1 = 0 and Y T Y = NI If we express Y as [y 1,..., y p ], then the optimization problem can also be expressed as p min y T k My k Y subject to k=1 y T k 1 = 0 and yt k y k = N for k = 1,..., p.

35 Eigenvalue Problem (2) Thus the solution to the optimization problem can be obtained by solving the following eigenvalue problem My = λy for the eigenvectors y k (k = 1,..., p) that correspond to the p smallest nonzero eigenvalues. The eigenvectors are normalized such that y T k y k = N for all k = 1,..., p.

36 Algorithm Overview Let x (1),..., x (N) be N points in R D. Like Isomap, the Laplacian eigenmap algorithm first constructs a weighted graph with N nodes representing the neighborhood relationships. It then computes an eigenmap based on the graph.

37 Edge Creation An edge is created between nodes i and j if x (i) and x (j) are close to each other. Two possible criteria for edge creation: ɛ-neighborhood: Nodes i and j are connected by an edge if x (i) x (j) < ɛ for some ɛ R +. K nearest neighbors: Nodes i and j are connected by an edge if x (i) is among the K nearest neighbors of x (j) or x (j) is among the K nearest neighbors of x (i).

38 Edge Weighting Two common variations for edge weighting: Heat kernel: { w ij = exp( x(i) x (j) ) if nodes i and j are connected σ 2 0 otherwise for some σ 2 R +. Binary weights: { 1 if nodes i and j are connected w ij = 0 otherwise

39 Construction of Eigenmap If the graph constructed above is not connected, then the following procedure is applied to each connected component separately. We first consider the special case which finds a 1-dimensional embedding, and then generalize it to the general p-dimensional case for p > 1. Let y = (y (1),..., y (N) ) T denote 1-dimensional embedding. The objective function for minimization is given by N (y (i) y (j) ) 2 w ij i,j=1

40 Construction of Eigenmap (2) We can rewrite the objective function as 1 2 N (y (i) y (j) ) 2 w ij = 1 2 i,j=1 = = N (y (i) ) 2 w ij i,j=1 N (y (i) ) 2 d ii i=1 N (y (j) ) 2 w ij + i,j=1 N y (i) y (j) w ij i,j=1 N y (i) (d ij w ij ) i,j=1 = y T (D W)y = y T Ly N y (i) y (j) w ij i,j=1 where d ii = j w ij,d = diag([d 11,..., d NN ]),L = D W is the graph Laplacian.

41 Scale Invariance To remove the arbitrary scaling factor in the embedding, we enforce the constraint y T Dy = 1 (the larger d ii is, the more important is the corresponding node i). The optimization problem can thus be restated as min y y T Ly subject to y T Dy = 1 or min y y T Ly y T Dy

42 Generalized Eigenvalue Problem This corresponds to solving the following eigenvalue problem (D 1 L)y = λy or the corresponding generalized eigenvalue problem Ly = λdy for the smallest eigenvalue λ and the corresponding eigenvector y. Note that λ = 0 and y = ci for all c 0 form a solution since cl1 = c(d W)1 = 0 = 0D1

43 Generalized Eigenvalue Problem (2) To eliminate such cases, we modify the minimization problem to min y y T Ly subject to y T Dy = 1, y T D1 = 0 Note that if D = I, then y T D1 = 0 is equivalent to centering. Finally, we can conclude that the solution is the eigenvector y for the following generalized eigenvalue problem: Ly = λdy corresponding to the smallest nonzero eigenvalue λ. Normalization of y is performed such that y T Dy = 1.

44 Construction of Eigenmap for p > 1 Let the p-dimensional embedding be denoted by the N p matrix Y = [y n,..., y p ] = [y (1),..., y (N) ] T. Note that y (i) is the p-dimensional representation of x (i) in the embedding space. The objective function for minimization is given by N y (i) y (j) 2 w ij i,j=1

45 Construction of Eigenmap for p > 1 (2) The minimization problem can be stated as min Tr(Y T LY) Y subject to Y T DY = I, Y T D1 = 0 Solution: the eigenvectors y k (k = 1,..., p) for the generalized eigenvalue problem corresponding to the p smallest nonzero eigenvalues give the solution to the optimization problem above. The eigenvectors are normalized such that y T k Dy k = 1 for all k = 1,..., p.

46 And Beyond Robust version of dimensionality reduction Out-of-sample extensions for LLE, Isomap, MDS A kernel view of embedding methods Probabilistic view Supervised and semi-supervised extensions Applications Super-resolution Image recognition Many many others...

47 Main References Kernel PCA: [SSM98] Isomap: [Ten98][TdL00][BST + 02] Locally Linear Embedding: [RS00][SR03] Laplacian Eigenmap: [BN02][BN03]

48 M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In T.G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pages MIT Press, Cambridge, MA, USA, M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6): , M. Balasubramanian, E.L. Schwartz, J.B. Tenenbaum, V. de Silva, and J.C. Langford. The Isomap algorithm and topological stability. Science, 295(5552):7a, S.T. Roweis and L.K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500): , 2000.

49 L.K. Saul and S.T. Roweis. Think globally, fit locally: unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research, 4: , B. Schölkopf, A.J. Smola, and K.-R. Müller. Nonlinear component analysis as a kernel eigenvalue probelm. Neural Computation, 10: , J.B. Tenenbaum, V. de Silva, and J.C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500): , J.B. Tenenbaum. Mapping a manifold of perceptual observations. In M.I. Jordan, M.J. Kearns, and S.A. Solla, editors, Advances in Neural Information Processing Systems 10, pages MIT Press, 1998.

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the

More information

Lecture 10: Dimension Reduction Techniques

Lecture 10: Dimension Reduction Techniques Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set

More information

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13

CSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13 CSE 291. Assignment 3 Out: Wed May 23 Due: Wed Jun 13 3.1 Spectral clustering versus k-means Download the rings data set for this problem from the course web site. The data is stored in MATLAB format as

More information

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Neural Computation, June 2003; 15 (6):1373-1396 Presentation for CSE291 sp07 M. Belkin 1 P. Niyogi 2 1 University of Chicago, Department

More information

Dimension Reduction and Low-dimensional Embedding

Dimension Reduction and Low-dimensional Embedding Dimension Reduction and Low-dimensional Embedding Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/26 Dimension

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures

More information

Manifold Learning: Theory and Applications to HRI

Manifold Learning: Theory and Applications to HRI Manifold Learning: Theory and Applications to HRI Seungjin Choi Department of Computer Science Pohang University of Science and Technology, Korea seungjin@postech.ac.kr August 19, 2008 1 / 46 Greek Philosopher

More information

L26: Advanced dimensionality reduction

L26: Advanced dimensionality reduction L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern

More information

Data-dependent representations: Laplacian Eigenmaps

Data-dependent representations: Laplacian Eigenmaps Data-dependent representations: Laplacian Eigenmaps November 4, 2015 Data Organization and Manifold Learning There are many techniques for Data Organization and Manifold Learning, e.g., Principal Component

More information

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold. Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all

More information

Nonlinear Dimensionality Reduction. Jose A. Costa

Nonlinear Dimensionality Reduction. Jose A. Costa Nonlinear Dimensionality Reduction Jose A. Costa Mathematics of Information Seminar, Dec. Motivation Many useful of signals such as: Image databases; Gene expression microarrays; Internet traffic time

More information

Intrinsic Structure Study on Whale Vocalizations

Intrinsic Structure Study on Whale Vocalizations 1 2015 DCLDE Conference Intrinsic Structure Study on Whale Vocalizations Yin Xian 1, Xiaobai Sun 2, Yuan Zhang 3, Wenjing Liao 3 Doug Nowacek 1,4, Loren Nolte 1, Robert Calderbank 1,2,3 1 Department of

More information

Dimensionality Reduction AShortTutorial

Dimensionality Reduction AShortTutorial Dimensionality Reduction AShortTutorial Ali Ghodsi Department of Statistics and Actuarial Science University of Waterloo Waterloo, Ontario, Canada, 2006 c Ali Ghodsi, 2006 Contents 1 An Introduction to

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction

Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction A presentation by Evan Ettinger on a Paper by Vin de Silva and Joshua B. Tenenbaum May 12, 2005 Outline Introduction The

More information

Learning a Kernel Matrix for Nonlinear Dimensionality Reduction

Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger kilianw@cis.upenn.edu Fei Sha feisha@cis.upenn.edu Lawrence K. Saul lsaul@cis.upenn.edu Department of Computer and Information

More information

Manifold Learning and it s application

Manifold Learning and it s application Manifold Learning and it s application Nandan Dubey SE367 Outline 1 Introduction Manifold Examples image as vector Importance Dimension Reduction Techniques 2 Linear Methods PCA Example MDS Perception

More information

Data dependent operators for the spatial-spectral fusion problem

Data dependent operators for the spatial-spectral fusion problem Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.

More information

Robust Laplacian Eigenmaps Using Global Information

Robust Laplacian Eigenmaps Using Global Information Manifold Learning and its Applications: Papers from the AAAI Fall Symposium (FS-9-) Robust Laplacian Eigenmaps Using Global Information Shounak Roychowdhury ECE University of Texas at Austin, Austin, TX

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Nonlinear Manifold Learning Summary

Nonlinear Manifold Learning Summary Nonlinear Manifold Learning 6.454 Summary Alexander Ihler ihler@mit.edu October 6, 2003 Abstract Manifold learning is the process of estimating a low-dimensional structure which underlies a collection

More information

DIMENSION REDUCTION. min. j=1

DIMENSION REDUCTION. min. j=1 DIMENSION REDUCTION 1 Principal Component Analysis (PCA) Principal components analysis (PCA) finds low dimensional approximations to the data by projecting the data onto linear subspaces. Let X R d and

More information

Learning a kernel matrix for nonlinear dimensionality reduction

Learning a kernel matrix for nonlinear dimensionality reduction University of Pennsylvania ScholarlyCommons Departmental Papers (CIS) Department of Computer & Information Science 7-4-2004 Learning a kernel matrix for nonlinear dimensionality reduction Kilian Q. Weinberger

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

More information

Locality Preserving Projections

Locality Preserving Projections Locality Preserving Projections Xiaofei He Department of Computer Science The University of Chicago Chicago, IL 60637 xiaofei@cs.uchicago.edu Partha Niyogi Department of Computer Science The University

More information

Statistical and Computational Analysis of Locality Preserving Projection

Statistical and Computational Analysis of Locality Preserving Projection Statistical and Computational Analysis of Locality Preserving Projection Xiaofei He xiaofei@cs.uchicago.edu Department of Computer Science, University of Chicago, 00 East 58th Street, Chicago, IL 60637

More information

Lecture: Some Practical Considerations (3 of 4)

Lecture: Some Practical Considerations (3 of 4) Stat260/CS294: Spectral Graph Methods Lecture 14-03/10/2015 Lecture: Some Practical Considerations (3 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough.

More information

Apprentissage non supervisée

Apprentissage non supervisée Apprentissage non supervisée Cours 3 Higher dimensions Jairo Cugliari Master ECD 2015-2016 From low to high dimension Density estimation Histograms and KDE Calibration can be done automacally But! Let

More information

Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine

Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine Olga Kouropteva, Oleg Okun, Matti Pietikäinen Machine Vision Group, Infotech Oulu and

More information

A Duality View of Spectral Methods for Dimensionality Reduction

A Duality View of Spectral Methods for Dimensionality Reduction A Duality View of Spectral Methods for Dimensionality Reduction Lin Xiao 1 Jun Sun 2 Stephen Boyd 3 May 3, 2006 1 Center for the Mathematics of Information, California Institute of Technology, Pasadena,

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Dimensionality Reduction: A Comparative Review

Dimensionality Reduction: A Comparative Review Tilburg centre for Creative Computing P.O. Box 90153 Tilburg University 5000 LE Tilburg, The Netherlands http://www.uvt.nl/ticc Email: ticc@uvt.nl Copyright c Laurens van der Maaten, Eric Postma, and Jaap

More information

Machine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU,

Machine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU, Eric Xing Eric Xing @ CMU, 2006-2010 1 Machine Learning Data visualization and dimensionality reduction Eric Xing Lecture 7, August 13, 2010 Eric Xing Eric Xing @ CMU, 2006-2010 2 Text document retrieval/labelling

More information

Gaussian Process Latent Random Field

Gaussian Process Latent Random Field Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Gaussian Process Latent Random Field Guoqiang Zhong, Wu-Jun Li, Dit-Yan Yeung, Xinwen Hou, Cheng-Lin Liu National Laboratory

More information

A Duality View of Spectral Methods for Dimensionality Reduction

A Duality View of Spectral Methods for Dimensionality Reduction Lin Xiao lxiao@caltech.edu Center for the Mathematics of Information, California Institute of Technology, Pasadena, CA 91125, USA Jun Sun sunjun@stanford.edu Stephen Boyd boyd@stanford.edu Department of

More information

Spectral Dimensionality Reduction via Maximum Entropy

Spectral Dimensionality Reduction via Maximum Entropy Sheffield Institute for Translational Neuroscience and Department of Computer Science, University of Sheffield Abstract We introduce a new perspective on spectral dimensionality reduction which views these

More information

Dimensionality Reduc1on

Dimensionality Reduc1on Dimensionality Reduc1on contd Aarti Singh Machine Learning 10-601 Nov 10, 2011 Slides Courtesy: Tom Mitchell, Eric Xing, Lawrence Saul 1 Principal Component Analysis (PCA) Principal Components are the

More information

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Machine Learning. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Machine Learning Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Machine Learning Fall 1395 1 / 47 Table of contents 1 Introduction

More information

(Non-linear) dimensionality reduction. Department of Computer Science, Czech Technical University in Prague

(Non-linear) dimensionality reduction. Department of Computer Science, Czech Technical University in Prague (Non-linear) dimensionality reduction Jiří Kléma Department of Computer Science, Czech Technical University in Prague http://cw.felk.cvut.cz/wiki/courses/a4m33sad/start poutline motivation, task definition,

More information

Graphs, Geometry and Semi-supervised Learning

Graphs, Geometry and Semi-supervised Learning Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Learning Eigenfunctions Links Spectral Embedding

Learning Eigenfunctions Links Spectral Embedding Learning Eigenfunctions Links Spectral Embedding and Kernel PCA Yoshua Bengio, Olivier Delalleau, Nicolas Le Roux Jean-François Paiement, Pascal Vincent, and Marie Ouimet Département d Informatique et

More information

Data Mining II. Prof. Dr. Karsten Borgwardt, Department Biosystems, ETH Zürich. Basel, Spring Semester 2016 D-BSSE

Data Mining II. Prof. Dr. Karsten Borgwardt, Department Biosystems, ETH Zürich. Basel, Spring Semester 2016 D-BSSE D-BSSE Data Mining II Prof. Dr. Karsten Borgwardt, Department Biosystems, ETH Zürich Basel, Spring Semester 2016 D-BSSE Karsten Borgwardt Data Mining II Course, Basel Spring Semester 2016 2 / 117 Our course

More information

Spectral Dimensionality Reduction

Spectral Dimensionality Reduction Spectral Dimensionality Reduction Yoshua Bengio, Olivier Delalleau, Nicolas Le Roux Jean-François Paiement, Pascal Vincent, and Marie Ouimet Département d Informatique et Recherche Opérationnelle Centre

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 2 Nonlinear Manifold Learning Multidimensional Scaling (MDS) Locally Linear Embedding (LLE) Beyond Principal Components Analysis (PCA)

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Manifold Learning: From Linear to nonlinear. Presenter: Wei-Lun (Harry) Chao Date: April 26 and May 3, 2012 At: AMMAI 2012

Manifold Learning: From Linear to nonlinear. Presenter: Wei-Lun (Harry) Chao Date: April 26 and May 3, 2012 At: AMMAI 2012 Manifold Learning: From Linear to nonlinear Presenter: Wei-Lun (Harry) Chao Date: April 26 and May 3, 2012 At: AMMAI 2012 1 Preview Goal: Dimensionality Classification reduction and clustering Main idea:

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

ISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at:

More information

Discriminant Uncorrelated Neighborhood Preserving Projections

Discriminant Uncorrelated Neighborhood Preserving Projections Journal of Information & Computational Science 8: 14 (2011) 3019 3026 Available at http://www.joics.com Discriminant Uncorrelated Neighborhood Preserving Projections Guoqiang WANG a,, Weijuan ZHANG a,

More information

A Scalable Kernel-Based Algorithm for Semi-Supervised Metric Learning

A Scalable Kernel-Based Algorithm for Semi-Supervised Metric Learning A Scalable Kernel-Based Algorithm for Semi-Supervised Metric Learning Dit-Yan Yeung, Hong Chang, Guang Dai Department of Computer Science and Engineering Hong Kong University of Science and Technology

More information

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised

More information

Chap.11 Nonlinear principal component analysis [Book, Chap. 10]

Chap.11 Nonlinear principal component analysis [Book, Chap. 10] Chap.11 Nonlinear principal component analysis [Book, Chap. 1] We have seen machine learning methods nonlinearly generalizing the linear regression method. Now we will examine ways to nonlinearly generalize

More information

Graph-Laplacian PCA: Closed-form Solution and Robustness

Graph-Laplacian PCA: Closed-form Solution and Robustness 2013 IEEE Conference on Computer Vision and Pattern Recognition Graph-Laplacian PCA: Closed-form Solution and Robustness Bo Jiang a, Chris Ding b,a, Bin Luo a, Jin Tang a a School of Computer Science and

More information

Image Analysis & Retrieval Lec 13 - Feature Dimension Reduction

Image Analysis & Retrieval Lec 13 - Feature Dimension Reduction CS/EE 5590 / ENG 401 Special Topics, Spring 2018 Image Analysis & Retrieval Lec 13 - Feature Dimension Reduction Zhu Li Dept of CSEE, UMKC http://l.web.umkc.edu/lizhu Office Hour: Tue/Thr 2:30-4pm@FH560E,

More information

Nonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization

Nonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization Nonlinear Dimensionality Reduction by Semidefinite Programming and Kernel Matrix Factorization Kilian Q. Weinberger, Benjamin D. Packer, and Lawrence K. Saul Department of Computer and Information Science

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

A SEMI-SUPERVISED METRIC LEARNING FOR CONTENT-BASED IMAGE RETRIEVAL. {dimane,

A SEMI-SUPERVISED METRIC LEARNING FOR CONTENT-BASED IMAGE RETRIEVAL. {dimane, A SEMI-SUPERVISED METRIC LEARNING FOR CONTENT-BASED IMAGE RETRIEVAL I. Daoudi,, K. Idrissi, S. Ouatik 3 Université de Lyon, CNRS, INSA-Lyon, LIRIS, UMR505, F-696, France Faculté Des Sciences, UFR IT, Université

More information

Dimensionality Reduction:

Dimensionality Reduction: Dimensionality Reduction: From Data Representation to General Framework Dong XU School of Computer Engineering Nanyang Technological University, Singapore What is Dimensionality Reduction? PCA LDA Examples:

More information

Dimensionality Reduction: A Comparative Review

Dimensionality Reduction: A Comparative Review Dimensionality Reduction: A Comparative Review L.J.P. van der Maaten, E.O. Postma, H.J. van den Herik MICC, Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands. Abstract In recent

More information

Distance Preservation - Part 2

Distance Preservation - Part 2 Distance Preservation - Part 2 Graph Distances Niko Vuokko October 9th 2007 NLDR Seminar Outline Introduction Geodesic and graph distances From linearity to nonlinearity Isomap Geodesic NLM Curvilinear

More information

Dimensionality Reduction

Dimensionality Reduction Dimensionality Reduction Neil D. Lawrence neill@cs.man.ac.uk Mathematics for Data Modelling University of Sheffield January 23rd 28 Neil Lawrence () Dimensionality Reduction Data Modelling School 1 / 7

More information

ISOMAP TRACKING WITH PARTICLE FILTER

ISOMAP TRACKING WITH PARTICLE FILTER Clemson University TigerPrints All Theses Theses 5-2007 ISOMAP TRACKING WITH PARTICLE FILTER Nikhil Rane Clemson University, nrane@clemson.edu Follow this and additional works at: https://tigerprints.clemson.edu/all_theses

More information

Kernel Principal Component Analysis

Kernel Principal Component Analysis Kernel Principal Component Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst

Learning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Learning on Graphs and Manifolds CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Outline Manifold learning is a relatively new area of machine learning (2000-now). Main idea Model the underlying geometry of

More information

Local Learning Projections

Local Learning Projections Mingrui Wu mingrui.wu@tuebingen.mpg.de Max Planck Institute for Biological Cybernetics, Tübingen, Germany Kai Yu kyu@sv.nec-labs.com NEC Labs America, Cupertino CA, USA Shipeng Yu shipeng.yu@siemens.com

More information

Informative Laplacian Projection

Informative Laplacian Projection Informative Laplacian Projection Zhirong Yang and Jorma Laaksonen Department of Information and Computer Science Helsinki University of Technology P.O. Box 5400, FI-02015, TKK, Espoo, Finland {zhirong.yang,jorma.laaksonen}@tkk.fi

More information

The Curse of Dimensionality for Local Kernel Machines

The Curse of Dimensionality for Local Kernel Machines The Curse of Dimensionality for Local Kernel Machines Yoshua Bengio, Olivier Delalleau, Nicolas Le Roux Dept. IRO, Université de Montréal P.O. Box 6128, Downtown Branch, Montreal, H3C 3J7, Qc, Canada {bengioy,delallea,lerouxni}@iro.umontreal.ca

More information

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 23 1 / 27 Overview

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Approximate Kernel PCA with Random Features

Approximate Kernel PCA with Random Features Approximate Kernel PCA with Random Features (Computational vs. Statistical Tradeoff) Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Journées de Statistique Paris May 28,

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Deterministic Component Analysis Goal (Lecture): To present standard and modern Component Analysis (CA) techniques such as Principal

More information

Bi-stochastic kernels via asymmetric affinity functions

Bi-stochastic kernels via asymmetric affinity functions Bi-stochastic kernels via asymmetric affinity functions Ronald R. Coifman, Matthew J. Hirn Yale University Department of Mathematics P.O. Box 208283 New Haven, Connecticut 06520-8283 USA ariv:1209.0237v4

More information

Kernel methods for comparing distributions, measuring dependence

Kernel methods for comparing distributions, measuring dependence Kernel methods for comparing distributions, measuring dependence Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Principal component analysis Given a set of M centered observations

More information

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral

More information

Statistical Learning. Dong Liu. Dept. EEIS, USTC

Statistical Learning. Dong Liu. Dept. EEIS, USTC Statistical Learning Dong Liu Dept. EEIS, USTC Chapter 6. Unsupervised and Semi-Supervised Learning 1. Unsupervised learning 2. k-means 3. Gaussian mixture model 4. Other approaches to clustering 5. Principle

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Nonlinear Learning using Local Coordinate Coding

Nonlinear Learning using Local Coordinate Coding Nonlinear Learning using Local Coordinate Coding Kai Yu NEC Laboratories America kyu@sv.nec-labs.com Tong Zhang Rutgers University tzhang@stat.rutgers.edu Yihong Gong NEC Laboratories America ygong@sv.nec-labs.com

More information

Spectral Techniques for Clustering

Spectral Techniques for Clustering Nicola Rebagliati 1/54 Spectral Techniques for Clustering Nicola Rebagliati 29 April, 2010 Nicola Rebagliati 2/54 Thesis Outline 1 2 Data Representation for Clustering Setting Data Representation and Methods

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction Linear vs Non-linear classifier CS789: Machine Learning and Neural Network Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Linear classifier is in the

More information

Graph Metrics and Dimension Reduction

Graph Metrics and Dimension Reduction Graph Metrics and Dimension Reduction Minh Tang 1 Michael Trosset 2 1 Applied Mathematics and Statistics The Johns Hopkins University 2 Department of Statistics Indiana University, Bloomington November

More information

Approximate Kernel Methods

Approximate Kernel Methods Lecture 3 Approximate Kernel Methods Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Machine Learning Summer School Tübingen, 207 Outline Motivating example Ridge regression

More information

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang. Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

More information

Preprocessing & dimensionality reduction

Preprocessing & dimensionality reduction Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016

More information

Linear and Non-Linear Dimensionality Reduction

Linear and Non-Linear Dimensionality Reduction Linear and Non-Linear Dimensionality Reduction Alexander Schulz aschulz(at)techfak.uni-bielefeld.de University of Pisa, Pisa 4.5.215 and 7.5.215 Overview Dimensionality Reduction Motivation Linear Projections

More information