A spectral clustering algorithm based on Gram operators
|
|
- Barbara Ramsey
- 5 years ago
- Views:
Transcription
1 A spectral clustering algorithm based on Gram operators Ilaria Giulini De partement de Mathe matiques et Applications ENS, Paris Joint work with Olivier Catoni 1 july 2015
2
3 Clustering task of grouping objects into classes (clusters) according to their similarities Spectral clustering algorithms use data-dependent matrices to perform clusters
4 Similarity graph Assume notion of similarity (affinity matrix) A = (a ij ) (symmetric) where a ij 0 measures the similarity between X i and X j Represent data points in a similarity graph G = (V, E) where V = {X 1,..., X n } set of vertices E V V set of edges the edge between X i and X j is weighted by a ij
5 Graph partitioning Goal Find a partition of the graph such that edges between different groups have a low weight (dissimilar points) edges within a group have high weights (similar points)
6 Graph (bi-)partitioning: cut Find the partition that minimizes cut(s, S c ) = i S,j S c a ij efficient algorithms to solve it [M.Stoer, F.Wagner] Problem it tends to separate one single vertex from the rest of the graph
7 Graph partitioning: Ncut [Shi and Malik] Find the partition that minimizes ( 1 Ncut(S, S c ) = vol(s) + 1 ) vol(s c cut(s, S c ) ) where vol(s) = i S,j V a ij cut(s, S c ) = i S,j S c a ij Problem NP-hard Spectral clustering is a way to solve a relaxation of Ncut
8 Graph partitioning: Ncut Minimizing Ncut is equivalent to min S v (D A)v subject to Dv 1, v Dv = vol(v) where v R n such that v i = vol(s c ) vol(s) 1 [i S] + A = (a ij ) affinity matrix D = diag(d 1,..., d n ) where d i = j V a ij vol(v) = i V d i 1 = (1,..., 1) vol(s) vol(s c ) 1 [i S c ]
9 Relaxation Ncut min v R v (D A)v subject to Dv 1, v Dv = vol(v) n Define Laplacian matrix L = D 1/2 AD 1/2 Relaxation Ncut is equivalent to (u = D 1/2 v) min u R n u (I L)u subject to u D 1/2 1, u 2 = vol(v) where (I L) D 1/2 1 = 0 Solution u = 2nd smallest eigenvector of I L
10 c partitioning Compute the c smallest eigenvectors of I L c largest eigenvectors of L
11 Ng, Jordan, Weiss algorithm Let X 1,..., X n points to cluster and c the number of classes 1. Form a ij = { exp( X i X j 2 /2σ 2 ) if i j 0 otherwise 2. Construct L = D 1/2 AD 1/2 where D ii = j a ij 3. Compute largest eigenvectors v 1,..., v c of L ] and form T = [v 1... v c n c 4. Renormalize T and form Y ij = T ij /( j T2 ij )1/2 5. Treat each row of Y as a vector in R c 6. Cluster points according to the new representation
12 Continuous counterpart Let K(x, y) = exp( x y 2 /2σ 2 ) Consider L = D 1/2 AD 1/2 as empirical version of the integral operator with kernel ( K(x, y) = (P unknown) 1/2 ( K(x, z)dp(z)) K(x, y) ) 1/2 K(y, z)dp(z) References 1. [U. von Luxburg, M. Belkin, O. Bousquet] 2. [L. Rosasco, M. Belkin, E. DeVito]
13 Gram operators Recall ( K(x, y) = 1/2 ( K(x, z)dp(z)) K(x, y) ) 1/2 K(y, z)dp(z) By the Moore-Aronszajn theorem K(x, y) = φ(x), φ(y) H Define the Gram operator on H Gv = v, φ(z) H φ(z) dp(z)
14 Ideal algorithm Let K(x, y) = exp ( β x y 2) 1. Form K(x, y) = φ(x), φ(y) H 2. Construct K m(x, y) = G m 1 2 φ(x), G m 1 2 φ(y) H 3. Renormalize to obtain K m(x, y) = K m(x, x) 1/2 K m(x, y) K m(y, y) 1/2 4. Cluster points according to the new representation Goal: Construct an empirical algorithm
15 Step 1 ( K(x, y) = 1/2 ( K(x, z)dp(z)) K(x, y) ) 1/2 K(y, z)dp(z) Let ˆµ(x) be any estimator of µ(x) = K(x, z)dp(z) an estimator of K(x, y) is ˆK(x, y) = ˆµ(x) 1/2 K(x, y)ˆµ(y) 1/2
16 By the Moore-Aronszajn theorem ˆK(x, y) = ˆφ(x), ˆφ(y) Ĥ Replacing K(x, y) with ˆK(x, y) ˆK m (x, y) = Ĝ m 1 2 ˆφ(x), Ĝ m 1 2 ˆφ(y) Ĥ where Ĝv = v, ˆφ(z) Ĥ ˆφ(z) dp(z) (still unknown!) Estimate of Gram operators
17 Estimate of Gram operators Goal Estimate Gθ = θ, v H v dp(v), θ H from X 1,..., X n i.i.d. P Related problem Estimate the quadratic form Gθ, θ H = θ, v 2 H dp(v) Idea: Step 1 finite dimension non-asymptotic dimension-free bounds Step 2 generalization to infinite dimension
18 Finite-dimensional case Define The Gram matrix G = E[XX ] X R d P Goal: Estimate the quadratic form θ Gθ = E[ θ, X 2 ], θ R d from X 1,..., X n i.i.d. P Classical empirical estimator 1 n n θ, X i 2 law of large numbers E[ θ, X 2 ] n i=1 Robust estimator
19 To reduce the influence of the tail of the distribution of θ, X 2 r λ (θ) = 1 n n ψ ( θ, X i 2 λ ) λ > 0 i=1 where log ( ) ( ) 1 t + t2 ψ(t) log 1 + t + t2 2 2 t R
20 Truncate version of the empirical estimator Introduce r λ (θ) = 1 n n ψ ( θ, X i 2 λ ) i=1 ˆα θ = sup{α R + r λ (αθ) 0} where r λ (ˆα θ θ) = 0 Estimator linked to λ/ˆα 2 θ
21 Use a PAC-Bayesian approach to construct a confidence region with probability 1 2ɛ, θ S d, B (λ/ˆα 2 θ ) θ Gθ B + (λ/ˆα 2 θ ) Optimal confidence region B (θ) θ Gθ B +(θ) Define as an estimator of G } Ĝ = arg min { H F H = H, B (θ) θ Hθ B +(θ), θ Θ δ with Θ δ any finite δ-net of S d
22 Proposition Notation N(θ) = θ Gθ Let κ = sup θ E[ θ,x 4 ] E[ θ,x 2 ] 2 < + With probability 1 2ɛ, θ S d, N(θ) θ µ ( N(θ) ) Ĝθ 2 max{n(θ), σ} ( ( )) 1 4µ N(θ) + +7δ tr(g 2 )+σ where if n ( ) 2.023(κ 1) a tr(g) µ(n(θ)) = n max{n(θ), σ} + b + cκ tr(g) log(ɛ 1 ) + n max{n(θ), σ} extension to any Hilbert space assuming tr(g) < +
23 Empirical results sample in R 10 of size n = 100 drawn according to a Gaussian mixture distribution Left: projection onto the two first coordinates Right: projection onto the 2nd and 3rd coordinates
24 Empirical results (approximation errors) 500 empirical errors sorted in increasing order Figure: Ĝ G 2 F, Ḡ G 2 F
25 Infinite-dimensional case Let (H k ) k H increasing sequence of subspaces dim(h k ) < + and k H k = H By a continuity argument, with probability 1 2ɛ, θ S H, B (θ) Gθ, θ H B +(θ) Notation: V k = span{π k X 1,..., Π k X n } dim(v k ) < + Ĝ k : V k V k such that tr(ĝ2 k ) tr(g2 ) and B (θ) Ĝkθ, θ H B +(θ) θ Θ δ S H V k Define Q = Ĝk Π Vk
26 Proposition Notation: N(θ) = Gθ, θ H Let κ = sup θ E[ θ,x 4 H] E[ θ,x 2 H] 2 < + With probability at least 1 2ɛ, θ S H, µ ( N(θ) ) N(θ) Qθ, θ H 2 max{n(θ), σ} ( ( )) 1 4µ N(θ) + +7δ tr(g 2 )+σ+v k where v k 0 as k + and if n ( ) 2.023(κ 1) a tr(g) µ(n(θ)) = n max{n(θ), σ} + b + cκ tr(g) log(ɛ 1 ) + n max{n(θ), σ}
27 Final representation K m (x, y) = K m (x, x) 1/2 K m (x, y) K m (y, y) 1/2 In such a way smallest eigenvalues are killed natural dimensionality reduction automatic estimate of the number of classes Moreover clusters are sent at the vertices of a simplex
28 Empirical results (n = 900)
29 Empirical results
30 Empirical results
31 Empirical results
32 Empirical results
33 Empirical results
34 Bibliography I. Giulini, Generalization bounds for random samples in Hilbert spaces, PhD Thesis O. Catoni, Estimating the Gram matrix through PAC-Bayes bounds, preprint. O. Catoni, Challenging the empirical mean and empirical variance: a deviation study, Ann. Inst. H. Poincaré Probab. Statist. Vol. 48, No 4 (2012). A. Ng, M. Jordan, Y. Weiss. On spectral clustering: Analysis and an algorithm, Advances in Neural Information Processing Systems (2001). L. Rosasco, M. Belkin, E. De Vito, On learning with integral operators, J. Mach. Learn. Res. (2010). J. Shi, J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell 22,8 (2000) M. Stoer, F. Wagner, A simple min-cut algorithm, J. ACM (1997). U. von Luxburg, M. Belkin, O. Bousquet, Consistency of spectral clustering, Ann. Statist. (2008).
Thèse de doctorat. docteur de l école normale supérieure. Estimation statistique dans les espaces de Hilbert
Thèse de doctorat En vue de l obtention du grade de docteur de l école normale supérieure École doctorale 386 de sciences mathématiques de Paris-Centre Spécialité : Mathématiques Estimation statistique
More informationSpectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods
Spectral Clustering Seungjin Choi Department of Computer Science POSTECH, Korea seungjin@postech.ac.kr 1 Spectral methods Spectral Clustering? Methods using eigenvectors of some matrices Involve eigen-decomposition
More informationMATH 567: Mathematical Techniques in Data Science Clustering II
This lecture is based on U. von Luxburg, A Tutorial on Spectral Clustering, Statistics and Computing, 17 (4), 2007. MATH 567: Mathematical Techniques in Data Science Clustering II Dominique Guillot Departments
More informationIntroduction to Spectral Graph Theory and Graph Clustering
Introduction to Spectral Graph Theory and Graph Clustering Chengming Jiang ECS 231 Spring 2016 University of California, Davis 1 / 40 Motivation Image partitioning in computer vision 2 / 40 Motivation
More informationCSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13
CSE 291. Assignment 3 Out: Wed May 23 Due: Wed Jun 13 3.1 Spectral clustering versus k-means Download the rings data set for this problem from the course web site. The data is stored in MATLAB format as
More informationMATH 567: Mathematical Techniques in Data Science Clustering II
Spectral clustering: overview MATH 567: Mathematical Techniques in Data Science Clustering II Dominique uillot Departments of Mathematical Sciences University of Delaware Overview of spectral clustering:
More informationData Analysis and Manifold Learning Lecture 7: Spectral Clustering
Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral
More informationThe Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space
The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space Robert Jenssen, Deniz Erdogmus 2, Jose Principe 2, Torbjørn Eltoft Department of Physics, University of Tromsø, Norway
More informationSpectral Clustering. Zitao Liu
Spectral Clustering Zitao Liu Agenda Brief Clustering Review Similarity Graph Graph Laplacian Spectral Clustering Algorithm Graph Cut Point of View Random Walk Point of View Perturbation Theory Point of
More informationLaplacian Eigenmaps for Dimensionality Reduction and Data Representation
Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to
More informationGeometry on Probability Spaces
Geometry on Probability Spaces Steve Smale Toyota Technological Institute at Chicago 427 East 60th Street, Chicago, IL 60637, USA E-mail: smale@math.berkeley.edu Ding-Xuan Zhou Department of Mathematics,
More informationTHE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING
THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING Luis Rademacher, Ohio State University, Computer Science and Engineering. Joint work with Mikhail Belkin and James Voss This talk A new approach to multi-way
More informationSpectral Clustering. Guokun Lai 2016/10
Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph
More informationSummary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)
Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University) The authors explain how the NCut algorithm for graph bisection
More informationMLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT
MLCC 2018 - Clustering Lorenzo Rosasco UNIGE-MIT-IIT About this class We will consider an unsupervised setting, and in particular the problem of clustering unlabeled data into coherent groups. MLCC 2018
More informationGraphs in Machine Learning
Graphs in Machine Learning Michal Valko INRIA Lille - Nord Europe, France Partially based on material by: Ulrike von Luxburg, Gary Miller, Doyle & Schnell, Daniel Spielman January 27, 2015 MVA 2014/2015
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationSPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS
SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised
More informationSpectral Techniques for Clustering
Nicola Rebagliati 1/54 Spectral Techniques for Clustering Nicola Rebagliati 29 April, 2010 Nicola Rebagliati 2/54 Thesis Outline 1 2 Data Representation for Clustering Setting Data Representation and Methods
More informationAn indicator for the number of clusters using a linear map to simplex structure
An indicator for the number of clusters using a linear map to simplex structure Marcus Weber, Wasinee Rungsarityotin, and Alexander Schliep Zuse Institute Berlin ZIB Takustraße 7, D-495 Berlin, Germany
More informationCertifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering
Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering Shuyang Ling Courant Institute of Mathematical Sciences, NYU Aug 13, 2018 Joint
More informationSpectral Clustering on Handwritten Digits Database
University of Maryland-College Park Advance Scientific Computing I,II Spectral Clustering on Handwritten Digits Database Author: Danielle Middlebrooks Dmiddle1@math.umd.edu Second year AMSC Student Advisor:
More informationMachine Learning for Data Science (CS4786) Lecture 11
Machine Learning for Data Science (CS4786) Lecture 11 Spectral clustering Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ ANNOUNCEMENT 1 Assignment P1 the Diagnostic assignment 1 will
More informationLimits of Spectral Clustering
Limits of Spectral Clustering Ulrike von Luxburg and Olivier Bousquet Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tübingen, Germany {ulrike.luxburg,olivier.bousquet}@tuebingen.mpg.de
More informationLecture 12 : Graph Laplacians and Cheeger s Inequality
CPS290: Algorithmic Foundations of Data Science March 7, 2017 Lecture 12 : Graph Laplacians and Cheeger s Inequality Lecturer: Kamesh Munagala Scribe: Kamesh Munagala Graph Laplacian Maybe the most beautiful
More informationLearning Eigenfunctions: Links with Spectral Clustering and Kernel PCA
Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 12: Graph Clustering Cho-Jui Hsieh UC Davis May 29, 2018 Graph Clustering Given a graph G = (V, E, W ) V : nodes {v 1,, v n } E: edges
More informationApproximate Kernel PCA with Random Features
Approximate Kernel PCA with Random Features (Computational vs. Statistical Tradeoff) Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Journées de Statistique Paris May 28,
More informationLearning Spectral Graph Segmentation
Learning Spectral Graph Segmentation AISTATS 2005 Timothée Cour Jianbo Shi Nicolas Gogin Computer and Information Science Department University of Pennsylvania Computer Science Ecole Polytechnique Graph-based
More informationNonlinear Dimensionality Reduction
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap
More informationSpectral clustering. Two ideal clusters, with two points each. Spectral clustering algorithms
A simple example Two ideal clusters, with two points each Spectral clustering Lecture 2 Spectral clustering algorithms 4 2 3 A = Ideally permuted Ideal affinities 2 Indicator vectors Each cluster has an
More informationDoubly Stochastic Normalization for Spectral Clustering
Doubly Stochastic Normalization for Spectral Clustering Ron Zass and Amnon Shashua Abstract In this paper we focus on the issue of normalization of the affinity matrix in spectral clustering. We show that
More informationMATH 829: Introduction to Data Mining and Analysis Clustering II
his lecture is based on U. von Luxburg, A Tutorial on Spectral Clustering, Statistics and Computing, 17 (4), 2007. MATH 829: Introduction to Data Mining and Analysis Clustering II Dominique Guillot Departments
More informationUnsupervised dimensionality reduction
Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional
More informationUnsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto
Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian
More informationAnalysis of Spectral Kernel Design based Semi-supervised Learning
Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,
More information1 Matrix notation and preliminaries from spectral graph theory
Graph clustering (or community detection or graph partitioning) is one of the most studied problems in network analysis. One reason for this is that there are a variety of ways to define a cluster or community.
More informationA Statistical Analysis of Fukunaga Koontz Transform
1 A Statistical Analysis of Fukunaga Koontz Transform Xiaoming Huo Dr. Xiaoming Huo is an assistant professor at the School of Industrial and System Engineering of the Georgia Institute of Technology,
More informationMIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design
MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation
More informationData Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings
Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline
More informationOn Some Extensions of Bernstein s Inequality for Self-Adjoint Operators
On Some Extensions of Bernstein s Inequality for Self-Adjoint Operators Stanislav Minsker e-mail: sminsker@math.duke.edu Abstract: We present some extensions of Bernstein s inequality for random self-adjoint
More informationDiffuse interface methods on graphs: Data clustering and Gamma-limits
Diffuse interface methods on graphs: Data clustering and Gamma-limits Yves van Gennip joint work with Andrea Bertozzi, Jeff Brantingham, Blake Hunter Department of Mathematics, UCLA Research made possible
More informationLinear Spectral Hashing
Linear Spectral Hashing Zalán Bodó and Lehel Csató Babeş Bolyai University - Faculty of Mathematics and Computer Science Kogălniceanu 1., 484 Cluj-Napoca - Romania Abstract. assigns binary hash keys to
More informationStatistical and Computational Analysis of Locality Preserving Projection
Statistical and Computational Analysis of Locality Preserving Projection Xiaofei He xiaofei@cs.uchicago.edu Department of Computer Science, University of Chicago, 00 East 58th Street, Chicago, IL 60637
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationConsistency of Modularity Clustering on Random Geometric Graphs
Consistency of Modularity Clustering on Random Geometric Graphs Erik Davis The University of Arizona May 10, 2016 Outline Introduction to Modularity Clustering Pointwise Convergence Convergence of Optimal
More informationApproximate Kernel Methods
Lecture 3 Approximate Kernel Methods Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Machine Learning Summer School Tübingen, 207 Outline Motivating example Ridge regression
More informationMinimax Estimation of Kernel Mean Embeddings
Minimax Estimation of Kernel Mean Embeddings Bharath K. Sriperumbudur Department of Statistics Pennsylvania State University Gatsby Computational Neuroscience Unit May 4, 2016 Collaborators Dr. Ilya Tolstikhin
More informationConvergence of Eigenspaces in Kernel Principal Component Analysis
Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation
More informationDiffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators
Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators Boaz Nadler Stéphane Lafon Ronald R. Coifman Department of Mathematics, Yale University, New Haven, CT 652. {boaz.nadler,stephane.lafon,ronald.coifman}@yale.edu
More informationDissertation Defense
Clustering Algorithms for Random and Pseudo-random Structures Dissertation Defense Pradipta Mitra 1 1 Department of Computer Science Yale University April 23, 2008 Mitra (Yale University) Dissertation
More informationLecture: Some Statistical Inference Issues (2 of 3)
Stat260/CS294: Spectral Graph Methods Lecture 23-04/16/2015 Lecture: Some Statistical Inference Issues (2 of 3) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough.
More informationLearning Spectral Clustering
Learning Spectral Clustering Francis R. Bach Computer Science University of California Berkeley, CA 94720 fbach@cs.berkeley.edu Michael I. Jordan Computer Science and Statistics University of California
More informationMore Spectral Clustering and an Introduction to Conjugacy
CS8B/Stat4B: Advanced Topics in Learning & Decision Making More Spectral Clustering and an Introduction to Conjugacy Lecturer: Michael I. Jordan Scribe: Marco Barreno Monday, April 5, 004. Back to spectral
More informationLearning sets and subspaces: a spectral approach
Learning sets and subspaces: a spectral approach Alessandro Rudi DIBRIS, Università di Genova Optimization and dynamical processes in Statistical learning and inverse problems Sept 8-12, 2014 A world of
More informationSub-Gaussian estimators under heavy tails
Sub-Gaussian estimators under heavy tails Roberto Imbuzeiro Oliveira XIX Escola Brasileira de Probabilidade Maresias, August 6th 2015 Joint with Luc Devroye (McGill) Matthieu Lerasle (CNRS/Nice) Gábor
More informationSpectral Clustering. by HU Pili. June 16, 2013
Spectral Clustering by HU Pili June 16, 2013 Outline Clustering Problem Spectral Clustering Demo Preliminaries Clustering: K-means Algorithm Dimensionality Reduction: PCA, KPCA. Spectral Clustering Framework
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationMarch 13, Paper: R.R. Coifman, S. Lafon, Diffusion maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University
Kernels March 13, 2008 Paper: R.R. Coifman, S. Lafon, maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University Kernels Figure: Example Application from [LafonWWW] meaningful geometric
More informationCS 664 Segmentation (2) Daniel Huttenlocher
CS 664 Segmentation (2) Daniel Huttenlocher Recap Last time covered perceptual organization more broadly, focused in on pixel-wise segmentation Covered local graph-based methods such as MST and Felzenszwalb-Huttenlocher
More informationSpectral Clustering using Multilinear SVD
Spectral Clustering using Multilinear SVD Analysis, Approximations and Applications Debarghya Ghoshdastidar, Ambedkar Dukkipati Dept. of Computer Science & Automation Indian Institute of Science Clustering:
More informationManifold Learning: Theory and Applications to HRI
Manifold Learning: Theory and Applications to HRI Seungjin Choi Department of Computer Science Pohang University of Science and Technology, Korea seungjin@postech.ac.kr August 19, 2008 1 / 46 Greek Philosopher
More informationApproximate Dynamic Programming
Approximate Dynamic Programming A. LAZARIC (SequeL Team @INRIA-Lille) Ecole Centrale - Option DAD SequeL INRIA Lille EC-RL Course Value Iteration: the Idea 1. Let V 0 be any vector in R N A. LAZARIC Reinforcement
More informationMATH 220: INNER PRODUCT SPACES, SYMMETRIC OPERATORS, ORTHOGONALITY
MATH 22: INNER PRODUCT SPACES, SYMMETRIC OPERATORS, ORTHOGONALITY When discussing separation of variables, we noted that at the last step we need to express the inhomogeneous initial or boundary data as
More informationLearning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31
Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from
More informationDiffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators
Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators Boaz Nadler Stéphane Lafon Ronald R. Coifman Department of Mathematics, Yale University, New Haven, CT 652. {boaz.nadler,stephane.lafon,ronald.coifman}@yale.edu
More informationRandom Matrices in Machine Learning
Random Matrices in Machine Learning Romain COUILLET CentraleSupélec, University of ParisSaclay, France GSTATS IDEX DataScience Chair, GIPSA-lab, University Grenoble Alpes, France. June 21, 2018 1 / 113
More informationEach new feature uses a pair of the original features. Problem: Mapping usually leads to the number of features blow up!
Feature Mapping Consider the following mapping φ for an example x = {x 1,...,x D } φ : x {x1,x 2 2,...,x 2 D,,x 2 1 x 2,x 1 x 2,...,x 1 x D,...,x D 1 x D } It s an example of a quadratic mapping Each new
More informationConsistency of Spectral Clustering
Max Planck Institut für biologische Kybernetik Max Planck Institute for Biological Cybernetics Technical Report No. TR-34 Consistency of Spectral Clustering Ulrike von Luxburg, Mikhail Belkin 2, Olivier
More informationSpectral Clustering on Handwritten Digits Database Mid-Year Pr
Spectral Clustering on Handwritten Digits Database Mid-Year Presentation Danielle dmiddle1@math.umd.edu Advisor: Kasso Okoudjou kasso@umd.edu Department of Mathematics University of Maryland- College Park
More informationSupport Vector Machine
Support Vector Machine Fabrice Rossi SAMM Université Paris 1 Panthéon Sorbonne 2018 Outline Linear Support Vector Machine Kernelized SVM Kernels 2 From ERM to RLM Empirical Risk Minimization in the binary
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationKernel Learning via Random Fourier Representations
Kernel Learning via Random Fourier Representations L. Law, M. Mider, X. Miscouridou, S. Ip, A. Wang Module 5: Machine Learning L. Law, M. Mider, X. Miscouridou, S. Ip, A. Wang Kernel Learning via Random
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationCMPSCI 791BB: Advanced ML: Laplacian Learning
CMPSCI 791BB: Advanced ML: Laplacian Learning Sridhar Mahadevan Outline! Spectral graph operators! Combinatorial graph Laplacian! Normalized graph Laplacian! Random walks! Machine learning on graphs! Clustering!
More informationIntroduction to Bases in Banach Spaces
Introduction to Bases in Banach Spaces Matt Daws June 5, 2005 Abstract We introduce the notion of Schauder bases in Banach spaces, aiming to be able to give a statement of, and make sense of, the Gowers
More informationOn the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering
On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering Chris Ding, Xiaofeng He, Horst D. Simon Published on SDM 05 Hongchang Gao Outline NMF NMF Kmeans NMF Spectral Clustering NMF
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More informationApproximate Dynamic Programming
Master MVA: Reinforcement Learning Lecture: 5 Approximate Dynamic Programming Lecturer: Alessandro Lazaric http://researchers.lille.inria.fr/ lazaric/webpage/teaching.html Objectives of the lecture 1.
More informationSelf-Tuning Semantic Image Segmentation
Self-Tuning Semantic Image Segmentation Sergey Milyaev 1,2, Olga Barinova 2 1 Voronezh State University sergey.milyaev@gmail.com 2 Lomonosov Moscow State University obarinova@graphics.cs.msu.su Abstract.
More informationBayesian Nonparametrics
Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent
More informationHigh-dimensional probability and statistics for the data sciences
High-dimensional probability and statistics for the data sciences Larry Goldstein and Mahdi Soltanolkotabi Motivation August 21, 2017 Motivation August 21, 2017 Please ask questions! Traditionally mathematics
More informationGraphs in Machine Learning
Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Ulrike von Luxburg, Gary Miller, Mikhail Belkin October 16th, 2017 MVA 2017/2018
More informationApproximate Dynamic Programming
Approximate Dynamic Programming A. LAZARIC (SequeL Team @INRIA-Lille) ENS Cachan - Master 2 MVA SequeL INRIA Lille MVA-RL Course Approximate Dynamic Programming (a.k.a. Batch Reinforcement Learning) A.
More informationFinite Singular Multivariate Gaussian Mixture
21/06/2016 Plan 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Plan Singular Multivariate Normal Distribution 1 Basic definitions Singular Multivariate Normal Distribution 2 3 Multivariate
More informationUnsupervised Clustering of Human Pose Using Spectral Embedding
Unsupervised Clustering of Human Pose Using Spectral Embedding Muhammad Haseeb and Edwin R Hancock Department of Computer Science, The University of York, UK Abstract In this paper we use the spectra of
More informationKernel change-point detection
1,2 (joint work with Alain Celisse 3 & Zaïd Harchaoui 4 ) 1 Cnrs 2 École Normale Supérieure (Paris), DIENS, Équipe Sierra 3 Université Lille 1 4 INRIA Grenoble Workshop Kernel methods for big data, Lille,
More informationWorst-Case Bounds for Gaussian Process Models
Worst-Case Bounds for Gaussian Process Models Sham M. Kakade University of Pennsylvania Matthias W. Seeger UC Berkeley Abstract Dean P. Foster University of Pennsylvania We present a competitive analysis
More informationMatrix Support Functional and its Applications
Matrix Support Functional and its Applications James V Burke Mathematics, University of Washington Joint work with Yuan Gao (UW) and Tim Hoheisel (McGill), CORS, Banff 2016 June 1, 2016 Connections What
More informationChapter 4. Signed Graphs. Intuitively, in a weighted graph, an edge with a positive weight denotes similarity or proximity of its endpoints.
Chapter 4 Signed Graphs 4.1 Signed Graphs and Signed Laplacians Intuitively, in a weighted graph, an edge with a positive weight denotes similarity or proximity of its endpoints. For many reasons, it is
More informationManifold Regularization
9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,
More informationK-Means, Expectation Maximization and Segmentation. D.A. Forsyth, CS543
K-Means, Expectation Maximization and Segmentation D.A. Forsyth, CS543 K-Means Choose a fixed number of clusters Choose cluster centers and point-cluster allocations to minimize error can t do this by
More informationKernel Method: Data Analysis with Positive Definite Kernels
Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University
More informationStochastic optimization in Hilbert spaces
Stochastic optimization in Hilbert spaces Aymeric Dieuleveut Aymeric Dieuleveut Stochastic optimization Hilbert spaces 1 / 48 Outline Learning vs Statistics Aymeric Dieuleveut Stochastic optimization Hilbert
More informationEigenvalue comparisons in graph theory
Eigenvalue comparisons in graph theory Gregory T. Quenell July 1994 1 Introduction A standard technique for estimating the eigenvalues of the Laplacian on a compact Riemannian manifold M with bounded curvature
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationFast Angular Synchronization for Phase Retrieval via Incomplete Information
Fast Angular Synchronization for Phase Retrieval via Incomplete Information Aditya Viswanathan a and Mark Iwen b a Department of Mathematics, Michigan State University; b Department of Mathematics & Department
More informationGrouping with Bias. Stella X. Yu 1,2 Jianbo Shi 1. Robotics Institute 1 Carnegie Mellon University Center for the Neural Basis of Cognition 2
Grouping with Bias Stella X. Yu, Jianbo Shi Robotics Institute Carnegie Mellon University Center for the Neural Basis of Cognition What Is It About? Incorporating prior knowledge into grouping Unitary
More informationRelations Between Adjacency And Modularity Graph Partitioning: Principal Component Analysis vs. Modularity Component Analysis
Relations Between Adjacency And Modularity Graph Partitioning: Principal Component Analysis vs. Modularity Component Analysis Hansi Jiang Carl Meyer North Carolina State University October 27, 2015 1 /
More information