Global vs. Multiscale Approaches
|
|
- Brianne Watts
- 5 years ago
- Views:
Transcription
1 Harmonic Analysis on Graphs Global vs. Multiscale Approaches Weizmann Institute of Science, Rehovot, Israel July 2011 Joint work with Matan Gavish (WIS/Stanford), Ronald Coifman (Yale), ICML 10'
2 Challenge: Organization / Understanding of Data In many elds massive amounts of data collected or generated, EXAMPLES: nancial data, multi-sensor data, simulations, documents, web-pages, images, video streams, medical data, astrophysical data, etc.
3 Challenge: Organization / Understanding of Data In many elds massive amounts of data collected or generated, EXAMPLES: nancial data, multi-sensor data, simulations, documents, web-pages, images, video streams, medical data, astrophysical data, etc. Need to organize / understand their structure Inference / Learning from data
4 Classical vs. Modern Data Analysis Setup and Tasks Classical Setup: - Data typically in a (low-dimensional) Euclidean Space. - Small to medium sample sizes (n < 1000) - Either all data unlabeled (unsupervised) or all labeled (supervised).
5 Classical vs. Modern Data Analysis Setup and Tasks Classical Setup: - Data typically in a (low-dimensional) Euclidean Space. - Small to medium sample sizes (n < 1000) - Either all data unlabeled (unsupervised) or all labeled (supervised). Modern Setup: - high-dimensional data, or data encoded as a graph. - Huge datasets, with n = 10 6 samples or more. Few labeled data.
6 Classical vs. Modern Data Analysis Setup and Tasks Classical Setup: - Data typically in a (low-dimensional) Euclidean Space. - Small to medium sample sizes (n < 1000) - Either all data unlabeled (unsupervised) or all labeled (supervised). Modern Setup: - high-dimensional data, or data encoded as a graph. - Huge datasets, with n = 10 6 samples or more. Few labeled data. The well developed and understood standard tools of statistics not always applicable
7 Classical vs. Modern Data Analysis Setup and Tasks Classical Setup: - Data typically in a (low-dimensional) Euclidean Space. - Small to medium sample sizes (n < 1000) - Either all data unlabeled (unsupervised) or all labeled (supervised). Modern Setup: - high-dimensional data, or data encoded as a graph. - Huge datasets, with n = 10 6 samples or more. Few labeled data. The well developed and understood standard tools of statistics not always applicable Question: Harmonic Analysis in such settings
8 Harmonic Analysis and Learning In the past 20 years (multiscale) harmonic analysis had profound impact on statistics and signal processing.
9 Harmonic Analysis and Learning In the past 20 years (multiscale) harmonic analysis had profound impact on statistics and signal processing. Well developed theory, long tradition: geometry of space X bases for {f : X R}
10 Harmonic Analysis and Learning In the past 20 years (multiscale) harmonic analysis had profound impact on statistics and signal processing. Well developed theory, long tradition: geometry of space X bases for {f : X R} Simplest Example: X = [0, 1] Construct (multiscale) basis {Ψ i }, that allows control of f, Ψ i for some (smooth) class of functions f. Theorem: ψ smooth wavelet, ψ l,k - wavelet basis, f : [0, 1] R, then f (x) f (y) < C x y α f, ψ l,k C 2 l(α+1/2)
11 Setting Harmonic analysis wisdom for f : R d R
12 Setting Harmonic analysis wisdom for f : R d R Expand f in orthonormal basis {ψ i } (e.g. wavelet) f = i f, ψ i ψ i
13 Setting Harmonic analysis wisdom for f : R d R Expand f in orthonormal basis {ψ i } (e.g. wavelet) f = i f, ψ i ψ i Shrink / estimate coecients
14 Setting Harmonic analysis wisdom for f : R d R Expand f in orthonormal basis {ψ i } (e.g. wavelet) f = i f, ψ i ψ i Shrink / estimate coecients Useful when f well approximated by a few terms (fast coecent decay / sparsity)
15 Setting Harmonic analysis wisdom for f : R d R Expand f in orthonormal basis {ψ i } (e.g. wavelet) f = i f, ψ i ψ i Shrink / estimate coecients Useful when f well approximated by a few terms (fast coecent decay / sparsity) Can this work on general datasets? Need data-adaptive basis {ψ i } for space of functions f : X R
16 Harmonic Analysis on Graphs Setup: We are given dataset X = {x 1,..., x N } with similarity / anity matrix W i,j Goal: Statistical inference of smooth f : X R Denoise f SSL / Regression / classication: extend f from X X to X
17 Semi-Supervised Learning In many applications - easy to collect lots of unlabeled data, BUT labeling the data is expensive. Question: Given (small) labeled set X X = {x i, y i } (y = f (x)), construct ˆf to label rest of X.
18 Semi-Supervised Learning In many applications - easy to collect lots of unlabeled data, BUT labeling the data is expensive. Question: Given (small) labeled set X X = {x i, y i } (y = f (x)), construct ˆf to label rest of X. Key Assumption: function f (x) has some smoothness w.r.t graph anities W i,j. otherwise - unlabeled data is useless or may even harm prediction.
19 The Graph Laplacian In most previous approaches, key object is GRAPH LAPLACIAN: L = D W where D ii = W j ij,
20 Global Graph Laplacian Based SSL Methods Zhu, Ghahramani, Laerty, SSL using Gaussian elds and harmonic functions [ICML, 2003], Azran [ICML 2007] 1 f (x) = arg min f (x i )=y i n 2 i,j=1 n W i,j(f i f j ) 2 = arg min 1 n 2 f T Lf Y. Bengio, O. Delalleau, N. Le Roux, [2006] D. Zhou, O. Bousquet, T. Navin Lal, J. Weston, B. Scholkopf [NIPS 2004] f (x) = arg min 1 n l n 2 W i,j(f i f j ) 2 + λ (f i y i ) 2 i,j=1 i=1
21 Global Graph-Laplacian Based SSL Belkin & Niyogi (2003): Given similarity matrix W nd rst few eigenvectors of Graph Laplacian, Le j = (D W )e j = λ j e j. Expand p ˆf = a j e j Estimate coecients a j j=1 from labeled data.
22 Statistical Analysis of Laplacian Regularization [N., Srebro, Zhou, NIPS 09'] Theorem: In the limit of large unlabeled data from Euclidean space, with W i,j = K( x i x j ), Graph Laplacian Regularizaton Methods f (x) = arg min 1 n l n 2 W i,j(f i f j ) 2 + λ (f i y i ) 2 i,j=1 are weill posed for underlying data in 1-d, but are not well posed for data in dimension d 2. In particular, in limit of innite unlabeled data, f (x) const at all unlabeled x. i=1
23 Eigenvector-Fourier Methods Belkin, Niyogi [2003], suggested a dierent approach based on the Graph Laplacian L = W D: Given similarity W, nd rst few eigenvectors (W D)e j = λ j e j, Expand p ŷ(x) = a j e j Find coecients a j j=1 by least squares, (â 1,..., â p ) = arg min l (y j ŷ(x j )) 2. j=1
24 Toy example Example: USPS benchmark
25 Toy example Example: USPS benchmark X is USPS (ML benchmark) as 1500 vectors in R = R 256
26 Toy example Example: USPS benchmark X is USPS (ML benchmark) as 1500 vectors in R = R 256 Anity W i,j = exp ( x i x j 2)
27 Toy example Example: USPS benchmark X is USPS (ML benchmark) as 1500 vectors in R = R 256 Anity W i,j = exp ( x i x j 2) f : X {1, 1} is the class label.
28 Toy example: visualization by kernel PCA
29 Toy example: Prior art?
30 Toy example: Prior art? Generalizing Fourier: The Graph Laplacian eigenbasis
31 Toy example: Prior art? Generalizing Fourier: The Graph Laplacian eigenbasis Take (W D)ψ i = λ i ψ i where D i,i = j W i,j
32 Toy example: Prior art? Generalizing Fourier: The Graph Laplacian eigenbasis Take (W D)ψ i = λ i ψ i where D i,i = j W i,j (Belkin, Niyogi 2004) and others
33 Toy example: Prior art? Generalizing Fourier: The Graph Laplacian eigenbasis Take (W D)ψ i = λ i ψ i where D i,i = j W i,j (Belkin, Niyogi 2004) and others In Euclidean setting
34 Toy example: Prior art? Generalizing Fourier: The Graph Laplacian eigenbasis Take (W D)ψ i = λ i ψ i where D i,i = j W i,j (Belkin, Niyogi 2004) and others In Euclidean setting Typical coecient decay rate in Fourier basis: polynomial
35 Toy example: Prior art? Generalizing Fourier: The Graph Laplacian eigenbasis Take (W D)ψ i = λ i ψ i where D i,i = j W i,j (Belkin, Niyogi 2004) and others In Euclidean setting Typical coecient decay rate in Fourier basis: polynomial But in wavelet bases: exponential
36 Toy example: Graph Laplacian Eigenbasis
37 Toy example: Graph Laplacian Eigenbasis
38 Toy example: Graph Laplacian Eigenbasis
39 Challenge
40 Challenge Challenge: build multiscale "wavelet-like" bases on X
41 Challenge Challenge: build multiscale "wavelet-like" bases on X 1. Construct adaptive multiscale basis on general dataset
42 Challenge Challenge: build multiscale "wavelet-like" bases on X 1. Construct adaptive multiscale basis on general dataset 2. Investigate: coecient f, ψ i decay rate f smooth
43 Challenge Challenge: build multiscale "wavelet-like" bases on X 1. Construct adaptive multiscale basis on general dataset 2. Investigate: coecient f, ψ i decay rate f smooth Previous Works - Difusion Wavelets [Coifman and Maggioni] - for Data on Graphs [Jansen, Nason, Silverman] - Wavelets on Graphs via Spectral Graph Theory [Hammond et al.]
44 Challenge Challenge: build multiscale "wavelet-like" bases on X 1. Construct adaptive multiscale basis on general dataset 2. Investigate: coecient f, ψ i decay rate f smooth Previous Works - Difusion Wavelets [Coifman and Maggioni] - for Data on Graphs [Jansen, Nason, Silverman] - Wavelets on Graphs via Spectral Graph Theory [Hammond et al.] Our Work - (Relavitely) Simple Construction -Accompanying Theory.
45 Toy example: intriguing experiment
46 Toy example: intriguing experiment
47 Toy example: intriguing experiment 10 2 sorted abs(coefficients) on log10 scale eigenbasis
48 Toy example: intriguing experiment 10 2 sorted abs(coefficients) on log10 scale 10 0 eigenbasis haar like basis
49 Toy example: intriguing experiment
50 Toy example: intriguing experiment
51 Challenge and Result
52 Challenge and Result Challenge: build "wavelet" bases on X
53 Challenge and Result Challenge: build "wavelet" bases on X 1. Construct adaptive multiscale basis on general dataset
54 Challenge and Result Challenge: build "wavelet" bases on X 1. Construct adaptive multiscale basis on general dataset 2. Investigate: coecient f, ψ i decay rate f smooth
55 Challenge and Result Challenge: build "wavelet" bases on X 1. Construct adaptive multiscale basis on general dataset 2. Investigate: coecient f, ψ i decay rate f smooth Key Results
56 Challenge and Result Challenge: build "wavelet" bases on X 1. Construct adaptive multiscale basis on general dataset 2. Investigate: coecient f, ψ i decay rate f smooth Key Results 1. Partition tree on X induces wavelet Haar-like bases
57 Challenge and Result Challenge: build "wavelet" bases on X 1. Construct adaptive multiscale basis on general dataset 2. Investigate: coecient f, ψ i decay rate f smooth Key Results 1. Partition tree on X induces wavelet Haar-like bases 2. Assuming Balanced tree f smooth fast coecient decay
58 Challenge and Result Challenge: build "wavelet" bases on X 1. Construct adaptive multiscale basis on general dataset 2. Investigate: coecient f, ψ i decay rate f smooth Key Results 1. Partition tree on X induces wavelet Haar-like bases 2. Assuming Balanced tree f smooth fast coecient decay 3. Novel SSL scheme with learning guarantees assuming smooth functions on tree.
59 The Haar Basis on [0, 1]
60 The Haar Basis on [0, 1]
61 The Haar Basis on [0, 1]
62 The Haar Basis on [0, 1]
63 The Haar Basis on [0, 1]
64 The Haar Basis on [0, 1]
65 The Haar Basis on [0, 1]
66 The Haar Basis on [0, 1]
67 The Haar Basis on [0, 1]
68 The Haar Basis on [0, 1]
69 The Haar Basis on [0, 1]
70 Hierarchical partition of X
71 Hierarchical partition of X
72 Hierarchical partition of X
73 Hierarchical partition of X
74 Hierarchical partition of X
75 Partition Tree (Dendrogram)
76 Partition Tree (Dendrogram)
77 Partition Tree (Dendrogram)
78 Partition Tree (Dendrogram)
79 Partition Tree (Dendrogram)
80 Partition Tree (Dendrogram)
81 Partition Tree (Dendrogram)
82 Partition Tree Haar-like basis Simple Observation: [Lee, N., Wasserman, AOAS 08'] Hierarchical Tree Mutli-Resolution Analysis of Space of Functions Haar-like multiscale basis
83 Partition Tree Haar-like basis Simple Observation: [Lee, N., Wasserman, AOAS 08'] Hierarchical Tree Mutli-Resolution Analysis of Space of Functions Haar-like multiscale basis V = {f f : X R} V l = {f f constant at partitions at level l}
84 Partition Tree Haar-like basis Simple Observation: [Lee, N., Wasserman, AOAS 08'] Hierarchical Tree Mutli-Resolution Analysis of Space of Functions Haar-like multiscale basis V = {f f : X R} V l = {f f constant at partitions at level l} V 1 V 2... V L = V
85 Partition Tree Haar-like basis
86 Partition Tree Haar-like basis
87 Partition Tree Haar-like basis
88 Partition Tree Haar-like basis
89 Partition Tree Haar-like basis
90 Partition Tree Haar-like basis
91 Partition Tree Haar-like basis
92 Partition Tree Haar-like basis
93 Partition Tree Haar-like basis
94 Partition Tree Haar-like basis
95 Partition Tree Haar-like basis
96 Toy example: Haar-like basis function
97 Smoothness Coecient decay How to dene smoothness?
98 Smoothness Coecient decay How to dene smoothness? Partition tree induces tree metric d(x i, x j ) [ultrametric] Particular example of Space of Homogeneous Type [Coifman & Weiss]
99 Smoothness Coecient decay How to dene smoothness? Partition tree induces tree metric d(x i, x j ) [ultrametric] Measure smoothness of f : X R in tree metric Particular example of Space of Homogeneous Type [Coifman & Weiss]
100 Smoothness Coecient decay How to dene smoothness? Partition tree induces tree metric d(x i, x j ) [ultrametric] Measure smoothness of f : X R in tree metric Theorem: Let f : X R. Then f (x i ) f (x j ) Cd(x i, x j ) α f, ψ l,i C q l(α+ 1/2). where - q measures the tree balance - α measures function smoothness w.r.t. tree Particular example of Space of Homogeneous Type [Coifman & Weiss]
101 Application: SSL Given dataset X with weighted graph G, similarity matrix W, labeled points {x i, y i }, 1. Construct a balanced hierarchical tree of graph 2. Construct corresponding Haar-like basis 3. Estimate coecients from labeled points
102 Toy Example: Benchmarks Test Error (%) Laplacian Eigenmaps Laplacian Reg. Adaptive Threshold Haar like basis State of the art # of labeled points (out of 1500)
103 Toy Example: MNIST 8 vs. {3,4,5,7} Laplacian Eigenmaps Laplacian Reg. Adaptive Threshold Haar like basis Test Error (%) # of labeled points (out of 1000)
104 Summary
105 Summary 1. A balanced partition tree induces a useful wavelet basis
106 Summary 1. A balanced partition tree induces a useful wavelet basis 2. Designing the basis allows a theory connecting Coecients decay, smoothness and learnability
107 Summary 1. A balanced partition tree induces a useful wavelet basis 2. Designing the basis allows a theory connecting Coecients decay, smoothness and learnability 3. Wavelet arsenal becomes available: wavelet shrinkage, etc
108 Summary 1. A balanced partition tree induces a useful wavelet basis 2. Designing the basis allows a theory connecting Coecients decay, smoothness and learnability 3. Wavelet arsenal becomes available: wavelet shrinkage, etc 4. Interesting harmonic analysis. Promising for data analysis?
109 Summary 1. A balanced partition tree induces a useful wavelet basis 2. Designing the basis allows a theory connecting Coecients decay, smoothness and learnability 3. Wavelet arsenal becomes available: wavelet shrinkage, etc 4. Interesting harmonic analysis. Promising for data analysis? 5. Computational experiments motivate theory and vice-versa
110 Summary 1. A balanced partition tree induces a useful wavelet basis 2. Designing the basis allows a theory connecting Coecients decay, smoothness and learnability 3. Wavelet arsenal becomes available: wavelet shrinkage, etc 4. Interesting harmonic analysis. Promising for data analysis? 5. Computational experiments motivate theory and vice-versa The End nadler
Multiscale Wavelets on Trees, Graphs and High Dimensional Data
Multiscale Wavelets on Trees, Graphs and High Dimensional Data ICML 2010, Haifa Matan Gavish (Weizmann/Stanford) Boaz Nadler (Weizmann) Ronald Coifman (Yale) Boaz Nadler Ronald Coifman Motto... the relationships
More informationMultiscale Wavelets on Trees, Graphs and High Dimensional Data: Theory and Applications to Semi Supervised Learning
Multiscale Wavelets on Trees, Graphs and High Dimensional Data: Theory and Applications to Semi Supervised Learning Matan Gavish 1 Boaz Nadler Weizmann Institute of Science, P.O. Box 26, Rehovot, 76100,
More informationSemi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data
Semi-Supervised Learning with the Graph Laplacian: The Limit of Infinite Unlabelled Data Boaz Nadler Dept. of Computer Science and Applied Mathematics Weizmann Institute of Science Rehovot, Israel 76 boaz.nadler@weizmann.ac.il
More informationGraph-Based Semi-Supervised Learning
Graph-Based Semi-Supervised Learning Olivier Delalleau, Yoshua Bengio and Nicolas Le Roux Université de Montréal CIAR Workshop - April 26th, 2005 Graph-Based Semi-Supervised Learning Yoshua Bengio, Olivier
More informationGraphs, Geometry and Semi-supervised Learning
Graphs, Geometry and Semi-supervised Learning Mikhail Belkin The Ohio State University, Dept of Computer Science and Engineering and Dept of Statistics Collaborators: Partha Niyogi, Vikas Sindhwani In
More informationHarmonic Analysis and Geometries of Digital Data Bases
Harmonic Analysis and Geometries of Digital Data Bases AMS Session Special Sesson on the Mathematics of Information and Knowledge, Ronald Coifman (Yale) and Matan Gavish (Stanford, Yale) January 14, 2010
More informationAnalysis of Spectral Kernel Design based Semi-supervised Learning
Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,
More informationMultiscale bi-harmonic Analysis of Digital Data Bases and Earth moving distances.
Multiscale bi-harmonic Analysis of Digital Data Bases and Earth moving distances. R. Coifman, Department of Mathematics, program of Applied Mathematics Yale University Joint work with M. Gavish and W.
More informationLearning on Graphs and Manifolds. CMPSCI 689 Sridhar Mahadevan U.Mass Amherst
Learning on Graphs and Manifolds CMPSCI 689 Sridhar Mahadevan U.Mass Amherst Outline Manifold learning is a relatively new area of machine learning (2000-now). Main idea Model the underlying geometry of
More informationLearning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31
Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from
More informationData-dependent representations: Laplacian Eigenmaps
Data-dependent representations: Laplacian Eigenmaps November 4, 2015 Data Organization and Manifold Learning There are many techniques for Data Organization and Manifold Learning, e.g., Principal Component
More informationSemi-Supervised Learning
Semi-Supervised Learning getting more for less in natural language processing and beyond Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning many human
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationSemi-Supervised Learning in Gigantic Image Collections. Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT)
Semi-Supervised Learning in Gigantic Image Collections Rob Fergus (New York University) Yair Weiss (Hebrew University) Antonio Torralba (MIT) Gigantic Image Collections What does the world look like? High
More informationSemi-Supervised Learning with Graphs
Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu LTI SCS CMU Thesis Committee John Lafferty (co-chair) Ronald Rosenfeld (co-chair) Zoubin Ghahramani Tommi Jaakkola 1 Semi-supervised Learning classifiers
More informationSemi-Supervised Learning with Graphs. Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University
Semi-Supervised Learning with Graphs Xiaojin (Jerry) Zhu School of Computer Science Carnegie Mellon University 1 Semi-supervised Learning classification classifiers need labeled data to train labeled data
More informationSemi-Supervised Classification with Universum
Semi-Supervised Classification with Universum Dan Zhang 1, Jingdong Wang 2, Fei Wang 3, Changshui Zhang 4 1,3,4 State Key Laboratory on Intelligent Technology and Systems, Tsinghua National Laboratory
More informationDiffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators
Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators Boaz Nadler Stéphane Lafon Ronald R. Coifman Department of Mathematics, Yale University, New Haven, CT 652. {boaz.nadler,stephane.lafon,ronald.coifman}@yale.edu
More informationBeyond the Point Cloud: From Transductive to Semi-Supervised Learning
Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of
More informationHow to learn from very few examples?
How to learn from very few examples? Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Outline Introduction Part A
More informationMLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT
MLCC 2018 - Clustering Lorenzo Rosasco UNIGE-MIT-IIT About this class We will consider an unsupervised setting, and in particular the problem of clustering unlabeled data into coherent groups. MLCC 2018
More informationLaplacian Eigenmaps for Dimensionality Reduction and Data Representation
Introduction and Data Representation Mikhail Belkin & Partha Niyogi Department of Electrical Engieering University of Minnesota Mar 21, 2017 1/22 Outline Introduction 1 Introduction 2 3 4 Connections to
More informationUnsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto
Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian
More informationLearning Eigenfunctions: Links with Spectral Clustering and Kernel PCA
Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures
More informationContribution from: Springer Verlag Berlin Heidelberg 2005 ISBN
Contribution from: Mathematical Physics Studies Vol. 7 Perspectives in Analysis Essays in Honor of Lennart Carleson s 75th Birthday Michael Benedicks, Peter W. Jones, Stanislav Smirnov (Eds.) Springer
More informationDiffusion Wavelets and Applications
Diffusion Wavelets and Applications J.C. Bremer, R.R. Coifman, P.W. Jones, S. Lafon, M. Mohlenkamp, MM, R. Schul, A.D. Szlam Demos, web pages and preprints available at: S.Lafon: www.math.yale.edu/~sl349
More informationActive and Semi-supervised Kernel Classification
Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),
More informationA graph based approach to semi-supervised learning
A graph based approach to semi-supervised learning 1 Feb 2011 Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples.
More informationManifold Regularization
9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,
More informationNonlinear Dimensionality Reduction. Jose A. Costa
Nonlinear Dimensionality Reduction Jose A. Costa Mathematics of Information Seminar, Dec. Motivation Many useful of signals such as: Image databases; Gene expression microarrays; Internet traffic time
More informationSemi-Supervised Learning by Multi-Manifold Separation
Semi-Supervised Learning by Multi-Manifold Separation Xiaojin (Jerry) Zhu Department of Computer Sciences University of Wisconsin Madison Joint work with Andrew Goldberg, Zhiting Xu, Aarti Singh, and Rob
More informationA PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS. Akshay Gadde and Antonio Ortega
A PROBABILISTIC INTERPRETATION OF SAMPLING THEORY OF GRAPH SIGNALS Akshay Gadde and Antonio Ortega Department of Electrical Engineering University of Southern California, Los Angeles Email: agadde@usc.edu,
More informationDiffusion Geometries, Global and Multiscale
Diffusion Geometries, Global and Multiscale R.R. Coifman, S. Lafon, MM, J.C. Bremer Jr., A.D. Szlam, P.W. Jones, R.Schul Papers, talks, other materials available at: www.math.yale.edu/~mmm82 Data and functions
More informationTHE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING
THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING Luis Rademacher, Ohio State University, Computer Science and Engineering. Joint work with Mikhail Belkin and James Voss This talk A new approach to multi-way
More informationCMPSCI 791BB: Advanced ML: Laplacian Learning
CMPSCI 791BB: Advanced ML: Laplacian Learning Sridhar Mahadevan Outline! Spectral graph operators! Combinatorial graph Laplacian! Normalized graph Laplacian! Random walks! Machine learning on graphs! Clustering!
More informationLimits of Spectral Clustering
Limits of Spectral Clustering Ulrike von Luxburg and Olivier Bousquet Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tübingen, Germany {ulrike.luxburg,olivier.bousquet}@tuebingen.mpg.de
More informationSelf-Tuning Semantic Image Segmentation
Self-Tuning Semantic Image Segmentation Sergey Milyaev 1,2, Olga Barinova 2 1 Voronezh State University sergey.milyaev@gmail.com 2 Lomonosov Moscow State University obarinova@graphics.cs.msu.su Abstract.
More informationGeometry on Probability Spaces
Geometry on Probability Spaces Steve Smale Toyota Technological Institute at Chicago 427 East 60th Street, Chicago, IL 60637, USA E-mail: smale@math.berkeley.edu Ding-Xuan Zhou Department of Mathematics,
More informationJustin Solomon MIT, Spring 2017
Justin Solomon MIT, Spring 2017 http://pngimg.com/upload/hammer_png3886.png You can learn a lot about a shape by hitting it (lightly) with a hammer! What can you learn about its shape from vibration frequencies
More informationMarch 13, Paper: R.R. Coifman, S. Lafon, Diffusion maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University
Kernels March 13, 2008 Paper: R.R. Coifman, S. Lafon, maps ([Coifman06]) Seminar: Learning with Graphs, Prof. Hein, Saarland University Kernels Figure: Example Application from [LafonWWW] meaningful geometric
More informationUnsupervised dimensionality reduction
Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional
More informationExploiting Sparse Non-Linear Structure in Astronomical Data
Exploiting Sparse Non-Linear Structure in Astronomical Data Ann B. Lee Department of Statistics and Department of Machine Learning, Carnegie Mellon University Joint work with P. Freeman, C. Schafer, and
More informationWhat is semi-supervised learning?
What is semi-supervised learning? In many practical learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate text processing, video-indexing,
More informationSpectral Hashing: Learning to Leverage 80 Million Images
Spectral Hashing: Learning to Leverage 80 Million Images Yair Weiss, Antonio Torralba, Rob Fergus Hebrew University, MIT, NYU Outline Motivation: Brute Force Computer Vision. Semantic Hashing. Spectral
More informationThe Generalized Haar-Walsh Transform (GHWT) for Data Analysis on Graphs and Networks
The Generalized Haar-Walsh Transform (GHWT) for Data Analysis on Graphs and Networks Jeff Irion & Naoki Saito Department of Mathematics University of California, Davis SIAM Annual Meeting 2014 Chicago,
More informationData dependent operators for the spatial-spectral fusion problem
Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.
More informationFrom graph to manifold Laplacian: The convergence rate
Appl. Comput. Harmon. Anal. 2 (2006) 28 34 www.elsevier.com/locate/acha Letter to the Editor From graph to manifold Laplacian: The convergence rate A. Singer Department of athematics, Yale University,
More informationDiffusion Geometries, Diffusion Wavelets and Harmonic Analysis of large data sets.
Diffusion Geometries, Diffusion Wavelets and Harmonic Analysis of large data sets. R.R. Coifman, S. Lafon, MM Mathematics Department Program of Applied Mathematics. Yale University Motivations The main
More informationSemi-supervised Learning using Sparse Eigenfunction Bases
Semi-supervised Learning using Sparse Eigenfunction Bases Kaushik Sinha Dept. of Computer Science and Engineering Ohio State University Columbus, OH 43210 sinhak@cse.ohio-state.edu Mikhail Belkin Dept.
More informationConference in Honor of Aline Bonami Orleans, June 2014
Conference in Honor of Aline Bonami Orleans, June 2014 Harmonic Analysis and functional duality, as a tool for organization of information, and learning. R. Coifman Department of Mathematics, program of
More informationAdaptive Compressive Imaging Using Sparse Hierarchical Learned Dictionaries
Adaptive Compressive Imaging Using Sparse Hierarchical Learned Dictionaries Jarvis Haupt University of Minnesota Department of Electrical and Computer Engineering Supported by Motivation New Agile Sensing
More informationLarge-Scale Graph-Based Semi-Supervised Learning via Tree Laplacian Solver
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Large-Scale Graph-Based Semi-Supervised Learning via Tree Laplacian Solver Yan-Ming Zhang and Xu-Yao Zhang National Laboratory
More informationGraphs in Machine Learning
Graphs in Machine Learning Michal Valko Inria Lille - Nord Europe, France TA: Pierre Perrault Partially based on material by: Mikhail Belkin, Jerry Zhu, Olivier Chapelle, Branislav Kveton October 30, 2017
More informationDiscrete vs. Continuous: Two Sides of Machine Learning
Discrete vs. Continuous: Two Sides of Machine Learning Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Oct. 18,
More informationSemi-supervised Dictionary Learning Based on Hilbert-Schmidt Independence Criterion
Semi-supervised ictionary Learning Based on Hilbert-Schmidt Independence Criterion Mehrdad J. Gangeh 1, Safaa M.A. Bedawi 2, Ali Ghodsi 3, and Fakhri Karray 2 1 epartments of Medical Biophysics, and Radiation
More informationBeyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian
Beyond Scalar Affinities for Network Analysis or Vector Diffusion Maps and the Connection Laplacian Amit Singer Princeton University Department of Mathematics and Program in Applied and Computational Mathematics
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationComputational tractability of machine learning algorithms for tall fat data
Computational tractability of machine learning algorithms for tall fat data Getting good enough solutions as fast as possible Vikas Chandrakant Raykar vikas@cs.umd.edu University of Maryland, CollegePark
More informationEfficient Iterative Semi-Supervised Classification on Manifold
Efficient Iterative Semi-Supervised Classification on Manifold Mehrdad Farajtabar, Hamid R. Rabiee, Amirreza Shaban, Ali Soltani-Farani Digital Media Lab, AICTC Research Center, Department of Computer
More informationImproving Semi-Supervised Target Alignment via Label-Aware Base Kernels
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Improving Semi-Supervised Target Alignment via Label-Aware Base Kernels Qiaojun Wang, Kai Zhang 2, Guofei Jiang 2, and Ivan Marsic
More informationDiffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators
Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators Boaz Nadler Stéphane Lafon Ronald R. Coifman Department of Mathematics, Yale University, New Haven, CT 652. {boaz.nadler,stephane.lafon,ronald.coifman}@yale.edu
More informationUnlabeled Data: Now It Helps, Now It Doesn t
institution-logo-filena A. Singh, R. D. Nowak, and X. Zhu. In NIPS, 2008. 1 Courant Institute, NYU April 21, 2015 Outline institution-logo-filena 1 Conflicting Views in Semi-supervised Learning The Cluster
More informationCSE 291. Assignment Spectral clustering versus k-means. Out: Wed May 23 Due: Wed Jun 13
CSE 291. Assignment 3 Out: Wed May 23 Due: Wed Jun 13 3.1 Spectral clustering versus k-means Download the rings data set for this problem from the course web site. The data is stored in MATLAB format as
More informationIFT LAPLACIAN APPLICATIONS. Mikhail Bessmeltsev
IFT 6112 09 LAPLACIAN APPLICATIONS http://www-labs.iro.umontreal.ca/~bmpix/teaching/6112/2018/ Mikhail Bessmeltsev Rough Intuition http://pngimg.com/upload/hammer_png3886.png You can learn a lot about
More informationOn the Phase Transition Phenomenon of Graph Laplacian Eigenfunctions on Trees
On the Phase Transition Phenomenon of Graph Laplacian Eigenfunctions on Trees Naoki Saito and Ernest Woei Department of Mathematics University of California Davis, CA 9566 USA Email: saito@math.ucdavis.edu;
More informationMulti-View Point Cloud Kernels for Semi-Supervised Learning
Multi-View Point Cloud Kernels for Semi-Supervised Learning David S. Rosenberg, Vikas Sindhwani, Peter L. Bartlett, Partha Niyogi May 29, 2009 Scope In semi-supervised learning (SSL), we learn a predictive
More informationManifold Denoising. Matthias Hein Markus Maier Max Planck Institute for Biological Cybernetics Tübingen, Germany
Manifold Denoising Matthias Hein Markus Maier Max Planck Institute for Biological Cybernetics Tübingen, Germany {first.last}@tuebingen.mpg.de Abstract We consider the problem of denoising a noisily sampled
More informationSpectral Hashing. Antonio Torralba 1 1 CSAIL, MIT, 32 Vassar St., Cambridge, MA Abstract
Spectral Hashing Yair Weiss,3 3 School of Computer Science, Hebrew University, 9904, Jerusalem, Israel yweiss@cs.huji.ac.il Antonio Torralba CSAIL, MIT, 32 Vassar St., Cambridge, MA 0239 torralba@csail.mit.edu
More informationSparse Approximation: from Image Restoration to High Dimensional Classification
Sparse Approximation: from Image Restoration to High Dimensional Classification Bin Dong Beijing International Center for Mathematical Research Beijing Institute of Big Data Research Peking University
More informationWavelets on Graphs, an Introduction
Wavelets on Graphs, an Introduction Pierre Vandergheynst and David Shuman Ecole Polytechnique Fédérale de Lausanne (EPFL) Signal Processing Laboratory {pierre.vandergheynst,david.shuman}@epfl.ch Université
More informationSpectral Techniques for Clustering
Nicola Rebagliati 1/54 Spectral Techniques for Clustering Nicola Rebagliati 29 April, 2010 Nicola Rebagliati 2/54 Thesis Outline 1 2 Data Representation for Clustering Setting Data Representation and Methods
More informationA Multiscale Framework for Markov Decision Processes using Diffusion Wavelets
A Multiscale Framework for Markov Decision Processes using Diffusion Wavelets Mauro Maggioni Program in Applied Mathematics Department of Mathematics Yale University New Haven, CT 6 mauro.maggioni@yale.edu
More informationPredicting Graph Labels using Perceptron. Shuang Song
Predicting Graph Labels using Perceptron Shuang Song shs037@eng.ucsd.edu Online learning over graphs M. Herbster, M. Pontil, and L. Wainer, Proc. 22nd Int. Conf. Machine Learning (ICML'05), 2005 Prediction
More informationFast Direct Policy Evaluation using Multiscale Analysis of Markov Diffusion Processes
Fast Direct Policy Evaluation using Multiscale Analysis of Markov Diffusion Processes Mauro Maggioni mauro.maggioni@yale.edu Department of Mathematics, Yale University, P.O. Box 88, New Haven, CT,, U.S.A.
More informationDimensionality Reduction:
Dimensionality Reduction: From Data Representation to General Framework Dong XU School of Computer Engineering Nanyang Technological University, Singapore What is Dimensionality Reduction? PCA LDA Examples:
More informationManifold Regularization
Manifold Regularization Vikas Sindhwani Department of Computer Science University of Chicago Joint Work with Mikhail Belkin and Partha Niyogi TTI-C Talk September 14, 24 p.1 The Problem of Learning is
More informationWeighted Nonlocal Laplacian on Interpolation from Sparse Data
Noname manuscript No. (will be inserted by the editor) Weighted Nonlocal Laplacian on Interpolation from Sparse Data Zuoqiang Shi Stanley Osher Wei Zhu Received: date / Accepted: date Abstract Inspired
More informationIntegrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction
Integrating Global and Local Structures: A Least Squares Framework for Dimensionality Reduction Jianhui Chen, Jieping Ye Computer Science and Engineering Department Arizona State University {jianhui.chen,
More informationMultiscale Analysis and Diffusion Semigroups With Applications
Multiscale Analysis and Diffusion Semigroups With Applications Karamatou Yacoubou Djima Advisor: Wojciech Czaja Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu
More informationOnline Manifold Regularization: A New Learning Setting and Empirical Study
Online Manifold Regularization: A New Learning Setting and Empirical Study Andrew B. Goldberg 1, Ming Li 2, Xiaojin Zhu 1 1 Computer Sciences, University of Wisconsin Madison, USA. {goldberg,jerryzhu}@cs.wisc.edu
More informationImproved Local Coordinate Coding using Local Tangents
Improved Local Coordinate Coding using Local Tangents Kai Yu NEC Laboratories America, 10081 N. Wolfe Road, Cupertino, CA 95129 Tong Zhang Rutgers University, 110 Frelinghuysen Road, Piscataway, NJ 08854
More informationNeural Networks, Convexity, Kernels and Curses
Neural Networks, Convexity, Kernels and Curses Yoshua Bengio Work done with Nicolas Le Roux, Olivier Delalleau and Hugo Larochelle August 26th 2005 Perspective Curse of Dimensionality Most common non-parametric
More informationDiffusion/Inference geometries of data features, situational awareness and visualization. Ronald R Coifman Mathematics Yale University
Diffusion/Inference geometries of data features, situational awareness and visualization Ronald R Coifman Mathematics Yale University Digital data is generally converted to point clouds in high dimensional
More informationCluster Kernels for Semi-Supervised Learning
Cluster Kernels for Semi-Supervised Learning Olivier Chapelle, Jason Weston, Bernhard Scholkopf Max Planck Institute for Biological Cybernetics, 72076 Tiibingen, Germany {first. last} @tuebingen.mpg.de
More informationMultiscale Manifold Learning
Multiscale Manifold Learning Chang Wang IBM T J Watson Research Lab Kitchawan Rd Yorktown Heights, New York 598 wangchan@usibmcom Sridhar Mahadevan Computer Science Department University of Massachusetts
More informationMetabolic networks: Activity detection and Inference
1 Metabolic networks: Activity detection and Inference Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group Advanced microarray analysis course, Elsinore, Denmark, May 21th,
More informationSimilarity and kernels in machine learning
1/31 Similarity and kernels in machine learning Zalán Bodó Babeş Bolyai University, Cluj-Napoca/Kolozsvár Faculty of Mathematics and Computer Science MACS 2016 Eger, Hungary 2/31 Machine learning Overview
More informationMachine Learning: Basis and Wavelet 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 Haar DWT in 2 levels
Machine Learning: Basis and Wavelet 32 157 146 204 + + + + + - + - 김화평 (CSE ) Medical Image computing lab 서진근교수연구실 7 22 38 191 17 83 188 211 71 167 194 207 135 46 40-17 18 42 20 44 31 7 13-32 + + - - +
More informationLecture: Some Practical Considerations (3 of 4)
Stat260/CS294: Spectral Graph Methods Lecture 14-03/10/2015 Lecture: Some Practical Considerations (3 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough.
More informationThe Curse of Dimensionality for Local Kernel Machines
The Curse of Dimensionality for Local Kernel Machines Yoshua Bengio, Olivier Delalleau, Nicolas Le Roux Dept. IRO, Université de Montréal P.O. Box 6128, Downtown Branch, Montreal, H3C 3J7, Qc, Canada {bengioy,delallea,lerouxni}@iro.umontreal.ca
More informationStable MMPI-2 Scoring: Introduction to Kernel. Extension Techniques
1 Stable MMPI-2 Scoring: Introduction to Kernel Extension Techniques Liberty,E., Almagor,M., Zucker,S., Keller,Y., and Coifman,R.R. Abstract The current study introduces a new technique called Geometric
More informationGraph-Laplacian PCA: Closed-form Solution and Robustness
2013 IEEE Conference on Computer Vision and Pattern Recognition Graph-Laplacian PCA: Closed-form Solution and Robustness Bo Jiang a, Chris Ding b,a, Bin Luo a, Jin Tang a a School of Computer Science and
More informationDetecting Sparse Structures in Data in Sub-Linear Time: A group testing approach
Detecting Sparse Structures in Data in Sub-Linear Time: A group testing approach Boaz Nadler The Weizmann Institute of Science Israel Joint works with Inbal Horev, Ronen Basri, Meirav Galun and Ery Arias-Castro
More informationAn Iterated Graph Laplacian Approach for Ranking on Manifolds
An Iterated Graph Laplacian Approach for Ranking on Manifolds Xueyuan Zhou Department of Computer Science University of Chicago Chicago, IL zhouxy@cs.uchicago.edu Mikhail Belkin Department of Computer
More informationValue Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions
Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions Sridhar Mahadevan Department of Computer Science University of Massachusetts Amherst, MA 13 mahadeva@cs.umass.edu Mauro
More informationScale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract
Scale-Invariance of Support Vector Machines based on the Triangular Kernel François Fleuret Hichem Sahbi IMEDIA Research Group INRIA Domaine de Voluceau 78150 Le Chesnay, France Abstract This paper focuses
More informationClustering in kernel embedding spaces and organization of documents
Clustering in kernel embedding spaces and organization of documents Stéphane Lafon Collaborators: Raphy Coifman (Yale), Yosi Keller (Yale), Ioannis G. Kevrekidis (Princeton), Ann B. Lee (CMU), Boaz Nadler
More informationAppendix to: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts
Appendix to: On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts Hariharan Narayanan Department of Computer Science University of Chicago Chicago IL 6637 hari@cs.uchicago.edu
More informationTreelets An Adaptive Multi-Scale Basis for Sparse Unordered Data
Treelets An Adaptive Multi-Scale Basis for Sparse Unordered Data arxiv:0707.0481v1 [stat.me] 3 Jul 2007 Ann B Lee, Boaz Nadler, and Larry Wasserman Department of Statistics (ABL, LW) Carnegie Mellon University
More informationL 2,1 Norm and its Applications
L 2, Norm and its Applications Yale Chang Introduction According to the structure of the constraints, the sparsity can be obtained from three types of regularizers for different purposes.. Flat Sparsity.
More information