Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures

Size: px
Start display at page:

Download "Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures"

Transcription

1 Spectral Clustering of Polarimetric SAR Data With Wishart-Derived Distance Measures STIAN NORMANN ANFINSEN ROBERT JENSSEN TORBJØRN ELTOFT COMPUTATIONAL EARTH OBSERVATION AND MACHINE LEARNING LABORATORY DEPARTMENT OF PHYSICS AND TECHNOLOGY UNIVERSITY OF TROMSØ, NORWAY 1/54

2 Outline Motivation Introduction to Spectral Clustering Distance Measures for PolSAR Covariance Matrices A New Algorithm Results Conclusions and Future Work 2/54

3 Motivation Seeking (near) optimal statistical classification Disregarding covariance matrix structure (decomposition theory) and spatial information - for now Improve on the Wishart classifier Lee et al. (IJRS, 1994), Lee et al. (TGRS, 1999), Pottier & Lee (EUSAR, 2000),... Apply modern pattern recognition tools Kernel methods, spectral clustering, information theoretic learning 3/54

4 The Wishart Classifier Revisited Initialisation: Segmentation in H/A/α space Cloude-Pottier-Wishart (CPW) classifier Class mean coherency matrices V i calculated from initial partitioning of data; V i = < T j pixel j class i >, i = 1,..., k T j = < kk H > k = 1 2 [S hh +S vv, S hh S vv, 2S hv ] T. 4/54

5 The Wishart Classifier Revisited Initialisation: Segmentation in H/A/α space Cloude-Pottier-Wishart (CPW) classifier Class mean coherency matrices V i calculated from initial partitioning of data Iterative classification Minimum distance classification based on Wishart distance between the pixel coherency matrix T and V i : ω j = min i d W (T, V i ), i 1,..., k Iterative reclassification and update of class means 5/54

6 The Wishart Classifier Revisited Delivers consistently good results. Few parameters, easy to use, computationally efficient, approaches a ML solution - if it converges. But has some drawbacks: The initialisation uses a fixed number of classes, and is restricted to one class per predetermined zone in H/A/α space. Inherits the well known disadvantages of k-means. E.g., converence is not guaranteed, and may be slow. Conclusion: State of the art algorithms from pattern recognition and machine learning should be tested. 6/54

7 Clustering by Pairwise Affinities Based on distances d i j between all pixel pairs (i,j). Propagates similarity from pixel to pixel. Yields flexible discrimination surfaces. Nonlinear mapping to kernel space, where clustering is done with linear methods. The mapping is found by eigendecomposition Examples of capabilities Input space Kernel space 7/54

8 Spectral Clustering Pairwise distances d i j are transformed to affinities, e.g.: { } a i j = exp d2 i j 2σ 2 8/54

9 Spectral Clustering Pairwise distances d i j are transformed to affinities, e.g.: { } a i j = exp d2 i j 2σ 2 Pairwise affinities a i j between N data points are stored in an affinity matrix A. a 11 a a 1N A = a 21 a a 2N..... a N1 a N2... a NN 9/54

10 Spectral Clustering Pairwise distances d i j are transformed to affinities, e.g.: { } a i j = exp d2 i j 2σ 2 The optimal data partitioning is derived from the eigendecomposition of A. Hence, spectral clustering. Pairwise affinities a i j between N data points are stored in an affinity matrix A. a 11 a a 1N A = a 21 a a 2N..... a N1 a N2... a NN 10/54

11 Spectral Clustering Pairwise distances d i j are transformed to affinities, e.g.: { } a i j = exp d2 i j 2σ 2 Pairwise affinities a i j between N data points are stored in an affinity matrix A. a 11 a a 1N A = a 21 a a 2N..... a N1 a N2... a NN The optimal data partitioning is derived from the eigendecomposition of A. Hence, spectral clustering. There are different ways of using the eigenvalues and eigenvectors of A to obtain an optimal clustering. 11/54

12 Spectral Clustering Pairwise distances d i j are transformed to affinities, e.g.: { } a i j = exp d2 i j 2σ 2 Pairwise affinities a i j between N data points are stored in an affinity matrix A. a 11 a a 1N A = a 21 a a 2N..... a N1 a N2... a NN The optimal data partitioning is derived from the eigendecomposition of A. Hence, spectral clustering. E.g.: Using u eigenvectors corresponding to the largest eigenvalues new u-dimensional feature space (eigenspace): e T 1 e T 2. e T u = [φ 1φ 2... φ N ] 12/54

13 Spectral Clustering We have a mapping from input feature space to eigenspace: Φ(T i ) : T i φ i 13/54

14 Spectral Clustering We have a mapping from input feature space to eigenspace: Φ(T i ) : T i φ i The eigenspace feature set can be clustered by simple, linear discrimination methods, e.g. k-means with Euclidean distance. 14/54

15 Spectral Clustering We have a mapping from input feature space to eigenspace: Φ(T i ) : T i φ i The eigenspace feature set can be clustered by simple, linear discrimination methods, e.g. k-means with Euclidean distance. 15/54

16 Spectral Clustering We have a mapping from input feature space to eigenspace: Φ(T i ) : T i φ i The eigenspace feature set can be clustered by simple, linear discrimination methods, e.g. k-means with Euclidean distance. We use an information theoretic method, which partitions data by implicit maximization of the Cauchy-Schwarz divergence between the cluster pdf s in input space. Pdf s are estimated nonparametrically. 16/54

17 Spectral Clustering We have a mapping from input feature space to eigenspace: Φ(T i ) : T i φ i The eigenspace feature set can be clustered by simple, linear discrimination methods, e.g. k-means with Euclidean distance. We use an information theoretic method, which partitions data by implicit maximization of the Cauchy-Schwarz divergence between the cluster pdf s in input space. Pdf s are estimated nonparametrically. Data points outside the size N sample can be mapped to eigenspace using the Nyström approximation: Φ j (T) N λ j N e ji d(t, T i ), j = 1,..., u. i=1 17/54

18 Relation to Kernel Methods May be related to Mercer kernel-based algorithms, such as: Support Vector Machines, Kernel PCA, Kernel k-means, etc. The pairwise affinities are inner products in a Mercer kernel space a i j = a(t i, T j ) = < φ i, φ j >, a i j is a Mercer kernel function and A a Mercer kernel matrix iff: a(t i, T j ) is semi-positive definite a(t i, T j ) is symmetric a(t i, T j ) is continuous With these restrictions, how do we select the distance measure? 18/54

19 Coherency Matrix Distance Measures Wishart distance (Lee et al., IJRS 94) d W (T 1, T 2 ) = ln T 2 + tr(t 1 2 T 1). 19/54

20 Coherency Matrix Distance Measures Wishart distance (Lee et al., IJRS 94) d W (T 1, T 2 ) = ln T 2 + tr(t 1 2 T 1). Can be symmetrized, but d W (T i, T i ) depends on T i. Not suitable! 20/54

21 Coherency Matrix Distance Measures Bartlett distance (Conradsen et al., TGRS 03) ( T1 + T d B (T 1, T 2 ) = ln 2 2 ) 2p ln 2. T 1 T 2 21/54

22 Coherency Matrix Distance Measures Bartlett distance (Conradsen et al., TGRS 03) ( T1 + T d B (T 1, T 2 ) = ln 2 2 ) 2p ln 2. T 1 T 2 Based on log-likelihood ratio test of equality for two unknown covariance matrices. 22/54

23 Coherency Matrix Distance Measures Bartlett distance (Conradsen et al., TGRS.03) ( T1 + T d B (T 1, T 2 ) = ln 2 2 ) 2p ln 2. T 1 T 2 Based on log-likelihood ratio test of equality for two unknown covariance matrices. Symmetrized normalized log-likelihood distance (Proposed here) d SNLL (T 1, T 2 ) = 1 2 ( tr(t 1 1 T 2 + T 1 2 T 1) ) p. 23/54

24 Coherency Matrix Distance Measures Bartlett distance (Conradsen et al., TGRS 03) ( T1 + T d B (T 1, T 2 ) = ln 2 2 ) 2p ln 2. T 1 T 2 Based on log-likelihood ratio test of equality for two unknown covariance matrices. Symmetrized normalized log-likelihood distance (Proposed here) d SNLL (T 1, T 2 ) = 1 ( tr(t T 2 + T 1 2 T 1) ) p. Based on log-likelihood ratio test of equality for one known and one unknown covariance matrix. Symmetrized version of revised Wishart distance (Kersten et al., TGRS 05) 24/54

25 The New Algorithm Summary Replaces H/A/α space initialisation with spectral clustering. 25/54

26 The New Algorithm Summary Replaces H/A/α space initialisation with spectral clustering. A subset of N pixels, randomly sampled from the image, is clustered. 26/54

27 The New Algorithm Summary Replaces H/A/α space initialisation with spectral clustering. A subset of N pixels, randomly sampled from the image, is clustered. Remaining pixels may be classified in eigenspace, using the Nyström approximation. 27/54

28 The New Algorithm Summary Replaces H/A/α space initialisation with spectral clustering. A subset of N pixels, randomly sampled from the image, is clustered. Remaining pixels may be classified in kernel space (eigenspace), using the Nyström approximation. Alternatively, remaining pixels may be classified in input space with the minimum distance Wishart classifier. 28/54

29 The New Algorithm Summary Replaces H/A/α space initialisation with spectral clustering. A subset of N pixels, randomly sampled from the image, is clustered. Remaining pixels may be classified in kernel space (eigenspace), using the Nyström approximation. Alternatively, remaining pixels may be classified in input space with the minimum distance Wishart classifier. The latter solution has much lower computational cost. Our experience is that the classification results are essentially equal. 29/54

30 The New Algorithm Summary Replaces H/A/α space initialisation with spectral clustering. A subset of N pixels, randomly sampled from the image, is clustered. Remaining pixels may be classified in kernel space (eigenspace), using the Nyström approximation. Alternatively, remaining pixels may be classified in input space with the minimum distance Wishart classifier. The latter solution has much lower computational cost. Our experience is that the classification results are essentially equal. Hence, only the initialisation of the CPW classifier is changed. 30/54

31 The New Algorithm Parameters Number of clusters: k Must be manually selected, but the effective number of classes in the classification result, k e f f, is data adaptive. 31/54

32 The New Algorithm Parameters Number of clusters: k Must be manually selected, but the effective number of classes in the classification result, k e f f, is data adaptive. Sample size: N Trade-off with computational cost 32/54

33 The New Algorithm Parameters Number of clusters: k Must be manually selected, but the effective number of classes in the classification result, k e f f, is data adaptive. Sample size: N Trade-off with computational cost Kernel bandwidth: σ Robust automatic selection rule is under investigation 33/54

34 The New Algorithm Parameters Number of clusters: k Must be manually selected, but the effective number of classes in the classification result, k e f f, is data adaptive. Sample size: N Trade-off with computational cost Kernel bandwidth: σ Robust automatic selection rule is under investigation Eigenspace dimension: u Can be fixed to u = k for simplicity 34/54

35 POLinSAR 2007 Frascati Test Data Set: Flevoland, L-band 200x320 subset of AIRSAR L-band data set of agricultural area in Flevoland, The Netherlands, August Courtesy of NASA/JPL. 35/54

36 Ground Truth Data 36/54

37 Evaluation Qualitative analysis (visual inspection) Quantitative analysis We calculate a matching matrix M that relates predicted (P) and actual (A) class labels, and derive classification merits from M (Ferro-Famil et al., TGRS 01): Descriptivity D i : The fraction of the dominant predicted class labels within an actual class (quantifies homogeneity). Compactness C i : Quantifies to what extent the dominant predicted class also dominates other actual classes. Representivity R i : Quantifies to what extent the dominant predicted class is predicted for other actual classes. 37/54

38 Qualitative Analysis Cloude-Pottier-Wishart (CPW) Classifier Parameters: k=16, k e f f =9, it=10 (No. iterations in Wishart classifier). 38/54

39 Qualitative Analysis Cloude-Pottier-Wishart (CPW) Classifier Observations: Class 2 and 9 covered by same cluster. Class 4 and 10 covered by same cluster. Homogeneous classification in the ground truth areas. 39/54

40 Qualitative Analysis Bartlett Spectral Wishart (BSW) Classifier Parameters: k=16, k e f f =12, σ = 0.42, N=6400 (10%), it=10. 40/54

41 Qualitative Analysis Bartlett Spectral Wishart (BSW) Classifier Observations: Class 1 and 5 covered by same cluster. Some interference by a second cluster in class 3 and 5. Not as homogeneous classification as for CPW classifier, but better delineation of some areas. 41/54

42 POLinSAR 2007 Frascati Qualitative Analysis SNLL Spectral Wishart (SSW) Classifier Parameters: k=16, ke f f =15, σ = 0.42, N=6400 (10%), it=10. 42/54

43 Qualitative Analysis SNLL Spectral Wishart (SSW) Classifier Observations: Unique dominant cluster for all ground truth areas. Less homogeneous classification than other methods, much due to the higher effective number of classes. 43/54

44 Matching matrix for CPW classifier D i P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C i A A A A A A A A A A R i /54

45 Matching matrix for Bartlett distance classifier D i P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C i A A A A A A A A A A R i /54

46 Matching matrix for SNLL distance classifier D i P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C i A A A A A A A A A A R i /54

47 Quantitative Analysis: Descriptivity 47/54

48 Quantitative Analysis: Compactness 48/54

49 Quantitative Analysis: Representivity 49/54

50 Quantitative Analysis: Effective no. classes 50/54

51 Convergence Speed 51/54

52 Conclusions and Future Work We have selected two distance measures suited for calculation of pairwise affinities for PolSAR data coherency matrices. We have demonstrated how PolSAR data can be segmented by spectral clustering of coherency matrices The algorithm improves the classification result of the CPW classifier, while using the same information (derived from the statistics of a single pixel). Performance analysis shows that spectral clustering gives a better initialisation of the Wishart classifier than the H/A/α initialisation, both in terms of classification result and covergence speed. 52/54

53 Conclusions and Future Work Further work will concentrate on methods for robust selection of the kernel bandwidth σ, and studies of the data adaptive k e f f, in order to develop and verify a fully automatic segmentation algorithm. We will also study how spatial information and information from polarimetric decompositions can be included in the distance measure, to assimilate more prior information in the kernel function. The algorithm will be tested on different data sets. 53/54

54 Thank you! Stian Normann Anfinsen Computational Earth Observation and Machine Learning Laboratory University of Tromsø URL: 54/54

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space Robert Jenssen, Deniz Erdogmus 2, Jose Principe 2, Torbjørn Eltoft Department of Physics, University of Tromsø, Norway

More information

Land Cover Feature recognition by fusion of PolSAR, PolInSAR and optical data

Land Cover Feature recognition by fusion of PolSAR, PolInSAR and optical data Land Cover Feature recognition by fusion of PolSAR, PolInSAR and optical data Shimoni, M., Borghys, D., Heremans, R., Milisavljević, N., Pernel, C. Derauw, D., Orban, A. PolInSAR Conference, ESRIN, 22-26

More information

Evaluation and Bias Removal of Multi-Look Effect on Entropy/Alpha /Anisotropy (H/

Evaluation and Bias Removal of Multi-Look Effect on Entropy/Alpha /Anisotropy (H/ POLINSAR 2009 WORKSHOP 26-29 January 2009 ESA-ESRIN, Frascati (ROME), Italy Evaluation and Bias Removal of Multi-Look Effect on Entropy/Alpha /Anisotropy (H/ (H/α/A) Jong-Sen Lee*, Thomas Ainsworth Naval

More information

Machine Learning - MT & 14. PCA and MDS

Machine Learning - MT & 14. PCA and MDS Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)

More information

Manifold Coarse Graining for Online Semi-supervised Learning

Manifold Coarse Graining for Online Semi-supervised Learning for Online Semi-supervised Learning Mehrdad Farajtabar, Amirreza Shaban, Hamid R. Rabiee, Mohammad H. Rohban Digital Media Lab, Department of Computer Engineering, Sharif University of Technology, Tehran,

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

DUAL FREQUENCY POLARIMETRIC SAR DATA CLASSIFICATION AND ANALYSIS

DUAL FREQUENCY POLARIMETRIC SAR DATA CLASSIFICATION AND ANALYSIS Progress In Electromagnetics Research, PIER 31, 247 272, 2001 DUAL FREQUENCY POLARIMETRIC SAR DATA CLASSIFICATION AND ANALYSIS L. Ferro-Famil Ecole Polytechnique de l Université de Nantes IRESTE, Laboratoire

More information

Maximum Within-Cluster Association

Maximum Within-Cluster Association Maximum Within-Cluster Association Yongjin Lee, Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 3 Hyoja-dong, Nam-gu, Pohang 790-784, Korea Abstract This paper

More information

Multi-Class Linear Dimension Reduction by. Weighted Pairwise Fisher Criteria

Multi-Class Linear Dimension Reduction by. Weighted Pairwise Fisher Criteria Multi-Class Linear Dimension Reduction by Weighted Pairwise Fisher Criteria M. Loog 1,R.P.W.Duin 2,andR.Haeb-Umbach 3 1 Image Sciences Institute University Medical Center Utrecht P.O. Box 85500 3508 GA

More information

Table of Contents. Multivariate methods. Introduction II. Introduction I

Table of Contents. Multivariate methods. Introduction II. Introduction I Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation

More information

Machine Learning - MT Clustering

Machine Learning - MT Clustering Machine Learning - MT 2016 15. Clustering Varun Kanade University of Oxford November 28, 2016 Announcements No new practical this week All practicals must be signed off in sessions this week Firm Deadline:

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 - Clustering Lorenzo Rosasco UNIGE-MIT-IIT About this class We will consider an unsupervised setting, and in particular the problem of clustering unlabeled data into coherent groups. MLCC 2018

More information

Study and Applications of POLSAR Data Time-Frequency Correlation Properties

Study and Applications of POLSAR Data Time-Frequency Correlation Properties Study and Applications of POLSAR Data Time-Frequency Correlation Properties L. Ferro-Famil 1, A. Reigber 2 and E. Pottier 1 1 University of Rennes 1, Institute of Electronics and Telecommunications of

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

Comparison of Modern Stochastic Optimization Algorithms

Comparison of Modern Stochastic Optimization Algorithms Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

Supervised locally linear embedding

Supervised locally linear embedding Supervised locally linear embedding Dick de Ridder 1, Olga Kouropteva 2, Oleg Okun 2, Matti Pietikäinen 2 and Robert P.W. Duin 1 1 Pattern Recognition Group, Department of Imaging Science and Technology,

More information

Estimation of the Equivalent Number of Looks in Polarimetric Synthetic Aperture Radar Imagery

Estimation of the Equivalent Number of Looks in Polarimetric Synthetic Aperture Radar Imagery PUBLISHED IN IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 47, NO., NOVEMBER 9 Estimation of the Equivalent Number of Looks in Polarimetric Synthetic Aperture Radar Imagery Stian Normann Anfinsen,

More information

Eigenface-based facial recognition

Eigenface-based facial recognition Eigenface-based facial recognition Dimitri PISSARENKO December 1, 2002 1 General This document is based upon Turk and Pentland (1991b), Turk and Pentland (1991a) and Smith (2002). 2 How does it work? The

More information

A New Model-Based Scattering Power Decomposition for Polarimetric SAR and Its Application in Analyzing Post-Tsunami Effects

A New Model-Based Scattering Power Decomposition for Polarimetric SAR and Its Application in Analyzing Post-Tsunami Effects A New Model-Based Scattering Power Decomposition for Polarimetric SAR and Its Application in Analyzing Post-Tsunami Effects Yi Cui, Yoshio Yamaguchi Niigata University, Japan Background (1/5) POLSAR data

More information

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang. Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

PCA and admixture models

PCA and admixture models PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1

More information

Maximum likelihood SAR tomography based on the polarimetric multi-baseline RVoG model:

Maximum likelihood SAR tomography based on the polarimetric multi-baseline RVoG model: Maximum likelihood SAR tomography based on the polarimetric multi-baseline RVoG model: Optimal estimation of a covariance matrix structured as the sum of two Kronecker products. L. Ferro-Famil 1,2, S.

More information

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception LINEAR MODELS FOR CLASSIFICATION Classification: Problem Statement 2 In regression, we are modeling the relationship between a continuous input variable x and a continuous target variable t. In classification,

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian

More information

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014 Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

c 4, < y 2, 1 0, otherwise,

c 4, < y 2, 1 0, otherwise, Fundamentals of Big Data Analytics Univ.-Prof. Dr. rer. nat. Rudolf Mathar Problem. Probability theory: The outcome of an experiment is described by three events A, B and C. The probabilities Pr(A) =,

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Outline. Motivation. Mapping the input space to the feature space Calculating the dot product in the feature space

Outline. Motivation. Mapping the input space to the feature space Calculating the dot product in the feature space to The The A s s in to Fabio A. González Ph.D. Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá April 2, 2009 to The The A s s in 1 Motivation Outline 2 The Mapping the

More information

Machine Learning Lecture 2

Machine Learning Lecture 2 Machine Perceptual Learning and Sensory Summer Augmented 15 Computing Many slides adapted from B. Schiele Machine Learning Lecture 2 Probability Density Estimation 16.04.2015 Bastian Leibe RWTH Aachen

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Data Analysis and Manifold Learning Lecture 7: Spectral Clustering Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture 7 What is spectral

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Spectral and Spatial Methods for the Classification of Urban Remote Sensing Data

Spectral and Spatial Methods for the Classification of Urban Remote Sensing Data Spectral and Spatial Methods for the Classification of Urban Remote Sensing Data Mathieu Fauvel gipsa-lab/dis, Grenoble Institute of Technology - INPG - FRANCE Department of Electrical and Computer Engineering,

More information

Spectral Generative Models for Graphs

Spectral Generative Models for Graphs Spectral Generative Models for Graphs David White and Richard C. Wilson Department of Computer Science University of York Heslington, York, UK wilson@cs.york.ac.uk Abstract Generative models are well known

More information

Ordinary Differential Equations II

Ordinary Differential Equations II Ordinary Differential Equations II February 9 217 Linearization of an autonomous system We consider the system (1) x = f(x) near a fixed point x. As usual f C 1. Without loss of generality we assume x

More information

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions

More information

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent

More information

A Least Squares Formulation for Canonical Correlation Analysis

A Least Squares Formulation for Canonical Correlation Analysis A Least Squares Formulation for Canonical Correlation Analysis Liang Sun, Shuiwang Ji, and Jieping Ye Department of Computer Science and Engineering Arizona State University Motivation Canonical Correlation

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers Blaine Nelson, Tobias Scheffer Contents Classification Problem Bayesian Classifier Decision Linear Classifiers, MAP Models Logistic

More information

Kernel Methods. Charles Elkan October 17, 2007

Kernel Methods. Charles Elkan October 17, 2007 Kernel Methods Charles Elkan elkan@cs.ucsd.edu October 17, 2007 Remember the xor example of a classification problem that is not linearly separable. If we map every example into a new representation, then

More information

Ch 4. Linear Models for Classification

Ch 4. Linear Models for Classification Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,

More information

Learning Spectral Graph Segmentation

Learning Spectral Graph Segmentation Learning Spectral Graph Segmentation AISTATS 2005 Timothée Cour Jianbo Shi Nicolas Gogin Computer and Information Science Department University of Pennsylvania Computer Science Ecole Polytechnique Graph-based

More information

Functional Analysis Review

Functional Analysis Review Functional Analysis Review Lorenzo Rosasco slides courtesy of Andre Wibisono 9.520: Statistical Learning Theory and Applications September 9, 2013 1 2 3 4 Vector Space A vector space is a set V with binary

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 1 Introduction, researchy course, latest papers Going beyond simple machine learning Perception, strange spaces, images, time, behavior

More information

Basic Calculus Review

Basic Calculus Review Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V

More information

Functional Analysis Review

Functional Analysis Review Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all

More information

Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine

Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine Olga Kouropteva, Oleg Okun, Matti Pietikäinen Machine Vision Group, Infotech Oulu and

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

An indicator for the number of clusters using a linear map to simplex structure

An indicator for the number of clusters using a linear map to simplex structure An indicator for the number of clusters using a linear map to simplex structure Marcus Weber, Wasinee Rungsarityotin, and Alexander Schliep Zuse Institute Berlin ZIB Takustraße 7, D-495 Berlin, Germany

More information

ML (cont.): SUPPORT VECTOR MACHINES

ML (cont.): SUPPORT VECTOR MACHINES ML (cont.): SUPPORT VECTOR MACHINES CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 40 Support Vector Machines (SVMs) The No-Math Version

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive

More information

Support Vector Machines for Classification: A Statistical Portrait

Support Vector Machines for Classification: A Statistical Portrait Support Vector Machines for Classification: A Statistical Portrait Yoonkyung Lee Department of Statistics The Ohio State University May 27, 2011 The Spring Conference of Korean Statistical Society KAIST,

More information

CS 664 Segmentation (2) Daniel Huttenlocher

CS 664 Segmentation (2) Daniel Huttenlocher CS 664 Segmentation (2) Daniel Huttenlocher Recap Last time covered perceptual organization more broadly, focused in on pixel-wise segmentation Covered local graph-based methods such as MST and Felzenszwalb-Huttenlocher

More information

Dimensionality Reduction

Dimensionality Reduction Dimensionality Reduction Le Song Machine Learning I CSE 674, Fall 23 Unsupervised learning Learning from raw (unlabeled, unannotated, etc) data, as opposed to supervised data where a classification of

More information

Soil moisture retrieval over periodic surfaces using PolSAR data

Soil moisture retrieval over periodic surfaces using PolSAR data Soil moisture retrieval over periodic surfaces using PolSAR data Sandrine DANIEL Sophie ALLAIN Laurent FERRO-FAMIL Eric POTTIER IETR Laboratory, UMR CNRS 6164, University of Rennes1, France Contents Soil

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation Lecture 15. Pattern Classification (I): Statistical Formulation Outline Statistical Pattern Recognition Maximum Posterior Probability (MAP) Classifier Maximum Likelihood (ML) Classifier K-Nearest Neighbor

More information

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

Kernel Principal Component Analysis

Kernel Principal Component Analysis Kernel Principal Component Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Curve Fitting Re-visited, Bishop1.2.5

Curve Fitting Re-visited, Bishop1.2.5 Curve Fitting Re-visited, Bishop1.2.5 Maximum Likelihood Bishop 1.2.5 Model Likelihood differentiation p(t x, w, β) = Maximum Likelihood N N ( t n y(x n, w), β 1). (1.61) n=1 As we did in the case of the

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES THEORY AND PRACTICE Bogustaw Cyganek AGH University of Science and Technology, Poland WILEY A John Wiley &. Sons, Ltd., Publication Contents Preface Acknowledgements

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Manifold Learning: Theory and Applications to HRI

Manifold Learning: Theory and Applications to HRI Manifold Learning: Theory and Applications to HRI Seungjin Choi Department of Computer Science Pohang University of Science and Technology, Korea seungjin@postech.ac.kr August 19, 2008 1 / 46 Greek Philosopher

More information

Polarimetry-based land cover classification with Sentinel-1 data

Polarimetry-based land cover classification with Sentinel-1 data Polarimetry-based land cover classification with Sentinel-1 data Banqué, Xavier (1); Lopez-Sanchez, Juan M (2); Monells, Daniel (1); Ballester, David (2); Duro, Javier (1); Koudogbo, Fifame (1) 1. Altamira-Information

More information

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017 The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the

More information

Analysis Preliminary Exam Workshop: Hilbert Spaces

Analysis Preliminary Exam Workshop: Hilbert Spaces Analysis Preliminary Exam Workshop: Hilbert Spaces 1. Hilbert spaces A Hilbert space H is a complete real or complex inner product space. Consider complex Hilbert spaces for definiteness. If (, ) : H H

More information

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Haiping Lu 1 K. N. Plataniotis 1 A. N. Venetsanopoulos 1,2 1 Department of Electrical & Computer Engineering,

More information

Kernel Learning with Bregman Matrix Divergences

Kernel Learning with Bregman Matrix Divergences Kernel Learning with Bregman Matrix Divergences Inderjit S. Dhillon The University of Texas at Austin Workshop on Algorithms for Modern Massive Data Sets Stanford University and Yahoo! Research June 22,

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

5. Discriminant analysis

5. Discriminant analysis 5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density

More information

Machine Learning Lecture 5

Machine Learning Lecture 5 Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Exploiting Sparse Non-Linear Structure in Astronomical Data

Exploiting Sparse Non-Linear Structure in Astronomical Data Exploiting Sparse Non-Linear Structure in Astronomical Data Ann B. Lee Department of Statistics and Department of Machine Learning, Carnegie Mellon University Joint work with P. Freeman, C. Schafer, and

More information

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.) Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Jialin Dong ShanghaiTech University 1 Outline Introduction FourVignettes: System Model and Problem Formulation Problem Analysis

More information