High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution

Size: px
Start display at page:

Download "High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution"

Transcription

1 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution Presented by 1 In collaboration with Mathieu Fauvel1, Stéphane Girard2 and David Sheeren1 1 Dynafor, 2 INRA, University of Toulouse, France Team Mistis, INRIA Grenoble, France

2 Study objectives Agroecological application Grassland management practices Data SITS with high spatial ( 10m) resolution and temporal (2-3 images per month) resolution satellite image time series Method Supervised classification at the object scale High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 2 of 22

3 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 3 of 22

4 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion Satellite remote sensing of grasslands We propose to use dense multispectral time series with high spatial resolution to characterize grasslands. Grasslands in Europe are: Relatively small ( 100m 100m) need high spatial resolution images Heterogeneous need multispectral images Natural cycle disturbed by human activities need SITS Mowing Grazing High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 4 of 22

5 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion Satellite remote sensing of grasslands We propose to use dense multispectral time series with high spatial resolution to characterize grasslands. Grasslands in Europe are: Relatively small ( 100m 100m) need high spatial resolution images Heterogeneous need multispectral images Natural cycle disturbed by human activities need SITS Mowing Grazing High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 4 of 22

6 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion Satellite remote sensing of grasslands We propose to use dense multispectral time series with high spatial resolution to characterize grasslands. Grasslands in Europe are: Relatively small ( 100m 100m) need high spatial resolution images Heterogeneous need multispectral images Natural cycle disturbed by human activities need SITS Mowing Grazing High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 4 of 22

7 Statistical problem Learn f such as y i = f (X i), where y i is the predicted label ] X i = [x i1... x ini is a matrix of size (n i d) that contains all the pixels inside g i x ik (t 1) x ik (t) =. R d x ik (t d ) with g i grassland with index i, n i number of pixels in grassland g i k pixel index, k {1,..., n i} d length of time series x ik (t l ) NDVI value of pixel k at time l High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 5 of 22

8 Contributions Thematic contributions Grassland management practices Sentinel-2 contribution Methodological contributions Model grassland distribution Process grassland supervised classification at the parcel scale Robust to the dimension of data (n i pixels, d temporal variables with n i d), the total number of grasslands pixels that might be large. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 6 of 22

9 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 7 of 22

10 Statistical modelling of grasslands For each grassland g i: each pixel x ik N (µ i, Σ i) High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 8 of 22

11 Measuring similarity between two grasslands Measuring similarity between g i and g j measuring similarity betwen N (µ i, Σ i) and N (µ j, Σ j) Symmetrized Kullback-Leibler divergence: KLD(g i, g j) = 1 [ Tr [ ] Σ 1 i Σ j + Σ 1 j Σ i + (µi µ 2 j ) ( Σ 1 i ) + Σ 1 j (µi µ j )] d High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 9 of 22

12 Measuring similarity between two grasslands The number of pixels in a grassland is usually lower than the number of parameters to estimate! 15 Number of grasslands Number of pixels Figure : Histogram of grassland size in number of pixels n i. Red line: number of parameters to estimate for each grassland for a multivariate Gaussian model. It is derived from the number of variables using the formula d(d + 3)/2 = 170 for d = 17. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 10 of 22

13 High dimensional model High Dimensional Discriminant Analysis 1 modelling is used: it assumes that the last eigenvalues of the covariance matrix are equal: with D i = Σ i = P id ip i λ i λ ipi 0 0 λ i } p i (d pi) 0 λ i p i is the number of non-equal eigenvalues, λ i is the multiple eigenvalue corresponding to the noise term (last and equal eigenvalues), λ ij λ i, for j = 1,..., p i 1 C. Bouveyron, S. Girard and C. Schmid, "High-dimensional discriminant analysis", Communications in Statistics - Theory and Methods, vol. 36, no. 14, pp , 2007 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 11 of 22

14 High dimensional model High Dimensional Discriminant Analysis 1 modelling is used: it assumes that the last eigenvalues of the covariance matrix are equal: Σ i = Q iλ iq i + λ ii d I d is the identity matrix of size d, Q i = [ ] q i1,..., q ipi, ] Λ i = diag [λ i1 λ i,..., λ ipi λ i. Folllowing this model, Σ 1 i can be computed explicitly: Σ 1 i = Q iv iq i + λ 1 i [ ] 1 with V i = diag λ i 1 λ i1,..., 1 λ i 1 λ ipi I d 1 C. Bouveyron, S. Girard and C. Schmid, "High-dimensional discriminant analysis", Communications in Statistics - Theory and Methods, vol. 36, no. 14, pp , 2007 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 11 of 22

15 High Dimensional Symmetrized KLD To compute HDKLD, only the p i first eigenvalues/eigenvectors are required: the number of parameters to estimate is reduced, the unstable estimation of the eigenvectors associated to small eigenvalues is avoided. HDKLD(g i, g j) = 1 [ 2 Λ 1 2 j Q j Q iv 1 2 i 2 2 F Λ 1 i Q i Q jv 1 2 j 2 F + λ 1 i Tr [ ] Λ j λj Tr [ ] V i + λ 1 j Tr [ ] Λ i λi Tr [ ] V j V 1 2 i Q i (µ i µ j ) 2 V 1 2 j Q j (µ i µ j ) 2 + λi + λj (µ λ i µ j ) 2 + λ2 i + λ 2 j d iλ j λ iλ j ] d where L 2 F = Tr(L L) is the Frobenius norm. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 12 of 22

16 Construction of a positive definite kernel HDKLD is used to build a positive definite kernel function. This kernel function can be used in any kernel method, such as SVM. (HD)KLD is a semi-metric. K(g i, g j) = exp [ (HD)KLD(g i,g j ) ] 2 with σ R + σ High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 13 of 22

17 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 14 of 22

18 Data Study site Field data 52 parcels with 3 management practices (field survey, 2015): Class Nb of grasslands Mowing 34 Grazing 10 Mixed (mowing & grazing) 8 Satellite data Formosat-2 Spatial resolution: 8m 4 spectral bands (B, G, R, NIR) d = 17 images from Feb. 16 to Dec. 20, 2013: High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 15 of 22

19 Classification methods Comparison with other methods: Method p-svm µ-svm KLD-SVM HDKLD-SVM Scale Pixel Object Object Object Expl. variable x ik µ i N (µ i, Σ i) N (µ i, Σ i) Kernel RBF RBF K(g i, g j) K(g i, g j) Nb of samples majority voting rule Optimal hyperparameters have been optimized by cross-validation. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 16 of 22

20 Processing chain Leave-One-Out Cross-Validation for error estimation. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 17 of 22

21 Results HDKLD outperforms KLD, but the results are not significantly different than the other methods. Classification accuracy P-SVM µ-svm KLD-SVM HDKLD-SVM REF REF REF REF OA κ PRED Test of significance of observed differences Z p-svm µ-svm KLD-SVM HDKLD-SVM p-svm µ-svm KLD-SVM HDKLD-SVM ˆκ m ˆκ n Z = var(ˆκ ˆ m) + var(ˆκ ˆ n) (*from Congalton and Green, Assessing the Accuracy of Remotely Sensed Data, Principles and Practices, 2009) High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 18 of 22

22 Results The proposed model enables a proper grassland modelling at the parcel scale, reduces the number of elements to be processed by SVM. For HDKLD-SVM: G = 52 grasslands processed For P-SVM: N = 8741 pixels processed High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 19 of 22

23 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 20 of 22

24 Conclusion Gaussian modelling seems accurate to model grassland pixels distribution. HDKLD is efficient for measuring grassland proximity and robust to the dimension of data. Best results are not significantly different. HDKLD is better than KLD. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 21 of 22

25 Context High Dimensional Divergence for measuring grassland similarity Experimental results Conclusion Prospects The method will be tested on a larger dataset. The method will be further extended to multispectral data. The method will be used for unsupervised classification. Thank you for your attention. High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 22 of 22

26 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 22 of 22

27 Estimation of HDKLD parameters: HDKLD(g i, g j) = 1 [ 2 Λ 1 2 j Q j Q iv 1 2 i 2 2 F Λ 1 i Q i Q jv 1 2 j 2 F + λ 1 i Tr [ ] Λ j λj Tr [ ] V i + λ 1 j Tr [ ] Λ i λi Tr [ ] V j V 1 2 i Q i (µ i µ j ) 2 V 1 2 j Q j (µ i µ j ) 2 + λi + λj (µ λ i µ j ) 2 + λ2 i + λ 2 j d iλ j λ iλ j ˆλ ij and ˆq ij are the first eigenvalues/eigenvectors of ˆΣ i, j {1,..., p i}, ] d ˆp i corresponds to the number of eigenvalues needed to reach a given percentage of variance t, ˆλ i = Tr( ˆΣ i ) ˆpi ˆλ ij j=1 Tr( ˆΣ i ) ˆλij j ˆp i d ˆp i. t, t being a user defined parameter, High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 22 of 22

28 Optimal parameters have been optimized during cross-validation given this search grid: Parameter p-svm µ-svm KLD-SVM HDKLD-SVM σ {2 5, 2 4,..., 2 5 } {2 8, 2 9,..., 2 12 } C {1, 10, 100} t {0.80, 0.85, 0.90, 0.95, 0.99} Bhattacharyya distance: D B (g i, g j) = 1 8 (µ i µ j ) Σ 1 (µ i µ j ) + 1 ( ) 2 ln det(σ) det(σi)det(σ j) with Σ = Σi + Σj 2 High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution 22 of 22

Object-based classification of grasslands from high resolution satellite image time series using Gaussian mean map kernels

Object-based classification of grasslands from high resolution satellite image time series using Gaussian mean map kernels Object-based classification of grasslands from high resolution satellite image time series using Gaussian mean map kernels Maïlys Lopes, Mathieu Fauvel, Stephane Girard, David Sheeren To cite this version:

More information

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron 1,2, Stéphane Girard 1, and Cordelia Schmid 2 1 LMC IMAG, BP 53, Université Grenoble 1, 38041 Grenoble cedex 9 France (e-mail: charles.bouveyron@imag.fr,

More information

Classification of high dimensional data: High Dimensional Discriminant Analysis

Classification of high dimensional data: High Dimensional Discriminant Analysis Classification of high dimensional data: High Dimensional Discriminant Analysis Charles Bouveyron, Stephane Girard, Cordelia Schmid To cite this version: Charles Bouveyron, Stephane Girard, Cordelia Schmid.

More information

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron LMC-IMAG & INRIA Rhône-Alpes Joint work with S. Girard and C. Schmid ASMDA Brest May 2005 Introduction Modern data are high dimensional: Imagery:

More information

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron LMC-IMAG & INRIA Rhône-Alpes Joint work with S. Girard and C. Schmid High Dimensional Discriminant Analysis - Lear seminar p.1/43 Introduction High

More information

Parsimonious Gaussian process models for the classification of hyperspectral remote sensing images

Parsimonious Gaussian process models for the classification of hyperspectral remote sensing images Parsimonious Gaussian process models for the classification of hyperspectral remote sensing images Mathieu Fauvel, Charles Bouveyron, Stephane Girard To cite this version: Mathieu Fauvel, Charles Bouveyron,

More information

Received: 25 July 2017 ; Accepted: 22 September 2017; Published: 25 September 2017

Received: 25 July 2017 ; Accepted: 22 September 2017; Published: 25 September 2017 remote sensing Article Spectro-Temporal Heterogeneity Measures from Dense High Spatial Resolution Satellite Image Time Series: Application to Grassland Species Diversity Estimation Mailys Lopes 1, *, Mathieu

More information

A New Space for Comparing Graphs

A New Space for Comparing Graphs A New Space for Comparing Graphs Anshumali Shrivastava and Ping Li Cornell University and Rutgers University August 18th 2014 Anshumali Shrivastava and Ping Li ASONAM 2014 August 18th 2014 1 / 38 Main

More information

A convex relaxation for weakly supervised classifiers

A convex relaxation for weakly supervised classifiers A convex relaxation for weakly supervised classifiers Armand Joulin and Francis Bach SIERRA group INRIA -Ecole Normale Supérieure ICML 2012 Weakly supervised classification We adress the problem of weakly

More information

Spectral and Spatial Methods for the Classification of Urban Remote Sensing Data

Spectral and Spatial Methods for the Classification of Urban Remote Sensing Data Spectral and Spatial Methods for the Classification of Urban Remote Sensing Data Mathieu Fauvel gipsa-lab/dis, Grenoble Institute of Technology - INPG - FRANCE Department of Electrical and Computer Engineering,

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Is ground-based phenology of deciduous tree species consistent with the temporal pattern observed from Sentinel-2 time series?

Is ground-based phenology of deciduous tree species consistent with the temporal pattern observed from Sentinel-2 time series? Is ground-based phenology of deciduous tree species consistent with the temporal pattern observed from Sentinel-2 time series? N. Karasiak¹, D. Sheeren¹, J-F Dejoux², J-B Féret³, J. Willm¹, C. Monteil¹.

More information

Mahalanobis kernel for the classification of hyperspectral images

Mahalanobis kernel for the classification of hyperspectral images Mahalanobis kernel for the classification of hyperspectral images Mathieu Fauvel, Alberto Villa, Jocelyn Chanussot, Jon Atli Benediktsson To cite this version: Mathieu Fauvel, Alberto Villa, Jocelyn Chanussot,

More information

Hierarchical kernel applied to mixture model for the classification of binary predictors

Hierarchical kernel applied to mixture model for the classification of binary predictors Hierarchical kernel applied to mixture model for the classification of binary predictors Seydou Sylla, Stephane Girard, Abdou Kâ Diongue, Aldiouma Diallo, Cheikh Sokhna To cite this version: Seydou Sylla,

More information

Sparse Kernel Density Estimation Technique Based on Zero-Norm Constraint

Sparse Kernel Density Estimation Technique Based on Zero-Norm Constraint Sparse Kernel Density Estimation Technique Based on Zero-Norm Constraint Xia Hong 1, Sheng Chen 2, Chris J. Harris 2 1 School of Systems Engineering University of Reading, Reading RG6 6AY, UK E-mail: x.hong@reading.ac.uk

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014 Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

Feature selection and extraction Spectral domain quality estimation Alternatives

Feature selection and extraction Spectral domain quality estimation Alternatives Feature selection and extraction Error estimation Maa-57.3210 Data Classification and Modelling in Remote Sensing Markus Törmä markus.torma@tkk.fi Measurements Preprocessing: Remove random and systematic

More information

What is semi-supervised learning?

What is semi-supervised learning? What is semi-supervised learning? In many practical learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate text processing, video-indexing,

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework Due Oct 15, 10.30 am Rules Please follow these guidelines. Failure to do so, will result in loss of credit. 1. Homework is due on the due date

More information

Covariance Matrix Simplification For Efficient Uncertainty Management

Covariance Matrix Simplification For Efficient Uncertainty Management PASEO MaxEnt 2007 Covariance Matrix Simplification For Efficient Uncertainty Management André Jalobeanu, Jorge A. Gutiérrez PASEO Research Group LSIIT (CNRS/ Univ. Strasbourg) - Illkirch, France *part

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Random Matrices in Machine Learning

Random Matrices in Machine Learning Random Matrices in Machine Learning Romain COUILLET CentraleSupélec, University of ParisSaclay, France GSTATS IDEX DataScience Chair, GIPSA-lab, University Grenoble Alpes, France. June 21, 2018 1 / 113

More information

Kernel discriminant analysis and clustering with parsimonious Gaussian process models

Kernel discriminant analysis and clustering with parsimonious Gaussian process models Kernel discriminant analysis and clustering with parsimonious Gaussian process models Charles Bouveyron, Mathieu Fauvel, Stephane Girard To cite this version: Charles Bouveyron, Mathieu Fauvel, Stephane

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

ESA-GLOBCOVER - A Global Land Cover Service at 300 m from ENVISAT MERIS Time Series

ESA-GLOBCOVER - A Global Land Cover Service at 300 m from ENVISAT MERIS Time Series October 15, 2008 (Jena Germany) ESA-GLOBCOVER - A Global Land Cover Service at 300 m from ENVISAT MERIS Time Series Defourny P.1, Bontemps S., Vancutsem C.1, Pekel J.F.1,, Vanbogaert E.1, Bicheron P.2,

More information

When Dictionary Learning Meets Classification

When Dictionary Learning Meets Classification When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August

More information

Curves clustering with approximation of the density of functional random variables

Curves clustering with approximation of the density of functional random variables Curves clustering with approximation of the density of functional random variables Julien Jacques and Cristian Preda Laboratoire Paul Painlevé, UMR CNRS 8524, University Lille I, Lille, France INRIA Lille-Nord

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 6 Standard Kernels Unusual Input Spaces for Kernels String Kernels Probabilistic Kernels Fisher Kernels Probability Product Kernels

More information

Unsupervised Regressive Learning in High-dimensional Space

Unsupervised Regressive Learning in High-dimensional Space Unsupervised Regressive Learning in High-dimensional Space University of Kent ATRC Leicester 31st July, 2018 Outline Data Linkage Analysis High dimensionality and variable screening Variable screening

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Introduction Consider a zero mean random vector R n with autocorrelation matri R = E( T ). R has eigenvectors q(1),,q(n) and associated eigenvalues λ(1) λ(n). Let Q = [ q(1)

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, DATA11002 Introduction to Machine Learning Lecturer: Antti Ukkonen TAs: Saska Dönges and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer,

More information

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines CS4495/6495 Introduction to Computer Vision 8C-L3 Support Vector Machines Discriminative classifiers Discriminative classifiers find a division (surface) in feature space that separates the classes Several

More information

Unsupervised Learning: Dimensionality Reduction

Unsupervised Learning: Dimensionality Reduction Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,

More information

Convex Optimization in Classification Problems

Convex Optimization in Classification Problems New Trends in Optimization and Computational Algorithms December 9 13, 2001 Convex Optimization in Classification Problems Laurent El Ghaoui Department of EECS, UC Berkeley elghaoui@eecs.berkeley.edu 1

More information

LAND USE MAPPING AND MONITORING IN THE NETHERLANDS (LGN5)

LAND USE MAPPING AND MONITORING IN THE NETHERLANDS (LGN5) LAND USE MAPPING AND MONITORING IN THE NETHERLANDS (LGN5) Hazeu, Gerard W. Wageningen University and Research Centre - Alterra, Centre for Geo-Information, The Netherlands; gerard.hazeu@wur.nl ABSTRACT

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Deterministic Component Analysis Goal (Lecture): To present standard and modern Component Analysis (CA) techniques such as Principal

More information

Kernel-Based Retrieval of Atmospheric Profiles from IASI Data

Kernel-Based Retrieval of Atmospheric Profiles from IASI Data Kernel-Based Retrieval of Atmospheric Profiles from IASI Data Gustavo Camps-Valls, Valero Laparra, Jordi Muñoz-Marí, Luis Gómez-Chova, Xavier Calbet Image Processing Laboratory (IPL), Universitat de València.

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

One-class Label Propagation Using Local Cone Based Similarity

One-class Label Propagation Using Local Cone Based Similarity One-class Label Propagation Using Local Based Similarity Takumi Kobayashi and Nobuyuki Otsu Abstract In this paper, we propose a novel method of label propagation for one-class learning. For binary (positive/negative)

More information

Non white sample covariance matrices.

Non white sample covariance matrices. Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions

More information

A Least Squares Formulation for Canonical Correlation Analysis

A Least Squares Formulation for Canonical Correlation Analysis A Least Squares Formulation for Canonical Correlation Analysis Liang Sun, Shuiwang Ji, and Jieping Ye Department of Computer Science and Engineering Arizona State University Motivation Canonical Correlation

More information

Fantope Regularization in Metric Learning

Fantope Regularization in Metric Learning Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

Kernel methods, kernel SVM and ridge regression

Kernel methods, kernel SVM and ridge regression Kernel methods, kernel SVM and ridge regression Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Collaborative Filtering 2 Collaborative Filtering R: rating matrix; U: user factor;

More information

CLASSIFICATION OF MULTI-SPECTRAL DATA BY JOINT SUPERVISED-UNSUPERVISED LEARNING

CLASSIFICATION OF MULTI-SPECTRAL DATA BY JOINT SUPERVISED-UNSUPERVISED LEARNING CLASSIFICATION OF MULTI-SPECTRAL DATA BY JOINT SUPERVISED-UNSUPERVISED LEARNING BEHZAD SHAHSHAHANI DAVID LANDGREBE TR-EE 94- JANUARY 994 SCHOOL OF ELECTRICAL ENGINEERING PURDUE UNIVERSITY WEST LAFAYETTE,

More information

High-dimensional test for normality

High-dimensional test for normality High-dimensional test for normality Jérémie Kellner Ph.D Student University Lille I - MODAL project-team Inria joint work with Alain Celisse Rennes - June 5th, 2014 Jérémie Kellner Ph.D Student University

More information

Linear and Non-Linear Dimensionality Reduction

Linear and Non-Linear Dimensionality Reduction Linear and Non-Linear Dimensionality Reduction Alexander Schulz aschulz(at)techfak.uni-bielefeld.de University of Pisa, Pisa 4.5.215 and 7.5.215 Overview Dimensionality Reduction Motivation Linear Projections

More information

A DATA FIELD METHOD FOR URBAN REMOTELY SENSED IMAGERY CLASSIFICATION CONSIDERING SPATIAL CORRELATION

A DATA FIELD METHOD FOR URBAN REMOTELY SENSED IMAGERY CLASSIFICATION CONSIDERING SPATIAL CORRELATION The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLI-B7, 016 XXIII ISPRS Congress, 1 19 July 016, Prague, Czech Republic A DATA FIELD METHOD FOR

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Mixture Models, Density Estimation, Factor Analysis Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 2: 1 late day to hand it in now. Assignment 3: Posted,

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

GEOG 4110/5100 Advanced Remote Sensing Lecture 12. Classification (Supervised and Unsupervised) Richards: 6.1, ,

GEOG 4110/5100 Advanced Remote Sensing Lecture 12. Classification (Supervised and Unsupervised) Richards: 6.1, , GEOG 4110/5100 Advanced Remote Sensing Lecture 12 Classification (Supervised and Unsupervised) Richards: 6.1, 8.1-8.8.2, 9.1-9.34 GEOG 4110/5100 1 Fourier Transforms Transformations in the Frequency Domain

More information

Spatial analysis of extreme rainfalls In the Cévennes-Vivarais region

Spatial analysis of extreme rainfalls In the Cévennes-Vivarais region Spatial analysis of extreme rainfalls In the Cévennes-Vivarais region using a nearest neighbor approach Caroline Bernard-Michel (1), Laurent Gardes (1), Stéphane Girard (1), Gilles Molinié (2) (1) INRIA

More information

GEO 874 Remote Sensing. Zihan Lin, Nafiseh Hagtalab, Ranjeet John

GEO 874 Remote Sensing. Zihan Lin, Nafiseh Hagtalab, Ranjeet John GEO 874 Remote Sensing Zihan Lin, Nafiseh Hagtalab, Ranjeet John http://onlinelibrary.wiley.com/doi/10.1002/wat2.1147/full Landscape Heterogeneity S-I: homogeneous landscape using only cropland for the

More information

Statistical Learning. Dong Liu. Dept. EEIS, USTC

Statistical Learning. Dong Liu. Dept. EEIS, USTC Statistical Learning Dong Liu Dept. EEIS, USTC Chapter 6. Unsupervised and Semi-Supervised Learning 1. Unsupervised learning 2. k-means 3. Gaussian mixture model 4. Other approaches to clustering 5. Principle

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

Data dependent operators for the spatial-spectral fusion problem

Data dependent operators for the spatial-spectral fusion problem Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.

More information

A Novel Low-Complexity HMM Similarity Measure

A Novel Low-Complexity HMM Similarity Measure A Novel Low-Complexity HMM Similarity Measure Sayed Mohammad Ebrahim Sahraeian, Student Member, IEEE, and Byung-Jun Yoon, Member, IEEE Abstract In this letter, we propose a novel similarity measure for

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

Anomaly (outlier) detection. Huiping Cao, Anomaly 1

Anomaly (outlier) detection. Huiping Cao, Anomaly 1 Anomaly (outlier) detection Huiping Cao, Anomaly 1 Outline General concepts What are outliers Types of outliers Causes of anomalies Challenges of outlier detection Outlier detection approaches Huiping

More information

The Variational Gaussian Approximation Revisited

The Variational Gaussian Approximation Revisited The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much

More information

A New Unsupervised Event Detector for Non-Intrusive Load Monitoring

A New Unsupervised Event Detector for Non-Intrusive Load Monitoring A New Unsupervised Event Detector for Non-Intrusive Load Monitoring GlobalSIP 2015, 14th Dec. Benjamin Wild, Karim Said Barsim, and Bin Yang Institute of Signal Processing and System Theory of,, Germany

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

1. Introduction. S.S. Patil 1, Sachidananda 1, U.B. Angadi 2, and D.K. Prabhuraj 3

1. Introduction. S.S. Patil 1, Sachidananda 1, U.B. Angadi 2, and D.K. Prabhuraj 3 Cloud Publications International Journal of Advanced Remote Sensing and GIS 2014, Volume 3, Issue 1, pp. 525-531, Article ID Tech-249 ISSN 2320-0243 Research Article Open Access Machine Learning Technique

More information

Midterm. Introduction to Machine Learning. CS 189 Spring You have 1 hour 20 minutes for the exam.

Midterm. Introduction to Machine Learning. CS 189 Spring You have 1 hour 20 minutes for the exam. CS 189 Spring 2013 Introduction to Machine Learning Midterm You have 1 hour 20 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. Please use non-programmable calculators

More information

Expectation Propagation for Approximate Bayesian Inference

Expectation Propagation for Approximate Bayesian Inference Expectation Propagation for Approximate Bayesian Inference José Miguel Hernández Lobato Universidad Autónoma de Madrid, Computer Science Department February 5, 2007 1/ 24 Bayesian Inference Inference Given

More information

Face detection and recognition. Detection Recognition Sally

Face detection and recognition. Detection Recognition Sally Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification

More information

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement Simon Leglaive 1 Laurent Girin 1,2 Radu Horaud 1 1: Inria Grenoble Rhône-Alpes 2: Univ. Grenoble Alpes, Grenoble INP,

More information

LECTURE 18. Lecture outline Gaussian channels: parallel colored noise inter-symbol interference general case: multiple inputs and outputs

LECTURE 18. Lecture outline Gaussian channels: parallel colored noise inter-symbol interference general case: multiple inputs and outputs LECTURE 18 Last time: White Gaussian noise Bandlimited WGN Additive White Gaussian Noise (AWGN) channel Capacity of AWGN channel Application: DS-CDMA systems Spreading Coding theorem Lecture outline Gaussian

More information

Bagging and Other Ensemble Methods

Bagging and Other Ensemble Methods Bagging and Other Ensemble Methods Sargur N. Srihari srihari@buffalo.edu 1 Regularization Strategies 1. Parameter Norm Penalties 2. Norm Penalties as Constrained Optimization 3. Regularization and Underconstrained

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software January 2012, Volume 46, Issue 6. http://www.jstatsoft.org/ HDclassif: An R Package for Model-Based Clustering and Discriminant Analysis of High-Dimensional Data Laurent

More information

Discriminative Learning and Big Data

Discriminative Learning and Big Data AIMS-CDT Michaelmas 2016 Discriminative Learning and Big Data Lecture 2: Other loss functions and ANN Andrew Zisserman Visual Geometry Group University of Oxford http://www.robots.ox.ac.uk/~vgg Lecture

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31

Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31 Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 1 Introduction, researchy course, latest papers Going beyond simple machine learning Perception, strange spaces, images, time, behavior

More information

A Kernel Between Sets of Vectors

A Kernel Between Sets of Vectors A Kernel Between Sets of Vectors Risi Kondor Tony Jebara Columbia University, New York, USA. 1 A Kernel between Sets of Vectors In SVM, Gassian Processes, Kernel PCA, kernel K de nes feature map Φ : X

More information

Kernel-Based Contrast Functions for Sufficient Dimension Reduction

Kernel-Based Contrast Functions for Sufficient Dimension Reduction Kernel-Based Contrast Functions for Sufficient Dimension Reduction Michael I. Jordan Departments of Statistics and EECS University of California, Berkeley Joint work with Kenji Fukumizu and Francis Bach

More information

Multivariate Analysis

Multivariate Analysis Prof. Dr. J. Franke All of Statistics 3.1 Multivariate Analysis High dimensional data X 1,..., X N, i.i.d. random vectors in R p. As a data matrix X: objects values of p features 1 X 11 X 12... X 1p 2.

More information

This is an author-deposited version published in : Eprints ID : 17710

This is an author-deposited version published in :   Eprints ID : 17710 Open Archive TOULOUSE Archive Ouverte (OATAO) OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible. This is an author-deposited

More information

Introduction to Gaussian Process

Introduction to Gaussian Process Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression

More information

Sparse Permutation Invariant Covariance Estimation: Final Talk

Sparse Permutation Invariant Covariance Estimation: Final Talk Sparse Permutation Invariant Covariance Estimation: Final Talk David Prince Biostat 572 dprince3@uw.edu May 31, 2012 David Prince (UW) SPICE May 31, 2012 1 / 19 Electronic Journal of Statistics Vol. 2

More information

EECS490: Digital Image Processing. Lecture #26

EECS490: Digital Image Processing. Lecture #26 Lecture #26 Moments; invariant moments Eigenvector, principal component analysis Boundary coding Image primitives Image representation: trees, graphs Object recognition and classes Minimum distance classifiers

More information

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent

More information

Machine learning techniques for the inversion of planetary hyperspectral images

Machine learning techniques for the inversion of planetary hyperspectral images Machine learning techniques for the inversion of planetary hyperspectral images Caroline Bernard-Michel, Sylvain Douté, Mathieu Fauvel, Laurent Gardes, Stephane Girard To cite this version: Caroline Bernard-Michel,

More information

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396 Data Mining Linear & nonlinear classifiers Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem Set 2 Due date: Wednesday October 6 Please address all questions and comments about this problem set to 6867-staff@csail.mit.edu. You will need to use MATLAB for some of

More information

Regularized Least Squares

Regularized Least Squares Regularized Least Squares Ryan M. Rifkin Google, Inc. 2008 Basics: Data Data points S = {(X 1, Y 1 ),...,(X n, Y n )}. We let X simultaneously refer to the set {X 1,...,X n } and to the n by d matrix whose

More information

Linear Regression and Discrimination

Linear Regression and Discrimination Linear Regression and Discrimination Kernel-based Learning Methods Christian Igel Institut für Neuroinformatik Ruhr-Universität Bochum, Germany http://www.neuroinformatik.rub.de July 16, 2009 Christian

More information

Cluster Kernels for Semi-Supervised Learning

Cluster Kernels for Semi-Supervised Learning Cluster Kernels for Semi-Supervised Learning Olivier Chapelle, Jason Weston, Bernhard Scholkopf Max Planck Institute for Biological Cybernetics, 72076 Tiibingen, Germany {first. last} @tuebingen.mpg.de

More information