Robust cartogram visualization of outliers in manifold leaning

Size: px
Start display at page:

Download "Robust cartogram visualization of outliers in manifold leaning"

Transcription

1 Robust cartogram visualization of outliers in manifold leaning Alessandra Tosi and Alfredo Vellido - LSI Department - UPC, Barcelona

2 1 Introduction Goals 2 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation 3 Cartograms representations for GTM and its variants Results

3 Table of Contents Introduction Goals 1 Introduction Goals 2 3

4 Goals PROBLEM: Increasing amount of available high-dimensional data sets, with different levels of complexity and growing diversity of characteristics.

5 Goals PROBLEM: Increasing amount of available high-dimensional data sets, with different levels of complexity and growing diversity of characteristics. CHALLENGE: Translation of raw data into useful information that can be acted upon in practical terms.

6 Goals PROBLEM: Increasing amount of available high-dimensional data sets, with different levels of complexity and growing diversity of characteristics. CHALLENGE: Translation of raw data into useful information that can be acted upon in practical terms. Nonlinear Dimensionality Reduction: Nonlinear techniques are applied to reduce dimensionality of data in order to explore multivariate data. It is almost impossible to completely avoid geometrical distortions while reducing dimensionality

7 Goals PROBLEM: Increasing amount of available high-dimensional data sets, with different levels of complexity and growing diversity of characteristics. CHALLENGE: Translation of raw data into useful information that can be acted upon in practical terms. Nonlinear Dimensionality Reduction: Nonlinear techniques are applied to reduce dimensionality of data in order to explore multivariate data. It is almost impossible to completely avoid geometrical distortions while reducing dimensionality Distortion Measures: Quantify and visualize this distortion itself in order to interpret data in a more faithful way.

8 Goals PROBLEM: Increasing amount of available high-dimensional data sets, with different levels of complexity and growing diversity of characteristics. CHALLENGE: Translation of raw data into useful information that can be acted upon in practical terms. Nonlinear Dimensionality Reduction: Nonlinear techniques are applied to reduce dimensionality of data in order to explore multivariate data. It is almost impossible to completely avoid geometrical distortions while reducing dimensionality Distortion Measures: Quantify and visualize this distortion itself in order to interpret data in a more faithful way. Visualization: Explicitly reintroducing the local distortion created by NLDR models into the low-dimensional representation of the MVD for visualization that they produce.

9 Table of Contents Introduction NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation 1 Introduction 2 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation 3

10 NLDR methods for MVD visualization NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation To successfully analyse real data, more complex models are often required: Nonlinear Dimensionality Reduction models (NLDR).

11 NLDR methods for MVD visualization NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation To successfully analyse real data, more complex models are often required: Nonlinear Dimensionality Reduction models (NLDR). Manifold learning attempts to describe MVD through nonlinear low-dimensional manifolds embedded in the observed data space. The aim is to discover the underlying geometry of data, while preserving the topology rather than pairwise distances and generating a lowdimensionality model.

12 NLDR methods for MVD visualization NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation To successfully analyse real data, more complex models are often required: Nonlinear Dimensionality Reduction models (NLDR). Latent Variables Models attempt to provide an additional set of variables (latent or hidden variables) in addition to the observed ones.

13 NLDR methods for MVD visualization NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation To successfully analyse real data, more complex models are often required: Nonlinear Dimensionality Reduction models (NLDR). Vector quantization reduces the number of observation by replacing original data with a smaller set of vectors of the same dimension, called prototypes (units, neurons, centroids, weight vectors)

14 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Generative Topographic Mapping (GTM) The Generative Topographic Mapping (GTM) is a nonlinear Latent Variable Model developed by Bishop, Svensén and Williams in the late nineties.

15 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Generative Topographic Mapping (GTM) The Generative Topographic Mapping (GTM) is a nonlinear Latent Variable Model developed by Bishop, Svensén and Williams in the late nineties. Basic GTM defines a Gaussian probability distribution in the latent space, in order to induce the corresponding probability distribution in the observed data space, using concepts of Bayesian inference. Images of sampled data points, or prototypes, are defined according to the following rule: y k = WΦ(u k )

16 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Generative Topographic Mapping (GTM) The Generative Topographic Mapping (GTM) is a nonlinear Latent Variable Model developed by Bishop, Svensén and Williams in the late nineties. Basic GTM defines a Gaussian probability distribution in the latent space, in order to induce the corresponding probability distribution in the observed data space, using concepts of Bayesian inference. Images of sampled data points, or prototypes, are defined according to the following rule: y k = WΦ(u k ) The basic GTM model has some limitations when dealing with atypical data or outliers, as they are likely to bias the estimation of its parameters. More robust formulations of GTM have been proposed using a mixture of Student s t-distributions (t-gtm).

17 Magnification Factor Introduction NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation

18 Magnification Factor Introduction NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation da /da = det(jj T ) J is the Jacobian (of dimension 2 d) of the mapping transformation.

19 Cartograms Introduction NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation

20 Cartograms Introduction NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation (T x 1,T x 2) (x 1,x 2 ) d

21 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Cartograms representations for NLDR methods We propose a Cartogram-based method, in which :

22 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Cartograms representations for NLDR methods We propose a Cartogram-based method, in which : political borders of geographic maps are replaced by the square grid of latent points u k in the visualization space

23 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Cartograms representations for NLDR methods We propose a Cartogram-based method, in which : political borders of geographic maps are replaced by the square grid of latent points u k in the visualization space map-underlying quantities such as density of population are replaced by the Magnification Factor

24 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Cartograms representations for NLDR methods We propose a Cartogram-based method, in which : political borders of geographic maps are replaced by the square grid of latent points u k in the visualization space map-underlying quantities such as density of population are replaced by the Magnification Factor the level of distortion within each of the squares associated to u k is assumed to be uniform

25 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Cartograms representations for NLDR methods We propose a Cartogram-based method, in which : political borders of geographic maps are replaced by the square grid of latent points u k in the visualization space map-underlying quantities such as density of population are replaced by the Magnification Factor the level of distortion within each of the squares associated to u k is assumed to be uniform the level of distortion in the space beyond this square grid is assumed to be uniform and equal to the mean distortion over the complete map, that is 1/K K k=1 J(u k), where J is the Jacobian of the transformation of the considered method

26 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Cartograms representations for NLDR methods GOAL: better visualize the embedded model manifold, expecting inter-point distances in the observed data space to be more faithfully reflected in the low-dimensional representation space.

27 NLDR methods: Generative Topographic Mapping Distortion measures in NLDR: Magnification Factor Cartogram-based representation Cartograms representations for NLDR methods GOAL: better visualize the embedded model manifold, expecting inter-point distances in the observed data space to be more faithfully reflected in the low-dimensional representation space. An advantage of this cartogram-based method is its portability, as it should be easy to implement for different representation architectures and with alternative NLDR visualization techniques for which distortion can be quantified.

28 Table of Contents Introduction Cartograms representations for GTM and its variants Results 1 Introduction 2 3 Cartograms representations for GTM and its variants Results

29 Cartograms representations for t-gtm Cartograms representations for GTM and its variants Results In the following experiments we investigate the impact of outliers on the visualization using both basic GTM and t-gtm.

30 Cartograms representations for GTM Cartograms representations for GTM and its variants Results Calculate over continuum the Jacobian J of the mapping transformation in basic GTM algorithm, in terms of the derivatives of the basis functions Φ, and apply the Magnification Factor (MF) formula: da /da = det(jj T )

31 Cartograms representations for GTM Cartograms representations for GTM and its variants Results Calculate over continuum the Jacobian J of the mapping transformation in basic GTM algorithm, in terms of the derivatives of the basis functions Φ, and apply the Magnification Factor (MF) formula: da /da = det(jj T ) Basic GTM da /da = det(ψ T W T WΨ) where Ψ is a M 2 matrix with elements ψ mi = φ m / u i, m = 1,...,M,i = 1,...,2.

32 Cartograms representations for t-gtm Cartograms representations for GTM and its variants Results The conditional distribution of the observed data variables, given the latent variables, p(x u) takes the following form for t-gtm: p(x u,w,β,ν) = Γ(ν+D 2 )β D 2 Γ( ν 2 )(νπ)d/2(1+ β ν x y(u) 2 ) ν+d 2, (1) To implement the Magnification Factor, we explicitly calculate the Jacobian J = ΨW, where Ψ is a M 2 matrix with elements ϕ mi, defined as:

33 Cartograms representations for t-gtm Cartograms representations for GTM and its variants Results The conditional distribution of the observed data variables, given the latent variables, p(x u) takes the following form for t-gtm: p(x u,w,β,ν) = Γ(ν+D 2 )β D 2 Γ( ν 2 )(νπ)d/2(1+ β ν x y(u) 2 ) ν+d 2, (1) To implement the Magnification Factor, we explicitly calculate the Jacobian J = ΨW, where Ψ is a M 2 matrix with elements ϕ mi, defined as: t-gtm φ m u i = Γ(ν+D D+2 2 )( ν D)β 2 Γ( ν 2 )πd/2 ν D+2 2 ( u i µ i ) ( m 1+ β ) ν+d 2 ν u µ m 2 2 (2)

34 Cartograms representations for GTM Cartograms representations for GTM and its variants Results GTM t GTM Representation of data together with the manifold grid (GTM on the left, t-gtm on the right).

35 Cartograms representations for GTM Cartograms representations for GTM and its variants Results Representation of MF maps and corresponding cartograms (GTM on the left, t-gtm on the right)

36 Cartograms representations for GTM Cartograms representations for GTM and its variants Results Representation of MF maps and corresponding cartograms (GTM on the left, t-gtm on the right)

37 Cartograms representations for GTM Cartograms representations for GTM and its variants Results GTM t GTM Representation of data together with the manifold grid (GTM on the left, t-gtm on the right).

38 Cartograms representations for GTM Cartograms representations for GTM and its variants Results Representation of MF maps and corresponding cartograms (GTM on the left, t-gtm on the right)

39 Cartograms representations for GTM Cartograms representations for GTM and its variants Results Representation of MF maps and corresponding cartograms (GTM on the left, t-gtm on the right)

40 Useful Links Introduction Cartograms representations for GTM and its variants Results Cartograms Software Somtoolbox for Matlab R Netlab3 3 for Matlab R

41 A short bibliography Introduction Cartograms representations for GTM and its variants Results M. Aupetit, Visualizing distortions and recovering topology in continuous projection techniques, Neurocomputing 70(7-9), pp , C.M. Bishop, M. Svensén and C.K.I. Williams, Magnification factors for the SOM and GTM algorithms, Proceedings of the Workshop on Self-Organizing Maps (WSOM 97), pp , June 4-6, Helsinki (Finland), M.T. Gastner and M.E.J. Newman, Diffusion-based method for producing density-equalizing maps, Proceedings of the National Academy of Sciences of the United States of America, 101(20), pp , National Academy of Sciences, A. Tosi, A. Vellido, Cartogram representation of the batch-som magnification factor. Proceedings of European Symposium on Artificial Neural Networks (ESANN), Bruges, Belgium, pp , A. Vellido, Missing data imputation through GTM as a mixture of t-distributions, Neural Networks 19(10), pp , A. Vellido, Assessment of an Unsupervised Feature Selection Method for Generative Topographic Mapping. 16th International Conference on Artificial Neural Networks (ICANN), Athens, Greece. LNCS Vol.4132, pp , A. Vellido, P.J.G. Lisboa, D. Vicente, Robust analysis of MRS brain tumour data using t-gtm, Neurocomputing, 69(7-9), pp , A. Vellido, J.D. Martín, F. Rossi, P.J.G. Lisboa, Seeing is believing: The importance of visualization in real-world machine learning applications, In M. Verleysen, editor, Proceedings of European Symposium on Artificial Neural Networks (ESANN), pp , Bruges, Belgium, A. Vellido, J.D. MartÃn-Guerrero, P.J.G. Lisboa, Making machine learning models interpretable. Proceedings of European Symposium on Artificial Neural Networks (ESANN), pp , 2012.

42 THANK YOU - QUESTIONS? Cartograms representations for GTM and its variants Results Alessandra Tosi - atosi/ atosi@lsi.upc.edu Alfredo Vellido - avellido/ avellido@lsi.upc.edu

Probability ridges and distortion flows: Visualizing multivariate time series using a variational Bayesian manifold learning method

Probability ridges and distortion flows: Visualizing multivariate time series using a variational Bayesian manifold learning method Probability ridges and distortion flows: Visualizing multivariate time series using a variational Bayesian manifold learning method Alessandra Tosi 1, Iván Olier 2, and Alfredo Vellido 1 1 Dept. de Llenguatges

More information

Self-Organization by Optimizing Free-Energy

Self-Organization by Optimizing Free-Energy Self-Organization by Optimizing Free-Energy J.J. Verbeek, N. Vlassis, B.J.A. Kröse University of Amsterdam, Informatics Institute Kruislaan 403, 1098 SJ Amsterdam, The Netherlands Abstract. We present

More information

Matching the dimensionality of maps with that of the data

Matching the dimensionality of maps with that of the data Matching the dimensionality of maps with that of the data COLIN FYFE Applied Computational Intelligence Research Unit, The University of Paisley, Paisley, PA 2BE SCOTLAND. Abstract Topographic maps are

More information

Generative topographic mapping by deterministic annealing

Generative topographic mapping by deterministic annealing Procedia Computer Science Procedia Computer Science 00 (2010) 1 10 International Conference on Computational Science, ICCS 2010 Generative topographic mapping by deterministic annealing Jong Youl Choi

More information

Bayesian ensemble learning of generative models

Bayesian ensemble learning of generative models Chapter Bayesian ensemble learning of generative models Harri Valpola, Antti Honkela, Juha Karhunen, Tapani Raiko, Xavier Giannakopoulos, Alexander Ilin, Erkki Oja 65 66 Bayesian ensemble learning of generative

More information

Linear and Non-Linear Dimensionality Reduction

Linear and Non-Linear Dimensionality Reduction Linear and Non-Linear Dimensionality Reduction Alexander Schulz aschulz(at)techfak.uni-bielefeld.de University of Pisa, Pisa 4.5.215 and 7.5.215 Overview Dimensionality Reduction Motivation Linear Projections

More information

Bayesian Semi Non-negative Matrix Factorisation

Bayesian Semi Non-negative Matrix Factorisation ESANN 6 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 7-9 April 6, i6doc.com publ., ISBN 978-875877-8. Bayesian Semi

More information

Immediate Reward Reinforcement Learning for Projective Kernel Methods

Immediate Reward Reinforcement Learning for Projective Kernel Methods ESANN'27 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), 25-27 April 27, d-side publi., ISBN 2-9337-7-2. Immediate Reward Reinforcement Learning for Projective Kernel Methods

More information

EM-algorithm for Training of State-space Models with Application to Time Series Prediction

EM-algorithm for Training of State-space Models with Application to Time Series Prediction EM-algorithm for Training of State-space Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology - Neural Networks Research

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

Metrics for Probabilistic Geometries

Metrics for Probabilistic Geometries Metrics for Probabilistic Geometries Alessandra Tosi Dept. of Computer Science Universitat Politècnica de Catalunya Barcelona, Spain Søren Hauberg DTU Compute Technical University of Denmark Denmark Alfredo

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Learning Vector Quantization (LVQ)

Learning Vector Quantization (LVQ) Learning Vector Quantization (LVQ) Introduction to Neural Computation : Guest Lecture 2 John A. Bullinaria, 2007 1. The SOM Architecture and Algorithm 2. What is Vector Quantization? 3. The Encoder-Decoder

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable

More information

Lecture 1a: Basic Concepts and Recaps

Lecture 1a: Basic Concepts and Recaps Lecture 1a: Basic Concepts and Recaps Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced

More information

Mixtures of Robust Probabilistic Principal Component Analyzers

Mixtures of Robust Probabilistic Principal Component Analyzers Mixtures of Robust Probabilistic Principal Component Analyzers Cédric Archambeau, Nicolas Delannay 2 and Michel Verleysen 2 - University College London, Dept. of Computer Science Gower Street, London WCE

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

Long-Term Time Series Forecasting Using Self-Organizing Maps: the Double Vector Quantization Method

Long-Term Time Series Forecasting Using Self-Organizing Maps: the Double Vector Quantization Method Long-Term Time Series Forecasting Using Self-Organizing Maps: the Double Vector Quantization Method Geoffroy Simon Université catholique de Louvain DICE - Place du Levant, 3 B-1348 Louvain-la-Neuve Belgium

More information

Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction

Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction A presentation by Evan Ettinger on a Paper by Vin de Silva and Joshua B. Tenenbaum May 12, 2005 Outline Introduction The

More information

Stochastic Gradient Estimate Variance in Contrastive Divergence and Persistent Contrastive Divergence

Stochastic Gradient Estimate Variance in Contrastive Divergence and Persistent Contrastive Divergence ESANN 0 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 7-9 April 0, idoc.com publ., ISBN 97-7707-. Stochastic Gradient

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

On the complexity of shallow and deep neural network classifiers

On the complexity of shallow and deep neural network classifiers On the complexity of shallow and deep neural network classifiers Monica Bianchini and Franco Scarselli Department of Information Engineering and Mathematics University of Siena Via Roma 56, I-53100, Siena,

More information

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Afternoon Meeting on Bayesian Computation 2018 University of Reading Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of

More information

Learning Vector Quantization

Learning Vector Quantization Learning Vector Quantization Neural Computation : Lecture 18 John A. Bullinaria, 2015 1. SOM Architecture and Algorithm 2. Vector Quantization 3. The Encoder-Decoder Model 4. Generalized Lloyd Algorithms

More information

Machine Learning of Environmental Spatial Data Mikhail Kanevski 1, Alexei Pozdnoukhov 2, Vasily Demyanov 3

Machine Learning of Environmental Spatial Data Mikhail Kanevski 1, Alexei Pozdnoukhov 2, Vasily Demyanov 3 1 3 4 5 6 7 8 9 10 11 1 13 14 15 16 17 18 19 0 1 3 4 5 6 7 8 9 30 31 3 33 International Environmental Modelling and Software Society (iemss) 01 International Congress on Environmental Modelling and Software

More information

Robust Probabilistic Projections

Robust Probabilistic Projections Cédric Archambeau cedric.archambeau@uclouvain.be Nicolas Delannay nicolas.delannay@uclouvain.be Michel Verleysen verleysen@dice.ucl.ac.be Université catholique de Louvain, Machine Learning Group, 3 Pl.

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Dimensionality Reduction and Principle Components Analysis

Dimensionality Reduction and Principle Components Analysis Dimensionality Reduction and Principle Components Analysis 1 Outline What is dimensionality reduction? Principle Components Analysis (PCA) Example (Bishop, ch 12) PCA vs linear regression PCA as a mixture

More information

On the Noise Model of Support Vector Machine Regression. Massimiliano Pontil, Sayan Mukherjee, Federico Girosi

On the Noise Model of Support Vector Machine Regression. Massimiliano Pontil, Sayan Mukherjee, Federico Girosi MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1651 October 1998

More information

Replacing eligibility trace for action-value learning with function approximation

Replacing eligibility trace for action-value learning with function approximation Replacing eligibility trace for action-value learning with function approximation Kary FRÄMLING Helsinki University of Technology PL 5500, FI-02015 TKK - Finland Abstract. The eligibility trace is one

More information

Information Theory Related Learning

Information Theory Related Learning Information Theory Related Learning T. Villmann 1,A.Cichocki 2,andJ.Principe 3 1 University of Applied Sciences Mittweida Dep. of Mathematics/Natural & Computer Sciences, Computational Intelligence Group

More information

Scuola di Calcolo Scientifico con MATLAB (SCSM) 2017 Palermo 31 Luglio - 4 Agosto 2017

Scuola di Calcolo Scientifico con MATLAB (SCSM) 2017 Palermo 31 Luglio - 4 Agosto 2017 Scuola di Calcolo Scientifico con MATLAB (SCSM) 2017 Palermo 31 Luglio - 4 Agosto 2017 www.u4learn.it Ing. Giuseppe La Tona Sommario Machine Learning definition Machine Learning Problems Artificial Neural

More information

Multiple Kernel Self-Organizing Maps

Multiple Kernel Self-Organizing Maps Multiple Kernel Self-Organizing Maps Madalina Olteanu (2), Nathalie Villa-Vialaneix (1,2), Christine Cierco-Ayrolles (1) http://www.nathalievilla.org {madalina.olteanu,nathalie.villa}@univ-paris1.fr, christine.cierco@toulouse.inra.fr

More information

Functional Preprocessing for Multilayer Perceptrons

Functional Preprocessing for Multilayer Perceptrons Functional Preprocessing for Multilayer Perceptrons Fabrice Rossi and Brieuc Conan-Guez Projet AxIS, INRIA, Domaine de Voluceau, Rocquencourt, B.P. 105 78153 Le Chesnay Cedex, France CEREMADE, UMR CNRS

More information

Radial Basis Function Networks. Ravi Kaushik Project 1 CSC Neural Networks and Pattern Recognition

Radial Basis Function Networks. Ravi Kaushik Project 1 CSC Neural Networks and Pattern Recognition Radial Basis Function Networks Ravi Kaushik Project 1 CSC 84010 Neural Networks and Pattern Recognition History Radial Basis Function (RBF) emerged in late 1980 s as a variant of artificial neural network.

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

Machine Learning. Chao Lan

Machine Learning. Chao Lan Machine Learning Chao Lan Clustering and Dimensionality Reduction Clustering Kmeans DBSCAN Gaussian Mixture Model Dimensionality Reduction principal component analysis manifold learning Other Feature Processing

More information

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN , ESANN'200 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), 25-27 April 200, D-Facto public., ISBN 2-930307-0-3, pp. 79-84 Investigating the Influence of the Neighborhood

More information

Jae-Bong Lee 1 and Bernard A. Megrey 2. International Symposium on Climate Change Effects on Fish and Fisheries

Jae-Bong Lee 1 and Bernard A. Megrey 2. International Symposium on Climate Change Effects on Fish and Fisheries International Symposium on Climate Change Effects on Fish and Fisheries On the utility of self-organizing maps (SOM) and k-means clustering to characterize and compare low frequency spatial and temporal

More information

The Generative Self-Organizing Map. A Probabilistic Generalization of Kohonen s SOM

The Generative Self-Organizing Map. A Probabilistic Generalization of Kohonen s SOM Submitted to: European Symposium on Artificial Neural Networks 2003 Universiteit van Amsterdam IAS technical report IAS-UVA-02-03 The Generative Self-Organizing Map A Probabilistic Generalization of Kohonen

More information

Lecture 7: Con3nuous Latent Variable Models

Lecture 7: Con3nuous Latent Variable Models CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/

More information

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland

Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland EnviroInfo 2004 (Geneva) Sh@ring EnviroInfo 2004 Advanced analysis and modelling tools for spatial environmental data. Case study: indoor radon data in Switzerland Mikhail Kanevski 1, Michel Maignan 1

More information

Remaining energy on log scale Number of linear PCA components

Remaining energy on log scale Number of linear PCA components NONLINEAR INDEPENDENT COMPONENT ANALYSIS USING ENSEMBLE LEARNING: EXPERIMENTS AND DISCUSSION Harri Lappalainen, Xavier Giannakopoulos, Antti Honkela, and Juha Karhunen Helsinki University of Technology,

More information

Spatial Statistics & R

Spatial Statistics & R Spatial Statistics & R Our recent developments C. Vega Orozco, J. Golay, M. Tonini, M. Kanevski Center for Research on Terrestrial Environment Faculty of Geosciences and Environment University of Lausanne

More information

Variational Autoencoders

Variational Autoencoders Variational Autoencoders Recap: Story so far A classification MLP actually comprises two components A feature extraction network that converts the inputs into linearly separable features Or nearly linearly

More information

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models JMLR Workshop and Conference Proceedings 6:17 164 NIPS 28 workshop on causality Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models Kun Zhang Dept of Computer Science and HIIT University

More information

Table S1 shows the SOM parameters that were used in the main manuscript. These correspond to the default set of options of the SOM Toolbox.

Table S1 shows the SOM parameters that were used in the main manuscript. These correspond to the default set of options of the SOM Toolbox. Supplemental file 1: SOM training parameters and sensitivity analysis. Table S1 shows the SOM parameters that were used in the main manuscript. These correspond to the default set of options of the SOM

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Neural networks: Unsupervised learning

Neural networks: Unsupervised learning Neural networks: Unsupervised learning 1 Previously The supervised learning paradigm: given example inputs x and target outputs t learning the mapping between them the trained network is supposed to give

More information

A Sliding Mode Controller Using Neural Networks for Robot Manipulator

A Sliding Mode Controller Using Neural Networks for Robot Manipulator ESANN'4 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), 8-3 April 4, d-side publi., ISBN -9337-4-8, pp. 93-98 A Sliding Mode Controller Using Neural Networks for Robot

More information

Bayesian Sampling and Ensemble Learning in Generative Topographic Mapping

Bayesian Sampling and Ensemble Learning in Generative Topographic Mapping Bayesian Sampling and Ensemble Learning in Generative Topographic Mapping Akio Utsugi National Institute of Bioscience and Human-Technology, - Higashi Tsukuba Ibaraki 35-8566, Japan March 6, Abstract Generative

More information

Neural Network Based Response Surface Methods a Comparative Study

Neural Network Based Response Surface Methods a Comparative Study . LS-DYNA Anwenderforum, Ulm Robustheit / Optimierung II Neural Network Based Response Surface Methods a Comparative Study Wolfram Beyer, Martin Liebscher, Michael Beer, Wolfgang Graf TU Dresden, Germany

More information

COMS 4771 Lecture Course overview 2. Maximum likelihood estimation (review of some statistics)

COMS 4771 Lecture Course overview 2. Maximum likelihood estimation (review of some statistics) COMS 4771 Lecture 1 1. Course overview 2. Maximum likelihood estimation (review of some statistics) 1 / 24 Administrivia This course Topics http://www.satyenkale.com/coms4771/ 1. Supervised learning Core

More information

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method Antti Honkela 1, Stefan Harmeling 2, Leo Lundqvist 1, and Harri Valpola 1 1 Helsinki University of Technology,

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION Alexandre Iline, Harri Valpola and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O.Box

More information

Linear Factor Models. Deep Learning Decal. Hosted by Machine Learning at Berkeley

Linear Factor Models. Deep Learning Decal. Hosted by Machine Learning at Berkeley Linear Factor Models Deep Learning Decal Hosted by Machine Learning at Berkeley 1 Overview Agenda Background Linear Factor Models Probabilistic PCA Independent Component Analysis Slow Feature Analysis

More information

Non-parametric Residual Variance Estimation in Supervised Learning

Non-parametric Residual Variance Estimation in Supervised Learning Non-parametric Residual Variance Estimation in Supervised Learning Elia Liitiäinen, Amaury Lendasse, and Francesco Corona Helsinki University of Technology - Lab. of Computer and Information Science P.O.

More information

A CUSUM approach for online change-point detection on curve sequences

A CUSUM approach for online change-point detection on curve sequences ESANN 22 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges Belgium, 25-27 April 22, i6doc.com publ., ISBN 978-2-8749-49-. Available

More information

Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology

Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology Erkki Oja Department of Computer Science Aalto University, Finland

More information

Prototype-based Analysis of GAMA Galaxy Catalogue Data

Prototype-based Analysis of GAMA Galaxy Catalogue Data ESANN 08 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), - April 08, i6doc.com publ., ISBN 98-8804-6. Prototype-based Analysis

More information

Kernel methods for comparing distributions, measuring dependence

Kernel methods for comparing distributions, measuring dependence Kernel methods for comparing distributions, measuring dependence Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Principal component analysis Given a set of M centered observations

More information

Validation of nonlinear PCA

Validation of nonlinear PCA Matthias Scholz. Validation of nonlinear PCA. (pre-print version) The final publication is available at www.springerlink.com Neural Processing Letters, 212, Volume 36, Number 1, Pages 21-3 Doi: 1.17/s1163-12-922-6

More information

Missing Data in Kernel PCA

Missing Data in Kernel PCA Missing Data in Kernel PCA Guido Sanguinetti, Neil D. Lawrence Department of Computer Science, University of Sheffield 211 Portobello Street, Sheffield S1 4DP, U.K. 19th June 26 Abstract Kernel Principal

More information

Linear Factor Models. Sargur N. Srihari

Linear Factor Models. Sargur N. Srihari Linear Factor Models Sargur N. srihari@cedar.buffalo.edu 1 Topics in Linear Factor Models Linear factor model definition 1. Probabilistic PCA and Factor Analysis 2. Independent Component Analysis (ICA)

More information

Reconstruction Deconstruction:

Reconstruction Deconstruction: Reconstruction Deconstruction: A Brief History of Building Models of Nonlinear Dynamical Systems Jim Crutchfield Center for Computational Science & Engineering Physics Department University of California,

More information

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Creighton Heaukulani and Zoubin Ghahramani University of Cambridge TU Denmark, June 2013 1 A Network Dynamic network data

More information

Notes on Latent Semantic Analysis

Notes on Latent Semantic Analysis Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically

More information

Information Dynamics Foundations and Applications

Information Dynamics Foundations and Applications Gustavo Deco Bernd Schürmann Information Dynamics Foundations and Applications With 89 Illustrations Springer PREFACE vii CHAPTER 1 Introduction 1 CHAPTER 2 Dynamical Systems: An Overview 7 2.1 Deterministic

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of

More information

CS Lecture 18. Expectation Maximization

CS Lecture 18. Expectation Maximization CS 6347 Lecture 18 Expectation Maximization Unobserved Variables Latent or hidden variables in the model are never observed We may or may not be interested in their values, but their existence is crucial

More information

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models Kun Zhang Dept of Computer Science and HIIT University of Helsinki 14 Helsinki, Finland kun.zhang@cs.helsinki.fi Aapo Hyvärinen

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Knowledge

More information

Time series prediction

Time series prediction Chapter 12 Time series prediction Amaury Lendasse, Yongnan Ji, Nima Reyhani, Jin Hao, Antti Sorjamaa 183 184 Time series prediction 12.1 Introduction Amaury Lendasse What is Time series prediction? Time

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 8 Continuous Latent Variable

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

Large-scale nonlinear dimensionality reduction for network intrusion detection

Large-scale nonlinear dimensionality reduction for network intrusion detection ESANN 207 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 26-28 April 207, i6doc.com publ., ISBN 978-287587039-. Large-scale

More information

Mathematical Foundations of the Generalization of t-sne and SNE for Arbitrary Divergences

Mathematical Foundations of the Generalization of t-sne and SNE for Arbitrary Divergences MACHINE LEARNING REPORTS Mathematical Foundations of the Generalization of t-sne and SNE for Arbitrary Report 02/2010 Submitted: 01.04.2010 Published:26.04.2010 T. Villmann and S. Haase University of Applied

More information

Input Selection for Long-Term Prediction of Time Series

Input Selection for Long-Term Prediction of Time Series Input Selection for Long-Term Prediction of Time Series Jarkko Tikka, Jaakko Hollmén, and Amaury Lendasse Helsinki University of Technology, Laboratory of Computer and Information Science, P.O. Box 54,

More information

Latent Dirichlet Allocation Introduction/Overview

Latent Dirichlet Allocation Introduction/Overview Latent Dirichlet Allocation Introduction/Overview David Meyer 03.10.2016 David Meyer http://www.1-4-5.net/~dmm/ml/lda_intro.pdf 03.10.2016 Agenda What is Topic Modeling? Parametric vs. Non-Parametric Models

More information

LATENT VARIABLE MODELS. Microsoft Research 7 J. J. Thomson Avenue, Cambridge CB3 0FB, U.K.

LATENT VARIABLE MODELS. Microsoft Research 7 J. J. Thomson Avenue, Cambridge CB3 0FB, U.K. LATENT VARIABLE MODELS CHRISTOPHER M. BISHOP Microsoft Research 7 J. J. Thomson Avenue, Cambridge CB3 0FB, U.K. Published in Learning in Graphical Models, M. I. Jordan (Ed.), MIT Press (1999), 371 403.

More information

y(n) Time Series Data

y(n) Time Series Data Recurrent SOM with Local Linear Models in Time Series Prediction Timo Koskela, Markus Varsta, Jukka Heikkonen, and Kimmo Kaski Helsinki University of Technology Laboratory of Computational Engineering

More information

Efficient unsupervised clustering for spatial bird population analysis along the Loire river

Efficient unsupervised clustering for spatial bird population analysis along the Loire river Efficient unsupervised clustering for spatial bird population analysis along the Loire river Aurore Payen 1, Ludovic Journaux 1.2, Clément Delion 1, Lucile Sautot 1,3, Bruno Faivre 3 1- AgroSup Dijon 26

More information

Diversity-Promoting Bayesian Learning of Latent Variable Models

Diversity-Promoting Bayesian Learning of Latent Variable Models Diversity-Promoting Bayesian Learning of Latent Variable Models Pengtao Xie 1, Jun Zhu 1,2 and Eric Xing 1 1 Machine Learning Department, Carnegie Mellon University 2 Department of Computer Science and

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA

Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA with Hiroshi Morioka Dept of Computer Science University of Helsinki, Finland Facebook AI Summit, 13th June 2016 Abstract

More information

The Variational Gaussian Approximation Revisited

The Variational Gaussian Approximation Revisited The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much

More information

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92

ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

Unsupervised learning: beyond simple clustering and PCA

Unsupervised learning: beyond simple clustering and PCA Unsupervised learning: beyond simple clustering and PCA Liza Rebrova Self organizing maps (SOM) Goal: approximate data points in R p by a low-dimensional manifold Unlike PCA, the manifold does not have

More information

Self Organizing Maps

Self Organizing Maps Sta306b May 21, 2012 Dimension Reduction: 1 Self Organizing Maps A SOM represents the data by a set of prototypes (like K-means. These prototypes are topologically organized on a lattice structure. In

More information

Negatively Correlated Echo State Networks

Negatively Correlated Echo State Networks Negatively Correlated Echo State Networks Ali Rodan and Peter Tiňo School of Computer Science, The University of Birmingham Birmingham B15 2TT, United Kingdom E-mail: {a.a.rodan, P.Tino}@cs.bham.ac.uk

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Space-Time Kernels. Dr. Jiaqiu Wang, Dr. Tao Cheng James Haworth University College London

Space-Time Kernels. Dr. Jiaqiu Wang, Dr. Tao Cheng James Haworth University College London Space-Time Kernels Dr. Jiaqiu Wang, Dr. Tao Cheng James Haworth University College London Joint International Conference on Theory, Data Handling and Modelling in GeoSpatial Information Science, Hong Kong,

More information

Model-Based Reinforcement Learning with Continuous States and Actions

Model-Based Reinforcement Learning with Continuous States and Actions Marc P. Deisenroth, Carl E. Rasmussen, and Jan Peters: Model-Based Reinforcement Learning with Continuous States and Actions in Proceedings of the 16th European Symposium on Artificial Neural Networks

More information

Learning from Data: Multi-layer Perceptrons

Learning from Data: Multi-layer Perceptrons Learning from Data: Multi-layer Perceptrons Amos Storkey, School of Informatics University of Edinburgh Semester, 24 LfD 24 Layered Neural Networks Background Single Neurons Relationship to logistic regression.

More information

SPARSE REPRESENTATION AND BLIND DECONVOLUTION OF DYNAMICAL SYSTEMS. Liqing Zhang and Andrzej Cichocki

SPARSE REPRESENTATION AND BLIND DECONVOLUTION OF DYNAMICAL SYSTEMS. Liqing Zhang and Andrzej Cichocki SPARSE REPRESENTATON AND BLND DECONVOLUTON OF DYNAMCAL SYSTEMS Liqing Zhang and Andrzej Cichocki Lab for Advanced Brain Signal Processing RKEN Brain Science nstitute Wako shi, Saitama, 351-198, apan zha,cia

More information

Unsupervised Learning Methods

Unsupervised Learning Methods Structural Health Monitoring Using Statistical Pattern Recognition Unsupervised Learning Methods Keith Worden and Graeme Manson Presented by Keith Worden The Structural Health Monitoring Process 1. Operational

More information