High Dimensional Discriminant Analysis

Size: px
Start display at page:

Download "High Dimensional Discriminant Analysis"

Transcription

1 High Dimensional Discriminant Analysis Charles Bouveyron LMC-IMAG & INRIA Rhône-Alpes Joint work with S. Girard and C. Schmid High Dimensional Discriminant Analysis - Lear seminar p.1/43

2 Introduction High dimensional data: many scientific domains need to analyze data which are increasingly complex, modern data are made up of many variables: imagery (MRI, vision), biology (DNA micro-array),... Classification is very difficult in high dimensional spaces: many learning methods suffer from the curse of dimensionality [Bel61], since the number n of data is not generally sufficient to learn high-dimensional data. The empty space phenomena [ST83] allows to assume that data live in subspaces with lower dimensionality. High Dimensional Discriminant Analysis - Lear seminar p.2/43

3 Introduction Classification: supervised classification (discriminant analysis) requires examples of classes, unsupervised classification (clustering) aims to organize data in homogeneous classes. 2 ways: generative methods: QDA, LDA, GMM, discriminantive methods: logistic regression and SVM. Generative models can be both used in supervised and unsupervised classification. High Dimensional Discriminant Analysis - Lear seminar p.3/43

4 Outline of the talk Discriminant analysis framework New modelisation of high-dimensional data High dimensional discriminant analysis (HDDA) construction of the decision rule a posteriori probability and reformulation Particular rules Estimators and intrinsic dimension estimation Numerical results application to image categorization application to object recognition Extension to unsupervised classification High Dimensional Discriminant Analysis - Lear seminar p.4/43

5 Part 1 Discriminant analysis framework High Dimensional Discriminant Analysis - Lear seminar p.5/43

6 Discriminant analysis framework Discriminant analysis is the supervised part of classification, i.e. it requires a professor! Discriminant analysis goals: descriptive aspect: find a data representation which allows to interpret the groups using explanatory variables. decisional aspect: the major goal is to find the good class membership of a new data x. Of course, HDDA favours the decisional aspect! High Dimensional Discriminant Analysis - Lear seminar p.6/43

7 Discrimination problem The basic problem: assign an observation x = (x 1,..., x p ) R p with unknown class membership to one of k classes C 1,..., C k known a priori. We have a learning dataset A: A = {(x 1, c 1 ),..., (x n, c n )/x j R p and y j {1,..., k}}, where the vector x j contains p explanatory variables and y j indicates the index of the class of x j. We have to construct a decision rule δ: δ : R p {1,..., k} x y. High Dimensional Discriminant Analysis - Lear seminar p.7/43

8 Bayes decision rule The optimal decision rule δ, called Bayes decision rule, is : δ : x C i, if i = argmax{p(c i x)}, i=1,...,k δ : x C i, if i = argmin{ 2 log(π i f i (x))}, i=1,...,k where π i is the a priori probability of class C i and f i (x) denotes the class conditional density of x. Generative methods usually assume that distributions of classes are Gaussian N (µ i, Σ i ). High Dimensional Discriminant Analysis - Lear seminar p.8/43

9 Classical discriminant analysis method Quadratic discriminant analysis (QDA): i = argmin{(x µ i ) t Σ 1 i (x µ i ) + log(det Σ i ) 2 log(π i )}. i=1,...,k Linear discriminant analysis (LDA): with the assumption that i, Σ i = Σ i = argmin{µ t iσ 1 µ i 2µ t iσ 1 x 2 log(π i )}. i=1,...,k QDA and LDA have disappointing behavior when the size of the training dataset n is small compared to the number p of variables. High Dimensional Discriminant Analysis - Lear seminar p.9/43

10 Discriminant analysis regularization Dimension reduction: PCA, FDA, features selection, Fischer discriminant analysis (FDA) combines: a dimension reduction step (projection on the k 1 discriminant axes) with one of the previous methods (usually LDA). Parsimonious models: Regularized discriminant analysis (RDA, [Fri89]), is an intermediate classifier between QDA and LDA, Eigenvalue decomposition discriminant analysis (EDDA, [BC96]) is based on the re-parametrization of the covariance matrices of classes: Σ i = λ i D i A i D t i. High Dimensional Discriminant Analysis - Lear seminar p.10/43

11 Dimension reduction for classification PCA axes Discriminant axes Fig.1 - High-dimensional data which classes live in different subspaces with lower dimensionality. High Dimensional Discriminant Analysis - Lear seminar p.11/43

12 Part 2 New modelisation High Dimensional Discriminant Analysis - Lear seminar p.12/43

13 New modelisation The empty space phenomena enables us to assume that HD data live in subspaces with low dimensionality. The main idea of the new modelisation is: each class is decomposed on two subspaces with low dimensionality, and the classes are assumed spherical in these subspaces. High Dimensional Discriminant Analysis - Lear seminar p.13/43

14 New modelisation We assume that class conditional densities are Gaussian N (µ i, Σ i ) with means µ i and covariance matrices Σ i. Let Q i be the orthogonal matrix of eigenvectors of the covariance matrix Σ i, Let B i be the basis of R p made of the eigenvectors of Σ i. The class conditional covariance matrix i is defined in the basis B i by: i = Q t i Σ i Q i. High Dimensional Discriminant Analysis - Lear seminar p.14/43

15 New modelisation We assume in addition that i contains only two different eigenvalues a i > b i. Let E i be the affine space generated by eigenvectors associated to the eigenvalue a i and such that µ i E i. We define also E i such that E i E i = R p and µ i E i. Let P i and P i be the projection operators on E i and E i. High Dimensional Discriminant Analysis - Lear seminar p.15/43

16 New modelisation Thus, we assume that i has the following form: i = 0 a i a i 0 0 b i C A 9 >= >; 9 >= >; d i (p d i ) 0 b i High Dimensional Discriminant Analysis - Lear seminar p.16/43

17 New modelisation: illustration High Dimensional Discriminant Analysis - Lear seminar p.17/43

18 Part 3 High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis - Lear seminar p.18/43

19 High Dimensional Discriminant Analysis Under the preceding assumptions, the Bayes decision rule yields a new decision rule δ + : Theorem 1: The new decision rule δ + consists in classifying x to the class C i if: i = argmin i=1,...,k { 1 µ i P i (x) x P i (x) 2 a i b i } +d i log(a i ) + (p d i ) log(b i ) 2 log(π i ). High Dimensional Discriminant Analysis - Lear seminar p.19/43

20 HDDA: illustration K i (x) = 1 a i µ i P i (x) b i x P i (x) 2 + d i log(a i ) + (p d i ) log(b i ) 2 log(π i ). High Dimensional Discriminant Analysis - Lear seminar p.20/43

21 HDDA: a posteriori probability In many applications, it is interesting to dispose of the a posteriori probability p(c i x) that x belongs to C i. The Bayes formula yields: p(c i x) = exp ( 1 2 K i(x) ) k j=1 exp ( 1 2 K j(x) ), where K i is the cost function of δ + conditionally with the class C i : K i (x) = 1 a i µ i P i (x) b i x P i (x) 2 +d i log(a i ) + (p d i ) log(b i ) 2 log(π i ). High Dimensional Discriminant Analysis - Lear seminar p.21/43

22 HDDA: reformulation In order to interpret easily the decision rule δ +, we introduce α i and σ i : a i = σ2 i α i and b i = σ2 i (1 α i ) with α i ]0, 1[ and σ i > 0. Thus, the decision rule δ + consists in classifying x to the class C i if: { 1 i ( = argmin i=1,...,k σi 2 αi µ i P i (x) 2 + (1 α i ) x P i (x) 2) ( ) } 1 αi +2p log(σ i ) + d i log p log(1 α i ) 2 log(π i ). Notation: HDDA is the model [a i b i Q i d i ] or [α i σ i Q i d i ]. α i High Dimensional Discriminant Analysis - Lear seminar p.22/43

23 Part 4 Particular rules High Dimensional Discriminant Analysis - Lear seminar p.23/43

24 Particular rules By allowing some but not all of HDDA parameters to vary, we obtain 24 particular rules: which correspond to different regularizations, which some ones are easily geometrically interpretable, which 9 have explicit formulations. HDDA can be interpreted as a classical discriminant analysis in particular cases: if i, α i = 1 2 : δ+ is QDA with sperical classes, if in addition i, σ i = σ: δ + is LDA with sperical classes. High Dimensional Discriminant Analysis - Lear seminar p.24/43

25 Links with classical methods QDA Σ i = λ i D i A i D t i Σ i = Q i i Q t i EDDA HDDA Σ i = λdad t A i = Id α i = LDA QDAs... Σ i = σ 2 i Id σ i = σ LDAs π i = π LDA géo High Dimensional Discriminant Analysis - Lear seminar p.25/43

26 Model [ασq i d i ] The decision rule δ + consists in classifying x to the class C i if: i = argmin{α µ i P i (x) 2 + (1 α) x P i (x) 2 }. i=1,...,k High Dimensional Discriminant Analysis - Lear seminar p.26/43

27 Part 5 Estimation High Dimensional Discriminant Analysis - Lear seminar p.27/43

28 HDDA estimators Estimators are computed using maximum likelihood estimation from the learning set A. Common estimators: ˆπ i = n i n, n i = #(C i ), ˆµ i = 1 n i x j C i x j, ˆΣ i = 1 n i x j C i (x j ˆµ i ) t (x j ˆµ i ). High Dimensional Discriminant Analysis - Lear seminar p.28/43

29 Estimators of the model [a i b i Q i d i ] Assuming d i is known, the ML estimators are: ˆQ i is made of the eigenvectors associated to the ordered eigenvalues of ˆΣ i, â i is the mean of the largest d i eigenvalues of ˆΣ i : â i = d i l=1 λ il d i, ˆb i is the mean of the smallest (p d i ) eigenvalues of ˆΣ i : p λ il ˆb i = (p d i ). l=d i +1 High Dimensional Discriminant Analysis - Lear seminar p.29/43

30 Estimation trick The decision rule δ + do not requires to compute the last (p d i ) eigenvectors of ˆΣ i. Thus, in order to minize the number of parameters to estimate, we use the following relation: p l=d i +1 λ il = tr( ˆΣ i ) d i l=1 λ il. Number of parameters to estimate with p = 100, d i = 10 and k = 4: QDA: HDDA: High Dimensional Discriminant Analysis - Lear seminar p.30/43

31 Intrinsic dimension estimation We base our approach to chose the values of d i on eigenvalues of Σ i, We use two empirical methods: common thresholding on the cumulative variance: d p d i = argmin λ d=1,...,p 1 j / λ j s, j=1 j=1 scree-test of Cattell: analyses differences between the eigenvalues in order to find a brake in the scree of eigenvalues. High Dimensional Discriminant Analysis - Lear seminar p.31/43

32 Intrinsic dimension estimation Ordered eigenvalues of Σ i Ordered eigenvalues of Σ i Cumulative sum of eigenvalues Common tresholding Difference betwenn eigenvalues Scree-test of Cattell High Dimensional Discriminant Analysis - Lear seminar p.32/43

33 Part 6 Numerical results High Dimensional Discriminant Analysis - Lear seminar p.33/43

34 Results: artificial data Method Classification rate HDDA ([a i b i Q i d i ]) HDDA ([a i b i Q i d]) LDA FDA 0.51 SVM Gaussian densities in R 15, with d 1 = 3, d 2 = 4 and d 3 = 5, In addition, the proportions are very different: π 1 = 1 2, π 2 = 1 3 and π 3 = 1 6, High Dimensional Discriminant Analysis - Lear seminar p.34/43

35 Results: image categorization A recent study [LBGGDH03] proposes an approach based on the human perception to categorize natural images. An image is represented by a vector of 49 dimensions. Each one of these 49 components is the response of the image to a Gabor filter. High Dimensional Discriminant Analysis - Lear seminar p.35/43

36 Results: image categorization Data: 328 descriptors in 49 dimensions, Results: Method Classification rate HDDA ([a i b i Q i d i ]) HDDA ([a i bq i d]) QDA LDA FDA (d = k 1) 0.79 SVM Classification results for the image categorization experiment (leave-one-out). High Dimensional Discriminant Analysis - Lear seminar p.36/43

37 Results: object recognition Our approach uses local descriptors (Harris-Laplace+Sift), We consider 3 object classes (wheels, seat and handlebars) and 1 background class, The dataset is made of 1000 descriptors in 128 dimensions: learning dataset: 500, test dataset: 500. High Dimensional Discriminant Analysis - Lear seminar p.37/43

38 Results: object recognition True positives FDA LDA True positives SVM classifiers HDDA classifiers False positives HDDA with error probability < 10 5 with error probability < False positives Classification results for the object recognition experiment. High Dimensional Discriminant Analysis - Lear seminar p.38/43

39 Results: object recognition Recognition using HDDA Recognition using SVM High Dimensional Discriminant Analysis - Lear seminar p.39/43

40 Part 7 Unsupervised classification High Dimensional Discriminant Analysis - Lear seminar p.40/43

41 Extension to unsupervised classification The unsupervised classification aims to organize data in homogeneous classes. Gaussian mixture models (GMM) are an efficient way for unsupervised classification: in Gaussian mixture models, the density of the mixture is: f(x, θ) = k i=1 π i f i (x; µ i, Σ i ), where θ = {π 1,..., π k, µ 1,..., µ k, Σ 1,..., Σ k }. the parameters estimation is generally done by the EM algorithm. High Dimensional Discriminant Analysis - Lear seminar p.41/43

42 Extension to unsupervised classification Using our model for HD data, the two main steps of the EM algorithm are: E step: compute t (q) ij t (q) ij where K (q) i d (q) i log(a (q) i = t (q) i (x j ) = exp( K (q) i (x j )/2)/ k l=1 exp( K(q) l (x j )/2), (x) = µ(q) i P (q) i (x j ) 2 a (q) i ) + (p d (q) i ) log(b (q) i + x j P (q) i (x j ) 2 b (q) i ) 2 log(π (q) i ). M step: classical estimation of π i, µ i and Σ i ; the estimators of a i, b i and Q i are the same as those of HDDA. + High Dimensional Discriminant Analysis - Lear seminar p.42/43

43 References [BC96] H. Bensmail and G. Celeux. Regularized gaussian discriminant analysis through eigenvalue decomposition. Journal of the American Statistical Association, 91: , [Bel61] R. Bellman. Adaptive Control Processes. Princeton University Press, [Fri89] J.H. Friedman. Regularized discriminant analysis. Journal of the American Statistical Association, 84: , [LBGGDH03] H. Le Borgne, N. Guyader, A. Guerin-Dugué, and J. Hérault. Classification of images: Ica filters vs human perception. In 7th International Symposium on Signal Processing and its Applications, number 2, pages , [ST83] D. Scott and J. Thompson. Probability density estimation in higher dimensions. In Proceedings of the Fifteenth Symposium on the Interface, North Holland-Elsevier Science Publishers, pages , High Dimensional Discriminant Analysis - Lear seminar p.43/43

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron LMC-IMAG & INRIA Rhône-Alpes Joint work with S. Girard and C. Schmid ASMDA Brest May 2005 Introduction Modern data are high dimensional: Imagery:

More information

High Dimensional Discriminant Analysis

High Dimensional Discriminant Analysis High Dimensional Discriminant Analysis Charles Bouveyron 1,2, Stéphane Girard 1, and Cordelia Schmid 2 1 LMC IMAG, BP 53, Université Grenoble 1, 38041 Grenoble cedex 9 France (e-mail: charles.bouveyron@imag.fr,

More information

Classification of high dimensional data: High Dimensional Discriminant Analysis

Classification of high dimensional data: High Dimensional Discriminant Analysis Classification of high dimensional data: High Dimensional Discriminant Analysis Charles Bouveyron, Stephane Girard, Cordelia Schmid To cite this version: Charles Bouveyron, Stephane Girard, Cordelia Schmid.

More information

Model-based clustering of high-dimensional data: an overview and some recent advances

Model-based clustering of high-dimensional data: an overview and some recent advances Model-based clustering of high-dimensional data: an overview and some recent advances Charles BOUVEYRON Laboratoire SAMM, EA 4543 Université Paris 1 Panthéon-Sorbonne This presentation is based on several

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software January 2012, Volume 46, Issue 6. http://www.jstatsoft.org/ HDclassif: An R Package for Model-Based Clustering and Discriminant Analysis of High-Dimensional Data Laurent

More information

Introduction to Machine Learning Spring 2018 Note 18

Introduction to Machine Learning Spring 2018 Note 18 CS 189 Introduction to Machine Learning Spring 2018 Note 18 1 Gaussian Discriminant Analysis Recall the idea of generative models: we classify an arbitrary datapoint x with the class label that maximizes

More information

INRIA Rh^one-Alpes. Abstract. Friedman (1989) has proposed a regularization technique (RDA) of discriminant analysis

INRIA Rh^one-Alpes. Abstract. Friedman (1989) has proposed a regularization technique (RDA) of discriminant analysis Regularized Gaussian Discriminant Analysis through Eigenvalue Decomposition Halima Bensmail Universite Paris 6 Gilles Celeux INRIA Rh^one-Alpes Abstract Friedman (1989) has proposed a regularization technique

More information

Classification 2: Linear discriminant analysis (continued); logistic regression

Classification 2: Linear discriminant analysis (continued); logistic regression Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:

More information

Classification Methods II: Linear and Quadratic Discrimminant Analysis

Classification Methods II: Linear and Quadratic Discrimminant Analysis Classification Methods II: Linear and Quadratic Discrimminant Analysis Rebecca C. Steorts, Duke University STA 325, Chapter 4 ISL Agenda Linear Discrimminant Analysis (LDA) Classification Recall that linear

More information

ISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification

ISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification ISyE 6416: Computational Statistics Spring 2017 Lecture 5: Discriminant analysis and classification Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology

More information

Regularized Discriminant Analysis and Reduced-Rank LDA

Regularized Discriminant Analysis and Reduced-Rank LDA Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and

More information

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University

Chap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University Chap 2. Linear Classifiers (FTH, 4.1-4.4) Yongdai Kim Seoul National University Linear methods for classification 1. Linear classifiers For simplicity, we only consider two-class classification problems

More information

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II) Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture

More information

Adaptive Mixture Discriminant Analysis for. Supervised Learning with Unobserved Classes

Adaptive Mixture Discriminant Analysis for. Supervised Learning with Unobserved Classes Adaptive Mixture Discriminant Analysis for Supervised Learning with Unobserved Classes Charles Bouveyron SAMOS-MATISSE, CES, UMR CNRS 8174 Université Paris 1 (Panthéon-Sorbonne), Paris, France Abstract

More information

CMSC858P Supervised Learning Methods

CMSC858P Supervised Learning Methods CMSC858P Supervised Learning Methods Hector Corrada Bravo March, 2010 Introduction Today we discuss the classification setting in detail. Our setting is that we observe for each subject i a set of p predictors

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we

More information

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University Heeyoul (Henry) Choi Dept. of Computer Science Texas A&M University hchoi@cs.tamu.edu Introduction Speaker Adaptation Eigenvoice Comparison with others MAP, MLLR, EMAP, RMP, CAT, RSW Experiments Future

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we

More information

Machine Learning 11. week

Machine Learning 11. week Machine Learning 11. week Feature Extraction-Selection Dimension reduction PCA LDA 1 Feature Extraction Any problem can be solved by machine learning methods in case of that the system must be appropriately

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Pattern Recognition. Parameter Estimation of Probability Density Functions

Pattern Recognition. Parameter Estimation of Probability Density Functions Pattern Recognition Parameter Estimation of Probability Density Functions Classification Problem (Review) The classification problem is to assign an arbitrary feature vector x F to one of c classes. The

More information

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis Jonathan Taylor, 10/12 Slide credits: Sergio Bacallado 1 / 1 Review: Main strategy in Chapter 4 Find an estimate ˆP

More information

Model-Based Clustering of High-Dimensional Data: A review

Model-Based Clustering of High-Dimensional Data: A review Model-Based Clustering of High-Dimensional Data: A review Charles Bouveyron, Camille Brunet To cite this version: Charles Bouveyron, Camille Brunet. Model-Based Clustering of High-Dimensional Data: A review.

More information

Lecture 4: Probabilistic Learning

Lecture 4: Probabilistic Learning DD2431 Autumn, 2015 1 Maximum Likelihood Methods Maximum A Posteriori Methods Bayesian methods 2 Classification vs Clustering Heuristic Example: K-means Expectation Maximization 3 Maximum Likelihood Methods

More information

Clustering VS Classification

Clustering VS Classification MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans:

More information

Classification 1: Linear regression of indicators, linear discriminant analysis

Classification 1: Linear regression of indicators, linear discriminant analysis Classification 1: Linear regression of indicators, linear discriminant analysis Ryan Tibshirani Data Mining: 36-462/36-662 April 2 2013 Optional reading: ISL 4.1, 4.2, 4.4, ESL 4.1 4.3 1 Classification

More information

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang. Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive

More information

Supervised Learning. Regression Example: Boston Housing. Regression Example: Boston Housing

Supervised Learning. Regression Example: Boston Housing. Regression Example: Boston Housing Supervised Learning Unsupervised learning: To extract structure and postulate hypotheses about data generating process from observations x 1,...,x n. Visualize, summarize and compress data. We have seen

More information

Subspace Analysis for Facial Image Recognition: A Comparative Study. Yongbin Zhang, Lixin Lang and Onur Hamsici

Subspace Analysis for Facial Image Recognition: A Comparative Study. Yongbin Zhang, Lixin Lang and Onur Hamsici Subspace Analysis for Facial Image Recognition: A Comparative Study Yongbin Zhang, Lixin Lang and Onur Hamsici Outline 1. Subspace Analysis: Linear vs Kernel 2. Appearance-based Facial Image Recognition.

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, DATA11002 Introduction to Machine Learning Lecturer: Antti Ukkonen TAs: Saska Dönges and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer,

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Introduction to Graphical Models

Introduction to Graphical Models Introduction to Graphical Models The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang 2018. 1. 9 (Tue.) Yung-Kyun Noh GENERALIZATION FOR PREDICTION 2 Probabilistic

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Regularized Discriminant Analysis. Part I. Linear and Quadratic Discriminant Analysis. Discriminant Analysis. Example. Example. Class distribution

Regularized Discriminant Analysis. Part I. Linear and Quadratic Discriminant Analysis. Discriminant Analysis. Example. Example. Class distribution Part I 09.06.2006 Discriminant Analysis The purpose of discriminant analysis is to assign objects to one of several (K) groups based on a set of measurements X = (X 1, X 2,..., X p ) which are obtained

More information

THESIS COVARIANCE REGULARIZATION IN MIXTURE OF GAUSSIANS FOR HIGH-DIMENSIONAL IMAGE CLASSIFICATION. Submitted by. Daniel L Elliott

THESIS COVARIANCE REGULARIZATION IN MIXTURE OF GAUSSIANS FOR HIGH-DIMENSIONAL IMAGE CLASSIFICATION. Submitted by. Daniel L Elliott THESIS COVARIANCE REGULARIZATION IN MIXTURE OF GAUSSIANS FOR HIGH-DIMENSIONAL IMAGE CLASSIFICATION Submitted by Daniel L Elliott Department of Computer Science In partial fulfillment of the requirements

More information

Lecture 5. Gaussian Models - Part 1. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. November 29, 2016

Lecture 5. Gaussian Models - Part 1. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. November 29, 2016 Lecture 5 Gaussian Models - Part 1 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza November 29, 2016 Luigi Freda ( La Sapienza University) Lecture 5 November 29, 2016 1 / 42 Outline 1 Basics

More information

Model selection criteria in Classification contexts. Gilles Celeux INRIA Futurs (orsay)

Model selection criteria in Classification contexts. Gilles Celeux INRIA Futurs (orsay) Model selection criteria in Classification contexts Gilles Celeux INRIA Futurs (orsay) Cluster analysis Exploratory data analysis tools which aim is to find clusters in a large set of data (many observations

More information

Kernel discriminant analysis and clustering with parsimonious Gaussian process models

Kernel discriminant analysis and clustering with parsimonious Gaussian process models Kernel discriminant analysis and clustering with parsimonious Gaussian process models Charles Bouveyron, Mathieu Fauvel, Stephane Girard To cite this version: Charles Bouveyron, Mathieu Fauvel, Stephane

More information

5. Discriminant analysis

5. Discriminant analysis 5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density

More information

Bayesian Decision and Bayesian Learning

Bayesian Decision and Bayesian Learning Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i

More information

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2

STATS 306B: Unsupervised Learning Spring Lecture 2 April 2 STATS 306B: Unsupervised Learning Spring 2014 Lecture 2 April 2 Lecturer: Lester Mackey Scribe: Junyang Qian, Minzhe Wang 2.1 Recap In the last lecture, we formulated our working definition of unsupervised

More information

The Bayes classifier

The Bayes classifier The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal

More information

LDA, QDA, Naive Bayes

LDA, QDA, Naive Bayes LDA, QDA, Naive Bayes Generative Classification Models Marek Petrik 2/16/2017 Last Class Logistic Regression Maximum Likelihood Principle Logistic Regression Predict probability of a class: p(x) Example:

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26 Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar

More information

Multivariate Analysis

Multivariate Analysis Prof. Dr. J. Franke All of Statistics 3.1 Multivariate Analysis High dimensional data X 1,..., X N, i.i.d. random vectors in R p. As a data matrix X: objects values of p features 1 X 11 X 12... X 1p 2.

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Mathematical Formulation of Our Example

Mathematical Formulation of Our Example Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot

More information

Adaptive Mixture Discriminant Analysis for Supervised Learning with Unobserved Classes

Adaptive Mixture Discriminant Analysis for Supervised Learning with Unobserved Classes Adaptive Mixture Discriminant Analysis for Supervised Learning with Unobserved Classes Charles Bouveyron To cite this version: Charles Bouveyron. Adaptive Mixture Discriminant Analysis for Supervised Learning

More information

Qualifying Exam in Machine Learning

Qualifying Exam in Machine Learning Qualifying Exam in Machine Learning October 20, 2009 Instructions: Answer two out of the three questions in Part 1. In addition, answer two out of three questions in two additional parts (choose two parts

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Classification: Linear Discriminant Analysis

Classification: Linear Discriminant Analysis Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based

More information

CSC411: Final Review. James Lucas & David Madras. December 3, 2018

CSC411: Final Review. James Lucas & David Madras. December 3, 2018 CSC411: Final Review James Lucas & David Madras December 3, 2018 Agenda 1. A brief overview 2. Some sample questions Basic ML Terminology The final exam will be on the entire course; however, it will be

More information

Probabilistic Time Series Classification

Probabilistic Time Series Classification Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

MACHINE LEARNING ADVANCED MACHINE LEARNING

MACHINE LEARNING ADVANCED MACHINE LEARNING MACHINE LEARNING ADVANCED MACHINE LEARNING Recap of Important Notions on Estimation of Probability Density Functions 2 2 MACHINE LEARNING Overview Definition pdf Definition joint, condition, marginal,

More information

High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution

High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution High Dimensional Kullback-Leibler divergence for grassland classification using satellite image time series with high spatial resolution Presented by 1 In collaboration with Mathieu Fauvel1, Stéphane Girard2

More information

A Study of Relative Efficiency and Robustness of Classification Methods

A Study of Relative Efficiency and Robustness of Classification Methods A Study of Relative Efficiency and Robustness of Classification Methods Yoonkyung Lee* Department of Statistics The Ohio State University *joint work with Rui Wang April 28, 2011 Department of Statistics

More information

Spring 2006: Linear Discriminant Analysis, Etc.

Spring 2006: Linear Discriminant Analysis, Etc. 36-724 Spring 2006: Linear Discriminant Analysis, Etc. Brian Junker April 17, 2006 Review: The Bayes Classifier Linear and Quadratic Discriminant Analysis and Friends Linear regression of an indicator

More information

Minimum Message Length Inference and Mixture Modelling of Inverse Gaussian Distributions

Minimum Message Length Inference and Mixture Modelling of Inverse Gaussian Distributions Minimum Message Length Inference and Mixture Modelling of Inverse Gaussian Distributions Daniel F. Schmidt Enes Makalic Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School

More information

MSA220 Statistical Learning for Big Data

MSA220 Statistical Learning for Big Data MSA220 Statistical Learning for Big Data Lecture 4 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology More on Discriminant analysis More on Discriminant

More information

Machine Learning (CS 567) Lecture 5

Machine Learning (CS 567) Lecture 5 Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

Discriminant Analysis Documentation

Discriminant Analysis Documentation Discriminant Analysis Documentation Release 1 Tim Thatcher May 01, 2016 Contents 1 Installation 3 2 Theory 5 2.1 Linear Discriminant Analysis (LDA).................................. 5 2.2 Quadratic Discriminant

More information

LEC 4: Discriminant Analysis for Classification

LEC 4: Discriminant Analysis for Classification LEC 4: Discriminant Analysis for Classification Dr. Guangliang Chen February 25, 2016 Outline Last time: FDA (dimensionality reduction) Today: QDA/LDA (classification) Naive Bayes classifiers Matlab/Python

More information

Mixture of Gaussians Models

Mixture of Gaussians Models Mixture of Gaussians Models Outline Inference, Learning, and Maximum Likelihood Why Mixtures? Why Gaussians? Building up to the Mixture of Gaussians Single Gaussians Fully-Observed Mixtures Hidden Mixtures

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Machine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 2: Linear Classification Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d.

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

Statistical Machine Learning Hilary Term 2018

Statistical Machine Learning Hilary Term 2018 Statistical Machine Learning Hilary Term 2018 Pier Francesco Palamara Department of Statistics University of Oxford Slide credits and other course material can be found at: http://www.stats.ox.ac.uk/~palamara/sml18.html

More information

Dimensionality Reduction and Principal Components

Dimensionality Reduction and Principal Components Dimensionality Reduction and Principal Components Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Motivation Recall, in Bayesian decision theory we have: World: States Y in {1,..., M} and observations of X

More information

Linear Regression and Discrimination

Linear Regression and Discrimination Linear Regression and Discrimination Kernel-based Learning Methods Christian Igel Institut für Neuroinformatik Ruhr-Universität Bochum, Germany http://www.neuroinformatik.rub.de July 16, 2009 Christian

More information

Linear Methods for Prediction

Linear Methods for Prediction This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

Introduction to Machine Learning

Introduction to Machine Learning Outline Introduction to Machine Learning Bayesian Classification Varun Chandola March 8, 017 1. {circular,large,light,smooth,thick}, malignant. {circular,large,light,irregular,thick}, malignant 3. {oval,large,dark,smooth,thin},

More information

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to: System 2 : Modelling & Recognising Modelling and Recognising Classes of Classes of Shapes Shape : PDM & PCA All the same shape? System 1 (last lecture) : limited to rigidly structured shapes System 2 :

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent

More information

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA

Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Manifold Learning for Signal and Visual Processing Lecture 9: Probabilistic PCA (PPCA), Factor Analysis, Mixtures of PPCA Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inria.fr http://perception.inrialpes.fr/

More information

Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Does Modeling Lead to More Accurate Classification?

Does Modeling Lead to More Accurate Classification? Does Modeling Lead to More Accurate Classification? A Comparison of the Efficiency of Classification Methods Yoonkyung Lee* Department of Statistics The Ohio State University *joint work with Rui Wang

More information

Machine Learning Lecture 2

Machine Learning Lecture 2 Machine Perceptual Learning and Sensory Summer Augmented 15 Computing Many slides adapted from B. Schiele Machine Learning Lecture 2 Probability Density Estimation 16.04.2015 Bastian Leibe RWTH Aachen

More information

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians

University of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision

More information

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture

More information