ELM Classification: Postprocessing NBC
|
|
- Todd Blankenship
- 6 years ago
- Views:
Transcription
1 ELM Classification: Postprocessing by GMM or NBC Amaury Lendasse, Andrey Gritsenko, Emil Eirola, Yoan Miche and Kaj-Mikael Björk MIE Department and Informatics Initiative UIOWA - amaury-lendasse@uiowa.edu
2 Who am I? Belgian: sorry for my french accent :) 2
3 Who am I? Belgian: sorry for my french accent :) Born on April 16 th
4 Who am I? Postdoc at the University of Memphis Postdoc and adjunct Prof. at Aalto Univ. in Finland Prof. at the University of the Basque Country in Spain Lecturer in Arcada Lecturer in Aalto 4
5 Who am I? Associate Prof. at The University of Iowa: 50% MIE and 50% Informatics Initiative Research: Machine Learning, Big Data, Environmental Modeling 5
6 What is Classification? Supervised Learning: we have to predict some output y based on some input x y can have: two values (binary classification) several ranked values (very small, small, medium, big, very big): multi-class problem several unranked values (English, American, French, Belgian): multi-class problem 6
7 What is Classification? Old problem: Fisher, R.A. "The use of multiple measurements in taxonomic problems" Annual Eugenics, 7, Part II, (1936) Many models (classifiers) are available: DL, KNN, LVQ, RBFN, MLP, SVM, ELM, GMM, LDA, NBC What is the object on the right of this handsome guy? 7
8 Two types of Classifiers Classifiers that predict the class: DL, LVQ, RBFN, MLP, SVM, ELM, LDA, Easy to build Classifiers that provide a probability for each class: GMM, NBC, (KNN?) Not so Easy to build or not working 8
9 Two types of Classifiers Classifiers that predict the class: y = Belgian y = French y = American Classifiers that provide a probability for each class: y = [99% Belgian, 1% French, 0% American] y = [49% Belgian, 51% French, 0% American] y = [33% Belgian, 33% French, 34% American] 9
10 Two types of Classifiers Classifiers that predict the class: y = Belgian y = French y = American Classifiers that provide a probability for each class: y = [99% Belgian, 1% French, 0% American] y = [49% Belgian, 51% French, 0% American] y = [33% Belgian, 33% French, 34% American] 10
11 What do we want? A classifier that: is accurate/efficient (to be defined) fast to build (to train) fast to predict the class of a new element provide a probability for each class can be handled by a cute brunette (easy to use, automatic) 11
12 Bayes Formula p(c 1 x) = p(x C 1 )p(c 1 ) p(x C 1 )p(c 1 )+p(x C 2 )p(c 2 )+p(x C 3 )p(c 3 ) 12
13 Bayes Formula p(c 1 x) = p(x C 1 )p(c 1 ) p(x C 1 )p(c 1 )+p(x C 2 )p(c 2 )+p(x C 3 )p(c 3 ) p(c 2 x) = p(x C 2 )p(c 2 ) p(x C 1 )p(c 1 )+p(x C 2 )p(c 2 )+p(x C 3 )p(c 3 ) p(c 3 x) = p(x C 3 )p(c 3 ) p(x C 1 )p(c 1 )+p(x C 2 )p(c 2 )+p(x C 3 )p(c 3 ) 13
14 Naive Bayes Classifier p(c 1 x) = p(x C 1)p(C 1 ) 3P p(x C i )p(c i ) i=1 14
15 Naive Bayes Classifier p(c 1 x) = p(x C 1)p(C 1 ) 3P p(x C i )p(c i ) i=1 "Naive" conditional independence assumptions p(c 1 x) = dq j=1 3P dq i=1 j=1 p(x j C 1 )p(c 1 ) p(x j C i )p(c i ) 15
16 Naive Bayes Classifier "Naive" conditional independence assumptions p(c 1 x) = p(x C 1)p(C 1 ) 3P p(x C i )p(c i ) i=1 p(c 1 x) = dq j=1 3P i=1 p(x j C 1 )p(c 1 ) dq j=1 Easy to estimate p(x j C i )p(c i ) 16
17 Naive Bayes Classifier "Naive" conditional independence assumptions p(c 1 x) = p(x C 1)p(C 1 ) 3P p(x C i )p(c i ) i=1 p(c 1 x) = dq j=1 3P i=1 p(x j C 1 )p(c 1 ) dq j=1 Wrong! p(x j C i )p(c i ) 17
18 Gaussian Mixture Model 18
19 Gaussian Distribution Gaussian (Normal) Distribution is a common continuous probability function Has multi-variate extension Parameters: mean and covariance matrix 19 19
20 Gaussian Mixture Model Probability distribution of given data is approximated by a mixture of multiple Gaussians Works for any data distribution, given enough Gaussian components Approximates data density: unsupervised learning method 20
21 Mixture Model 21
22 Mixture of Gaussians 22
23 Mixture of Gaussians density estimation p(x) = P K k=1 kn (x µ k, k ) N (x µ k, k )= 1 1 exp{ 1 (2 ) 2 k 1/2 2 (x µ k) T 1 k (x µ k)} - multivariate Gaussian distribution. I Number of components is a hyper-parameter determined via cross-validation. I Why Gaussian distribution? The least restrictive among all continuous distributions / 19
24 ELM 24
25 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? A simple BUT very smart idea! Prof. Guang-Bin Huang, NTU, Singapore 25
26 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? Good old MLP 26
27 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? Good old MLP 27
28 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? Good old MLP 28
29 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? Notations 29
30 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? Notations 30
31 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? Training 31
32 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? Training 32
33 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? Training 33
34 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? First Example 34
35 ELM: A Robust Modeling Technique? Yes! What is Extreme Learning Machine? First Example 35
36 ELM: A Robust Modeling Technique? Yes! My first Simple Test 36
37 ELM: A Robust Modeling Technique? Yes! My first Simple Test 37
38 (TR)OP-ELM 38 38
39 ELM: A Robust Modeling Technique? Yes! So you know ELM 39
40 ELM: A Robust Modeling Technique? Yes! So you know ELM 40
41 ELM: A Robust Modeling Technique? Yes! So you know ELM 41
42 ELM: A Robust Modeling Technique? Yes! So you know ELM 42
43 ELM: A Robust Modeling Technique? Yes! Regularization Approaches 43
44 ELM: A Robust Modeling Technique? Yes! Regularization Approaches 44
45 ELM: A Robust Modeling Technique? Yes! Regularization Approaches 45
46 ELM: A Robust Modeling Technique? Yes! Regularization Approaches 46
47 ELM: A Robust Modeling Technique? Yes! Regularization Approaches 47
48 ELM: A Robust Modeling Technique? Yes! Regularization Approaches Provides a ranking of the variables 48
49 ELM: A Robust Modeling Technique? Yes! Tikhonov Regularization 49
50 ELM: A Robust Modeling Technique? Yes! Tikhonov Regularization 50
51 ELM: A Robust Modeling Technique? Yes! Elastic Net 51
52 ELM: A Robust Modeling Technique? Yes! Elastic Net 52
53 ELM: A Robust Modeling Technique? Yes! Drawback of these methods 53
54 ELM: A Robust Modeling Technique? Yes! OP-ELM 54
55 ELM: A Robust Modeling Technique? Yes! PRESS is Exact but Fast LOO Calculation 55
56 ELM: A Robust Modeling Technique? Yes! TROP-ELM 56
57 ELM: A Robust Modeling Technique? Yes! TROP-ELM: TR-PRESS 57
58 ELM: A Robust Modeling Technique? Yes! TROP-ELM: Global Methodology 58
59 ELM: A Robust Modeling Technique? Yes! TROP-ELM 59
60 New Idea: ELM + GMM p(c 1 x) = p(x C 1)p(C 1 ) 3P p(x C i )p(c i ) i=1 60
61 New Idea: ELM + GMM p(c 1 x) = p(x C 1)p(C 1 ) 3P p(x C i )p(c i ) i=1 Ŷ 61
62 New Idea: ELM + GMM Ŷ p(c 1 ŷ) = p(ŷ C 1)p(C 1 ) 3P p(ŷ C i )p(c i ) i=1 62
63 New Idea: ELM + GMM Approximated using GMM Ŷ p(c 1 ŷ) = p(ŷ C 1)p(C 1 ) 3P p(ŷ C i )p(c i ) i=1 63
64 Experiments 64
65 Results (1) 65
66 Are the Probabilities correctly evaluated? Let s ask an expert to analyze the misclassifications! No, it is boring (not automatic) and usually not possible 66
67 Are the Probabilities correctly evaluated? Let s ask an expert to analyze the misclassifications! No, it is boring (not automatic) and usually not possible 67
68 Are the Probabilities correctly evaluated? If the correct class corresponds to one of the two largest probabilities 68
69 Are the Probabilities correctly evaluated? If the output of the classifier is a probability THEN let s draw the classification randomly using this probability Seriously? 69
70 Are the Probabilities correctly evaluated? If the output of the classifier is a probability THEN let s draw the classification randomly using this probability Classifiers that provide a probability for each class: y = [99% Belgian, 1% French, 0% American] y = [49% Belgian, 51% French, 0% American] y = [33% Belgian, 33% French, 34% American] 70
71 Are the Probabilities correctly evaluated? 71
72 Conclusions? We wanted a classifier that: is accurate: similar to the state of the art fast to built: the fastest? fast to predict the class of a new element: Oh yes baby! Provide a probability for each class: Yes we can! can be handled by a cute brunette: to be determined Big Data? Basic version accepted to IWANN 15, journal version: under review 72
73 Conclusions? We wanted a classifier that: is accurate: similar to the state of the art fast to built: the fastest? fast to predict the class of a new element: Oh yes baby! Provide a probability for each class: Yes we can! can be handled by a cute brunette: to be determined Big Data? Basic version accepted to IWANN 15, journal version: under review 73
74 Conclusions? We wanted a classifier that: is accurate: similar to the state of the art fast to built: the fastest? fast to predict the class of a new element: Oh yes baby! Provide a probability for each class: Yes we can! can be handled by a cute brunette: to be determined Big Data? Basic version accepted to IWANN 15, journal version: under review 74
75 Conclusions? We wanted a classifier that: is accurate: similar to the state of the art fast to built: the fastest? fast to predict the class of a new element: Oh yes baby! Provide a probability for each class: Yes we can! can be handled by a cute brunette: to be determined Big Data? Basic version accepted to IWANN 15, journal version: under review 75
76 Conclusions? We wanted a classifier that: is accurate: similar to the state of the art fast to built: the fastest? fast to predict the class of a new element: Oh yes baby! Provide a probability for each class: Yes we can! can be handled by a cute brunette: to be determined Big Data? Basic version accepted to IWANN 15, journal version: under review 76
77 Conclusions? We wanted a classifier that: is accurate: similar to the state of the art fast to built: the fastest? fast to predict the class of a new element: Oh yes baby! Provide a probability for each class: Yes we can! can be handled by a cute brunette: to be determined Big Data? Basic version accepted to IWANN 15, journal version: under review 77
78 Conclusions? We wanted a classifier that: is accurate: similar to the state of the art fast to built: the fastest? fast to predict the class of a new element: Oh yes baby! Provide a probability for each class: Yes we can! can be handled by a cute brunette: to be determined Big Data? Basic version accepted to IWANN 15, journal version: under review 78
79 Thank You! Questions? 79
Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationMachine Learning for Signal Processing Bayes Classification
Machine Learning for Signal Processing Bayes Classification Class 16. 24 Oct 2017 Instructor: Bhiksha Raj - Abelino Jimenez 11755/18797 1 Recap: KNN A very effective and simple way of performing classification
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Expectation Maximization Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning with MAR Values We discussed learning with missing at random values in data:
More informationIntroduction to Machine Learning
1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer
More informationMachine Learning. Regression-Based Classification & Gaussian Discriminant Analysis. Manfred Huber
Machine Learning Regression-Based Classification & Gaussian Discriminant Analysis Manfred Huber 2015 1 Logistic Regression Linear regression provides a nice representation and an efficient solution to
More informationGenerative Model (Naïve Bayes, LDA)
Generative Model (Naïve Bayes, LDA) IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University Materials from Prof. Jia Li, sta3s3cal learning book (Has3e et al.), and machine learning
More informationIntroduction to Machine Learning
1, DATA11002 Introduction to Machine Learning Lecturer: Antti Ukkonen TAs: Saska Dönges and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer,
More informationISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification
ISyE 6416: Computational Statistics Spring 2017 Lecture 5: Discriminant analysis and classification Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology
More informationIntro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation
Lecture 15. Pattern Classification (I): Statistical Formulation Outline Statistical Pattern Recognition Maximum Posterior Probability (MAP) Classifier Maximum Likelihood (ML) Classifier K-Nearest Neighbor
More informationClassification Methods II: Linear and Quadratic Discrimminant Analysis
Classification Methods II: Linear and Quadratic Discrimminant Analysis Rebecca C. Steorts, Duke University STA 325, Chapter 4 ISL Agenda Linear Discrimminant Analysis (LDA) Classification Recall that linear
More informationMidterm: CS 6375 Spring 2015 Solutions
Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an
More informationThe classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know
The Bayes classifier Theorem The classifier satisfies where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know Alternatively, since the maximum it is
More informationThe classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA
The Bayes classifier Linear discriminant analysis (LDA) Theorem The classifier satisfies In linear discriminant analysis (LDA), we make the (strong) assumption that where the min is over all possible classifiers.
More informationGenerative Clustering, Topic Modeling, & Bayesian Inference
Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationLearning with multiple models. Boosting.
CS 2750 Machine Learning Lecture 21 Learning with multiple models. Boosting. Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Learning with multiple models: Approach 2 Approach 2: use multiple models
More informationPerformance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project
Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationMixtures of Gaussians continued
Mixtures of Gaussians continued Machine Learning CSE446 Carlos Guestrin University of Washington May 17, 2013 1 One) bad case for k-means n Clusters may overlap n Some clusters may be wider than others
More informationIntroduction to Machine Learning
Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationMachine Learning for Signal Processing Bayes Classification and Regression
Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For
More informationParametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a
Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a
More informationIntroduction to Machine Learning
Outline Introduction to Machine Learning Bayesian Classification Varun Chandola March 8, 017 1. {circular,large,light,smooth,thick}, malignant. {circular,large,light,irregular,thick}, malignant 3. {oval,large,dark,smooth,thin},
More informationday month year documentname/initials 1
ECE471-571 Pattern Recognition Lecture 13 Decision Tree Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi
More informationBayesian Networks Inference with Probabilistic Graphical Models
4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationData Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 4 of Data Mining by I. H. Witten, E. Frank and M. A.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter of Data Mining by I. H. Witten, E. Frank and M. A. Hall Statistical modeling Opposite of R: use all the attributes Two assumptions:
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationClassification of Ordinal Data Using Neural Networks
Classification of Ordinal Data Using Neural Networks Joaquim Pinto da Costa and Jaime S. Cardoso 2 Faculdade Ciências Universidade Porto, Porto, Portugal jpcosta@fc.up.pt 2 Faculdade Engenharia Universidade
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationClustering and Gaussian Mixture Models
Clustering and Gaussian Mixture Models Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 25, 2016 Probabilistic Machine Learning (CS772A) Clustering and Gaussian Mixture Models 1 Recap
More informationCSC 411: Lecture 09: Naive Bayes
CSC 411: Lecture 09: Naive Bayes Class based on Raquel Urtasun & Rich Zemel s lectures Sanja Fidler University of Toronto Feb 8, 2015 Urtasun, Zemel, Fidler (UofT) CSC 411: 09-Naive Bayes Feb 8, 2015 1
More informationContents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)
Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture
More informationMachine Learning for Data Science (CS4786) Lecture 12
Machine Learning for Data Science (CS4786) Lecture 12 Gaussian Mixture Models Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016fa/ Back to K-means Single link is sensitive to outliners We
More informationWeek 5: Logistic Regression & Neural Networks
Week 5: Logistic Regression & Neural Networks Instructor: Sergey Levine 1 Summary: Logistic Regression In the previous lecture, we covered logistic regression. To recap, logistic regression models and
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More informationNotes on Discriminant Functions and Optimal Classification
Notes on Discriminant Functions and Optimal Classification Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Discriminant Functions Consider a classification problem
More informationSoft Computing. Lecture Notes on Machine Learning. Matteo Mattecci.
Soft Computing Lecture Notes on Machine Learning Matteo Mattecci matteucci@elet.polimi.it Department of Electronics and Information Politecnico di Milano Matteo Matteucci c Lecture Notes on Machine Learning
More informationMachine Learning - MT Classification: Generative Models
Machine Learning - MT 2016 7. Classification: Generative Models Varun Kanade University of Oxford October 31, 2016 Announcements Practical 1 Submission Try to get signed off during session itself Otherwise,
More informationA Study of Relative Efficiency and Robustness of Classification Methods
A Study of Relative Efficiency and Robustness of Classification Methods Yoonkyung Lee* Department of Statistics The Ohio State University *joint work with Rui Wang April 28, 2011 Department of Statistics
More informationPattern Recognition. Parameter Estimation of Probability Density Functions
Pattern Recognition Parameter Estimation of Probability Density Functions Classification Problem (Review) The classification problem is to assign an arbitrary feature vector x F to one of c classes. The
More informationStatistical aspects of prediction models with high-dimensional data
Statistical aspects of prediction models with high-dimensional data Anne Laure Boulesteix Institut für Medizinische Informationsverarbeitung, Biometrie und Epidemiologie February 15th, 2017 Typeset by
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationSupport Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Support Vector Machines CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Linear Classifier Naive Bayes Assume each attribute is drawn from Gaussian distribution with the same variance Generative model:
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 7 Unsupervised Learning Statistical Perspective Probability Models Discrete & Continuous: Gaussian, Bernoulli, Multinomial Maimum Likelihood Logistic
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis Jonathan Taylor, 10/12 Slide credits: Sergio Bacallado 1 / 1 Review: Main strategy in Chapter 4 Find an estimate ˆP
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationThe exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.
CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. Please
More informationMidterm, Fall 2003
5-78 Midterm, Fall 2003 YOUR ANDREW USERID IN CAPITAL LETTERS: YOUR NAME: There are 9 questions. The ninth may be more time-consuming and is worth only three points, so do not attempt 9 unless you are
More informationRepresentation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2
Representation Stefano Ermon, Aditya Grover Stanford University Lecture 2 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 2 1 / 32 Learning a generative model We are given a training
More informationGenerative Models. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University
Generative Models CS4780/5780 Machine Learning Fall 2012 Thorsten Joachims Cornell University Reading: Mitchell, Chapter 6.9-6.10 Duda, Hart & Stork, Pages 20-39 Bayes decision rule Bayes theorem Generative
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationSecurity Analytics. Topic 6: Perceptron and Support Vector Machine
Security Analytics Topic 6: Perceptron and Support Vector Machine Purdue University Prof. Ninghui Li Based on slides by Prof. Jenifer Neville and Chris Clifton Readings Principle of Data Mining Chapter
More informationThe Perceptron. Volker Tresp Summer 2016
The Perceptron Volker Tresp Summer 2016 1 Elements in Learning Tasks Collection, cleaning and preprocessing of training data Definition of a class of learning models. Often defined by the free model parameters
More informationGenerative Learning. INFO-4604, Applied Machine Learning University of Colorado Boulder. November 29, 2018 Prof. Michael Paul
Generative Learning INFO-4604, Applied Machine Learning University of Colorado Boulder November 29, 2018 Prof. Michael Paul Generative vs Discriminative The classification algorithms we have seen so far
More informationThe Bayes classifier
The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal
More informationEM-algorithm for Training of State-space Models with Application to Time Series Prediction
EM-algorithm for Training of State-space Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology - Neural Networks Research
More informationBayesian Networks Structure Learning (cont.)
Koller & Friedman Chapters (handed out): Chapter 11 (short) Chapter 1: 1.1, 1., 1.3 (covered in the beginning of semester) 1.4 (Learning parameters for BNs) Chapter 13: 13.1, 13.3.1, 13.4.1, 13.4.3 (basic
More informationMachine Learning 1. Linear Classifiers. Marius Kloft. Humboldt University of Berlin Summer Term Machine Learning 1 Linear Classifiers 1
Machine Learning 1 Linear Classifiers Marius Kloft Humboldt University of Berlin Summer Term 2014 Machine Learning 1 Linear Classifiers 1 Recap Past lectures: Machine Learning 1 Linear Classifiers 2 Recap
More informationInf2b Learning and Data
Inf2b Learning and Data Lecture 13: Review (Credit: Hiroshi Shimodaira Iain Murray and Steve Renals) Centre for Speech Technology Research (CSTR) School of Informatics University of Edinburgh http://www.inf.ed.ac.uk/teaching/courses/inf2b/
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationDoes Modeling Lead to More Accurate Classification?
Does Modeling Lead to More Accurate Classification? A Comparison of the Efficiency of Classification Methods Yoonkyung Lee* Department of Statistics The Ohio State University *joint work with Rui Wang
More informationLECTURE NOTE #3 PROF. ALAN YUILLE
LECTURE NOTE #3 PROF. ALAN YUILLE 1. Three Topics (1) Precision and Recall Curves. Receiver Operating Characteristic Curves (ROC). What to do if we do not fix the loss function? (2) The Curse of Dimensionality.
More informationLecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher
Lecture 3 STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher Previous lectures What is machine learning? Objectives of machine learning Supervised and
More informationMachine Learning. Lecture 9: Learning Theory. Feng Li.
Machine Learning Lecture 9: Learning Theory Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Why Learning Theory How can we tell
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationMachine Learning (CS 567) Lecture 2
Machine Learning (CS 567) Lecture 2 Time: T-Th 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationMachine Learning 2nd Edition
INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010
More informationNaïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 3 September 14, Readings: Mitchell Ch Murphy Ch.
School of Computer Science 10-701 Introduction to Machine Learning aïve Bayes Readings: Mitchell Ch. 6.1 6.10 Murphy Ch. 3 Matt Gormley Lecture 3 September 14, 2016 1 Homewor 1: due 9/26/16 Project Proposal:
More informationMachine Learning Practice Page 2 of 2 10/28/13
Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes
More informationCOMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017
COMS 4721: Machine Learning for Data Science Lecture 16, 3/28/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SOFT CLUSTERING VS HARD CLUSTERING
More informationCOMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification
COMP 55 Applied Machine Learning Lecture 5: Generative models for linear classification Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp55 Unless otherwise noted, all material
More informationExpectation maximization
Expectation maximization Subhransu Maji CMSCI 689: Machine Learning 14 April 2015 Motivation Suppose you are building a naive Bayes spam classifier. After your are done your boss tells you that there is
More informationSupport Vector Machine. Industrial AI Lab. Prof. Seungchul Lee
Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /
More informationMachine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler
+ Machine Learning and Data Mining Bayes Classifiers Prof. Alexander Ihler A basic classifier Training data D={x (i),y (i) }, Classifier f(x ; D) Discrete feature vector x f(x ; D) is a con@ngency table
More informationClass 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio
Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant
More informationBayes Decision Theory
Bayes Decision Theory Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 / 16
More informationDeep Generative Models. (Unsupervised Learning)
Deep Generative Models (Unsupervised Learning) CEng 783 Deep Learning Fall 2017 Emre Akbaş Reminders Next week: project progress demos in class Describe your problem/goal What you have done so far What
More informationLogistic Regression. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824
Logistic Regression Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative Please start HW 1 early! Questions are welcome! Two principles for estimating parameters Maximum Likelihood
More informationLINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning
LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES Supervised Learning Linear vs non linear classifiers In K-NN we saw an example of a non-linear classifier: the decision boundary
More informationMULTIVARIATE PATTERN RECOGNITION FOR CHEMOMETRICS. Richard Brereton
MULTIVARIATE PATTERN RECOGNITION FOR CHEMOMETRICS Richard Brereton r.g.brereton@bris.ac.uk Pattern Recognition Book Chemometrics for Pattern Recognition, Wiley, 2009 Pattern Recognition Pattern Recognition
More information0.5. (b) How many parameters will we learn under the Naïve Bayes assumption?
. Consider the following four vectors:.5 (i) x = [.5 ] (ii) x = [ ] (iii) x 3 = [ (a) What is the magnitude of each vector?.5 ] (b) What is the result of each dot product below? x T x x 3 T x x T x 3.
More informationCS 340 Lec. 18: Multivariate Gaussian Distributions and Linear Discriminant Analysis
CS 3 Lec. 18: Multivariate Gaussian Distributions and Linear Discriminant Analysis AD March 11 AD ( March 11 1 / 17 Multivariate Gaussian Consider data { x i } N i=1 where xi R D and we assume they are
More informationCS 188: Artificial Intelligence. Outline
CS 188: Artificial Intelligence Lecture 21: Perceptrons Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. Outline Generative vs. Discriminative Binary Linear Classifiers Perceptron Multi-class
More informationAdvanced Introduction to Machine Learning CMU-10715
Advanced Introduction to Machine Learning CMU-10715 Risk Minimization Barnabás Póczos What have we seen so far? Several classification & regression algorithms seem to work fine on training datasets: Linear
More informationExpectation Maximization Algorithm
Expectation Maximization Algorithm Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein, Luke Zettlemoyer and Dan Weld The Evils of Hard Assignments? Clusters
More informationIntroduction to Machine Learning Midterm Exam
10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric
More informationMachine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier
Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationLogistic Regression. Machine Learning Fall 2018
Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics 8. Multivariate Analysis Prof. Dr. Klaus Reygers (lectures) Dr. Sebastian Neubert (tutorials) Heidelberg University WS 2017/18 Multi-Variate Classification Consider
More informationBayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI
Bayes Classifiers CAP5610 Machine Learning Instructor: Guo-Jun QI Recap: Joint distributions Joint distribution over Input vector X = (X 1, X 2 ) X 1 =B or B (drinking beer or not) X 2 = H or H (headache
More informationCSE446: Clustering and EM Spring 2017
CSE446: Clustering and EM Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin, Dan Klein, and Luke Zettlemoyer Clustering systems: Unsupervised learning Clustering Detect patterns in unlabeled
More information