Introduction to Graphical Models

Size: px
Start display at page:

Download "Introduction to Graphical Models"

Transcription

1 Introduction to Graphical Models The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang (Tue.) Yung-Kyun Noh

2 GENERALIZATION FOR PREDICTION 2

3 Probabilistic Assumption and Bayes Classification Bayes Error p 1 (x) p 2 (x) Data space 3

4 Probabilistic Assumption and Bayes Classification p 1 (x) p 2 (x) Data space 4

5 Learning Data (Regularity) Prediction Learning Learn prediction function from data ( : Hypothesis set/candidate set) 5

6 Quantify the Evaluation Measure of quality: expected loss : loss function Estimated error Examples Classification Regression 6

7 Consistent Learner satisfies <Uniform convergence> Caution: The definition of consistency is not 7

8 Quiz 1 Consider a hypothesis set which satisfies Explain that learning with is not consistent though it satisfies. N What is the possible problem in this case? 8

9 Consistency and Bayes Error Minimizing expected error (objective) vs. minimizing estimated error f Error : loss function : Optimal parameter with N data 0 Bayes error Smaller (Simple classifier) Error for training data (train error) (For example, a linear classifier with regularization) 9 Larger (Complex classifier)

10 Consistency and Bayes Error Consistent learner with many data Error 0 train error Bayes error Smaller f : Optimal parameter Minimum error of consistent learner Larger Consistency does not necessarily imply achieving the Bayes error!!! (For example, a linear classifier with regularization) 10

11 Related Terms with Confining H Linear model VC-dim for classification = Dimensionality + 1 Small number of parameters Large margin Regularization Bias-Variance trade-off Generalization ability, overfitting Many terms are theoretically connected 11

12 Supervised Learning (Prediction) Method: Learning from examples and can classify an unseen data horse car car horse ship horse Learn car plane New unseen data classifier label (a horse) [Based on the assumption of regularity] 12

13 Representation of Data Data space Each datum is one point in a data space =[1, 2, 5, 10, ] T elements 13

14 Classification 14

15 Bayes Optimal Classifier Our ultimate goal is not a zero error. (Optimal) Bayes error Figure credit: Masashi Sugiyama 15

16 FISHER DISCRIMINANT ANALYSIS 16

17 p p Gaussian Random Variable Principal axes are the eigenvector directions of 2 1 : eigenvalues 17

18 Covariance Matrix and Projection covariance = Ʃ variance = : eigenvalues with 18

19 Parameter Estimation Maximum Likelihood Estimation Data: Mean vector: Covariance matrix: 19

20 Fisher Discriminant Analysis - Consider Two Statistics Between-class variance 20

21 Fisher Discriminant Analysis - Consider Two Statistics Within-class variance : covariance matrix of each class 21

22 Total Variance Total covariance matrix: Total variance along : 22

23 Fisher Discriminant Analysis (FDA) Find having maximal for give. Between class variance Within class variance w 2 w 1 23

24 Take the Derivative Take the derivative Generalized eigenvector problem 24

25 Quiz 1: Closed Form Solution Problem: For two class problem with N 1 = N 2 : Assume is a full-rank matrix. Find the closed form solution: 25

26 Quiz 2: How to solve the generalized eigenvector problem? Problem: Are the solution eigenvalues real? (non-complex)? If not, how are these eigenvalues are treated? 26

27 Quiz 3: Two different FDAs Some papers consider the criterion instead of What is the difference? 27

28 In High Dimensional Space (1/2) : not a full rank matrix e-values Eigenvalue spectrum N - C N C

29 In High Dimensional Space (2/2) FDA will trivially pick up the null space of as solution. More seriously, the nullspace varies by sampling Ill-posed problem 29

30 Regularization in FDA Regularize e-values e-values N - C N - C The problem becomes well-posed. New solution: for two classes 30

31 Well-posed Problem vs. Ill-Posed Problem Well-posed Ill-posed The variation of configuration may come from the sampling variation. 31

32 Quiz 4: New SW with Infinite Data If infinite number of data are generated around each datum with isotropic Gaussian 32

33 GRAPHICAL MODELS 33

34 Naïve Bayes As An Extreme Case Naïve Bayes DY p(x) = p d (x d ) d=1 = p 1 (x 1 )p 2 (x 2 ) : : : p D (x D ) Simply ignore every correlation and dependencies between variables True decomposition: p(x) = p(x 1 )p(x 2 jx 1 )p(x 3 jx 1 ; x 2 ): : : p(x D jx 1 ; : : : ; x D 1 ) 34

35 Number of Parameters D = 1000 Number of parameters of a Gaussian: *(1000)/2 = 501,500 35

36 Independence p(x) = p 1 (x 1 )p 2 (x 2 ) D = D 1 + D 2 D 1 = 500, D 2 = 500 Number of parameters *(500)/ *(500)/2 = 251,500 Incorporating one independence can reduce the number of parameters into half. 36

37 Graphical Models We utilize probabilities that are represented by the graph structure. (directed & undirected) Use probabilistic independencies and conditional independencies that can be captured by graph structure 37

38 Directed Graphical Models Factorization of a (large) joint pdf X 1 X 2 X 3 X 4 ; X 5 X 6 P (X) = P (X 1 ; : : : ; X 7 ) = P (X 1 )P (X 2 )P (X 3 jx 1 ; X 2 ) P (X 4 ; X 5 jx 2 )P (X 6 ) P(X 7 jx 4 ; X 5 ; X 6 ) X 7 P(X 1 ; : : : ; X D ) = DY P(X i jpa Xi ) i=1 For given data, make a model for each decomposed probability, then estimate parameters separately. 38

39 D-Separations X 1 X 1 X 1 X 2 X 2 X 2 X 3 X 3 X 3 X 1? X 2 X 1? X 3 jx 2 X 2? X 3 jx 1 X 1?= X 2 jx 3 Causal path Common cause Common effect 39

40 Discrete Random Variables 40

41 Discrete Random Variables X 1 X 2 X 4 X 6 P (1) X 3 X 5 41

42 Inference with Discrete Variables 42

43 Inference with Discrete Variables Marginalize X 5 43

44 Inference with Discrete Variables Marginalize X 5! 44

45 Inference with Discrete Variables Marginalize X 3 45

46 Inference with Discrete Variables Marginalize X 3 46

47 Inference with Discrete Variables Marginalize X 2 47

48 Inference with Discrete Variables 48

49 Continuous Random Variables Each probability density is a Gaussian 49

50 Continuous Random Variables Jointly Gaussian with 6 variables Need 6 + 6*7/2 = 27 parameters Using graphical model: Need to obtain the parameters of We need to estimate only the following 19 parameters 50

51 Gaussian Random Variable Marginal 51

52 Gaussian Random Variable Conditional 52

53 KALMAN FILTER 53

54 Filtering Hidden Markov Models (HMM) /Linear Dynamical Systems (LDS) X 1 X 2 X 3 X k 1 X k Y 1 Y 2 Y 3 Y k 1 Y k HMM LDS 54

55 Kalman Filter Marginalizing x 0, x T-1 Filtering Conditional marginalization Marginalization from the left 55

56 Kalman Filter Unconstrained distribution Also, for joint density if necessary 56

57 Kalman Filter Unconstrained distribution 57

58 Kalman Filter Constrained distribution Marginalizing x 0 Same y and are from unconstrained distribution. What matters is y 0 58

59 Kalman Filter Marginalizing x 1, x T-1 What matters is how we can find the covariance of joint density function!! 59

60 Kalman Filter Filtering Conditional marginalization Marginalization from the left Why filtering? Once we know don t have to know (or keep)., we 60

61 Kalman Filter Time update Measurement update Time update Similar to the unconstrained distribution calculation! 61

62 Kalman Filter Also, Joint: 62

63 Kalman Filter Measurement update (Conditional density) Sum - ups 63

64 Kalman Filter With different notation, Alternative form of K t+1 64

65 Markov Random Field Undirected Graph If there is a direct edge between X i and X j : X 1 X 2 X 3 X i?= X j jx ni;j If there is no direct edge between X i and X j : X i? X j jx ni;j X 4 X 5 X 1?= X 3 jx 2 ; X 4 ; X 5 X 1? X 5 jx 3 65

66 Joint Distribution Product of functions on cliques X 1 X 2 X 3 P(X 1 ; X 2 ; X 3 ; X 4 ; X 5 ) = 1 Z Ã 1;2(X 1 ; X 2 ) Ã 1;3 (X 1 ; X 3 ) Ã 3;4;5 (X 3 ; X 4 ; X 5 ) = X Ã 1;2 (X 1 ; X 2 )Ã 1;3 (X 1 ; X 3 )Ã 3;4;5 (X 3 ; X 4 ; X 5 ) A 1 X 1 ;X 2 ;X 3 ;X 4 ;X 5 X4 X 5 Cliques: X 2 X 1 X 1 X 3 X4 X 3 X 5 The set of distributions satisfying MRF conditions (Markov random field) = The set of distributions decomposed by cliques (Gibbs random field) (Hammersley-Clifford Theorem) 66

67 More Fancy Models Latent Variable Model X Z Y Filtering X Z Y Z X Z 1 Z 2 Z 3 Z k 1 Z k Y 1 Y 2 Y 3 Y k 1 Y k 67

68 Topic Models 68

69 Topic Models 69

70 Independency Correlation and Independency Independency in Gaussian means no correlation Naïve Bayes? Mixture of Gaussian? 70

71 Conditional Independency vs. 71

72 Y Naïve Bayes for Classification X 1 X 2 X D P (X; Y ) = P (Y )P (X 1 jy ) : : : P (X D jy ) 72

73 Undirected Graphical Models Independent 73

74 Undirected Graphical Models Find all maximal cliques: 74

75 1 Z Z Undirected Graphical Models Potential functions on cliques ª 1 (X 1 ); ª 2 (X 2 ); : : : (X 1, X 2, : maximal cliques) P ( X ) = Z = Z = Partition function ª 1 ( X 1 ) ª 2 ( X 2 ) ª C ( X C ) X ª 1 ( X 1 ) ª C ( X C ) X 1 ; X 2 ; ; X D X 1 ; X 2 ; ; X D Discrete ª 1 ( X 1 ) ª C ( X C ) d X 1 X C Continuous 75

76 Undirected Graphical Models 2-cliques X 1 X 2 X 3 X 4 ª X 1 ; X 2 ; X 3 X 1 X 2 X X 3 X 4 X 3 ; X Z = X ª X 1 ; X 2 ; X 3 ( X 1 ; X 2 ; X 3 ) X 3 ; X 4 ( X 3 ; X 4 ) X 1 ; X 2 ; X 3 ; X 4 = ª X 1 ; X 2 ; X 3 ( 1 ; 1 ; 1 ) X 3 ; X 4 ( 1 ; 1 ) + ª X 1 ; X 2 ; X 3 ( 1 ; 1 ; 1 ) X 3 ; X 4 ( 1 ; 0 ) + : : : = : : : = = Ex. P ( 1 ; 0 ; 0 ; 0 ) = ª X 1 ; X 2 ; X 3 ( 1 ; 0 ; 0 ) X 3 ; X 4 ( 0 ; 0 ) = Z =

77 Estimating Parameters 2-cliques X 1 X 2 X 3 X 4 X 1 X 2 X 3 X 3 X = ª 1 ; 1 ; = ª 1 ; 1 ; ª X 1 ; X 2 ; X = 1 ; 1 = 1 ; 0 12 parameters X 3 ; X 4 Without graphical model: 15 parameters (2 4 1) : 8 : 4 X 1 X 2 X 3 X = P(1; 1; 1; 1) = P(1; 1; 1; 0) 77

78 = = = Conditional Independency X 1 P (X 1 ; X 2 ; X 3 ) = P (X 1 )P (X 2 jx 1 )P (X 3 jx 2 ) P (X 3 = 1jX 1 = 0; X 2 = 1) =? X 2 P (X 3 = 1jX 1 = 0; X 2 = 1) X 3 P ( X 1 = 0 ; X 2 = 1 ; X 3 = 1 ) P ( X 1 = 0 ; X 2 = 1 ; X 3 = 1 ) + P ( X 1 = 0 ; X 2 = 1 ; X 3 = 0 ) P ( X 1 = 0 ) P ( X 2 = 1 j X 1 = 0 ) P ( X 3 = 1 j X 2 = 1 ) P ( X 1 = 0 ) P ( X 2 = 1 j X 1 = 0 ) P ( X 3 = 1 j X 2 = 1 ) + P ( X 1 = 0 ) P ( X 2 = 1 j X 1 = 0 ) P ( X 3 = 0 j X 2 = 1 ) P ( X 3 = 1 j X 2 = 1 ) P ( X 3 = 1 j X 2 = 1 ) + P ( X 3 = 0 j X 2 = 1 ) = P (X 3 = 1jX 2 = 1) 78

79 Conditional Independency 2-cliques X 1 X 2 X 3 X 4 X 1 X 2 X = ª 1 ; = ª 1 ; = ª 0 ; = ª 0 ; 0 X 3 X = 1 = 0 X 1 X 2 X 3 X = ª 1 ; = ª 1 ; = ª 1 ; = ª 1 ; = ª 0 ; = ª 0 ; = ª 0 ; = ª 0 ; 0 0 P (X 4 = 1jX 1 = 0; X 2 = 1) =? P (X 4 = 1jX 1 = 0; X 2 = 0) =? P (X 4 = 1) =? All answers are the same:

80 Z Marginalization X 1 P (X 1 ; X 2 ; X 3 ) = P (X 1 )P (X 2 jx 1 )P (X 3 jx 2 ) P ( X 1 ; X 3 ) = P ( X 1 ; X 2 ; X 3 ) d X 2 X 2 Any good property like P (X 1 ; X 3 ) = P (X 1 )P(X 3 )? X 1 X 3 X 3 80

81 X i X D Z Marginalization X 1 X P(X 1 ; : : : ; X D ) = P (X 1 ) : : : P (X i 1 ) i 1 P (X i jx 1 ; : : : ; X i 1 ) P(X i+1 jx i ) : : : P (X D jx i ) P ( X 1 ; : : : ; X i 1 ; X i + 1 ; : : : ; X D ) = P ( X 1 ; : : : ; X D ) d X i Any decomposition with X i + 1 P (X 1 ; : : : ; X i 1 ; X i+1 ; : : : ; X D ) =? P (X 1 ; : : : ; X i 1 ; X i+1 ; : : : ; X D ) = P (X 1 ) : : : P (X i 1 )P (X i+1 jx 1 ; : : : ; X i 1 ) : : : P (X D jx 1 ; : : : ; X i 1 ) 81

82 Marginalization Marginalize X i P (X D jx 1 ; : : : ; X i 1 ) 82

83 Introducing Latent Variables Z X 1 X 2 X 3 X 4 X 1 X 2 X 3 X 4 Issue: how can a model be simplified as much as possible, while the flexibility is kept enough to incorporate the true dependency. 83

84 Expectation-Maximization Algorithm Parameter estimation with latent variables We don t have data for latent variables E-step: Data for latent variables are obtained from expectation with current parameter values. M-step: With expected latent variables, parameters are obtained by maximizing the likelihood. E-step and M-step are repeated back and forth until the likelihood converges. 84

85 X Z 0. 1 Expectation-Maximization Algorithm Gaussian mixture model Parameters: Unknown variables: We are given x i for i = 1 ; : : : ; N ¼ k ; ¹ k ; k for k = 1 ; : : : ; K z i = z for i 1 B A z i K i = 1 ; : : : ; N k = 1 i k = 2 j i 1 = 0 : 2 i 2 = 0 : 8 j 1 = 0 : 0 5 j 2 = 0 : 9 5 E-step: Distribution of z i (Responsibilities) ( z i k ) = P ¼ k N ( x i j ¹ k ; k ) K j = 1 ¼ j N ( x i j ¹ j ; j ) using current parameters ¼ k ; ¹ k ; k 85

86 N N k N l n à K Expectation-Maximization Algorithm M-step: Estimate parameters. ¹ k = 1 N k X N i = 1 ( z i k ) x i k = N 1 k X N i = 1 ( z i k ) ( x i ¹ k ) ( x i ¹ k ) > ¼ k = for N k = X N i = 1 ( z i k ) Iterate until l n p ( X j ¼ ; ¹ ; ) = X X ¼ k N ( x i j ¹ k ; k )! converges. i = 1 k = 1 86

87 Gaussian Mixture Model With EM 87 C. Bishop 2007, Figure 9.8(a)-(f)

88 ANY QUESTIONS? 88

89 THANK YOU Yung-Kyun Noh 89

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

A graph contains a set of nodes (vertices) connected by links (edges or arcs) BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012 David Sontag (NYU) Graphical Models Lecture 4, February 16, 2012 1 / 27 Undirected graphical models Reminder

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Random Field Models for Applications in Computer Vision

Random Field Models for Applications in Computer Vision Random Field Models for Applications in Computer Vision Nazre Batool Post-doctorate Fellow, Team AYIN, INRIA Sophia Antipolis Outline Graphical Models Generative vs. Discriminative Classifiers Markov Random

More information

Mathematical Formulation of Our Example

Mathematical Formulation of Our Example Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot

More information

Probabilistic Time Series Classification

Probabilistic Time Series Classification Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 13: Learning in Gaussian Graphical Models, Non-Gaussian Inference, Monte Carlo Methods Some figures

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Machine Learning and Related Disciplines

Machine Learning and Related Disciplines Machine Learning and Related Disciplines The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang 2018. 1. 8-12 (Mon.-Fri.) Yung-Kyun Noh Machine Learning Interdisciplinary

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Introduction to Machine Learning Midterm, Tues April 8

Introduction to Machine Learning Midterm, Tues April 8 Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 12 Dynamical Models CS/CNS/EE 155 Andreas Krause Homework 3 out tonight Start early!! Announcements Project milestones due today Please email to TAs 2 Parameter learning

More information

Factor Analysis and Kalman Filtering (11/2/04)

Factor Analysis and Kalman Filtering (11/2/04) CS281A/Stat241A: Statistical Learning Theory Factor Analysis and Kalman Filtering (11/2/04) Lecturer: Michael I. Jordan Scribes: Byung-Gon Chun and Sunghoon Kim 1 Factor Analysis Factor analysis is used

More information

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated

More information

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

A brief introduction to Conditional Random Fields

A brief introduction to Conditional Random Fields A brief introduction to Conditional Random Fields Mark Johnson Macquarie University April, 2005, updated October 2010 1 Talk outline Graphical models Maximum likelihood and maximum conditional likelihood

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

COMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma

COMS 4771 Probabilistic Reasoning via Graphical Models. Nakul Verma COMS 4771 Probabilistic Reasoning via Graphical Models Nakul Verma Last time Dimensionality Reduction Linear vs non-linear Dimensionality Reduction Principal Component Analysis (PCA) Non-linear methods

More information

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Machine Learning Techniques for Computer Vision

Machine Learning Techniques for Computer Vision Machine Learning Techniques for Computer Vision Part 2: Unsupervised Learning Microsoft Research Cambridge x 3 1 0.5 0.2 0 0.5 0.3 0 0.5 1 ECCV 2004, Prague x 2 x 1 Overview of Part 2 Mixture models EM

More information

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013

Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Conditional Random Fields and beyond DANIEL KHASHABI CS 546 UIUC, 2013 Outline Modeling Inference Training Applications Outline Modeling Problem definition Discriminative vs. Generative Chain CRF General

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University

Heeyoul (Henry) Choi. Dept. of Computer Science Texas A&M University Heeyoul (Henry) Choi Dept. of Computer Science Texas A&M University hchoi@cs.tamu.edu Introduction Speaker Adaptation Eigenvoice Comparison with others MAP, MLLR, EMAP, RMP, CAT, RSW Experiments Future

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari

Deep Learning Srihari. Deep Belief Nets. Sargur N. Srihari Deep Belief Nets Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines for continuous

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Ch 4. Linear Models for Classification

Ch 4. Linear Models for Classification Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,

More information

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs)

1 Undirected Graphical Models. 2 Markov Random Fields (MRFs) Machine Learning (ML, F16) Lecture#07 (Thursday Nov. 3rd) Lecturer: Byron Boots Undirected Graphical Models 1 Undirected Graphical Models In the previous lecture, we discussed directed graphical models.

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Machine Learning Lecture 5

Machine Learning Lecture 5 Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory

More information

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010

Hidden Markov Models. Aarti Singh Slides courtesy: Eric Xing. Machine Learning / Nov 8, 2010 Hidden Markov Models Aarti Singh Slides courtesy: Eric Xing Machine Learning 10-701/15-781 Nov 8, 2010 i.i.d to sequential data So far we assumed independent, identically distributed data Sequential data

More information

ABSTRACT INTRODUCTION

ABSTRACT INTRODUCTION ABSTRACT Presented in this paper is an approach to fault diagnosis based on a unifying review of linear Gaussian models. The unifying review draws together different algorithms such as PCA, factor analysis,

More information

Machine Learning Lecture 7

Machine Learning Lecture 7 Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant

More information

CSC411: Final Review. James Lucas & David Madras. December 3, 2018

CSC411: Final Review. James Lucas & David Madras. December 3, 2018 CSC411: Final Review James Lucas & David Madras December 3, 2018 Agenda 1. A brief overview 2. Some sample questions Basic ML Terminology The final exam will be on the entire course; however, it will be

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

STA 414/2104: Machine Learning

STA 414/2104: Machine Learning STA 414/2104: Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistics! rsalakhu@cs.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 9 Sequential Data So far

More information

Bayesian Networks Inference with Probabilistic Graphical Models

Bayesian Networks Inference with Probabilistic Graphical Models 4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning

More information

Introduction to Probabilistic Graphical Models: Exercises

Introduction to Probabilistic Graphical Models: Exercises Introduction to Probabilistic Graphical Models: Exercises Cédric Archambeau Xerox Research Centre Europe cedric.archambeau@xrce.xerox.com Pascal Bootcamp Marseille, France, July 2010 Exercise 1: basics

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

MACHINE LEARNING ADVANCED MACHINE LEARNING

MACHINE LEARNING ADVANCED MACHINE LEARNING MACHINE LEARNING ADVANCED MACHINE LEARNING Recap of Important Notions on Estimation of Probability Density Functions 2 2 MACHINE LEARNING Overview Definition pdf Definition joint, condition, marginal,

More information

Logistic Regression. Machine Learning Fall 2018

Logistic Regression. Machine Learning Fall 2018 Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes

More information

Probability models for machine learning. Advanced topics ML4bio 2016 Alan Moses

Probability models for machine learning. Advanced topics ML4bio 2016 Alan Moses Probability models for machine learning Advanced topics ML4bio 2016 Alan Moses What did we cover in this course so far? 4 major areas of machine learning: Clustering Dimensionality reduction Classification

More information

Lecture 3: Pattern Classification

Lecture 3: Pattern Classification EE E6820: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 1 2 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mixtures

More information

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

Principal Component Analysis and Linear Discriminant Analysis

Principal Component Analysis and Linear Discriminant Analysis Principal Component Analysis and Linear Discriminant Analysis Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/29

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing Hidden Markov Models By Parisa Abedi Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed data Sequential (non i.i.d.) data Time-series data E.g. Speech

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework

More information

10 : HMM and CRF. 1 Case Study: Supervised Part-of-Speech Tagging

10 : HMM and CRF. 1 Case Study: Supervised Part-of-Speech Tagging 10-708: Probabilistic Graphical Models 10-708, Spring 2018 10 : HMM and CRF Lecturer: Kayhan Batmanghelich Scribes: Ben Lengerich, Michael Kleyman 1 Case Study: Supervised Part-of-Speech Tagging We will

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

Lecture 7: Con3nuous Latent Variable Models

Lecture 7: Con3nuous Latent Variable Models CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 7: Con3nuous Latent Variable Models All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/

More information

Lecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models

Lecture 10. Announcement. Mixture Models II. Topics of This Lecture. This Lecture: Advanced Machine Learning. Recap: GMMs as Latent Variable Models Advanced Machine Learning Lecture 10 Mixture Models II 30.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ Announcement Exercise sheet 2 online Sampling Rejection Sampling Importance

More information

Active and Semi-supervised Kernel Classification

Active and Semi-supervised Kernel Classification Active and Semi-supervised Kernel Classification Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London Work done in collaboration with Xiaojin Zhu (CMU), John Lafferty (CMU),

More information

Notes on Machine Learning for and

Notes on Machine Learning for and Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori

More information

CMU-Q Lecture 24:

CMU-Q Lecture 24: CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input

More information

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014 Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of

More information

Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. September 20, 2012

Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. September 20, 2012 Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University September 20, 2012 Today: Logistic regression Generative/Discriminative classifiers Readings: (see class website)

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Oct, 21, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models CPSC

More information

Exploiting k-nearest Neighbor Information with Many Data

Exploiting k-nearest Neighbor Information with Many Data Exploiting k-nearest Neighbor Information with Many Data 2017 TEST TECHNOLOGY WORKSHOP 2017. 10. 24 (Tue.) Yung-Kyun Noh Robotics Lab., Contents Nonparametric methods for estimating density functions Nearest

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1 Lecture 13: Data Modelling and Distributions Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1 Why data distributions? It is a well established fact that many naturally occurring

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING Text Data: Topic Model Instructor: Yizhou Sun yzsun@cs.ucla.edu December 4, 2017 Methods to be Learnt Vector Data Set Data Sequence Data Text Data Classification Clustering

More information

Hidden Markov Models

Hidden Markov Models 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Hidden Markov Models Matt Gormley Lecture 22 April 2, 2018 1 Reminders Homework

More information

2 : Directed GMs: Bayesian Networks

2 : Directed GMs: Bayesian Networks 10-708: Probabilistic Graphical Models 10-708, Spring 2017 2 : Directed GMs: Bayesian Networks Lecturer: Eric P. Xing Scribes: Jayanth Koushik, Hiroaki Hayashi, Christian Perez Topic: Directed GMs 1 Types

More information

Learning the Linear Dynamical System with ASOS ( Approximated Second-Order Statistics )

Learning the Linear Dynamical System with ASOS ( Approximated Second-Order Statistics ) Learning the Linear Dynamical System with ASOS ( Approximated Second-Order Statistics ) James Martens University of Toronto June 24, 2010 Computer Science UNIVERSITY OF TORONTO James Martens (U of T) Learning

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Diversity-Promoting Bayesian Learning of Latent Variable Models

Diversity-Promoting Bayesian Learning of Latent Variable Models Diversity-Promoting Bayesian Learning of Latent Variable Models Pengtao Xie 1, Jun Zhu 1,2 and Eric Xing 1 1 Machine Learning Department, Carnegie Mellon University 2 Department of Computer Science and

More information

PATTERN CLASSIFICATION

PATTERN CLASSIFICATION PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information