COMP 328: Machine Learning

Size: px
Start display at page:

Download "COMP 328: Machine Learning"

Transcription

1 COMP 328: Machine Learning Lecture 2: Naive Bayes Classifiers Nevin L. Zhang Department of Computer Science and Engineering The Hong Kong University of Science and Technology Spring 2010 Nevin L. Zhang (HKUST) COMP 328 Spring / 34

2 Two different types of classifiers Decision tree classifiers Data Decision rules Classify unseen examples using the rules Naive Bayes classifiers Data probabilistic model about relationship among class and attributes Classify unseen examples via inference based on the model Nevin L. Zhang (HKUST) COMP 328 Spring / 34

3 Outline Probabilistic Models and Classification 1 Probabilistic Models and Classification 2 Probabilistic Independence 3 Na Ive Bayes Model Classifiers 4 Issues 5 Learning to Classify Text Nevin L. Zhang (HKUST) COMP 328 Spring / 34

4 Probabilistic Models and Classification Joint Distribution The most general way to describe relationships among random variables. Example: P(Temperature, Wind, PlayTennis) Temperature Wind PlayTennis Probability Hot Weak No 0.1 Hot Weak Yes 0 Hot Strong No 0.2 Hot Strong Yes 0.3 Cool Weak No 0.1 Cool Weak Yes 0.2 Cool Strong No 0.1 Cool Strong Yes 0 One prob value for each combination of the states of the variables. All the prob values must sum to 1. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

5 Probabilistic Models and Classification Joint Distribution and Classification Suppose we have a joint distribution. How do classify? Calculate probability and then classify. Play tennis on a Hot day with Strong Wind? P(Play = y T = h, W = s) = = P(Play = y, T = h, W = s) P(T = h, W = s) = 0.6 P(Play = n T = h, W = s) = 1 P(Play = y T = h, W = s) = = 0.4 Answer: yes (if must answer with yes or no) Nevin L. Zhang (HKUST) COMP 328 Spring / 34

6 Probabilistic Models and Classification Joint Distribution and Classification We can perform classification even with a subset of attributes Play tennis when Wind is weak? P(Play = y W = w) = = P(Play = y, W = w) P(W = w) = 0.67 P(Play = n T = h) = 0.33 Answer: yes Nevin L. Zhang (HKUST) COMP 328 Spring / 34

7 Probabilistic Models and Classification The General Case Suppose we have a joint distribution P(A 1,A 2,...,A n,c) over attributes A 1,A 2,...,A n and the class variable C. For a new example with attribute values a 1,a 2,...,a n, For each possible value v j of class variable C, compute the probability P(C = v j A 1 = a 1, A 2 = a 2,..., A n = a n ) Assign the example to the class v j with the highest posterior probability: For all v j P(C = v j A 1 = a 1,A 2 = a 2,...,A n = a n) P(C = v j A 1 = a 1, A 2 = a 2,...,A n = a n) cj = arg max v j class labels P(C = v j A 1 = a 1,A 2 = a 2,..., A n = a n) Nevin L. Zhang (HKUST) COMP 328 Spring / 34

8 Probabilistic Models and Classification A Difficult 100 binary attributes, and a binary class variables. How many entries in the joint probability table? Too many to handle. Too many to estimate from data. Leads to overfit. The Naive Bayes model reduces the number of model parameters by making independence assumption. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

9 Outline Probabilistic Independence 1 Probabilistic Models and Classification 2 Probabilistic Independence 3 Na Ive Bayes Model Classifiers 4 Issues 5 Learning to Classify Text Nevin L. Zhang (HKUST) COMP 328 Spring / 34

10 Probabilistic Independence Marginal independence Two random variables X and Y are marginally independent, written X Y, if for any state x of X and any state y of Y, P(X=x Y =y) = P(X=x), whenever P(Y = y) 0. Meaning: Learning the value of Y does not give me any information about X and vice versa.y contains no information about X and vice versa. Equivalent definition: P(X=x,Y =y) = P(X=x)P(Y =y) Shorthand for the equations: P(X Y ) = P(X),P(X,Y ) = P(X)P(Y ). Nevin L. Zhang (HKUST) COMP 328 Spring / 34

11 Probabilistic Independence Marginal independence Examples: X:result of tossing a fair coin for the first time, Y: result of second tossing of the same coin. X: result of US election, Y : your grades in this course. Counter example:x midterm exam grade, Y final exam grade. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

12 Probabilistic Independence Conditional independence Two random variables X and Y are conditionally independent given a third variable Z,written X Y Z, if P(X=x Y=y, Z=z) = P(X=x Z=z) whenever P(Y =y, Z=z) 0 Meaning: If I know the state of Z already, then learning the state of Y does not give me additional information about X. Y might contain some information about X. However all the information about X contained in Y are also contained in Z. Shorthand for the equation: Equivalent definition: P(X Y, Z) = P(X Z) P(X, Y Z) = P(X Z)P(Y Z) Nevin L. Zhang (HKUST) COMP 328 Spring / 34

13 Probabilistic Independence Example of Conditional Independence There is a bag of 100 coins. 10 coins were made by a malfunctioning machine and are biased toward head. Tossing such a coin results in head 80% of the time. The other coins are fair. Randomly draw a coin from the bag and toss it a few time. X i : result of the i-th tossing, Y : whether the coin is produced by the malfunctioning machine. The X i s are not marginally independent of each other: If I get 9 heads in first 10 tosses, then the coin is probably a biased coin. Hence the next tossing will be more likely to result in a head than a tail. Learning the value of X i gives me some information about whether the coin is biased, which in term gives me some information about X j. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

14 Probabilistic Independence Example of Conditional Independence However, they are conditionally independent given Y : If the coin is not biased, the probability of getting a head in one toss is 1/2 regardless of the results of other tosses. If the coin is biased, the probability of getting a head in one toss is 80% regardless of the results of other tosses. If I already knows whether the coin is biased or not, learning the value of X i does not give me additional information about X j. Here is how the variables are related pictorially. We will return to this picture later. Coin Type Toss 1 Result Toss 2 Result... Toss n Result Nevin L. Zhang (HKUST) COMP 328 Spring / 34

15 Outline Naive Bayes Model Classifiers 1 Probabilistic Models and Classification 2 Probabilistic Independence 3 Na Ive Bayes Model Classifiers 4 Issues 5 Learning to Classify Text Nevin L. Zhang (HKUST) COMP 328 Spring / 34

16 Naive Bayes Model Classifiers The Naive Bayes Model It assumes that the attributes are mutually independent of each other given the class variable. Graphically depicted as: Joint distribution given by n P(C,A 1,A 2,...A n ) = P(C) P(A i C) i=1 Nevin L. Zhang (HKUST) COMP 328 Spring / 34

17 Naive Bayes Model Classifiers Learning the Naive Bayes Model Learning amounts to estimating from data: P(C), P(A 1 C),..., P(A n C). Straightforward to compute: ˆP(C = v j ) = ˆP(A i = a i C = v j ) = # of examples with C = v j total # of examples # of examples with C = v j and A i = a i # of examples with C = v j Although simple, those are the maximum likelihood estimates (MLE) of the parameters. Have nice properties. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

18 Naive Bayes Model Classifiers Classification the Naive Bayes Model For a new example with attribute values: a 1, a 2,..., a n, assign it to this class: v NB = arg max ˆP(C = v j ) v j V n ˆP(A i = a i C = v j ) i=1 Nevin L. Zhang (HKUST) COMP 328 Spring / 34

19 Example: PlayTennis Naive Bayes Model Classifiers Day Outlook Temperature Humidity Wind PlayTennis D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No Nevin L. Zhang (HKUST) COMP 328 Spring / 34

20 Naive Bayes Model Classifiers Example: Estimate parameters P(PlayTennis = y) = 9/14 P(PlayTennis = n) = 5/14 P(Outlook = sunny y) = 2/9 P(Outlook = sunny n) = 3/5 P(Outlook = overcast y) = 4/9 P(Outlook = overcast n) = 0/5 P(Outlook = rain y) = 3/9 P(Outlook = rain n) = 2/5 P(Temp = hot y) = 2/9 P(Temp = hot PlayTennis = n) = 2/5 P(Temp = mild y) = 4/9 P(Temp = mild n) = 2/5 P(Temp = cool y) = 3/9 P(Temp = cool n) = 1/5 P(Humidity = high y) = 3/9 P(Humidity = normal n) = 1/5 P(Humidity = normal y) = 6/9 P(Humidity = high n) = 4/5 P(Wind = strong y) = 3/9 P(Wind = strong n) = 3/5 P(Wind = weak y) = 6/9 P(Wind = weak n) = 2/5 Nevin L. Zhang (HKUST) COMP 328 Spring / 34

21 Naive Bayes Model Classifiers Example:Classification New case: (Sunny, Cool, High, Strong) PlayTennis? Inference: P(y)P(sunny y)p(cool y)p(high y)p(strong y) =.005 P(n)P(sunny n)p(cool n)p(high n)p(strong n) =.021 Conclusion: v NB = n No, don t play. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

22 Outline Issues 1 Probabilistic Models and Classification 2 Probabilistic Independence 3 Na Ive Bayes Model Classifiers 4 Issues 5 Learning to Classify Text Nevin L. Zhang (HKUST) COMP 328 Spring / 34

23 Zero counts Issues None of the training instances with target v j have attribute a i? ˆP(a i v j ) = 0 Hence, we have ˆP(v j ) i ˆP(a i v j ) = 0 Future example with a i, no chance to be classifies as v j, even if all other attribute values suggest v j. Smoothing: Add virtual count 1 to each case. (Laplace Smoothing/Correction) ˆP(A i = a i C = v j ) = (# of examples with C = v j and A i = a i ) + 1 (# of examples with C = v j ) + C C : number of classes Nevin L. Zhang (HKUST) COMP 328 Spring / 34

24 Weka does just that Issues P(outlook yes) is {3/12, 5/12, 4/12} instead of {2/9, 4/9, 3/9} as on Slide 20. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

25 Continuous attributes Issues Continuous attributes? Discretize them. Equal intervals. Weka has MDL method. Use parameteric form (Gaussian) for P(A i C). Nevin L. Zhang (HKUST) COMP 328 Spring / 34

26 Issues Conditional independence assumption Assumption: A i s mutually independent of each other given C. Often violated. Might lead to double counting To see this, suppose we duplicate A 1. So we have two copies of A 1 in data: A 1 and A 1. Information in data remain the same. However, classification might be different: arg max ˆP(v j )ˆP(A 1 v j )ˆP(A 1 v j )ˆP(A 2 v j )... v j V The evidence on A 1 is counted twice. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

27 Issues Conditional independence assumption Naive Bayes classifier works surprisingly well anyway! Reason: Although ˆP(v j ) i ˆP(a i v j ) might be a poor estimation of ˆP(v j, a 1...,a n ) to be correct we might still have arg max ˆP(v j ) v j V i ˆP(a i v j ) = arg max v j V P(v j, a 1..., a n ) Nevin L. Zhang (HKUST) COMP 328 Spring / 34

28 Issues Notes: Conditional independence assumption Bayesian (belief) network classifier: relax the assumption Nevin L. Zhang (HKUST) COMP 328 Spring / 34

29 Issues Overfitting Overfitting is not an issue for Naive Bayes classifier because the complexity is fixed. However it is an issue for Bayesian networks. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

30 Outline Learning to Classify Text 1 Probabilistic Models and Classification 2 Probabilistic Independence 3 Na Ive Bayes Model Classifiers 4 Issues 5 Learning to Classify Text Nevin L. Zhang (HKUST) COMP 328 Spring / 34

31 Learning to Classify Text Spam and non-spam s Classify s into spam and non-spam according to content. Classes: S and S Attributes: w 1, w 2,..., w n, a list of words. From training set. Stop word removal: e.g., a, the Stemming: e.g., engineering, engineered, engineer engineer Parameters P(w i S): probability of word w i appear in a spam mail. P(w i S): probability of word w i appear in a non-spam mail. P(S) and P( S): probability of a mail being spam or non-spam All those can be obtained from a training set by counting. Nevin L. Zhang (HKUST) COMP 328 Spring / 34

32 Learning to Classify Text Spam and non-spam s New document D with words: X 1, X 2,..., X n A subset of w 1, w 2,..., w n P(D S) = m i=1 P(X i S) P(D S) = m i=1 P(X i S) Document is spam if P(S) n n P(X i S) > P( S) P(X i S) i=1 Can we do this with decision trees? i=1 Nevin L. Zhang (HKUST) COMP 328 Spring / 34

33 Learning to Classify Text Nevin L. Zhang (HKUST) COMP 328 Spring / 34

34 Learning to Classify Text Final Remark When to use Naive Bayes Classifiers? Moderate or large training set available Attributes that describe instances are conditionally independent given classification Successful applications: Diagnosis Classifying text documents Nevin L. Zhang (HKUST) COMP 328 Spring / 34

The Naïve Bayes Classifier. Machine Learning Fall 2017

The Naïve Bayes Classifier. Machine Learning Fall 2017 The Naïve Bayes Classifier Machine Learning Fall 2017 1 Today s lecture The naïve Bayes Classifier Learning the naïve Bayes Classifier Practical concerns 2 Today s lecture The naïve Bayes Classifier Learning

More information

Bayesian Learning. Reading: Tom Mitchell, Generative and discriminative classifiers: Naive Bayes and logistic regression, Sections 1-2.

Bayesian Learning. Reading: Tom Mitchell, Generative and discriminative classifiers: Naive Bayes and logistic regression, Sections 1-2. Bayesian Learning Reading: Tom Mitchell, Generative and discriminative classifiers: Naive Bayes and logistic regression, Sections 1-2. (Linked from class website) Conditional Probability Probability of

More information

CSCE 478/878 Lecture 6: Bayesian Learning and Graphical Models. Stephen Scott. Introduction. Outline. Bayes Theorem. Formulas

CSCE 478/878 Lecture 6: Bayesian Learning and Graphical Models. Stephen Scott. Introduction. Outline. Bayes Theorem. Formulas ian ian ian Might have reasons (domain information) to favor some hypotheses/predictions over others a priori ian methods work with probabilities, and have two main roles: Naïve Nets (Adapted from Ethem

More information

Stephen Scott.

Stephen Scott. 1 / 28 ian ian Optimal (Adapted from Ethem Alpaydin and Tom Mitchell) Naïve Nets sscott@cse.unl.edu 2 / 28 ian Optimal Naïve Nets Might have reasons (domain information) to favor some hypotheses/predictions

More information

Lecture 9: Bayesian Learning

Lecture 9: Bayesian Learning Lecture 9: Bayesian Learning Cognitive Systems II - Machine Learning Part II: Special Aspects of Concept Learning Bayes Theorem, MAL / ML hypotheses, Brute-force MAP LEARNING, MDL principle, Bayes Optimal

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

UVA CS / Introduc8on to Machine Learning and Data Mining

UVA CS / Introduc8on to Machine Learning and Data Mining UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 13: Probability and Sta3s3cs Review (cont.) + Naïve Bayes Classifier Yanjun Qi / Jane, PhD University of Virginia Department

More information

CSCE 478/878 Lecture 6: Bayesian Learning

CSCE 478/878 Lecture 6: Bayesian Learning Bayesian Methods Not all hypotheses are created equal (even if they are all consistent with the training data) Outline CSCE 478/878 Lecture 6: Bayesian Learning Stephen D. Scott (Adapted from Tom Mitchell

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Bayesian Methods: Naïve Bayes

Bayesian Methods: Naïve Bayes Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior

More information

Bayesian Learning Features of Bayesian learning methods:

Bayesian Learning Features of Bayesian learning methods: Bayesian Learning Features of Bayesian learning methods: Each observed training example can incrementally decrease or increase the estimated probability that a hypothesis is correct. This provides a more

More information

Introduction to ML. Two examples of Learners: Naïve Bayesian Classifiers Decision Trees

Introduction to ML. Two examples of Learners: Naïve Bayesian Classifiers Decision Trees Introduction to ML Two examples of Learners: Naïve Bayesian Classifiers Decision Trees Why Bayesian learning? Probabilistic learning: Calculate explicit probabilities for hypothesis, among the most practical

More information

Artificial Intelligence: Reasoning Under Uncertainty/Bayes Nets

Artificial Intelligence: Reasoning Under Uncertainty/Bayes Nets Artificial Intelligence: Reasoning Under Uncertainty/Bayes Nets Bayesian Learning Conditional Probability Probability of an event given the occurrence of some other event. P( X Y) P( X Y) P( Y) P( X,

More information

Bayesian Classification. Bayesian Classification: Why?

Bayesian Classification. Bayesian Classification: Why? Bayesian Classification http://css.engineering.uiowa.edu/~comp/ Bayesian Classification: Why? Probabilistic learning: Computation of explicit probabilities for hypothesis, among the most practical approaches

More information

Algorithms for Classification: The Basic Methods

Algorithms for Classification: The Basic Methods Algorithms for Classification: The Basic Methods Outline Simplicity first: 1R Naïve Bayes 2 Classification Task: Given a set of pre-classified examples, build a model or classifier to classify new cases.

More information

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction 15-0: Learning vs. Deduction Artificial Intelligence Programming Bayesian Learning Chris Brooks Department of Computer Science University of San Francisco So far, we ve seen two types of reasoning: Deductive

More information

Probability Based Learning

Probability Based Learning Probability Based Learning Lecture 7, DD2431 Machine Learning J. Sullivan, A. Maki September 2013 Advantages of Probability Based Methods Work with sparse training data. More powerful than deterministic

More information

MLE/MAP + Naïve Bayes

MLE/MAP + Naïve Bayes 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University MLE/MAP + Naïve Bayes Matt Gormley Lecture 19 March 20, 2018 1 Midterm Exam Reminders

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning CS4375 --- Fall 2018 Bayesian a Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell 1 Uncertainty Most real-world problems deal with

More information

Introduction to Machine Learning

Introduction to Machine Learning Uncertainty Introduction to Machine Learning CS4375 --- Fall 2018 a Bayesian Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell Most real-world problems deal with

More information

Introduction to Bayesian Learning

Introduction to Bayesian Learning Course Information Introduction Introduction to Bayesian Learning Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Apprendimento Automatico: Fondamenti - A.A. 2016/2017 Outline

More information

Machine Learning. Bayesian Learning.

Machine Learning. Bayesian Learning. Machine Learning Bayesian Learning Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg Martin.Riedmiller@uos.de

More information

Decision Trees. Tirgul 5

Decision Trees. Tirgul 5 Decision Trees Tirgul 5 Using Decision Trees It could be difficult to decide which pet is right for you. We ll find a nice algorithm to help us decide what to choose without having to think about it. 2

More information

Bayesian Learning. Remark on Conditional Probabilities and Priors. Two Roles for Bayesian Methods. [Read Ch. 6] [Suggested exercises: 6.1, 6.2, 6.

Bayesian Learning. Remark on Conditional Probabilities and Priors. Two Roles for Bayesian Methods. [Read Ch. 6] [Suggested exercises: 6.1, 6.2, 6. Machine Learning Bayesian Learning Bayes Theorem Bayesian Learning [Read Ch. 6] [Suggested exercises: 6.1, 6.2, 6.6] Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme

More information

Inteligência Artificial (SI 214) Aula 15 Algoritmo 1R e Classificador Bayesiano

Inteligência Artificial (SI 214) Aula 15 Algoritmo 1R e Classificador Bayesiano Inteligência Artificial (SI 214) Aula 15 Algoritmo 1R e Classificador Bayesiano Prof. Josenildo Silva jcsilva@ifma.edu.br 2015 2012-2015 Josenildo Silva (jcsilva@ifma.edu.br) Este material é derivado dos

More information

Representation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2

Representation. Stefano Ermon, Aditya Grover. Stanford University. Lecture 2 Representation Stefano Ermon, Aditya Grover Stanford University Lecture 2 Stefano Ermon, Aditya Grover (AI Lab) Deep Generative Models Lecture 2 1 / 32 Learning a generative model We are given a training

More information

Classification. Classification. What is classification. Simple methods for classification. Classification by decision tree induction

Classification. Classification. What is classification. Simple methods for classification. Classification by decision tree induction Classification What is classification Classification Simple methods for classification Classification by decision tree induction Classification evaluation Classification in Large Databases Classification

More information

Soft Computing. Lecture Notes on Machine Learning. Matteo Mattecci.

Soft Computing. Lecture Notes on Machine Learning. Matteo Mattecci. Soft Computing Lecture Notes on Machine Learning Matteo Mattecci matteucci@elet.polimi.it Department of Electronics and Information Politecnico di Milano Matteo Matteucci c Lecture Notes on Machine Learning

More information

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides slide 1 Inference with Bayes rule: Example In a bag there are two envelopes one has a red ball (worth $100) and a black ball one

More information

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative

More information

10-701/ Machine Learning: Assignment 1

10-701/ Machine Learning: Assignment 1 10-701/15-781 Machine Learning: Assignment 1 The assignment is due September 27, 2005 at the beginning of class. Write your name in the top right-hand corner of each page submitted. No paperclips, folders,

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Supervised Learning Input: labelled training data i.e., data plus desired output Assumption:

More information

Be able to define the following terms and answer basic questions about them:

Be able to define the following terms and answer basic questions about them: CS440/ECE448 Section Q Fall 2017 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables, axioms of probability o Joint, marginal, conditional

More information

Data Mining Part 4. Prediction

Data Mining Part 4. Prediction Data Mining Part 4. Prediction 4.3. Fall 2009 Instructor: Dr. Masoud Yaghini Outline Introduction Bayes Theorem Naïve References Introduction Bayesian classifiers A statistical classifiers Introduction

More information

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework

More information

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)

Bayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA) Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu

More information

Learning Decision Trees

Learning Decision Trees Learning Decision Trees Machine Learning Spring 2018 1 This lecture: Learning Decision Trees 1. Representation: What are decision trees? 2. Algorithm: Learning decision trees The ID3 algorithm: A greedy

More information

Bayesian Learning. Bayesian Learning Criteria

Bayesian Learning. Bayesian Learning Criteria Bayesian Learning In Bayesian learning, we are interested in the probability of a hypothesis h given the dataset D. By Bayes theorem: P (h D) = P (D h)p (h) P (D) Other useful formulas to remember are:

More information

Topics. Bayesian Learning. What is Bayesian Learning? Objectives for Bayesian Learning

Topics. Bayesian Learning. What is Bayesian Learning? Objectives for Bayesian Learning Topics Bayesian Learning Sattiraju Prabhakar CS898O: ML Wichita State University Objectives for Bayesian Learning Bayes Theorem and MAP Bayes Optimal Classifier Naïve Bayes Classifier An Example Classifying

More information

Decision Tree Learning Mitchell, Chapter 3. CptS 570 Machine Learning School of EECS Washington State University

Decision Tree Learning Mitchell, Chapter 3. CptS 570 Machine Learning School of EECS Washington State University Decision Tree Learning Mitchell, Chapter 3 CptS 570 Machine Learning School of EECS Washington State University Outline Decision tree representation ID3 learning algorithm Entropy and information gain

More information

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning CS 446 Machine Learning Fall 206 Nov 0, 206 Bayesian Learning Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes Overview Bayesian Learning Naive Bayes Logistic Regression Bayesian Learning So far, we

More information

Naïve Bayes Lecture 6: Self-Study -----

Naïve Bayes Lecture 6: Self-Study ----- Naïve Bayes Lecture 6: Self-Study ----- Marina Santini Acknowledgements Slides borrowed and adapted from: Data Mining by I. H. Witten, E. Frank and M. A. Hall 1 Lecture 6: Required Reading Daumé III (015:

More information

An AI-ish view of Probability, Conditional Probability & Bayes Theorem

An AI-ish view of Probability, Conditional Probability & Bayes Theorem An AI-ish view of Probability, Conditional Probability & Bayes Theorem Review: Uncertainty and Truth Values: a mismatch Let action A t = leave for airport t minutes before flight. Will A 15 get me there

More information

10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty.

10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty. An AI-ish view of Probability, Conditional Probability & Bayes Theorem Review: Uncertainty and Truth Values: a mismatch Let action A t = leave for airport t minutes before flight. Will A 15 get me there

More information

Machine Learning. Yuh-Jye Lee. March 1, Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU

Machine Learning. Yuh-Jye Lee. March 1, Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU Machine Learning Yuh-Jye Lee Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU March 1, 2017 1 / 13 Bayes Rule Bayes Rule Assume that {B 1, B 2,..., B k } is a partition of S

More information

LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning

LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES Supervised Learning Linear vs non linear classifiers In K-NN we saw an example of a non-linear classifier: the decision boundary

More information

Machine Learning. Bayesian Learning. Acknowledgement Slides courtesy of Martin Riedmiller

Machine Learning. Bayesian Learning. Acknowledgement Slides courtesy of Martin Riedmiller Machine Learning Bayesian Learning Dr. Joschka Boedecker AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg jboedeck@informatik.uni-freiburg.de

More information

Machine Learning Recitation 8 Oct 21, Oznur Tastan

Machine Learning Recitation 8 Oct 21, Oznur Tastan Machine Learning 10601 Recitation 8 Oct 21, 2009 Oznur Tastan Outline Tree representation Brief information theory Learning decision trees Bagging Random forests Decision trees Non linear classifier Easy

More information

Estimating Parameters

Estimating Parameters Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University September 13, 2012 Today: Bayes Classifiers Naïve Bayes Gaussian Naïve Bayes Readings: Mitchell: Naïve Bayes

More information

Lecture 24: Other (Non-linear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof. Ganesh Ramakrishnan

Lecture 24: Other (Non-linear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof. Ganesh Ramakrishnan Lecture 24: Other (Non-linear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof Ganesh Ramakrishnan October 20, 2016 1 / 25 Decision Trees: Cascade of step

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,

More information

Introduction. Decision Tree Learning. Outline. Decision Tree 9/7/2017. Decision Tree Definition

Introduction. Decision Tree Learning. Outline. Decision Tree 9/7/2017. Decision Tree Definition Introduction Decision Tree Learning Practical methods for inductive inference Approximating discrete-valued functions Robust to noisy data and capable of learning disjunctive expression ID3 earch a completely

More information

Lecture 9: Naive Bayes, SVM, Kernels. Saravanan Thirumuruganathan

Lecture 9: Naive Bayes, SVM, Kernels. Saravanan Thirumuruganathan Lecture 9: Naive Bayes, SVM, Kernels Instructor: Outline 1 Probability basics 2 Probabilistic Interpretation of Classification 3 Bayesian Classifiers, Naive Bayes 4 Support Vector Machines Probability

More information

Bayes Theorem & Naïve Bayes. (some slides adapted from slides by Massimo Poesio, adapted from slides by Chris Manning)

Bayes Theorem & Naïve Bayes. (some slides adapted from slides by Massimo Poesio, adapted from slides by Chris Manning) Bayes Theorem & Naïve Bayes (some slides adapted from slides by Massimo Poesio, adapted from slides by Chris Manning) Review: Bayes Theorem & Diagnosis P( a b) Posterior Likelihood Prior P( b a) P( a)

More information

Generative Classifiers: Part 1. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang

Generative Classifiers: Part 1. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Generative Classifiers: Part 1 CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang 1 This Week Discriminative vs Generative Models Simple Model: Does the patient

More information

Decision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1

Decision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1 Decision Trees Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, 2018 Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1 Roadmap Classification: machines labeling data for us Last

More information

Where are we? è Five major sec3ons of this course

Where are we? è Five major sec3ons of this course UVA CS 4501-001 / 6501 007 Introduc8on to Machine Learning and Data Mining Lecture 12: Probability and Sta3s3cs Review Yanjun Qi / Jane University of Virginia Department of Computer Science 10/02/14 1

More information

Bayesian Approaches Data Mining Selected Technique

Bayesian Approaches Data Mining Selected Technique Bayesian Approaches Data Mining Selected Technique Henry Xiao xiao@cs.queensu.ca School of Computing Queen s University Henry Xiao CISC 873 Data Mining p. 1/17 Probabilistic Bases Review the fundamentals

More information

Supervised Learning! Algorithm Implementations! Inferring Rudimentary Rules and Decision Trees!

Supervised Learning! Algorithm Implementations! Inferring Rudimentary Rules and Decision Trees! Supervised Learning! Algorithm Implementations! Inferring Rudimentary Rules and Decision Trees! Summary! Input Knowledge representation! Preparing data for learning! Input: Concept, Instances, Attributes"

More information

COS 424: Interacting with Data. Lecturer: Dave Blei Lecture #11 Scribe: Andrew Ferguson March 13, 2007

COS 424: Interacting with Data. Lecturer: Dave Blei Lecture #11 Scribe: Andrew Ferguson March 13, 2007 COS 424: Interacting with ata Lecturer: ave Blei Lecture #11 Scribe: Andrew Ferguson March 13, 2007 1 Graphical Models Wrap-up We began the lecture with some final words on graphical models. Choosing a

More information

Notes on Machine Learning for and

Notes on Machine Learning for and Notes on Machine Learning for 16.410 and 16.413 (Notes adapted from Tom Mitchell and Andrew Moore.) Choosing Hypotheses Generally want the most probable hypothesis given the training data Maximum a posteriori

More information

CS 188: Artificial Intelligence. Machine Learning

CS 188: Artificial Intelligence. Machine Learning CS 188: Artificial Intelligence Review of Machine Learning (ML) DISCLAIMER: It is insufficient to simply study these slides, they are merely meant as a quick refresher of the high-level ideas covered.

More information

MLE/MAP + Naïve Bayes

MLE/MAP + Naïve Bayes 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University MLE/MAP + Naïve Bayes MLE / MAP Readings: Estimating Probabilities (Mitchell, 2016)

More information

Algorithmisches Lernen/Machine Learning

Algorithmisches Lernen/Machine Learning Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 26, 2015 Today: Bayes Classifiers Conditional Independence Naïve Bayes Readings: Mitchell: Naïve Bayes

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

BAYESIAN LEARNING. [Read Ch. 6] [Suggested exercises: 6.1, 6.2, 6.6]

BAYESIAN LEARNING. [Read Ch. 6] [Suggested exercises: 6.1, 6.2, 6.6] 1 BAYESIAN LEARNING [Read Ch. 6] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theorem MAP, ML hypotheses, MAP learners Minimum description length principle Bayes optimal classifier, Naive Bayes learner Example:

More information

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty Bayes Classification n Uncertainty & robability n Baye's rule n Choosing Hypotheses- Maximum a posteriori n Maximum Likelihood - Baye's concept learning n Maximum Likelihood of real valued function n Bayes

More information

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

CS6375: Machine Learning Gautam Kunapuli. Decision Trees Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s

More information

Naïve Bayes Classifiers

Naïve Bayes Classifiers Naïve Bayes Classifiers Example: PlayTennis (6.9.1) Given a new instance, e.g. (Outlook = sunny, Temperature = cool, Humidity = high, Wind = strong ), we want to compute the most likely hypothesis: v NB

More information

Basic Probability and Statistics

Basic Probability and Statistics Basic Probability and Statistics Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Jerry Zhu, Mark Craven] slide 1 Reasoning with Uncertainty

More information

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric

More information

CS 188: Artificial Intelligence Spring Today

CS 188: Artificial Intelligence Spring Today CS 188: Artificial Intelligence Spring 2006 Lecture 9: Naïve Bayes 2/14/2006 Dan Klein UC Berkeley Many slides from either Stuart Russell or Andrew Moore Bayes rule Today Expectations and utilities Naïve

More information

Learning Decision Trees

Learning Decision Trees Learning Decision Trees Machine Learning Fall 2018 Some slides from Tom Mitchell, Dan Roth and others 1 Key issues in machine learning Modeling How to formulate your problem as a machine learning problem?

More information

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 11: Probability Review. Dr. Yanjun Qi. University of Virginia. Department of Computer Science

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 11: Probability Review. Dr. Yanjun Qi. University of Virginia. Department of Computer Science UVA CS 6316/4501 Fall 2016 Machine Learning Lecture 11: Probability Review 10/17/16 Dr. Yanjun Qi University of Virginia Department of Computer Science 1 Announcements: Schedule Midterm Nov. 26 Wed / 3:30pm

More information

Machine Learning 2nd Edi7on

Machine Learning 2nd Edi7on Lecture Slides for INTRODUCTION TO Machine Learning 2nd Edi7on CHAPTER 9: Decision Trees ETHEM ALPAYDIN The MIT Press, 2010 Edited and expanded for CS 4641 by Chris Simpkins alpaydin@boun.edu.tr h1p://www.cmpe.boun.edu.tr/~ethem/i2ml2e

More information

Naive Bayes classification

Naive Bayes classification Naive Bayes classification Christos Dimitrakakis December 4, 2015 1 Introduction One of the most important methods in machine learning and statistics is that of Bayesian inference. This is the most fundamental

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification

More information

Bayesian Learning (II)

Bayesian Learning (II) Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP

More information

Machine Learning, Midterm Exam: Spring 2009 SOLUTION

Machine Learning, Midterm Exam: Spring 2009 SOLUTION 10-601 Machine Learning, Midterm Exam: Spring 2009 SOLUTION March 4, 2009 Please put your name at the top of the table below. If you need more room to work out your answer to a question, use the back of

More information

Introduction to Machine Learning Midterm, Tues April 8

Introduction to Machine Learning Midterm, Tues April 8 Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend

More information

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference

COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference COMP 551 Applied Machine Learning Lecture 19: Bayesian Inference Associate Instructor: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted

More information

Dimensionality reduction

Dimensionality reduction Dimensionality Reduction PCA continued Machine Learning CSE446 Carlos Guestrin University of Washington May 22, 2013 Carlos Guestrin 2005-2013 1 Dimensionality reduction n Input data may have thousands

More information

COMP61011 : Machine Learning. Probabilis*c Models + Bayes Theorem

COMP61011 : Machine Learning. Probabilis*c Models + Bayes Theorem COMP61011 : Machine Learning Probabilis*c Models + Bayes Theorem Probabilis*c Models - one of the most active areas of ML research in last 15 years - foundation of numerous new technologies - enables decision-making

More information

Bayesian Inference. Definitions from Probability: Naive Bayes Classifiers: Advantages and Disadvantages of Naive Bayes Classifiers:

Bayesian Inference. Definitions from Probability: Naive Bayes Classifiers: Advantages and Disadvantages of Naive Bayes Classifiers: Bayesian Inference The purpose of this document is to review belief networks and naive Bayes classifiers. Definitions from Probability: Belief networks: Naive Bayes Classifiers: Advantages and Disadvantages

More information

COMP538: Introduction to Bayesian Networks

COMP538: Introduction to Bayesian Networks COMP538: Introduction to Bayesian Networks Lecture 9: Optimal Structure Learning Nevin L. Zhang lzhang@cse.ust.hk Department of Computer Science and Engineering Hong Kong University of Science and Technology

More information

Decision Trees.

Decision Trees. . Machine Learning Decision Trees Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de

More information

Bayes Rule. CS789: Machine Learning and Neural Network Bayesian learning. A Side Note on Probability. What will we learn in this lecture?

Bayes Rule. CS789: Machine Learning and Neural Network Bayesian learning. A Side Note on Probability. What will we learn in this lecture? Bayes Rule CS789: Machine Learning and Neural Network Bayesian learning P (Y X) = P (X Y )P (Y ) P (X) Jakramate Bootkrajang Department of Computer Science Chiang Mai University P (Y ): prior belief, prior

More information

CSE-4412(M) Midterm. There are five major questions, each worth 10 points, for a total of 50 points. Points for each sub-question are as indicated.

CSE-4412(M) Midterm. There are five major questions, each worth 10 points, for a total of 50 points. Points for each sub-question are as indicated. 22 February 2007 CSE-4412(M) Midterm p. 1 of 12 CSE-4412(M) Midterm Sur / Last Name: Given / First Name: Student ID: Instructor: Parke Godfrey Exam Duration: 75 minutes Term: Winter 2007 Answer the following

More information

A.I. in health informatics lecture 3 clinical reasoning & probabilistic inference, II *

A.I. in health informatics lecture 3 clinical reasoning & probabilistic inference, II * A.I. in health informatics lecture 3 clinical reasoning & probabilistic inference, II * kevin small & byron wallace * Slides borrow heavily from Andrew Moore, Weng- Keen Wong and Longin Jan Latecki today

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

CLASSIFICATION NAIVE BAYES. NIKOLA MILIKIĆ UROŠ KRČADINAC

CLASSIFICATION NAIVE BAYES. NIKOLA MILIKIĆ UROŠ KRČADINAC CLASSIFICATION NAIVE BAYES NIKOLA MILIKIĆ nikola.milikic@fon.bg.ac.rs UROŠ KRČADINAC uros@krcadinac.com WHAT IS CLASSIFICATION? A supervised learning task of determining the class of an instance; it is

More information

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58 Overview of Course So far, we have studied The concept of Bayesian network Independence and Separation in Bayesian networks Inference in Bayesian networks The rest of the course: Data analysis using Bayesian

More information

T Machine Learning: Basic Principles

T Machine Learning: Basic Principles Machine Learning: Basic Principles Bayesian Networks Laboratory of Computer and Information Science (CIS) Department of Computer Science and Engineering Helsinki University of Technology (TKK) Autumn 2007

More information

Probabilistic Graphical Models for Image Analysis - Lecture 1

Probabilistic Graphical Models for Image Analysis - Lecture 1 Probabilistic Graphical Models for Image Analysis - Lecture 1 Alexey Gronskiy, Stefan Bauer 21 September 2018 Max Planck ETH Center for Learning Systems Overview 1. Motivation - Why Graphical Models 2.

More information

Generative Learning algorithms

Generative Learning algorithms CS9 Lecture notes Andrew Ng Part IV Generative Learning algorithms So far, we ve mainly been talking about learning algorithms that model p(y x; θ), the conditional distribution of y given x. For instance,

More information

Naïve Bayes. Vibhav Gogate The University of Texas at Dallas

Naïve Bayes. Vibhav Gogate The University of Texas at Dallas Naïve Bayes Vibhav Gogate The University of Texas at Dallas Supervised Learning of Classifiers Find f Given: Training set {(x i, y i ) i = 1 n} Find: A good approximation to f : X Y Examples: what are

More information

Introduction to Machine Learning Midterm Exam

Introduction to Machine Learning Midterm Exam 10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but

More information

Data classification (II)

Data classification (II) Lecture 4: Data classification (II) Data Mining - Lecture 4 (2016) 1 Outline Decision trees Choice of the splitting attribute ID3 C4.5 Classification rules Covering algorithms Naïve Bayes Classification

More information