Today. Statistical Learning. Coin Flip. Coin Flip. Experiment 1: Heads. Experiment 1: Heads. Which coin will I use? Which coin will I use?

Size: px
Start display at page:

Download "Today. Statistical Learning. Coin Flip. Coin Flip. Experiment 1: Heads. Experiment 1: Heads. Which coin will I use? Which coin will I use?"

Transcription

1 Today Statistical Learning Parameter Estimation: Maximum Likelihood (ML) Maximum A Posteriori (MAP) Bayesian Continuous case Learning Parameters for a Bayesian Network Naive Bayes Maximum Likelihood estimates Priors Learning Structure of Bayesian Networks Coin Flip Coin Flip C C C C P(H C ) =. P(H C P(H P(H C ) =. P(H C P(H Which coin will I use? Which coin will I use? P(C ) = / P(C ) = / P( ) = / Prior: Probability of a hypothesis before we make any observations P(C ) = / P(C ) = / P( ) = / Uniform Prior: All hypothesis are equally likely before we make any observations 4 Experiment : Heads Experiment : Heads P(C H) =? P(C H) =? P( H) =? P(C H) =.66 P(C H) =. P( H) =.6 P (C H) = P (H C )P (C ) P (H) P (H) = P (H C i )P (C i ) i= Posterior: Probability of a hypothesis given data C C C C P(H C ) =. P(H C P(H P(C ) = / P(C ) = / P( ) = / 5 P(H C ) =. P(H C P(H P(C ) = / P(C ) = / P( ) = / 6

2 Experiment : Tails P(C HT) =? P(C HT) =? P( HT) =? P (C HT ) = αp (HT C )P (C ) = αp (H C )P (T C )P (C ) Experiment : Tails P(C HT) =. P(C HT8 P( HT) =. P (C HT ) = αp (HT C )P (C ) = αp (H C )P (T C )P (C ) C C C C P(H C ) =. P(H C P(H P(C ) = / P(C ) = / P( ) = / 7 P(H C ) =. P(H C P(H P(C ) = / P(C ) = / P( ) = / 8 Experiment : Tails P(C HT) =. P(C HT8 P( HT) =. Your Estimate? What is the probability of heads after two experiments? Most likely coin: C Best estimate for P(H) P(H C C C C C P(H C ) =. P(H C P(H P(C ) = / P(C ) = / P( ) = / 9 P(H C ) =. P(H C P(H P(C ) = / P(C ) = / P( ) = / Most likely coin: C Your Estimate? Maximum Likelihood Estimate: The best hypothesis that fits observed data assuming uniform prior C Best estimate for P(H) P(H C Using Prior Knowledge Should we always use Uniform Prior? Background knowledge: Heads => you go first in Abalone against TA TAs are nice people => TA is more likely to use a coin biased in your favor C C P(H C P(C ) = / P(H C ) =. P(H C P(H

3 Using Prior Knowledge We can encode it in the prior: P(C ) =.5 P(C P( C C Experiment : Heads P(C H) =? P(C H) =? P( H) =? P (C H) = αp (H C )P (C ) C C P(H C ) =. P(H C P(H P(H C ) =. P(H C P(H P(C ) =.5 P(C P( 4 Experiment : Heads P(C H) =.6 P(C H) =.65 P( H) =.89 ML posterior after Exp : P(C H) =.66 P(C H) =. P( H) =.6 C C Experiment : Tails P(C HT) =? P(C HT) =? P( HT) =? P (C HT ) = αp (HT C )P (C ) = αp (H C )P (T C )P (C ) C C P(H C ) =. P(H C P(H P(C ) =.5 P(C P( 5 P(H C ) =. P(H C P(H P(C ) =.5 P(C P( 6 Experiment : Tails Experiment : Tails P(C HT) =.5 P(C HT) =.48 P( HT) =.485 P(C HT) =.5 P(C HT) =.48 P( HT) =.485 P (C HT ) = αp (HT C )P (C ) = αp (H C )P (T C )P (C ) C C C C P(H C ) =. P(H C P(H P(C ) =.5 P(C P( 7 P(H C ) =. P(H C P(H P(C ) =.5 P(C P( 8

4 Your Estimate? What is the probability of heads after two experiments? Most likely coin: Best estimate for P(H) Most likely coin: Your Estimate? Maximum A Posteriori (MAP) Estimate: The best hypothesis that fits observed data assuming a non-uniform prior Best estimate for P(H) P(H P(H C C P(H C ) =. P(H C P(H P(C ) =.5 P(C P( 9 P(H P( Did We Do The Right Thing? Did We Do The Right Thing? P(C HT) =.5 P(C HT) =.48 P( HT) =.485 P(C HT) =.5 P(C HT) =.48 P( HT) =.485 C and are almost equally likely C C C C P(H C ) =. P(H C P(H P(H C ) =. P(H C P(H A Better Estimate Recall: P (H) = P (H C i )P (C i ) =.68 i= P(C HT) =.5 P(C HT) =.48 P( HT) =.485 Bayesian Estimate Bayesian Estimate: Minimizes prediction error, given data and (generally) assuming a non-uniform prior P (H) = P (H C i )P (C i ) =.68 i= P(C HT) =.5 P(C HT) =.48 P( HT) =.485 C C P(H C ) =. P(H C P(H C C P(H C ) =. P(H C P(H 4

5 Comparison ML (Maximum Likelihood): P(H after experiments: P(H MAP (Maximum A Posteriori): P(H after experiments: P(H Bayesian: P(H) =.68 after experiments: P(H Comparison ML (Maximum Likelihood): P(H after experiments (HTH 8 ): P(H MAP (Maximum A Posteriori): P(H after experiments (HTH 8 ): P(H Bayesian: P(H) =.68 after experiments (HTH 8 ): P(H 5 6 Comparison ML (Maximum Likelihood): Easy to compute MAP (Maximum A Posteriori): Still easy to compute Incorporates prior knowledge Bayesian: Minimizes error => great when data is scarce Potentially much harder to compute Comparison ML (Maximum Likelihood): Easy to compute MAP (Maximum A Posteriori): Still easy to compute Incorporates prior knowledge Bayesian: Minimizes error => great when data is scarce Potentially much harder to compute 7 8 Comparison ML (Maximum Likelihood): Easy to compute MAP (Maximum A Posteriori): Still easy to compute Incorporates prior knowledge Bayesian: Minimizes error => great when data is scarce Potentially much harder to compute Summary For Now Prior: Probability of a hypothesis before we see any data Likelihood: Probability of data given hypothesis Uniform Prior: A prior that makes all hypothesis equaly likely Posterior: Probability of a hypothesis after we saw some data 9

6 Summary For Now Continuous Case Prior: Probability of a hypothesis before we see any data Uniform Prior: A prior that makes all hypothesis equaly likely Posterior: Probability of a hypothesis after we saw some data Likelihood: Probability of data given hypothesis Prior In the previous example, we chose from a discrete set of three coins Hypothesis Maximum Likelihood Estimate Uniform The most likely Point Maximum A Posteriori Estimate Any The most likely Point Bayesian Estimate Any Weighted combination In general, we have to pick from a continuous distribution of biased coins Average Continuous Case Continuous Case Continuous Case Prior 4 Continuous Case Exp : Tails Exp : Heads.9 Posterior after experiments: uniform ML Estimate MAP Estimate Bayesian Estimate w/ uniform prior with background knowledge. with background knowledge

7 After Experiments ML Estimate MAP Estimate Bayesian Estimate Posterior: w/ uniform prior After Experiments ML Estimate MAP Estimate Bayesian Estimate Posterior: w/ uniform prior with background knowledge with background knowledge 7 8 Parameter Estimation and Bayesian Networks Parameter Estimation and Bayesian Networks E B R A J M T F T T F T F F F F F T F T F T T T F F F T T T F T F F F F E B R A J M T F T T F T F F F F F T F T F T T T F F F T T T F T F F F F We have: - Bayes Net structure and observations - We need: Bayes Net parameters P(B) =? Prior + data = Now compute either MAP or Bayesian estimate 9 4 Parameter Estimation and Bayesian Networks E B R A J M T F T T F T F F F F F T F T F T T T F F F T T T F T F F F F Parameter Estimation and Bayesian Networks E B R A J M T F T T F T F F F F F T F T F T T T F F F T T T F T F F F F 4 P(A E,B) =? P(A E, B) =? P(A E,B) =? P(A E, B) =? 4

8 Parameter Estimation and Bayesian Networks Parameter Estimation and Bayesian Networks E B R A J M T F T T F T F F F F F T F T F T T T F F F T T T F T F F F F E B R A J M T F T T F T F F F F F T F T F T T T F F F T T T F T F F F F P(A E,B) =? P(A E, B) =? P(A E,B) =? P(A E, B) =? Prior + data = Now compute either MAP or Bayesian estimate 4 P(A E,B) =? P(A E, B) =? P(A E,B) =? P(A E, B) =? You know the drill 44 Naive Bayes FREE apple Bayes A Bayes Net where all nodes are children of a single root node Why? Expressive and accurate? Easy to learn? Naive Bayes A Bayes Net where all nodes are children of a single root node Why? Expressive and accurate? No - why? Easy to learn? Naive Bayes Naive Bayes A Bayes Net where all nodes are children of a single root node Why? Expressive and accurate? No Easy to learn? Yes A Bayes Net where all nodes are children of a single root node Why? Expressive and accurate? No Easy to learn? Yes Useful? Sometimes 47 48

9 Inference In Naive Bayes P(S) =.6 Inference In Naive Bayes P(S) =.6 P(A S) =. P(A S) =. P(B S) =. P(B S) =. P(F S) =.8 P(F S) =.5 P(A S) =? P(A S) =? P(A S) =. P(A S) =. P(B S) =. P(B S) =. P(F S) =.8 P(F S) =.5 P(A S) =? P(A S) =? Goal, given evidence (words in an ) decide if an is spam E = {A, B, F, K,...} P (S E) = = = P (E S)P (S) P (E) P (A, B, F, K,... S)P (S) P (A, B, F, K,...) Independence to the rescue! P (A S)P ( B S)P (F S)P ( K S)P (... S)P (S) P (A)P ( B)P (F )P ( K)P (...) 49 5 Inference In Naive Bayes P(S) =.6 Inference In Naive Bayes P(S) =.6 P(A S) =. P(A S) =. P(B S) =. P(B S) =. P(F S) =.8 P(F S) =.5 P(A S) =? P(A S) =? P(A S) =. P(A S) =. P(B S) =. P(B S) =. P(F S) =.8 P(F S) =.5 P(A S) =? P(A S) =? P (S E) = P ( S E) = P (A S)P ( B S)P (F S)P ( K S)P (... S)P (S) P (A)P ( B)P (F )P ( K)P (...) P (A S)P ( B S)P (F S)P ( K S)P (... S)P ( S) P (A)P ( B)P (F )P ( K)P (...) if P(S E) > P( S E) But 5 P (S E) P (A S)P ( B S)P (F S)P ( K S)P (... S)P (S) P ( S E) P (A S)P ( B S)P (F S)P ( K S)P (... S)P ( S) 5 Prior ! Can we calculate Maximum Likelihood estimate of! easily? + Data: = Max Likelihood estimate ! Looking for the maximum of a function: - find the derivative - set it to zero What function are we maximizing? P(data hypothesis) hypothesis = h! (one for each value of!) P(data h ) = P( h )P(!! h )P(! h )P(! h )! =! (-!) (-!)! =! # (-!) # 5 54

10 What function are we maximizing? P(data hypothesis) hypothesis = h! (one for each value of!) P(data h ) = P( h )P(!! h )P(! h )P(! h )! =! (-!) (-!)! =! # (-!) # What function are we maximizing? P(data hypothesis) hypothesis = h! (one for each value of!) P(data h ) = P( h )P(!! h )P(! h )P(! h )! =! (-!) (-!)! =! # (-!) # What function are we maximizing? P(data hypothesis) hypothesis = h! (one for each value of!) P(data h ) = P( h )P(!! h )P(! h )P(! h )! =! (-!) (-!)! =! # (-!) # What function are we maximizing? P(data hypothesis) hypothesis = h! (one for each value of!) P(data h ) = P( h )P(!! h )P(! h )P(! h )! =! (-!) (-!)! =! # (-!) # To find! that maximizes! # (-!) # we take a derivative of the function and set it to. And we get: = # / (# + # ) You knew it already, right? To find! that maximizes! # (-!) # we take a derivative of the function and set it to. And we get: # = # + # You knew it already, right? 59 6

11 Problems With Small Samples What happens if in your training data apples are not mentioned in any spam message? P(A S) = Why is it bad? P (S E) P (A S)P ( B S)P (F S)P ( K S)P (... S)P (S) = Smoothing Smoothing is used when samples are small Add-one smoothing is the simplest smoothing method: just add to every count! 6 6 Priors! # Recall that P(S) = # + # If we have a slight hunch that P(S)! p # + p P(S) = # + # + If we have a big hunch that P(S)! p # + mp # + # + m where m can be any number > Priors! # Recall that P(S) = # + # If we have a slight hunch that P(S)! p # + p P(S) = # + # + If we have a big hunch that P(S)! p # + mp # + # + m where m can be any number > 6 64 Priors! # Recall that P(S) = # + # If we have a slight hunch that P(S)! p # + p P(S) = # + # + If we have a big hunch that P(S)! p # + mp # + # + m where m can be any number > Priors! # + mp P(S) = # + # + m Note that if m = in the above, it is like saying I have seen samples that make me believe that P(S) = p Hence, m is referred to as the equivalent sample size 65 66

12 Priors! # + mp P(S) = # + # + m Where should p come from? No prior knowledge => p=.5 If you build a personalized spam filter, you can use p = P(S) from some body else s filter! Inference in Naive Bayes Revisited Recall that P (S E) P (A S)P ( B S)P (F S)P ( K S)P (... S)P (S) We are multiplying lots of small numbers together Is there => any danger potential of underflow! for trouble here? Solution? Use logs! Inference in Naive Bayes Revisited Recall that P (S E) P (A S)P ( B S)P (F S)P ( K S)P (... S)P (S) We are multiplying lots of small numbers together => danger of underflow! Solution? Use logs! Inference in Naive Bayes Revisited log(p (S E)) log(p (A S)P ( B S)P (F S)P ( K S)P (... S)P (S)) log(p (A S))+log(P ( B S))+log(P (F S))+log(P ( K S))+log(P (... S))+log(P (S))) Now we add regular numbers -- little danger of over- or underflow errors! 69 7 Learning The Structure of Bayesian Networks General idea: look at all possible network structures and pick one that fits observed data best Impossibly slow: exponential number of networks, and for each we have to learn parameters, too! What do we do if searching the space exhaustively is too expensive? Learning The Structure of Bayesian Networks Local search! Start with some network structure Try to make a change (add, delete or reverse node) See if the new network is any better 7 7

13 Learning The Structure of Bayesian Networks Learning The Structure of Bayesian Networks What network structure should we start with? Random with uniform prior? Networks that reflects our (or experts ) knowledge of the field? prior network+equivalent sample size data x x x x 4 x 5 x true true x 6 x 7 x 8 x true. x 9 x true true... improved network(s) x x x 4 x x 5 x 6 x 7 x 8 x Learning The Structure of Bayesian Networks We have just described how to get an ML or MAP estimate of the structure of a Bayes Net What would the Bayes estimate look like? Find all possible networks Calculate their posteriors When doing inference: result weighed combination of all networks! 75

Logistics. Naïve Bayes & Expectation Maximization. 573 Schedule. Coming Soon. Estimation Models. Topics

Logistics. Naïve Bayes & Expectation Maximization. 573 Schedule. Coming Soon. Estimation Models. Topics Logistics Naïve Bayes & Expectation Maximization CSE 7 eam Meetings Midterm Open book, notes Studying See AIMA exercises Daniel S. Weld Daniel S. Weld 7 Schedule Selected opics Coming Soon Selected opics

More information

Which coin will I use? Which coin will I use? Logistics. Statistical Learning. Topics. 573 Topics. Coin Flip. Coin Flip

Which coin will I use? Which coin will I use? Logistics. Statistical Learning. Topics. 573 Topics. Coin Flip. Coin Flip Logistics Statistical Learning CSE 57 Team Meetings Midterm Open book, notes Studying See AIMA exercises Daniel S. Weld Daniel S. Weld Supervised Learning 57 Topics Logic-Based Reinforcement Learning Planning

More information

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional

More information

Bayesian Methods: Naïve Bayes

Bayesian Methods: Naïve Bayes Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite

More information

Probabilistic Classification

Probabilistic Classification Bayesian Networks Probabilistic Classification Goal: Gather Labeled Training Data Build/Learn a Probability Model Use the model to infer class labels for unlabeled data points Example: Spam Filtering...

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics March 14, 2018 CS 361: Probability & Statistics Inference The prior From Bayes rule, we know that we can express our function of interest as Likelihood Prior Posterior The right hand side contains the

More information

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya

Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya BBM 205 Discrete Mathematics Hacettepe University http://web.cs.hacettepe.edu.tr/ bbm205 Lecture 10: Bayes' Theorem, Expected Value and Variance Lecturer: Lale Özkahya Resources: Kenneth Rosen, Discrete

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Bayesian Learning. Instructor: Jesse Davis

Bayesian Learning. Instructor: Jesse Davis Bayesian Learning Instructor: Jesse Davis 1 Announcements Homework 1 is due today Homework 2 is out Slides for this lecture are online We ll review some of homework 1 next class Techniques for efficient

More information

Bayesian Learning (II)

Bayesian Learning (II) Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Bayes Formula. MATH 107: Finite Mathematics University of Louisville. March 26, 2014

Bayes Formula. MATH 107: Finite Mathematics University of Louisville. March 26, 2014 Bayes Formula MATH 07: Finite Mathematics University of Louisville March 26, 204 Test Accuracy Conditional reversal 2 / 5 A motivating question A rare disease occurs in out of every 0,000 people. A test

More information

Introduction to Artificial Intelligence Midterm 2. CS 188 Spring You have approximately 2 hours and 50 minutes.

Introduction to Artificial Intelligence Midterm 2. CS 188 Spring You have approximately 2 hours and 50 minutes. CS 188 Spring 2014 Introduction to Artificial Intelligence Midterm 2 You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers

More information

Bayesian Updating with Discrete Priors Class 11, Jeremy Orloff and Jonathan Bloom

Bayesian Updating with Discrete Priors Class 11, Jeremy Orloff and Jonathan Bloom 1 Learning Goals ian Updating with Discrete Priors Class 11, 18.05 Jeremy Orloff and Jonathan Bloom 1. Be able to apply theorem to compute probabilities. 2. Be able to define the and to identify the roles

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning CS4375 --- Fall 2018 Bayesian a Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell 1 Uncertainty Most real-world problems deal with

More information

Generative Clustering, Topic Modeling, & Bayesian Inference

Generative Clustering, Topic Modeling, & Bayesian Inference Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week

More information

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation Machine Learning CMPT 726 Simon Fraser University Binomial Parameter Estimation Outline Maximum Likelihood Estimation Smoothed Frequencies, Laplace Correction. Bayesian Approach. Conjugate Prior. Uniform

More information

Introduction to Machine Learning

Introduction to Machine Learning Uncertainty Introduction to Machine Learning CS4375 --- Fall 2018 a Bayesian Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell Most real-world problems deal with

More information

With Question/Answer Animations. Chapter 7

With Question/Answer Animations. Chapter 7 With Question/Answer Animations Chapter 7 Chapter Summary Introduction to Discrete Probability Probability Theory Bayes Theorem Section 7.1 Section Summary Finite Probability Probabilities of Complements

More information

CS4705. Probability Review and Naïve Bayes. Slides from Dragomir Radev

CS4705. Probability Review and Naïve Bayes. Slides from Dragomir Radev CS4705 Probability Review and Naïve Bayes Slides from Dragomir Radev Classification using a Generative Approach Previously on NLP discriminative models P C D here is a line with all the social media posts

More information

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1 Bayes Networks CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 59 Outline Joint Probability: great for inference, terrible to obtain

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

CSC321 Lecture 18: Learning Probabilistic Models

CSC321 Lecture 18: Learning Probabilistic Models CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling

More information

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning CS 446 Machine Learning Fall 206 Nov 0, 206 Bayesian Learning Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes Overview Bayesian Learning Naive Bayes Logistic Regression Bayesian Learning So far, we

More information

Discrete Mathematics and Probability Theory Fall 2015 Lecture 21

Discrete Mathematics and Probability Theory Fall 2015 Lecture 21 CS 70 Discrete Mathematics and Probability Theory Fall 205 Lecture 2 Inference In this note we revisit the problem of inference: Given some data or observations from the world, what can we infer about

More information

Text Categorization CSE 454. (Based on slides by Dan Weld, Tom Mitchell, and others)

Text Categorization CSE 454. (Based on slides by Dan Weld, Tom Mitchell, and others) Text Categorization CSE 454 (Based on slides by Dan Weld, Tom Mitchell, and others) 1 Given: Categorization A description of an instance, x X, where X is the instance language or instance space. A fixed

More information

CSE 473: Artificial Intelligence Autumn Topics

CSE 473: Artificial Intelligence Autumn Topics CSE 473: Artificial Intelligence Autumn 2014 Bayesian Networks Learning II Dan Weld Slides adapted from Jack Breese, Dan Klein, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 473 Topics

More information

Naïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 3 September 14, Readings: Mitchell Ch Murphy Ch.

Naïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 3 September 14, Readings: Mitchell Ch Murphy Ch. School of Computer Science 10-701 Introduction to Machine Learning aïve Bayes Readings: Mitchell Ch. 6.1 6.10 Murphy Ch. 3 Matt Gormley Lecture 3 September 14, 2016 1 Homewor 1: due 9/26/16 Project Proposal:

More information

CSC Discrete Math I, Spring Discrete Probability

CSC Discrete Math I, Spring Discrete Probability CSC 125 - Discrete Math I, Spring 2017 Discrete Probability Probability of an Event Pierre-Simon Laplace s classical theory of probability: Definition of terms: An experiment is a procedure that yields

More information

Probability Review and Naïve Bayes

Probability Review and Naïve Bayes Probability Review and Naïve Bayes Instructor: Alan Ritter Some slides adapted from Dan Jurfasky and Brendan O connor What is Probability? The probability the coin will land heads is 0.5 Q: what does this

More information

Introduction to AI Learning Bayesian networks. Vibhav Gogate

Introduction to AI Learning Bayesian networks. Vibhav Gogate Introduction to AI Learning Bayesian networks Vibhav Gogate Inductive Learning in a nutshell Given: Data Examples of a function (X, F(X)) Predict function F(X) for new examples X Discrete F(X): Classification

More information

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction 15-0: Learning vs. Deduction Artificial Intelligence Programming Bayesian Learning Chris Brooks Department of Computer Science University of San Francisco So far, we ve seen two types of reasoning: Deductive

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

MLE/MAP + Naïve Bayes

MLE/MAP + Naïve Bayes 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University MLE/MAP + Naïve Bayes MLE / MAP Readings: Estimating Probabilities (Mitchell, 2016)

More information

Logistic Regression. Machine Learning Fall 2018

Logistic Regression. Machine Learning Fall 2018 Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes

More information

Classification & Information Theory Lecture #8

Classification & Information Theory Lecture #8 Classification & Information Theory Lecture #8 Introduction to Natural Language Processing CMPSCI 585, Fall 2007 University of Massachusetts Amherst Andrew McCallum Today s Main Points Automatically categorizing

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Bayesian RL Seminar. Chris Mansley September 9, 2008

Bayesian RL Seminar. Chris Mansley September 9, 2008 Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

Lecture 9: Naive Bayes, SVM, Kernels. Saravanan Thirumuruganathan

Lecture 9: Naive Bayes, SVM, Kernels. Saravanan Thirumuruganathan Lecture 9: Naive Bayes, SVM, Kernels Instructor: Outline 1 Probability basics 2 Probabilistic Interpretation of Classification 3 Bayesian Classifiers, Naive Bayes 4 Support Vector Machines Probability

More information

Non-parametric Methods

Non-parametric Methods Non-parametric Methods Machine Learning Alireza Ghane Non-Parametric Methods Alireza Ghane / Torsten Möller 1 Outline Machine Learning: What, Why, and How? Curve Fitting: (e.g.) Regression and Model Selection

More information

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables? Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do

More information

Conditional Probability & Independence. Conditional Probabilities

Conditional Probability & Independence. Conditional Probabilities Conditional Probability & Independence Conditional Probabilities Question: How should we modify P(E) if we learn that event F has occurred? Definition: the conditional probability of E given F is P(E F

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013 Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you

More information

The Naïve Bayes Classifier. Machine Learning Fall 2017

The Naïve Bayes Classifier. Machine Learning Fall 2017 The Naïve Bayes Classifier Machine Learning Fall 2017 1 Today s lecture The naïve Bayes Classifier Learning the naïve Bayes Classifier Practical concerns 2 Today s lecture The naïve Bayes Classifier Learning

More information

Statistical learning. Chapter 20, Sections 1 3 1

Statistical learning. Chapter 20, Sections 1 3 1 Statistical learning Chapter 20, Sections 1 3 Chapter 20, Sections 1 3 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete

More information

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I SYDE 372 Introduction to Pattern Recognition Probability Measures for Classification: Part I Alexander Wong Department of Systems Design Engineering University of Waterloo Outline 1 2 3 4 Why use probability

More information

CS 188: Artificial Intelligence Spring Today

CS 188: Artificial Intelligence Spring Today CS 188: Artificial Intelligence Spring 2006 Lecture 9: Naïve Bayes 2/14/2006 Dan Klein UC Berkeley Many slides from either Stuart Russell or Andrew Moore Bayes rule Today Expectations and utilities Naïve

More information

LEARNING WITH BAYESIAN NETWORKS

LEARNING WITH BAYESIAN NETWORKS LEARNING WITH BAYESIAN NETWORKS Author: David Heckerman Presented by: Dilan Kiley Adapted from slides by: Yan Zhang - 2006, Jeremy Gould 2013, Chip Galusha -2014 Jeremy Gould 2013Chip Galus May 6th, 2016

More information

MLE/MAP + Naïve Bayes

MLE/MAP + Naïve Bayes 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University MLE/MAP + Naïve Bayes Matt Gormley Lecture 19 March 20, 2018 1 Midterm Exam Reminders

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released

More information

Final Examination CS 540-2: Introduction to Artificial Intelligence

Final Examination CS 540-2: Introduction to Artificial Intelligence Final Examination CS 540-2: Introduction to Artificial Intelligence May 7, 2017 LAST NAME: SOLUTIONS FIRST NAME: Problem Score Max Score 1 14 2 10 3 6 4 10 5 11 6 9 7 8 9 10 8 12 12 8 Total 100 1 of 11

More information

Directed Graphical Models

Directed Graphical Models CS 2750: Machine Learning Directed Graphical Models Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 Graphical Models If no assumption of independence is made, must estimate an exponential

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 3 Probability Contents 1. Events, Sample Spaces, and Probability 2. Unions and Intersections 3. Complementary Events 4. The Additive Rule and Mutually Exclusive

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a

More information

CSCE 478/878 Lecture 6: Bayesian Learning

CSCE 478/878 Lecture 6: Bayesian Learning Bayesian Methods Not all hypotheses are created equal (even if they are all consistent with the training data) Outline CSCE 478/878 Lecture 6: Bayesian Learning Stephen D. Scott (Adapted from Tom Mitchell

More information

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION THOMAS MAILUND Machine learning means different things to different people, and there is no general agreed upon core set of algorithms that must be

More information

Review: Bayesian learning and inference

Review: Bayesian learning and inference Review: Bayesian learning and inference Suppose the agent has to make decisions about the value of an unobserved query variable X based on the values of an observed evidence variable E Inference problem:

More information

Point Estimation. Vibhav Gogate The University of Texas at Dallas

Point Estimation. Vibhav Gogate The University of Texas at Dallas Point Estimation Vibhav Gogate The University of Texas at Dallas Some slides courtesy of Carlos Guestrin, Chris Bishop, Dan Weld and Luke Zettlemoyer. Basics: Expectation and Variance Binary Variables

More information

Naïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Naïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824 Naïve Bayes Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative HW 1 out today. Please start early! Office hours Chen: Wed 4pm-5pm Shih-Yang: Fri 3pm-4pm Location: Whittemore 266

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables? Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what

More information

Info 2950, Lecture 4

Info 2950, Lecture 4 Info 2950, Lecture 4 7 Feb 2017 More Programming and Statistics Boot Camps? This week (only): PG Wed Office Hour 8 Feb at 3pm Prob Set 1: due Mon night 13 Feb (no extensions ) Note: Added part to problem

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 3: Probability, Bayes Theorem, and Bayes Classification Peter Belhumeur Computer Science Columbia University Probability Should you play this game? Game: A fair

More information

Lecture 3 Probability Basics

Lecture 3 Probability Basics Lecture 3 Probability Basics Thais Paiva STA 111 - Summer 2013 Term II July 3, 2013 Lecture Plan 1 Definitions of probability 2 Rules of probability 3 Conditional probability What is Probability? Probability

More information

Last Time. Today. Bayesian Learning. The Distributions We Love. CSE 446 Gaussian Naïve Bayes & Logistic Regression

Last Time. Today. Bayesian Learning. The Distributions We Love. CSE 446 Gaussian Naïve Bayes & Logistic Regression CSE 446 Gaussian Naïve Bayes & Logistic Regression Winter 22 Dan Weld Learning Gaussians Naïve Bayes Last Time Gaussians Naïve Bayes Logistic Regression Today Some slides from Carlos Guestrin, Luke Zettlemoyer

More information

Inconsistency of Bayesian inference when the model is wrong, and how to repair it

Inconsistency of Bayesian inference when the model is wrong, and how to repair it Inconsistency of Bayesian inference when the model is wrong, and how to repair it Peter Grünwald Thijs van Ommen Centrum Wiskunde & Informatica, Amsterdam Universiteit Leiden June 3, 2015 Outline 1 Introduction

More information

Naïve Bayes. Vibhav Gogate The University of Texas at Dallas

Naïve Bayes. Vibhav Gogate The University of Texas at Dallas Naïve Bayes Vibhav Gogate The University of Texas at Dallas Supervised Learning of Classifiers Find f Given: Training set {(x i, y i ) i = 1 n} Find: A good approximation to f : X Y Examples: what are

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Be able to define the following terms and answer basic questions about them:

Be able to define the following terms and answer basic questions about them: CS440/ECE448 Section Q Fall 2017 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables, axioms of probability o Joint, marginal, conditional

More information

Language Modelling: Smoothing and Model Complexity. COMP-599 Sept 14, 2016

Language Modelling: Smoothing and Model Complexity. COMP-599 Sept 14, 2016 Language Modelling: Smoothing and Model Complexity COMP-599 Sept 14, 2016 Announcements A1 has been released Due on Wednesday, September 28th Start code for Question 4: Includes some of the package import

More information

CS221 / Autumn 2017 / Liang & Ermon. Lecture 15: Bayesian networks III

CS221 / Autumn 2017 / Liang & Ermon. Lecture 15: Bayesian networks III CS221 / Autumn 2017 / Liang & Ermon Lecture 15: Bayesian networks III cs221.stanford.edu/q Question Which is computationally more expensive for Bayesian networks? probabilistic inference given the parameters

More information

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks BY: MOHAMAD ALSABBAGH Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional

More information

Computational Perception. Bayesian Inference

Computational Perception. Bayesian Inference Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters

More information

PROBABILISTIC REASONING SYSTEMS

PROBABILISTIC REASONING SYSTEMS PROBABILISTIC REASONING SYSTEMS In which we explain how to build reasoning systems that use network models to reason with uncertainty according to the laws of probability theory. Outline Knowledge in uncertain

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,

More information

1 Probabilities. 1.1 Basics 1 PROBABILITIES

1 Probabilities. 1.1 Basics 1 PROBABILITIES 1 PROBABILITIES 1 Probabilities Probability is a tricky word usually meaning the likelyhood of something occuring or how frequent something is. Obviously, if something happens frequently, then its probability

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Statistical methods for NLP Estimation

Statistical methods for NLP Estimation Statistical methods for NLP Estimation UNIVERSITY OF Richard Johansson January 29, 2015 why does the teacher care so much about the coin-tossing experiment? because it can model many situations: I pick

More information

Bayesian Networks. Motivation

Bayesian Networks. Motivation Bayesian Networks Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Motivation Assume we have five Boolean variables,,,, The joint probability is,,,, How many state configurations

More information

NLP: Probability. 1 Basics. Dan Garrette December 27, E : event space (sample space)

NLP: Probability. 1 Basics. Dan Garrette December 27, E : event space (sample space) NLP: Probability Dan Garrette dhg@cs.utexas.edu December 27, 2013 1 Basics E : event space (sample space) We will be dealing with sets of discrete events. Example 1: Coin Trial: flipping a coin Two possible

More information

Algorithmisches Lernen/Machine Learning

Algorithmisches Lernen/Machine Learning Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines

More information

Machine Learning. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 8 May 2012

Machine Learning. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 8 May 2012 Machine Learning Hal Daumé III Computer Science University of Maryland me@hal3.name CS 421 Introduction to Artificial Intelligence 8 May 2012 g 1 Many slides courtesy of Dan Klein, Stuart Russell, or Andrew

More information

Probability Based Learning

Probability Based Learning Probability Based Learning Lecture 7, DD2431 Machine Learning J. Sullivan, A. Maki September 2013 Advantages of Probability Based Methods Work with sparse training data. More powerful than deterministic

More information

Naïve Bayes Classifiers and Logistic Regression. Doug Downey Northwestern EECS 349 Winter 2014

Naïve Bayes Classifiers and Logistic Regression. Doug Downey Northwestern EECS 349 Winter 2014 Naïve Bayes Classifiers and Logistic Regression Doug Downey Northwestern EECS 349 Winter 2014 Naïve Bayes Classifiers Combines all ideas we ve covered Conditional Independence Bayes Rule Statistical Estimation

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 8: Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk Based on slides by Sharon Goldwater October 14, 2016 Frank Keller Computational

More information

Statistical learning. Chapter 20, Sections 1 3 1

Statistical learning. Chapter 20, Sections 1 3 1 Statistical learning Chapter 20, Sections 1 3 Chapter 20, Sections 1 3 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete

More information

Bayes Theorem & Naïve Bayes. (some slides adapted from slides by Massimo Poesio, adapted from slides by Chris Manning)

Bayes Theorem & Naïve Bayes. (some slides adapted from slides by Massimo Poesio, adapted from slides by Chris Manning) Bayes Theorem & Naïve Bayes (some slides adapted from slides by Massimo Poesio, adapted from slides by Chris Manning) Review: Bayes Theorem & Diagnosis P( a b) Posterior Likelihood Prior P( b a) P( a)

More information

Bayesian Learning. CSL603 - Fall 2017 Narayanan C Krishnan

Bayesian Learning. CSL603 - Fall 2017 Narayanan C Krishnan Bayesian Learning CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Bayes Theorem MAP Learners Bayes optimal classifier Naïve Bayes classifier Example text classification Bayesian networks

More information

Bayesian Learning Features of Bayesian learning methods:

Bayesian Learning Features of Bayesian learning methods: Bayesian Learning Features of Bayesian learning methods: Each observed training example can incrementally decrease or increase the estimated probability that a hypothesis is correct. This provides a more

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Machine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 2: Linear Classification Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d.

More information

Lecture 7. Bayes formula and independence

Lecture 7. Bayes formula and independence 18.440: Lecture 7 Bayes formula and independence Scott Sheffield MIT 1 Outline Bayes formula Independence 2 Outline Bayes formula Independence 3 Recall definition: conditional probability Definition: P(E

More information

Accouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF

Accouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF Accouncements You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF Please do not zip these files and submit (unless there are >5 files) 1 Bayesian Methods Machine Learning

More information

{ p if x = 1 1 p if x = 0

{ p if x = 1 1 p if x = 0 Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =

More information

Statistical learning. Chapter 20, Sections 1 4 1

Statistical learning. Chapter 20, Sections 1 4 1 Statistical learning Chapter 20, Sections 1 4 Chapter 20, Sections 1 4 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete

More information

(1) Introduction to Bayesian statistics

(1) Introduction to Bayesian statistics Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they

More information