In today s lecture. Conditional probability and independence. COSC343: Artificial Intelligence. Curse of dimensionality.

Similar documents
Artificial Intelligence

Uncertainty. Chapter 13

Cartesian-product sample spaces and independence

Artificial Intelligence Uncertainty

An AI-ish view of Probability, Conditional Probability & Bayes Theorem

10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty.

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

13.4 INDEPENDENCE. 494 Chapter 13. Quantifying Uncertainty

Probabilistic Reasoning

Uncertainty and Belief Networks. Introduction to Artificial Intelligence CS 151 Lecture 1 continued Ok, Lecture 2!

Artificial Intelligence Programming Probability

CS 188: Artificial Intelligence Spring Announcements

Uncertainty. Outline

CS 188: Artificial Intelligence Fall 2008

Where are we in CS 440?

Probabilistic representation and reasoning

Pengju

10/15/2015 A FAST REVIEW OF DISCRETE PROBABILITY (PART 2) Probability, Conditional Probability & Bayes Rule. Discrete random variables

Artificial Intelligence CS 6364

Pengju XJTU 2016

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

CS 188: Artificial Intelligence Fall 2009

CS 188: Artificial Intelligence. Our Status in CS188

Logistics. Naïve Bayes & Expectation Maximization. 573 Schedule. Coming Soon. Estimation Models. Topics

CSE 473: Artificial Intelligence

COMP61011! Probabilistic Classifiers! Part 1, Bayes Theorem!

The Bayesian Learning

Events A and B are independent P(A) = P(A B) = P(A B) / P(B)

Uncertainty. Chapter 13

Probabilistic Models

Machine Learning. CS Spring 2015 a Bayesian Learning (I) Uncertainty

Reasoning under Uncertainty: Intro to Probability

CS 188: Artificial Intelligence Fall 2009

This lecture. Reading. Conditional Independence Bayesian (Belief) Networks: Syntax and semantics. Chapter CS151, Spring 2004

Reasoning under Uncertainty: Intro to Probability

Why Probability? It's the right way to look at the world.

Reasoning Under Uncertainty: Conditional Probability

UNCERTAINTY. In which we see what an agent should do when not all is crystal-clear.

Chapter 13 Quantifying Uncertainty

Quantifying uncertainty & Bayesian networks

Probability Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 27 Mar 2012

PROBABILITY AND INFERENCE

CS 5522: Artificial Intelligence II

Probability Based Learning

CS 561: Artificial Intelligence

Where are we in CS 440?

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

Computer Science CPSC 322. Lecture 18 Marginalization, Conditioning

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Fusion in simple models

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

PROBABILITY. Inference: constraint propagation

Bayesian Learning. Examples. Conditional Probability. Two Roles for Bayesian Methods. Prior Probability and Random Variables. The Chain Rule P (B)

CS 343: Artificial Intelligence

Reasoning Under Uncertainty

Probabilistic Reasoning

Modeling and reasoning with uncertainty

The Naïve Bayes Classifier. Machine Learning Fall 2017

Probabilistic Representation and Reasoning

Web-Mining Agents Data Mining

Uncertainty. Chapter 13, Sections 1 6

Bayesian decision theory. Nuno Vasconcelos ECE Department, UCSD

CSE 473: Artificial Intelligence Autumn 2011

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

COMP61011 : Machine Learning. Probabilis*c Models + Bayes Theorem

Answering. Question Answering: Result. Application of the Theorem Prover: Question. Overview. Question Answering: Example.

Our Status. We re done with Part I Search and Planning!

Reasoning Under Uncertainty: Conditioning, Bayes Rule & the Chain Rule

COMP9414/ 9814/ 3411: Artificial Intelligence. 14. Uncertainty. Russell & Norvig, Chapter 13. UNSW c AIMA, 2004, Alan Blair, 2012

Reasoning Under Uncertainty

CS 5100: Founda.ons of Ar.ficial Intelligence

Discrete Random Variables

Bayesian Networks BY: MOHAMAD ALSABBAGH

Concept Learning Mitchell, Chapter 2. CptS 570 Machine Learning School of EECS Washington State University

Bayesian networks (1) Lirong Xia

Review: Bayesian learning and inference

CMPSCI 240: Reasoning about Uncertainty

Bayesian Classifiers and Probability Estimation. Vassilis Athitsos CSE 4308/5360: Artificial Intelligence I University of Texas at Arlington

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Probabilistic representation and reasoning

Fundamentals of Machine Learning for Predictive Data Analytics

Probabilistic Classification

Introduction to Machine Learning

Discrete Probability and State Estimation

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Bayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI

Algorithms for Classification: The Basic Methods

Introduction to Machine Learning

Machine Learning

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Probabilistic Robotics

Naïve Bayes classification

Probabilistic Models. Models describe how (a portion of) the world works

Probabilistic Reasoning Systems

Bayes and Naïve Bayes Classifiers CS434

Bayes Theorem & Naïve Bayes. (some slides adapted from slides by Massimo Poesio, adapted from slides by Chris Manning)

Basic Probability. Robert Platt Northeastern University. Some images and slides are used from: 1. AIMA 2. Chris Amato

CS 188: Artificial Intelligence Fall 2011

Transcription:

In today s lecture COSC343: Artificial Intelligence Lecture 5: Bayesian Reasoning Conditional probability independence Curse of dimensionality Lech Szymanski Dept. of Computer Science, University of Otago Bayes rule Naive Bayes Classifier Lech Szymanski COSC343 Lecture 5 Recap: a simple medical example Consider a medical scenario, with 3 Boolean variables cavity (does the patient have a cavity or not?) (does the patient have a or not?) catch (does the dentist s probe catch on the patient s tooth?) Here s an example probability model: the joint probability distribution Conditional probabilities Probability of an event given information about another event. given probability of a given b

Computing conditional probabilities Assume we re given the following joint probability distribution learn that a patient has a E.g. how to calculate? Conditional probability of whole distributions Conditional probabilities can also be computed for whole distributions. For instance, the conditional probability distribution of Cavity Catch given shows all possible values of Cavity Catch given. catch catch cavity.188/.2.12/.2 cavity.16/.2.064/.2 Complexity of reasoning from a full joint distribution Inference from full joint distributions is all about summing over hidden variables. If there are Boolean variables in the table, it takes time to compute a probability Moreover, the space complexity of the table itself is This means as increases, it becomes infeasible even to store the full joint distribution And remember: someone has to build the full joint distribution from empirical data! In practice, probabilistic reasoning tasks often involve thouss of rom variables. Obviously, a more efficient reasoning technique is necessary. Curse of dimensionality When the dimensionality of problem increases, the volume of the space becomes so large, you need exponentially more points to sample the space at the same density. This phenomenon is sometimes referred as the curse of dimensionality. https://haifengl.wordpress.com/2016/02/29/there-is-no-big-data-in-machine-learning/

Independence Huge reductions in complexity can be made if we know that some variables in our domain are independent of others If then it is said that A B are independent If A B are independent, then it follows that: An example of independence Say we add a variable Weather to our dentist model, with four values: sunny, cloudy, rainy, snowy. Weather teeth don t affect each other To compute the probability of a proposition involving weather teeth, we can just multiply the results found in the two distributions Given: it follows that: Independence If we wanted to create a full joint distribution, we could do so by multiplying the appropriate probabilities from the two distributions Instead of 4 dimensions of 2x2x2x4=32 samples we have two spaces of 3 1 dimensions with total of 2x2x2+4=12 samples sunny.6 cloudy.029 rainy.1 snowy.001 Conditional independence Absolute independence is powerful, but rare. Can we factor out further? For instance, in dentistry, most variables are related, but not directly. What do do? A very useful concept is conditional independence For instance, having a cavity changes the probability of a changes the probability of a catch but these two are separate!

Conditional independence Knowing the value of Toothache certainly gives us information about the value of Catch. But only indirectly, as they are both linked to the variable Cavity. If I already knew the value of Cavity, then learning about Catch won t tell me anything else about the value of Toothache. Bayes rule From the definition of conditional probability it follows that Therefore: Conditional probability so Bayes Rule Conditional independence A problem Information from previous patients: What s more likely? Larger or smaller? mild y o n y strong y y mild n? How to diagnose a person with the following symptoms? Compute see which one is more likely. It s hard to estimate : You have to have information on many groups of people one for each possible combination of symptoms test each person in each group for flu It s far easier to estimate : You take a sample of only two groups of people those with flu those without, test them for symptoms From Bayes rule

Invert the problem Need probability distributions Bayes Normalising denominator is the same on both sides, so plays no part in the comparison Gathering prior probabilities from training set mild y o n y strong y y y n 5/8 3/8 Gathering conditional probabilities from training set Gathering conditional probabilities from training set mild y o n y strong y y joint probability distribution has 4 dimensions 24 combinations of symptoms has 1 dimension 2 has 1 dimension 2 has 1 dimension 2 has 1 dimension 3 If the symptoms were independent events, then from independence we would have Let s just assume the symptoms are independent! Naive Bayes mild y o n y strong y y y n 5/8 3/8 3/5 2/5 4/5 1/5 mild no strong 2/5 1/5 2/5 3/5 2/5 1/3 2/3 1/3 2/3 mild no strong 1/3 1/3 1/3 1/3 2/3

Naive Bayes Classifier Naive Bayes Classifier flu Probability of flu given symptoms: And so, for the person with the symptoms: C R H Fr mild n? - From Bayes rule - From the assumption of independence where y no flu n, - From the probability distributions derived from the training data no mild strong Naive Bayes Classifier deems this person more likelot to have a flu! Summary reading Conditional probabilities model changes in uncertainty given some information Independence simplifies computation Bayes rule: possible to compute from Naive Bayes: assume s input variables are independent, so that Reading for the lecture: AIMA Chapter 13 Sections 3-5 Reading for next lecture: AIMA Chapter 18 Section 3