Probability, Statistics, and Bayes Theorem Session 3

Similar documents
Intermediate Math Circles November 15, 2017 Probability III

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Joint, Conditional, & Marginal Probabilities

2.4 Conditional Probability

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Uncertainty. Michael Peters December 27, 2013

CS 361: Probability & Statistics

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

Conditional Probability, Independence and Bayes Theorem Class 3, Jeremy Orloff and Jonathan Bloom

Lecture Stat 302 Introduction to Probability - Slides 5

Joint, Conditional, & Marginal Probabilities

Lecture 2. Conditional Probability

Conditional Probability P( )

Lecture 3. January 7, () Lecture 3 January 7, / 35

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

Introduction to Probability Theory, Algebra, and Set Theory

18.05 Problem Set 5, Spring 2014 Solutions

Grades 7 & 8, Math Circles 24/25/26 October, Probability

Machine Learning. CS Spring 2015 a Bayesian Learning (I) Uncertainty

Elementary Discrete Probability

P (E) = P (A 1 )P (A 2 )... P (A n ).

With Question/Answer Animations. Chapter 7

CS 540: Machine Learning Lecture 2: Review of Probability & Statistics

Introducing Proof 1. hsn.uk.net. Contents

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

Quantitative Understanding in Biology 1.7 Bayesian Methods

the time it takes until a radioactive substance undergoes a decay

Lecture 3: Probability

Formalizing Probability. Choosing the Sample Space. Probability Measures

Some Basic Concepts of Probability and Information Theory: Pt. 1

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

Bayesian Learning. Reading: Tom Mitchell, Generative and discriminative classifiers: Naive Bayes and logistic regression, Sections 1-2.

Discrete Random Variables

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

It applies to discrete and continuous random variables, and a mix of the two.

Conditional Probability, Independence, Bayes Theorem Spring January 1, / 23

Conditional Probability, Independence, Bayes Theorem Spring January 1, / 28

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Cambridge University Press How to Prove it: A Structured Approach: Second edition Daniel J. Velleman Excerpt More information

Statistical Methods for Astronomy

Why should you care?? Intellectual curiosity. Gambling. Mathematically the same as the ESP decision problem we discussed in Week 4.

The rest of the course

Section 1.3: Valid and Invalid Arguments

Basic Probability. Introduction

CS 124 Math Review Section January 29, 2018

Thinking with Probability

Stat 225 Week 2, 8/27/12-8/31/12, Notes: Independence and Bayes Rule

Chapter 9. Hypothesis testing. 9.1 Introduction

Announcements. Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size, and power.

CHAPTER 6 - THINKING ABOUT AND PRACTICING PROPOSITIONAL LOGIC

Basic Probabilistic Reasoning SEG

Chapter 2. Mathematical Reasoning. 2.1 Mathematical Models

Single Maths B: Introduction to Probability

Where are we in CS 440?

C Examples of Bayes Calculations

Probabilistic models

More on conditioning and Mr. Bayes

Independent Events. The multiplication rule for independent events says that if A and B are independent, P (A and B) = P (A) P (B).

STA111 - Lecture 1 Welcome to STA111! 1 What is the difference between Probability and Statistics?

Announcements. Midterm Review Session. Extended TA session on Friday 9am to 12 noon.

4. Conditional Probability

2. Probability. Chris Piech and Mehran Sahami. Oct 2017

Discrete Random Variables

Outline. Introduction. Bayesian Probability Theory Bayes rule Bayes rule applied to learning Bayesian learning and the MDL principle

Great Theoretical Ideas in Computer Science

Conditional probability

P (B)P (A B) = P (AB) = P (A)P (B A), and dividing both sides by P (B) gives Bayes rule: P (A B) = P (A) P (B A) P (B),

An Introduction to Bayesian Reasoning

Bayesian Updating with Discrete Priors Class 11, Jeremy Orloff and Jonathan Bloom

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes.

First we look at some terms to be used in this section.

Maximizing expected utility

COMP61011 : Machine Learning. Probabilis*c Models + Bayes Theorem

irst we need to know that there are many ways to indicate multiplication; for example the product of 5 and 7 can be written in a variety of ways:

HOW TO WRITE PROOFS. Dr. Min Ru, University of Houston

Algebra. Formal fallacies (recap). Example of base rate fallacy: Monty Hall Problem.

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

Lecture Notes: Paradoxes in Probability and Statistics

Chapter 1 Principles of Probability

Homework 4 solutions

Conditional Probability

Pig organ transplants within 5 years

Bayesian decision theory. Nuno Vasconcelos ECE Department, UCSD

COS 424: Interacting with Data. Lecturer: Dave Blei Lecture #2 Scribe: Ellen Kim February 7, 2008

Bayesian Updating: Discrete Priors: Spring

Logic of Sentences (Propositional Logic) is interested only in true or false statements; does not go inside.

COMP61011! Probabilistic Classifiers! Part 1, Bayes Theorem!

Hypothesis tests

Probabilistic models

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

Chapter Three. Hypothesis Testing

HW1 Solutions. October 5, (20 pts.) Random variables, sample space and events Consider the random experiment of ipping a coin 4 times.

A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace

15. NUMBERS HAVE LOTS OF DIFFERENT NAMES!

Critical Notice: Bas van Fraassen, Scientific Representation: Paradoxes of Perspective Oxford University Press, 2008, xiv pages

Where are we in CS 440?

Probability (Devore Chapter Two)

Lecture 2 - Length Contraction

and problem sheet 1

AST 418/518 Instrumentation and Statistics

Transcription:

Probability, Statistics, and Bayes Theorem Session 3 1 Introduction Now that we know what Bayes Theorem is, we want to explore some of the ways that it can be used in real-life situations. Often the results are surprising and seem to contradict common sense. Before we turn to these, we ll have a quick review of what Bayes Theorem says. 1.1 Bayes Theorem: Review Recall the definition of the conditional probability of A given B: P (A B) = P (A B). P (B) On a formal level, Bayes Theorem relates two conditional probabilities: P (A B) = P (B A) P (A) P (B). Let the event A be interpreted as a proposition and let the event B be interpreted as evidence. Then P (A) is our initial belief in the truth of proposition A. This is called the prior probability for A, or just the prior. P (A B) represents our belief that A is true taking into account the evidence B. It is called the posterior probability of A given B, or just the posterior. The ratio P (B A) P (B) is interpreted as the support that the evidence B provides for A. With these interpretations, Bayes Theorem gives us a way to update of belief in the truth of proposition A based upon 1

the available evidence B; we start with the prior belief in A, then we gather evidence B, and then we form the posterior belief P (A B). 2 Medical Testing Bayes Theorem can be effectively used in assessing the results of any kind of test where there is not complete accuracy in the outcome of the test. These kinds of tests are often given in a medical context. Before we turn to the problem, we have to formally define what we mean when we say that the outcome of a test isn t completely accurate. 2.1 False Positives and False Negatives We are going to assume that the test has only two outcomes, which will be called Positive and Negative, which we will sometimes just shorten to P and N. We are also going to assume that there are only two possible states that we can be in; we either have what the test is checking for, or we don t. We will call these states Sick, if we have it, and Healthy if we don t. Again, we will often shorten these to S and H. There are now four possible combination of the two states and the two outcomes: (N H) : The test comes back negative and we do not have the disease (P S) : The test comes back positive and we do have the disease (P H) : The test says that we have the disease but we don t (N S) : The test says that we don t have the disease but we do In the first two cases, the test is doing what it is supposed to do: if we re healthy, the test says that we are healthy, while if we are sick then the test confirms this. In the second two cases, however, the test is not working, but it s not working in different ways. Given (H,P), we are healthy but the test is coming back positive. This means that we think that we have the disease even though we don t. This has widespread ramifications, emotionally, physically, and financially. When this happens, we say that the test has produced a false positive, since the outcome of the test is positive, but that is not the state that we are in. Similarly, given (S,N), we are sick but the test says that we are healthy. This means that we don t think that we have the disease even though we do. This has even more dangerous ramifications because we might fail to get the treatment that we need to regain our health because we don t think that we are sick. When this happens, we say that the test has produced a false negative, because the outcome of the test is negative but that id not the state that we are in. To keep terminology consistent, we refer to (H,N) as a true negative and (S,P) as a true positive. This can all be summarized as 2

(N H) : True Negative (P S) : True Positive (P H) : False Positive (N S) : False Negative These values reflect the accuracy of the test, and they need to be taken into account before any decisions are made based on the outcome of the test. 2.2 The Situation Suppose that you go to the doctor to have a check-up. You feel perfectly healthy and look perfectly healthy. When you get to the doctor s office, he tells you that there is a pretty serious disease going around, and it seems like 1 in 5000 people in the Waterloo area have contracted the disease. A reasonab;e description of your knowledge would be the prior probabilities P (Healthy) =.9998, P (Sick) =.0002 He then tells you that he has a test for the disease, and that the test has about 5% false positives and about 2% false negatives, so that P (Positive Healthy) =.05 P (Negative Healthy) =.95 P (Positive Sick) =.98 P (Negative Sick) =.02 You decide to take the test. A few days later, the doctor calls you with your results. 2.2.1 Exercises 1. How would you feel if the doctor told you the test came back negative? 2. How would you feel if the test came back positive? 3

3 The Monty Hall Problem This is a famous application of Bayes Theorem to a probabilistic problem which explains the counterintuitive results. 3.1 The Problem Suppose that you are a contestant on a game show. There are three doors, which are closed, and which are number 1 to 3. Behind one of doors is a brand new sports car. Behind the other two are goats. You have one chance to pick a door, and if you pick the door with the sports car behind it, you win the sports car. If you pick a door with a goat, you win the goat. Say that you pick door 1, and announce it to the host. Before he opens door 1, he opens door 3 and reveals a goat. The host turns to you and asks you if you want to change your choice to door 2. 3.1.1 Exercises 1. Is it to your advantage to switch? 3.2 Discussion and another Problem Before the host opens door 3, you figure that you have a 1/3 chance of winning the sports car. When he opens door 3 and shows you a goat, you now know that the sports car is either behind door 1 or door 2. You might think to yourself that since you now know that there are only two possibilities (door 1 and door 2), that the probability of it being behind door 1 is now 1/2 and the probability that it is behind door 2 is 1/2, so there is no advantage in switching. In essence, once the host opens door 3, you think to yourself that the sample space has shrunk from three options to just two, and since the information provided by opening door 3 doesn t change the location of the car, each option is equally likely. You re standing there thinking and thinking. The host can tell that you re working through the various possibilities, and he asks you to share your thoughts with him. You explain that you re not going to switch because you figure that the probabilities of being behind door 1 and door 2 are both 1/2 and you ll stick with your gut instinct which told you to pick door 1. The host listens to you, nodding as you explain your reasoning. When you re finished, he says What if I told you that the probability of the car being behind door 2 is 2/3? 3.2.1 Exercises 1. Do you believe him? 4

2. Would you switch now? 3.3 The Mathematical Model Let s turn this into a probabilistic model that we can investigate rigorously. Our sample space is Ω = {1, 2, 3}, these numbers representing the respective doors. There are three random variables involved: C, the number of the door with the car behind it S, the number of the door selected by you H, the number of the door opened by the host At the start of the game, P (C = c) = 1/3 for all c Ω since we are assuming that the car is equally likely to be behind any door. We also know that P (C = c S = s) = P (C = c) since our choice doesn t affect the location of the car in any way. Finally, we have the following 0 h = s 0 h = c P (H = h C = c, S = s) = 1/2 h s, s = c 1 h c, h s, s c These probabilities are all that we have to work with. It turns out that they are all the we need. 3.3.1 Exercises 1. Explain the values of P (H = h C = c, S = s). 2. Use Bayes Theorem to get an expression for P (C = c H = h, S = s) in terms of the probabilities that we have above. 3. Compute P (C = 2 H = 3, S = 1). 4 The Prosecutor s Fallacy This is an example from a legal setting. It is a fictional example based upon real trials that have affected real lives. 5

4.1 On Trial A person is accused of committing a pretty serious crime. They maintain that they are innocent, the victim of bad luck. There is some evidence that seems to link the defendant to the crime, but as with most evidence it is not absolutely conclusive. During the trial, the prosecutor calls to the witness stand an expert in interpreting exactly this kind of evidence. At one point, the prosecutor asks the expert What is the probability of finding this evidence, if the defendant was innocent? The expert replies 1 in 73 million. Later in the trial, during the final argument, the prosecutor says to the jury You ve heard an expert witness testify that the probability that the defendant is innocent, given this evidence, is 1 in 73 million. 4.1.1 Exercises 1. If you were the defense attorney, would you object to the prosecutor s last statement? 4.2 Mathematical Formulation The prosecutor s fallacy arises from the confusion of two different, though related, conditional probabilities. Let E represent the evidence, and I represent the innocence of the defendant. The expert witness testified to the probability of finding the evidence, assuming innocence of the defendant. This can be represented in terms of conditional probabilities as P (E I). What the prosecutor said to the jury involved the probability of the defendant being innocent given the evidence. This can be represented in terms of conditional probabilities as P (I E). The fallacy in the prosecutor s statement is that P (E I) P (I E). But we know from Bayes Theorem that we can t just switch the order of terms in conditional probabilities. The correct relationship between the two probabilities is instead given by P (I) P (I E) = P (E I) P (E I)P (I) + P (E I c )(1 P (I)) Here P (I) is the prior probability of innocence independent of all evidence, and I c represents the defendant being guilty. 4.2.1 Exercises 1. What would happen if the evidence were truly a smoking gun, meaning that P (E I) = 0? 2. What would happen in the opposite case that the evidence was absolutely untrustworthy, meaning that P (E I) = 1? 3. What would happen if the evidence didn t really provide any additional insight, meaning that P (E I) = 1/2? 6