SDS 321: Introduction to Probability and Statistics

Similar documents
Mathematics. ( : Focus on free Education) (Chapter 16) (Probability) (Class XI) Exercise 16.2

Lecture 1 : The Mathematical Theory of Probability

MODULE 2 RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES DISTRIBUTION FUNCTION AND ITS PROPERTIES

Lecture 3 - Axioms of Probability

Lecture Lecture 5

Introduction Probability. Math 141. Introduction to Probability and Statistics. Albyn Jones

Probability: Terminology and Examples Class 2, Jeremy Orloff and Jonathan Bloom

Mean, Median and Mode. Lecture 3 - Axioms of Probability. Where do they come from? Graphically. We start with a set of 21 numbers, Sta102 / BME102

Deep Learning for Computer Vision

Conditional Probability

Events A and B are said to be independent if the occurrence of A does not affect the probability of B.

CS4705. Probability Review and Naïve Bayes. Slides from Dragomir Radev

STAT 430/510 Probability

Outline. 1. Define likelihood 2. Interpretations of likelihoods 3. Likelihood plots 4. Maximum likelihood 5. Likelihood ratio benchmarks

6.2 Introduction to Probability. The Deal. Possible outcomes: STAT1010 Intro to probability. Definitions. Terms: What are the chances of?


Probability Pearson Education, Inc. Slide

V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

NLP: Probability. 1 Basics. Dan Garrette December 27, E : event space (sample space)

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

CS 361: Probability & Statistics

First Digit Tally Marks Final Count

PERMUTATIONS, COMBINATIONS AND DISCRETE PROBABILITY

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

What is Probability? Probability. Sample Spaces and Events. Simple Event

Quantitative Methods for Decision Making

27 Binary Arithmetic: An Application to Programming

Discrete Random Variable

(i) Given that a student is female, what is the probability of having a GPA of at least 3.0?

Notes 1 Autumn Sample space, events. S is the number of elements in the set S.)

Conditional Probability

Why should you care?? Intellectual curiosity. Gambling. Mathematically the same as the ESP decision problem we discussed in Week 4.

What is a random variable

Introduction to Probability

Applied Statistics I

Lecture 3: Random variables, distributions, and transformations

CSC Discrete Math I, Spring Discrete Probability

General Info. Grading

ECE 450 Lecture 2. Recall: Pr(A B) = Pr(A) + Pr(B) Pr(A B) in general = Pr(A) + Pr(B) if A and B are m.e. Lecture Overview

Outline. 1. Define likelihood 2. Interpretations of likelihoods 3. Likelihood plots 4. Maximum likelihood 5. Likelihood ratio benchmarks

2. Conditional Probability

Probability Distributions for Discrete RV

4/17/2012. NE ( ) # of ways an event can happen NS ( ) # of events in the sample space

Sec$on Summary. Assigning Probabilities Probabilities of Complements and Unions of Events Conditional Probability

Mathacle. A; if u is not an element of A, then A. Some of the commonly used sets and notations are

Fundamentals of Probability CE 311S

Expected Value 7/7/2006

Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya

Notation: X = random variable; x = particular value; P(X = x) denotes probability that X equals the value x.

Discrete Structures for Computer Science

Probability, Random Processes and Inference

Formal Modeling in Cognitive Science Lecture 19: Application of Bayes Theorem; Discrete Random Variables; Distributions. Background.

Formal Modeling in Cognitive Science

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Probability Rules. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics

Recap. The study of randomness and uncertainty Chances, odds, likelihood, expected, probably, on average,... PROBABILITY INFERENTIAL STATISTICS

324 Stat Lecture Notes (1) Probability

Homework 4 Solution, due July 23

ELEG 3143 Probability & Stochastic Process Ch. 1 Experiments, Models, and Probabilities

Lecture 4 An Introduction to Stochastic Processes

Conditional Expectation

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

CLASS 6 July 16, 2015 STT

CS 441 Discrete Mathematics for CS Lecture 20. Probabilities. CS 441 Discrete mathematics for CS. Probabilities

Marquette University MATH 1700 Class 5 Copyright 2017 by D.B. Rowe

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

Chapter 2 PROBABILITY SAMPLE SPACE

Key Concepts. Key Concepts. Event Relations. Event Relations

Uncertainty. Russell & Norvig Chapter 13.

Expected Value. Lecture A Tiefenbruck MWF 9-9:50am Center 212 Lecture B Jones MWF 2-2:50pm Center 214 Lecture C Tiefenbruck MWF 11-11:50am Center 212

Statistical Inference

Today we ll discuss ways to learn how to think about events that are influenced by chance.

Chapter 2. Probability. Math 371. University of Hawai i at Mānoa. Summer 2011

Probability with Engineering Applications ECE 313 Section C Lecture 2. Lav R. Varshney 30 August 2017

LECTURE NOTES by DR. J.S.V.R. KRISHNA PRASAD

TA Qinru Shi: Based on poll result, the first Programming Boot Camp will be: this Sunday 5 Feb, 7-8pm Gates 114.

EE 178 Lecture Notes 0 Course Introduction. About EE178. About Probability. Course Goals. Course Topics. Lecture Notes EE 178

Lecture 6 Random Variable. Compose of procedure & observation. From observation, we get outcomes

More on Distribution Function

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

MA : Introductory Probability

Recitation 2: Probability

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018

Lecture 1. ABC of Probability

Statistical Theory 1

Introduction to Probability and Sample Spaces

CS70: Jean Walrand: Lecture 15b.

n How to represent uncertainty in knowledge? n Which action to choose under uncertainty? q Assume the car does not have a flat tire

CMPSCI 240: Reasoning about Uncertainty

Bayes Rule for probability

9/6/2016. Section 5.1 Probability. Equally Likely Model. The Division Rule: P(A)=#(A)/#(S) Some Popular Randomizers.

Probabilistic models

CISC 1100/1400 Structures of Comp. Sci./Discrete Structures Chapter 7 Probability. Outline. Terminology and background. Arthur G.

Probability. VCE Maths Methods - Unit 2 - Probability

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

Steve Smith Tuition: Maths Notes

Transcription:

SDS 321: Introduction to Probability and Statistics Lecture 2: Conditional probability Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin www.cs.cmu.edu/ psarkar/teaching 1

Example Alice and Bob toss two fair coins separately. Denote the event that Alice gets a H by E A and the event that Bob gets a H by E B. We know that P(E A ) = P(E B ) = 1/2. 2

Example Alice and Bob toss two fair coins separately. Denote the event that Alice gets a H by E A and the event that Bob gets a H by E B. We know that P(E A ) = P(E B ) = 1/2. P(E A E B ) = P(E A ) + P(E B ) = 1. So its certain that at least one of them will get a head. 2

Example Alice and Bob toss two fair coins separately. Denote the event that Alice gets a H by E A and the event that Bob gets a H by E B. We know that P(E A ) = P(E B ) = 1/2. P(E A E B ) = P(E A ) + P(E B ) = 1. So its certain that at least one of them will get a head. Whats wrong with this argument? 2

Example Alice and Bob toss two fair coins separately. Denote the event that Alice gets a H by E A and the event that Bob gets a H by E B. We know that P(E A ) = P(E B ) = 1/2. P(E A E B ) = P(E A ) + P(E B ) = 1. So its certain that at least one of them will get a head. Whats wrong with this argument? But both could get tails! 2

Example Alice and Bob toss two fair coins separately. Denote the event that Alice gets a H by E A and the event that Bob gets a H by E B. We know that P(E A ) = P(E B ) = 1/2. P(E A E B ) = P(E A ) + P(E B ) = 1. So its certain that at least one of them will get a head. Whats wrong with this argument? But both could get tails! Are E A and E B disjoint?. 2

Example Alice and Bob toss two fair coins separately. Denote the event that Alice gets a H by E A and the event that Bob gets a H by E B. We know that P(E A ) = P(E B ) = 1/2. P(E A E B ) = P(E A ) + P(E B ) = 1. So its certain that at least one of them will get a head. Whats wrong with this argument? But both could get tails! Are E A and E B disjoint? Not quite. The sample space you are considering is: HT, TH, TT, HH. So E A = {HT, HH} and E B = {TH, HH}. 2

Example Alice and Bob toss two fair coins separately. Denote the event that Alice gets a H by E A and the event that Bob gets a H by E B. We know that P(E A ) = P(E B ) = 1/2. P(E A E B ) = P(E A ) + P(E B ) = 1. So its certain that at least one of them will get a head. Whats wrong with this argument? But both could get tails! Are E A and E B disjoint? Not quite. The sample space you are considering is: HT, TH, TT, HH. So E A = {HT, HH} and E B = {TH, HH}. So the union is really P(E A E B ) = P(E A ) + P(E B ) P(E A E B ) = 3/4. 2

Example Alice and Bob toss two fair coins separately. Denote the event that Alice gets a H by E A and the event that Bob gets a H by E B. We know that P(E A ) = P(E B ) = 1/2. P(E A E B ) = P(E A ) + P(E B ) = 1. So its certain that at least one of them will get a head. Whats wrong with this argument? But both could get tails! Are E A and E B disjoint? Not quite. The sample space you are considering is: HT, TH, TT, HH. So E A = {HT, HH} and E B = {TH, HH}. So the union is really P(E A E B ) = P(E A ) + P(E B ) P(E A E B ) = 3/4. May be what you are really thinking about is independent and not disjoint events. 2

Partial information So far, we have assumed we know nothing about the outcome of our experiment, except for the information encoded in the probability law. Sometimes, however, we have partial information that may affect the likelihood of a given event. The experiment involves rolling a die. You are told that the number is odd. The experiment involves the weather tomorrow. You know that the weather today is rainy. The experiment involves the presence or absence of a disease. A blood test comes back positive. In each case, knowing about some event B (e.g. it is raining today ) changes our beliefs about event A ( Will it rain tomorrow? ). We want to update our probability law to incorporate this new knowledge. 3

Conditional Probability Original problem: What is the probability of some event A. e.g. What is the probability that we roll a number less than 4? In other words, what is P(A)? This is given by our probability law. 4

Conditional Probability Original problem: What is the probability of some event A. e.g. What is the probability that we roll a number less than 4? In other words, what is P(A)? This is given by our probability law. New problem: Assuming event B (equivalently given event B), what is the probability of event A? e.g. Given that the number rolled is an odd number, what is the probability that it is less than 4? We call this the conditional distribution of A given B. We write this as P(A B) Read as given or conditioned on the fact that. Our conditional probability is still describing the probability of something, so we expect it to behave like a probability distribution. 4

Conditional Probability Consider rolling a fair 6-sided die (uniform, discrete probability distribution). Let A be the event outcome is equal to 1. What is P(A)? Let s now assume that the number rolled is an odd number. What is the set, B, that we are conditioning on? What do you think P(A B) should be? 5

Conditional Probability You are conditioning on event B. So your new sample space is {1, 3, 5} instead of {1, 2, 3, 4, 5, 6}. 6

Conditional Probability You are conditioning on event B. So your new sample space is {1, 3, 5} instead of {1, 2, 3, 4, 5, 6}. Each of these are equally likely. So P(A B) = 1/3. 6

Conditional Probability You are conditioning on event B. So your new sample space is {1, 3, 5} instead of {1, 2, 3, 4, 5, 6}. Each of these are equally likely. So P(A B) = 1/3. Formally, if all outcomes are equally likely, we have P(A B) = # elements in A B # elements in B 6

Conditional Probability You are conditioning on event B. So your new sample space is {1, 3, 5} instead of {1, 2, 3, 4, 5, 6}. Each of these are equally likely. So P(A B) = 1/3. Formally, if all outcomes are equally likely, we have P(A B) = # elements in A B # elements in B More generally, P(A B) = P(A B). P(B) 6

Conditional Probability You are conditioning on event B. So your new sample space is {1, 3, 5} instead of {1, 2, 3, 4, 5, 6}. Each of these are equally likely. So P(A B) = 1/3. Formally, if all outcomes are equally likely, we have P(A B) = # elements in A B # elements in B P(A B) More generally, P(A B) =. P(B) A conditional probability is only defined if P(B) > 0. 6

Conditional Probability Axioms Nonnegativity Check. Normalization Additivity 7

Conditional Probability Axioms Nonnegativity Check. Normalization Your new universe is now B, and we know that P(B B) = 1. Additivity 7

Conditional Probability Axioms Nonnegativity Check. Normalization Your new universe is now B, and we know that P(B B) = 1. Additivity P(A 1 A 2 B) = P(A 1 B) + P(A 2 B) for two disjoint sets, A 1 and A 2. 7

Conditional Probability Axioms Nonnegativity Check. Normalization Your new universe is now B, and we know that P(B B) = 1. Additivity P(A 1 A 2 B) = P(A 1 B) + P(A 2 B) for two disjoint sets, A 1 and A 2. Ω B A 1 A 2 Conditioning on B A 2 B B A 1 B 7

Conditional Probability Axioms Nonnegativity Check. Normalization Your new universe is now B, and we know that P(B B) = 1. Additivity P(A 1 A 2 B) = P(A 1 B) + P(A 2 B) for two disjoint sets, A 1 and A 2. Ω B A 1 A 2 Conditioning on B A 2 B B A 1 B Using additivity, P((A 1 A 2 ) B) = P(A 1 B) + P(A 2 B), so P(A 1 A 2 B) = P(A1 B) + P(A2 B) P(B) = P(A 1 B) + P(A 2 B) 7

Properties of conditional probability If P(B) > 0, If A i for i {1,..., n} are all pairwise disjoint, then n P( n i=1a i B) = P(A i B). i=1 8

Properties of conditional probability If P(B) > 0, If A i for i {1,..., n} are all pairwise disjoint, then n P( n i=1a i B) = P(A i B). i=1 If A 1 A 2, then P(A 1 B) P(A 2 B). 8

Properties of conditional probability If P(B) > 0, If A i for i {1,..., n} are all pairwise disjoint, then n P( n i=1a i B) = P(A i B). i=1 If A 1 A 2, then P(A 1 B) P(A 2 B). P(A 1 A 2 B) = P(A 1 B) + P(A 2 B) P(A 1 A 2 B) 8

Properties of conditional probability If P(B) > 0, If A i for i {1,..., n} are all pairwise disjoint, then n P( n i=1a i B) = P(A i B). i=1 If A 1 A 2, then P(A 1 B) P(A 2 B). P(A 1 A 2 B) = P(A 1 B) + P(A 2 B) P(A 1 A 2 B) Union bound: P(A 1 A 2 B) P(A 1 B) + P(A 2 B) 8

Properties of conditional probability If P(B) > 0, If A i for i {1,..., n} are all pairwise disjoint, then n P( n i=1a i B) = P(A i B). i=1 If A 1 A 2, then P(A 1 B) P(A 2 B). P(A 1 A 2 B) = P(A 1 B) + P(A 2 B) P(A 1 A 2 B) Union bound: P(A 1 A 2 B) P(A 1 B) + P(A 2 B) P( n i=1 A i B) n P(A i B). i=1 8

Example: Coin toss Consider the experiment of tossing a fair coin three times. What is the probability of getting alternating heads and tails conditioned on the event that your first toss gives a head? Notation: Let A := {Tosses yield alternating tails and heads} and B := {The first toss is a head}. We want P(A B). What if you wanted to follow the formula? 9

Example: Coin toss Consider the experiment of tossing a fair coin three times. What is the probability of getting alternating heads and tails conditioned on the event that your first toss gives a head? Notation: Let A := {Tosses yield alternating tails and heads} and B := {The first toss is a head}. We want P(A B). What if you wanted to follow the formula? Sample space for three coin tosses {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT }. 9

Example: Coin toss Consider the experiment of tossing a fair coin three times. What is the probability of getting alternating heads and tails conditioned on the event that your first toss gives a head? Notation: Let A := {Tosses yield alternating tails and heads} and B := {The first toss is a head}. We want P(A B). What if you wanted to follow the formula? Sample space for three coin tosses {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT }. A = {HTH, THT } and A B = {HTH}. 9

Example: Coin toss Consider the experiment of tossing a fair coin three times. What is the probability of getting alternating heads and tails conditioned on the event that your first toss gives a head? Notation: Let A := {Tosses yield alternating tails and heads} and B := {The first toss is a head}. We want P(A B). What if you wanted to follow the formula? Sample space for three coin tosses {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT }. A = {HTH, THT } and A B = {HTH}. We have P(B) = 4/8 and P(A B) = 1/8 and so P(A B) = 1/4. 9

Example: Coin toss Consider the experiment of tossing a fair coin three times. What is the probability of getting alternating heads and tails conditioned on the event that your first toss gives a head? Notation: Let A := {Tosses yield alternating tails and heads} and B := {The first toss is a head}. We want P(A B). What if you wanted to follow the formula? Sample space for three coin tosses {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT }. A = {HTH, THT } and A B = {HTH}. We have P(B) = 4/8 and P(A B) = 1/8 and so P(A B) = 1/4. Your new world/sample space when you condition is B = {HHT, HTH, HTT, HHH}. Each of these are equally likely. Out of these 1 event satisfies alternating heads and tails. 9

Example: Coin toss Consider the experiment of tossing a fair coin three times. What is the probability of getting alternating heads and tails conditioned on the event that your first toss gives a head? Notation: Let A := {Tosses yield alternating tails and heads} and B := {The first toss is a head}. We want P(A B). What if you wanted to follow the formula? Sample space for three coin tosses {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT }. A = {HTH, THT } and A B = {HTH}. We have P(B) = 4/8 and P(A B) = 1/8 and so P(A B) = 1/4. Your new world/sample space when you condition is B = {HHT, HTH, HTT, HHH}. Each of these are equally likely. Out of these 1 event satisfies alternating heads and tails. So, P(A B) = 1/4. 9

Homework Read Sections 1.1, 1.2 and 1.3 of Bertsekas and Tsitsiklis. The first homework will be posted online today. It is due next Thursday by 5pm via Canvas. 10