Probability, Random Processes and Inference

Similar documents
Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Module 1. Probability

Probability, Random Processes and Inference

STAT:5100 (22S:193) Statistical Inference I

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

EE 178 Lecture Notes 0 Course Introduction. About EE178. About Probability. Course Goals. Course Topics. Lecture Notes EE 178

Announcements. Topics: To Do:

HW MATH425/525 Lecture Notes 1

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Probabilistic models

Basic Statistics and Probability Chapter 3: Probability

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Probabilistic models

2. AXIOMATIC PROBABILITY

Probability and Statistics Notes

Econ 325: Introduction to Empirical Economics

CS626 Data Analysis and Simulation

Statistical Inference

Chapter 2 Class Notes

Statistical Theory 1

Presentation on Theo e ry r y o f P r P o r bab a il i i l t i y

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

Fundamentals of Probability CE 311S

Outline Conditional Probability The Law of Total Probability and Bayes Theorem Independent Events. Week 4 Classical Probability, Part II

Notes Week 2 Chapter 3 Probability WEEK 2 page 1

Chapter 2 PROBABILITY SAMPLE SPACE

Topic -2. Probability. Larson & Farber, Elementary Statistics: Picturing the World, 3e 1

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

PROBABILITY VITTORIA SILVESTRI

Probability Notes (A) , Fall 2010

Discrete Probability

Axiomatic Foundations of Probability. Definition: Probability Function

CIS 2033 Lecture 5, Fall

Chapter 3 : Conditional Probability and Independence

MAT 271E Probability and Statistics

Lecture 2: Probability, conditional probability, and independence

Discrete Random Variable

Dept. of Linguistics, Indiana University Fall 2015

MAT 271E Probability and Statistics

Origins of Probability Theory

3 PROBABILITY TOPICS

Single Maths B: Introduction to Probability

Lecture 3 Probability Basics

ELEG 3143 Probability & Stochastic Process Ch. 1 Probability

2011 Pearson Education, Inc

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch

Section 13.3 Probability

PROBABILITY. Contents Preface 1 1. Introduction 2 2. Combinatorial analysis 5 3. Stirling s formula 8. Preface

Probability theory basics

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

4. Probability of an event A for equally likely outcomes:

STAT Chapter 3: Probability

MATH MW Elementary Probability Course Notes Part I: Models and Counting

Probability Theory and Applications

Mathematical Foundations of Computer Science Lecture Outline October 18, 2018

1 The Basic Counting Principles

PERMUTATIONS, COMBINATIONS AND DISCRETE PROBABILITY

Lecture Lecture 5

Lectures Conditional Probability and Independence

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Lecture 2: Probability

ELEG 3143 Probability & Stochastic Process Ch. 1 Experiments, Models, and Probabilities

Event A: at least one tail observed A:

Week 2: Probability: Counting, Sets, and Bayes

CMPSCI 240: Reasoning about Uncertainty

Example. What is the sample space for flipping a fair coin? Rolling a 6-sided die? Find the event E where E = {x x has exactly one head}

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

Probability- describes the pattern of chance outcomes

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory and Simulation Methods

What is the probability of getting a heads when flipping a coin

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

Discrete Probability. Chemistry & Physics. Medicine

Introduction to Probability

Introduction to Stochastic Processes

MAT 271E Probability and Statistics

Chapter 13, Probability from Applied Finite Mathematics by Rupinder Sekhon was developed by OpenStax College, licensed by Rice University, and is

Elementary Discrete Probability

Probability. Chapter 1 Probability. A Simple Example. Sample Space and Probability. Sample Space and Event. Sample Space (Two Dice) Probability

9/6. Grades. Exam. Card Game. Homework/quizzes: 15% (two lowest scores dropped) Midterms: 25% each Final Exam: 35%

Chapter. Probability

7.1 What is it and why should we care?

Topic 2 Probability. Basic probability Conditional probability and independence Bayes rule Basic reliability

Mutually Exclusive Events

Chapter 5 : Probability. Exercise Sheet. SHilal. 1 P a g e

Probability deals with modeling of random phenomena (phenomena or experiments whose outcomes may vary)

Probability. 25 th September lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.)

2.6 Tools for Counting sample points

Conditional Probability, Total Probability Theorem and Bayes Rule

TOPIC 12 PROBABILITY SCHEMATIC DIAGRAM

Probability. Lecture Notes. Adolfo J. Rumbos

STT When trying to evaluate the likelihood of random events we are using following wording.

Formalizing Probability. Choosing the Sample Space. Probability Measures

STA Module 4 Probability Concepts. Rev.F08 1

UNIT Explain about the partition of a sampling space theorem?

Probability and distributions. Francesco Corona

1 Probability Theory. 1.1 Introduction

1 Preliminaries Sample Space and Events Interpretation of Probability... 13

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

Transcription:

INSTITUTO POLITÉCNICO NACIONAL CENTRO DE INVESTIGACION EN COMPUTACION Laboratorio de Ciberseguridad Probability, Random Processes and Inference Dr. Ponciano Jorge Escamilla Ambrosio pescamilla@cic.ipn.mx http://www.cic.ipn.mx/~pescamilla/

Probability, Random Processes and Inference CIC Instructor Dr. Ponciano Jorge Escamilla Ambrosio pescamilla@cic.ipn.mx http://www.cic.ipn.mx/~pescamilla/ Class meetings Mondays and Wednesdays 12:00 14:00 hrs. Classroom Aula A3 2

Course web site Course web site: http://www.cic.ipn.mx/~pescamilla/academy.html Reader material and homework exercises, etc. 3

Course Objective The student will learn the fundamentals of probability theory: probabilistic models, discrete and continuous random variables, multiple random variables and limit theorems as well as an introduction to more advanced topics such as random processes and statistical inference. At the end of the course the student will be able to develop and analyse probabilistic models in a manner that combines intuitive understanding and mathematical precision. 4

Course content 1. Probability 1. 1. What is Probability? 1.1.1. Statistical Probability 1.1.2. Probability as a Measure of Uncertainty 1. 2. Sample Space and Probability 1.2.1. Probabilistic Models 1.2.2. Conditional Probability 1.2.3. Total Probability Theorem and Bayes Rule 1.2.4. Independence 1.2.5. Counting 1.2.6. The probabilistic Method 5

Course content 1. 3. Discrete Random Variables 1.3.1. Basic Concepts 1.3.2. Probability Mass Functions 1.3.3. Functions of Random Variables 1.3.4. Expectation and Variance 1.3.5. Joint PMFs of Multiple Random Variables 1.3.6. Conditioning 1.3.7. Independence 6

Course content 1. 4. General Random Variables 1.4.1. Continuous Random Variables and PDFs 1.4.2. Cumulative Distribution Function 1.4.3. Normal Random Variables 1.4.4. Joint PDFs of Multiple Random Variables 1.4.5. Conditioning 1.4.6. The Continuous Bayes Rule 1.4.7. The Strong Law of Large Numbers 7

Course content 2. Introduction to Random Processes 2.1. Markov Chains 2.1.1. Discrete Time Markov Chains 2.1.2. Classification of States 2.1.3. Steady State Behavior 2.1.4. Absorption Probabilities and Expected Time to Absorption 2.1.5. Continuous Time Markov Chains 2.1.6. Ergodic Theorem for Discrete Markov Chains 2.1.7. Markov Chain Montecarlo Method 2.1.8.Queuing Theory 8

Course content 3. Statistics 3.2. Classical Statistical Inference 3.2.1. Classical Parameter Estimation 3.2.2. Linear Regression 3.2.3. Analysis of Variance and Regression 3.2.4. Binary Hypothesis Testing 3.2.5. Significance Testing 9

Course text books Dimitri P. Bertsekas and John N. Tsitsiklis. Introduction to probability, 2nd Edition, Athena Scientific, 2008. http://athenasc.com/probbook.html Joseph Blitzstein, Jessica Hwang. Introduction to probability, CRC Press 2014. https://www.crcpress.com/introduction-to-probability/blitzstein- Hwang/9781466575578 10

Course text books William Feller. An introduction to probability theory and its applications, Vol. 1, 3rd Edition, Wiley, 1968. http://www.wiley.com/wileycda/wileytitle/productcd-0471257087.html Géza Schay, Introduction to probability with statistical applications, Birkhauser, Boston, 2007. http://link.springer.com/book/10.1007/978-0-8176-4591-5 11

Grading Midterm exam 15% Final exam 15% Homework assignments 20% One written departmental exam 50% 12

Course Schedule A-17 http://www.cic.ipn.mx/~pescamilla/academy.html 13

Probability 1. What is Probability? 1.1.1. Statistical Probability 1.1.2. Probability as a Measure of Uncertainty 14

What is Probability? 15

What is Probability? The relative is trying to use the concept of probability to discuss an uncertain situation Luck, Coincidence, Randomness, Uncertainty, Risk, Doubt, Fortune, Chance Used in a vague, casual way! A first approach to define probability is in terms of frequency of occurrence, as a percentage of success 16

What is Probability? For example, if we toss a coin, and observe whether it lands head (H) or tail (T) up What is the probability of either result? Why? 17

What is Probability? P A = # Favorable outcomes # Possible outcomes Example: Flip a coin twice 18

Sample space Definition 1 (Sample space and event). The sample space S of an experiment is the set of all possible outcomes of an experiment. An event A is a subset of the sample space S, and we say that A occurred if the actual outcome is in A. 19

Sample space Tossing twice a coin experiment, example 20

What is Probability? Probability is logical framework for quantifying uncertainty and randomness [Blitzstein and Hwang, 2014] Probability theory is a branch of mathematics that deals with repetitive events whose occurrence or nonoccurrence is subject to chance variation. [Schay, 2007] 21

What is Probability? Provides tools for understanding and explaining variation, separating signal from noise, and modeling complex phenomena. (engineer definition) 22

What is Probability? There are situation where the frequency interpretation is not appropriate Example: A scholar asserts that the Iliad and the Odyssey were composed by the same person, with probability 90% It is based on the scholar s subjective belief 23

What is Probability? The theory of probability is useful in a broad variety of contexts and applications: Statistics, Physics, Biology, Computer Science, Meteorology, Gambling, Finance, Political Science, Medicine, Life. Assignment 1a: Give an example of the application of probability theory in each area Assignment 1b: Read math review: http://projects.iq.harvard.edu/files/stat110/files/math_rev iew_handout.pdf 24

Probabilistic Model 25

Elements of a Probabilistic Model The sample space S, which is the set of all possible outcomes of an experiment. The probability law, which assigns to a set A of possible outcomes (also called an event) a nonnegative number P(A) (called the probability of A) that encodes our knowledge or belief about the collective likelihood of the elements of A. The probability law must satisfy certain properties. 26

Experiments and events The experiment will produce exactly one out of several possible outcomes. A subset of the sample space, that is, a collection of possible outcomes, is called an event. It means that any collection of possible outcomes, including the entire sample space S and its complement, the empty set, may qualify as an event. Strictly speaking, however, some sets have to be excluded. In particular when dealing with probabilistic models involving an uncountable infinite sample space, there are certain unusual subsets for which one cannot associate meaningful probabilities. 27

Experiments and events There is no restriction on what constitutes an experiment. The events to be considered can be described by such statements as a toss of a given coin results in head, a card drawn at random from a regular 52 card deck is an Ace, or this book is green. Associated with each statement there is a set S of possibilities, or possible outcomes. 28

Experiments and events Examples of experiments and events: Tossing a Coin. For a coin toss, S may be taken to consist of two possible outcomes, which we may abbreviate as H and T for head and tail. We say that H and T are the members, elements or points of S, and write S = {H, T}. Tossing two coins but ignore one of them. In this case S = {HH, HT, TH, TT}. In this case, for instance, the outcome the first coin shows H is represented by the set {HH, HT}, that is, this statement is true if we obtain HH or HT and false if we obtain TH or TT. 29

Experiments and events Tossing a Coin Until an H is Obtained. If we toss a coin until an H is obtained, we cannot say in advance how many tosses will be required, and so the natural sample space is S = {H, TH, TTH, TTTH,... }, an infinite set. We can use, of course, many other sample spaces as well, for instance, we may be interested only in whether we had to toss the coin more than twice or not, in which case S = {1 or 2, more than 2} is adequate. Selecting a Number from an Interval. Sometimes, we need an uncountable set for a sample space. For instance, if the experiment consists of choosing a random number between 0 and 1, we may use S = {x : 0 < x < 1}. 30

The probability law Specifies the likelihood of any outcome, or of any set of possible outcomes. Assigns to every event A, a number P(A), called the probability of A. 31

Probability Space [Schay 2007] Given a sample space S and a certain collection F of its subsets, called events, an assignment P of a number P(A) to each event A in F is called a probability measure, and P(A) the probability of A, if P has the following properties: 1. P(A) 0 for every A, 2. P(S) = 1, and 3. P(A1 A2 ) = P(A1)+ P(A2) + for any finite or countably infinite set of mutually exclusive events A1, A2, Then, the sample space S together with F and P is called a probability space. 32

Probability Axioms [Bertsekas and Tsitsiklis, 2008] CIC P(S) = 1. S 33

Probability Space [Blitzstein and Hwang, 2015] Definition 1.6.1 (General definition of probability). A probability space consists of a sample space S and a probability function P which takes an event A S as input and returns P(A), a real number between 0 and 1, as output. The function P must satisfy the following axioms: 1. P( ) = 0, P(S) = 1. 2. If A 1, A 2,... are disjoint events, then: (Saying that these events are disjoint means that they are mutually exclusive: A i A j = for i j.) 34

Properties of probabilities The Probability of the Empty Set Is 0. In any probability space, P( ) = 0. Proof: 1 = P(S) = P(S ) = P(S) + P( ) = 1 + P( ) 35

Properties of probabilities The Probability of the Union of Two Events. For any two events A and B, Proof: P(A B) = P(A) + P(B) P(A B) 36

Properties of probabilities Probability of Complements. For any event A, P(A c ) = 1 P(A) Proof: A c A = and A c A = S by the definition of A c. Thus, by Axiom 3, P(S) = P(A c A) = P(A c ) + P(A). Now, Axiom 2 says that P(S) = 1, and so, comparing these two values of P(S), we obtain P(A c ) + P(A) = 1. 37

Properties of probabilities Probability of Subsets. If A B, Proof: then P(A) P(B). If A B, then we can write B as the union of A and B A c, where B A c is the part of B not also in A. Since A and B A c are disjoint, we can apply the second axiom: P(B) = P(A (B A c )) = P(A) + P(B A c ) Probability is nonnegative, so P(B A c ) 0, proving that P(B) P(A). 38

Properties of probabilities Inclusion-exclusion. For any events A1,...,An, 39

Properties of probabilities Example: 40

Properties of Probability Laws 41

Discrete Probability Law 42

Discrete Uniform Probability Law In the special case where the probabilities P(s 1 ),, P(s n ) are all the same, by necessity equal to 1/n, in view of the normalization axiom, we obtain: 43

Discrete Uniform Probability Law 44

Discrete Uniform Probability Law 45

Counting The calculation of probabilities often involves counting the number of outcomes in various events. When the sample space S has finite number of equally likely outcomes, so that the discrete uniform probability law applies. Then, the probability of any event A is given by: P A = number of elements of A number of elements of S = k n When we want to calculate the probability of an event A with a finite number of equally likely outcomes, each of which has an already known probability p. Then the probability of A is given by: P A = p (number of elements of A) 46

Basic Counting Principle In how many ways you can dress today if you find: 4 shirts 3 ties 2 jackets in your closet? 47

The Multiplication Principle Consider a process that consists of r stages. Suppose that: a) There are n 1 possible results at the firs stage. b) For every possible result at the first stage, there are n 2 possible results at the second stage. c) More generally, for any sequence of possible results at the first i 1 stage, there are n i possible results at the ith stage. Then, the total number of possible results of the r-stage process is: n 1 n 2 n r 48

The Multiplication Principle 49

The Multiplication Principle Example 1. The number of telephone numbers. A local telephone company number is a 7-digit sequence, but the first digit has to be different from 0 or 1. How many distinct telephone numbers are there? 50

The Multiplication Principle Example 2. The number of subsets of an n- element set. Consider an n-element set {s 1, s 2,, s n }. How many subsets it have, including itself and the empty set? Example, in the set {1,2,3}? 51

The Multiplication Principle This is a sequential process where we take in turn each of the n elements and decide whether to include it in the desired subset or not. Thus we have n steps, and in each step two choices, namely yes or no to the question of whether the element belongs to the desired subset. Therefore the number of subsets is: for n = 1? 52

Number of subsets Example 3. Drawing three cards. What is the number of ways three cards can be drawn one after the other from a regular 52 cards deck without replacement? n 1 = 52, n 2 = 51, n 3 = 50 52 51 50 What is this number if we replace each card before the next one is drawn? n 1 = n 2 = n 3 = 52 52 3 53

Permutation and Combination Involve the selection of k objects out of a collection of n objects. If the order of selection matters, the selection is called a permutation. If the order of selection does not matter, the selection is called a combination. 54

Permutation k permutations Assume there are n distinct objects, and let k be some positive integer with k n. We want to count the number of different ways that we can pick k out of these n objects and arrange them in a sequence, e.g. the number of distinct k-object sequences. 55

Permutation In place 1 we can put n objects, which we can write as n 1+1; In place 2 we can put n 1 = n 2+1 objects; and so on. Thus the kth factor will be n k + 1, and so, for any 2 positive integers n and k n: n(n 1)(n 2) (n k + 1) = P n,k In the special case where k = n: n(n 1)(n 2) 3 2 1 = n! The number of possible sequences is simple called permutations 56

Permutation From the definitions of n!, (n k)! and P n,k we can obtain the following relation: n! = [n(n 1)(n 2) (n k + 1)][(n k)(n k 1) 2 1] = P n,k (n k)! and so: with 0! = 1. P n,k = n! n k! 57

Probability calculation Example 4. Six rolls of a die. Find the probability that: Six rolls of a (six sided) die all give different numbers Assume all outcomes are equally likely P(all six rolls give different numbers) =? P A = number of elements of A number of elements of S = k n P A = p (number of elements of A) p = probability of each equally likely outcome in A 58

Probability calculation Example 4. Six rolls of a die. Find the probability that: Six rolls of a (six sided) die all give different numbers Assume all outcomes are equally likely P(all six rolls give different numbers) =? P A = number of elements of A number of elements of S = k n = A = P 6,6 # elements in S = 6! 6 6 P A = p number of elements of A = 1 6 6 6! p = probability of each equally likely outcome in A 59

Permutation Example 5. Dealing Three Cards. In how many ways can three cards be dealt from a regular deck of 52 cards? 60

Permutation Example 5. Dealing Three Cards. In how many ways can three cards be dealt from a regular deck of 52 cards? n! P 52,3 = P n,k = n k! = 52 51 50 = 132, 600. 61

Permutation Example 6. Birthday problem. There are k people in a room. Assume each person s birthday is equally likely to be any of the 365 days of the year (we exclude February 29), and that people s birthdays are independent (we assume there are no twins in the room). What is the probability that two or more people in the group have the same birthday? 62

Permutation This amounts to sampling the 365 days of the year without replacement, so: 365 364 363 (365 k +1) for k 365 Therefore the probability of no birthday matches in a group of k people is: and the probability of at least one birthday match is: 63

Permutation Probability that in a room of k people, at least two were born on the same day. This probability first exceeds 0.5 when k = 23. 64

Combinations The number of possible unordered selections of k different things out of n different ones is denoted by C n,k, and each such selection is called a combination of the given things. If we select k things out of n without regard to order, then, this can be done in C n,k ways. In each case we have k things which can be ordered k! ways. Thus, by the multiplication principle, the number of ordered selections is C n,k k! On the other hand, this number is, by definition, P n,k. Therefore C n,k k! = P n,k, and so: C n,k = P n,k n! = k! k! n k! 65

Combinations The quantity on the right-hand side is usually abbreviated as n k, and is called a binomial coefficient. Thus, for any positive integer n and k = 1, 2,..., n: C n,k = n k n(n 1)(n 2) (n k + 1) = k! n! = k! n k! n! = [n (n 1)(n 2) (n k + 1)][(n k)(n k 1) 2 1] 66

Combinations 67

Binomial probabilities Binomial coefficient n k Binomial probabilities n 1 independent coin tosses: P(H) = p; P(k heads) =? Example: P(HTTTHH) =? P(particular sequence) =? P(particular k-head sequence) =? 68

Partitions A combination can be seen as a partition of the set in two: one part contains k elements and the other contains the remaining n k elements. Given an n-element set and nonnegative integers n 1, n 2,, n r, whose sum is equal to n; consider partitions of the set into r disjoint subsets, with the ith subset containing exactly n i elements. In how many ways this can be done? 69

Partitions There are n n 1 ways of forming the first subset. Having formed the first subset, there are left n n 1 elements. We need to choose n 2 of them in order to form the second subset, and have n n 1 n 2 Thus, using the Counting Principle: choices, and so on. 70

Partitions As several terms cancel, it results: This is called the multinomial coefficient and is usually denoted by: 71

Partitions 72

Partitions Example 7. Each person gets an ace. There is a 52- card deck, dealt (fairly) to four players. What is the probability of each player getting an ace? 73

Partitions Example 7. Each person gets an ace. There is a 52- card deck, dealt (fairly) to four players. What is the probability of each player getting an ace? The size of the sample space is: 13!13!13!13! Constructing an outcome with one ace for each person: o # of different ways of distributing the 4 aces to 4 players: 4! 52! o Distribution of the remaining 48 cards: 48! 12!12!12!12! 74

Summary of Counting Results 75

Conditional Probability Conditional probability provides us with a way to reason about the outcome of an experiment, based on partial information. Examples: A) In an experiment involving two successive rolls of a die, you are told that the sum of the two rolls is 9. How likely is that the first roll was 6? B) In a word guessing game, the first letter of the word is a t. What is the likelihood that the second letter is an h? 76

Conditional Probability C) How likely is it that a person has certain disease given that a medical test was negative? D) A spot shows up on a radar screen. How likely is it to correspond to an aircraft? 77

Conditional Probability Given: An experiment A corresponding sample space A probability law We know that the outcome is within some given event B. Quantify the likelihood that the outcome also belongs to some other given event A. 78

Conditional Probability Construct a new probability law that takes into account the available knowledge. A probability law that for any event A, specifies the conditional probability of A given B, P(A B). The conditional probabilities P(A B) of different events A should satisfy the probability axioms. 79

Conditional Probability Example: Suppose that all six possible outcomes of a fair die roll are equally likely. If the outcome is even, then there are only three possible outcomes: 2, 4 and 6. What is the probability of the outcome being 6 given that the outcome is even? 80

Conditional Probability If all possible outcomes are equally likely: Conditional probability definition: With P(B) > 0. The total probability of the elements of B, P(A B) is the fraction that is assigned to possible outcomes that also belong to A. 81

Conditional Probability Probability law of conditional probabilities satisfy the three axioms: 1. P(A B) 0 for every event A, 2. P(S B) = 1, 3. P(A 1 A 2 B) = P(A 1 B)+ P(A 2 B) + for any finite or countably infinite number of mutually exclusive events A 1, A 2,.... 82

Conditional Probability Proofs: 1. In the definition of P(A B) the numerator is nonnegative by Axiom 1, and the denominator is positive by assumption. Thus, the fraction is nonnegative. 2. Taking A = S in the definition of P(A B), we get: 83

Conditional Probability 3. 84

Conditional Probability Knowledge that event B has occurred implies that the outcome of the experiment is in the set B. In computing P(A B) we can therefore view the experiment as now having the reduced sample space B. The event A occurs in the reduced sample space if and only if the outcome ζ is in A B. The equation simply renormalizes the probability of events that occur jointly with B. 85

Conditional Probability Suppose that we learn that B occurred. Upon obtaining this information, we get rid of all the pebbles in B c because they are incompatible with the knowledge that B has occurred. Then P(A B) is the total mass of the pebbles remaining in A. Finally, we renormalize, that is, divide all the masses by a constant so that the new total mass of the remaining pebbles is 1. This is achieved by dividing by P(B), the total mass of the pebbles in B. The updated mass of the outcomes corresponding to event A is the conditional probability P(A B) = P(A B)/P(B). 86

Conditional Probability If we interpret probability as relative frequency: P(A B) should be the relative frequency of the event P(A B) in experiments where B occurred. Suppose that the experiment is performed n times, and suppose that event B occurs n B times, and that event A B occurs n A B times. The relative frequency of interest is then: where we have implicitly assumed that P(B) > 0. 87

Conditional Probability Example 1. Given the figure below, obtain P(A B) 88

Conditional Probability Example 2. A ball is selected from an urn containing two black balls, numbered 1 and 2, and two white balls, numbered 3 and 4. The number and color of the ball is noted, so the sample space is {(1,b),(2,b), (3,w), (4,w)}. Assuming that the four outcomes are equally likely, find P(A B) and P(A C), where A, B, and C are the following events: 89

Conditional Probability Example 3. From all families with three children, we select one family at random. What is the probability that the children are all boys, if we know that a) the first one is a boy, and b) at least one is a boy? (Assume that each child is a boy or a girl with probability 1/2, independently of each other.) 90

Conditional Probability Example 4. A card is drawn at random from a deck of 52 cards. What is the probability that it is a King or a 2, given that it is a face card (J, Q, K)? 91

Total Probability Theorem and Bayes Rule CIC If we multiply both sides of the definition of P(A B) by P(B) we obtain: P(A B) = P(A B) P(B) Similarly, if we multiply both sides of the definition of P(B A) by P(A) we obtain: P(B A) = P(B A) P(A) 92

Total Probability Theorem and Bayes Rule CIC Joint Probability of Two Events. For any events A and B with positive probabilities: P(A B) = P(B) P(A B) = P(A) P(B A) Joint Probability of Three Events P(A B C) = P(A) P(B A) P(C A B) P(A 1 A 2 A 3 ) = P(A 1 ) P(A 2 A 1 ) P(A 3 A 1 A 2 ) 93

Total Probability Theorem and Bayes Rule CIC Applying repeatedly, we can generalise to the intersection of n events. 94

Total Probability Theorem and Bayes Rule CIC 95

Total Probability Theorem Total Probability Theorem: 96

Total Probability Theorem P(B) = P(A 1 ) P(B A 1 ) + + P(A n ) P(B A n ) The probability that B occurs is a weighted average of its conditional probability under each scenario, where each scenario is weighted according to its (unconditional) probability. The A i partition the sample space; P(B) is equal to: 97

Total Probability Theorem 98

Total Probability Theorem Example 1. Radar detection. If an aircraft is present in certain area, a radar detects it and generates an alarm signal with probability 0.99. If an aircraft is not present, the radar generates a (false) alarm, with probability 0.10. We assume that an aircraft is present with probability 0.05. What is the probability of no aircraft presence and false alarm? What is the probability of aircraft presence and no detection? 99

Total Probability Theorem Sequential representation in a tree diagram 100

Total Probability Theorem Sequential Representation in a tree diagram 101

Total Probability Theorem Example 2. Picking Balls from Urns. Suppose we have two urns, with the first one containing 2 white and 6 black balls, and the second one containing 2 white and 2 black balls. We pick an urn at random, and then pick a ball from the chosen urn at random. What is the probability of picking a white ball? 102

Total Probability Theorem Tree diagram What is the probability of picking a black ball? 103

Total Probability Theorem Dealing Three Cards. From a deck of 52 cards three are drawn without replacement. What is the probability of the event E of getting two Aces and one King in any order? Denote the relevant outcomes by A, K and O (for other ), 104

Total Probability Theorem 105

Total Probability Theorem 106

Bayes Rule 107

Bayes Rule To verify Bayes rule, by the definition of conditional probability: P(B) follows from the total probability theorem. 108

Bayes Rule 109

Bayes Rule 110

Bayes Rule Example 1. Rare disease. A test for a rare disease is assumed to be correct 95% of the time: if a person has the disease, the test results are positive with probability 0.95, and if the person does not have the disease, the results are negative with probability 0.95. A random person drawn from a certain population has probability 0.001 of having the disease. Given that the person just tested positive, what is the probability of having the disease? A={ the person has the disease } B={ the test results are positive } P(A B)=? 111

Bayes Rule A rare disease we need a much more accurate test. The probability of a false positive result must be of a lower order of magnitude than the fraction of people with the disease. 112

Bayes Rule Example 2. Random coin. You have one fair coin, and one biased coin which lands Heads with probability 3/4. You pick one of the coins at random and flip it three times. It lands Heads all three times. Given this information, what is the probability that the coin you picked is the fair one? 113

Bayes Rule Before flipping the coin, we thought we were equally likely to have picked the fair coin as the biased coin: P(F) = P(F c ) = 1/2. Upon observing three Heads, however, it becomes more likely that we ve chosen the biased coin than the fair coin, so P(F A) is only about 0.23. 114

Independence Independence of two events. Events A and B are independent if P(A B) = P(A) P(B) If P(A) > 0 and P(B) > 0, then this is equivalent to: and also equivalent to: P(A B) = P(A) P(B A) = P(B) 115

Independence Two events are independent if we can obtain the probability of their intersection by multiplying their individual probabilities. Alternatively, A and B are independent if learning that B occurred gives us no information that would change our probabilities for A occurring (and vice versa). Independence is a symmetric relation: if A is independent of B, then B is independent of A. 116

Independence Independence is completely different from disjointness. If A and B are disjoint, then P(A B) = 0, so disjoint events can be independent only if P(A) = 0 or P(B) = 0. Knowing that A occurs tells us that B definitely did not occur, so A clearly conveys information about B, meaning the two events are not independent (except if A or B already has zero probability). 117

Independence If A and B are independent, then A and B c are independent, A c and B are independent, and A c and B c are independent. Proof. Let A and B be independent. Then P(B c A) = 1 P(B A) = 1 P(B) = P(B c ) so A and B c are independent. Swapping the roles of A and B, we have that A c and B are independent. Using the fact that A, B independent implies A, B c independent, with A c playing the role of A, we also have that A c and B c are independent. 118

Independence Independence of three events. Events A, B, and C are said to be independent if all of the following equations hold: P(A B) = P(A)P(B) P(A C) = P(A)P(C) P(B C) = P(B)P(C) P(A B C) = P(A)P(B)P(C) 119

Independence 120

Independence Independence of many events. For n events A 1,A 2,..., A n to be independent, we require any pair to satisfy: any triplet to satisfy: P(A i A j ) = P(A i )P(A j ) (for i j), P(A i A j A k ) = P(A i )P(A j )P(A k ) (for i, j, k distinct) And similarly for all quadruplets, quintuplets, and so on. For infinitely many events, we say that they are independent if every finite subset of the events is independent. 121

Conditional independence Given an event C, the events A and B are said to be conditionally independent if: P(A B C) = P(A C) P(B C) 122

Conditional independence The previous relation states that if C is known to have occurred, the additional knowledge that B also occurred does not change the probability of A. The independence of two events A and B with respect to the unconditional probability law, does not imply conditional independence, and vice versa. 123

Independence Example 2. Reliability. p i : probability that unit i is up u i : ith unit is up u 1, u 2,, u n are independent f i : ith unit is down f i are independent P(system is up) =? 124