Probability Theory and Applications

Similar documents
LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Lecture notes for probability. Math 124

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

Fundamentals of Probability CE 311S

Probability- describes the pattern of chance outcomes

Mutually Exclusive Events

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Econ 325: Introduction to Empirical Economics

4 Lecture 4 Notes: Introduction to Probability. Probability Rules. Independence and Conditional Probability. Bayes Theorem. Risk and Odds Ratio

STAT:5100 (22S:193) Statistical Inference I

What is Probability? Probability. Sample Spaces and Events. Simple Event

Probability Year 10. Terminology

Properties of Probability

I - Probability. What is Probability? the chance of an event occuring. 1classical probability. 2empirical probability. 3subjective probability

Probability Year 9. Terminology

Probability (Devore Chapter Two)

Chapter 6: Probability The Study of Randomness

Introduction to Probability

Dept. of Linguistics, Indiana University Fall 2015

Deep Learning for Computer Vision

Chapter 2.5 Random Variables and Probability The Modern View (cont.)

Introduction to Probability

3 PROBABILITY TOPICS

Math 1313 Experiments, Events and Sample Spaces

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Statistical Theory 1

Axioms of Probability

Section 13.3 Probability

MA : Introductory Probability

Recitation 2: Probability

Single Maths B: Introduction to Probability

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

Basic Statistics and Probability Chapter 3: Probability

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

STAT Chapter 3: Probability

Chapter 1 (Basic Probability)

Venn Diagrams; Probability Laws. Notes. Set Operations and Relations. Venn Diagram 2.1. Venn Diagrams; Probability Laws. Notes

Announcements. Topics: To Do:

UNIT 5 ~ Probability: What Are the Chances? 1

7.1 What is it and why should we care?

STAT 285 Fall Assignment 1 Solutions

Chap 4 Probability p227 The probability of any outcome in a random phenomenon is the proportion of times the outcome would occur in a long series of

MATH MW Elementary Probability Course Notes Part I: Models and Counting

AMS7: WEEK 2. CLASS 2

CSC Discrete Math I, Spring Discrete Probability

Probabilistic Systems Analysis Spring 2018 Lecture 6. Random Variables: Probability Mass Function and Expectation

Lecture 1: Probability Fundamentals

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch

Topic 5 Basics of Probability

2011 Pearson Education, Inc

1 INFO Sep 05

Introduction to Probability. Ariel Yadin. Lecture 1. We begin with an example [this is known as Bertrand s paradox]. *** Nov.

Information Science 2

Probability Rules. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics

Statistical Inference

Example: Suppose we toss a quarter and observe whether it falls heads or tails, recording the result as 1 for heads and 0 for tails.

With Question/Answer Animations. Chapter 7

STAT 430/510 Probability

EnM Probability and Random Processes

STA Module 4 Probability Concepts. Rev.F08 1

Conditional Probability and Bayes

Lecture Lecture 5

1 The Basic Counting Principles

Chapter 4 - Introduction to Probability

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

Introduction to Probability 2017/18 Supplementary Problems

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions

RVs and their probability distributions

Recap. The study of randomness and uncertainty Chances, odds, likelihood, expected, probably, on average,... PROBABILITY INFERENTIAL STATISTICS

Probability Theory. Introduction to Probability Theory. Principles of Counting Examples. Principles of Counting. Probability spaces.

Chapter 8: An Introduction to Probability and Statistics

Review Basic Probability Concept

CS 361: Probability & Statistics

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

3.2 Probability Rules

Event A: at least one tail observed A:

Probability the chance that an uncertain event will occur (always between 0 and 1)

Lecture 16. Lectures 1-15 Review

Lecture 3 - Axioms of Probability

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya

the time it takes until a radioactive substance undergoes a decay

Week 2. Section Texas A& M University. Department of Mathematics Texas A& M University, College Station 22 January-24 January 2019

Probability. Chapter 1 Probability. A Simple Example. Sample Space and Probability. Sample Space and Event. Sample Space (Two Dice) Probability

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

MA : Introductory Probability

Probability and Sample space

Probability Notes. Definitions: The probability of an event is the likelihood of choosing an outcome from that event.

Introduction to probability

Intermediate Math Circles November 8, 2017 Probability II

Statistics for Financial Engineering Session 2: Basic Set Theory March 19 th, 2006

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

CINQA Workshop Probability Math 105 Silvia Heubach Department of Mathematics, CSULA Thursday, September 6, 2012

Probability Theory Review

STT When trying to evaluate the likelihood of random events we are using following wording.

Compound Events. The event E = E c (the complement of E) is the event consisting of those outcomes which are not in E.

Outline Conditional Probability The Law of Total Probability and Bayes Theorem Independent Events. Week 4 Classical Probability, Part II

Chapter 2. Conditional Probability and Independence. 2.1 Conditional Probability

Transcription:

Probability Theory and Applications Videos of the topics covered in this manual are available at the following links: Lesson 4 Probability I http://faculty.citadel.edu/silver/ba205/online course/lesson 04.wmv Lesson 5 Probability II Conditional Probability, Independence, Bayes http://faculty.citadel.edu/silver/ba205/online course/lesson 05.wmv Introduction to the theory of probability. The probability of any outcome of an experiment is defined to be the relative likelihood that it will occur and is given as a number between 0 and 1. Think of a trial of any experiment, for example, tossing a coin. Either a heads or a tails will occur (we assume the chance that the coin lands on its edge is 0, meaning it cannot occur). If we assign the value 0 for a tail and 1 for a head, then we can add up the number of heads, i.e. the 1 s, and divide by the total number of tosses n. What we get is the proportion of heads in the n tosses in our experiment. This will be the proportion of heads in our sample, or our sample proportion. [We talk more about the sample proportion in the statistics text in the discussion of the binomial distribution.] Now we can conduct this experiment over and over again and add the number of 1 s in the next set of trials to the number of 1 s already accumulated and divide it by the accumulated number of trials. As we do this, the proportion of heads in our experiment will approach the true proportion of heads in the coin, which, if it s a fair coin, will be ½. Thus, we may think of the probability of an outcome or event (to be defined shortly) as the proportion of times it will occur in an experiment conducted an infinite number of times. Such an experiment, in which there are only two possible outcomes such as a coin toss, is called a binomial, or Bernoulli, experiment. In general, however, an experiment may have more than two possible outcomes, in which case we call it a multinomial experiment. [In the statistics notes to this course, we will not cover probability theory for multinomial experiments; the treatment of such cases is found in more advanced statistics texts.] We can list these outcomes as O 1, O 2,, O n. Then the sample space S is defined to be the set that lists all these possible outcomes, S = {O 1, O 2,, O n }. An event in S is defined as any subset of S. For example if S = {H, T} for the experiment of a single coin toss, then Head = {H} is a (proper) subset of S. Therefore, Head is both an outcome and an event in S. Another possible event in S is the null, or empty, set Φ. For example, rolling both a head and a tail on a single toss is not possible. This example in which two outcomes occur at the same time introduces the concept of intersection. A and B A B A intersect B is the event containing all values that are in both A and

B. Because any value outside of either of the two events A and B lies outside A B, A B is rather exclusive. In fact, often this event is empty. In this case we say A and B are mutually exclusive, or disjoint; that is, there is no overlap of the two events. Another event is A or B AUB A union B, which is the event that includes all outcomes in S that are in either A or B. This event is relatively large in that any outcome that is not outside both A and B is in AUB. In particular, A B lies in AUB. In fact, if A B AUB then A B. We also need an event to represent all outcomes in S not in A; we call this event A complement, or A c. [Some texts use or A.] From the figure below, known as a Venn diagram, we see that all of S can be partitioned into four mutually exclusive events in S, namely: A B, A B c, A c B, and A c B c. Now, consider A c B c, which contains all outcomes not in A and not in B. Then any outcome in either A or B is outside this event; otherwise the outcome is inside A c B c. Thus it is in the complement of AUB or in (AUB) c. So A c B c (AUB) c. Also, consider the event A c UB c, which contains all outcomes in S that are not in A or not in B. Thus, it contains all outcomes outside (A and B), that is, outside A B; otherwise the outcome is in the event. Therefore, A c UB c (A B) c. The two rules A c B c (AUB) c and A c UB c (A B) c are called De Morgan s rules and are very useful in probability theory. We will often use these rules in calculating probabilities of events. In looking at these two formulas, we see that we take the complements of the two outcomes, reverse the union/intersection sign, and then take the complement of this event. Let us now do a simple example problem applying the concepts. Take the experiment of rolling a single die. Then S = {1, 2, 3, 4, 5, 6} is a listing of the possible outcomes. One event in S is S is rolling an even number, Even = {2, 4, 6}; Odd = {1, 3, 6}. Obviously, Even Odd = Φ, and Even U Odd = S. What about rolling an even R(E) and rolling a

number greater than 3 R(>3)? Calling this even A, the only outcomes in this event are 4 and 6, so R(E) R(>3) = {4, 6}. What about rolling an even or rolling a number greater than three. This includes 2, 4, 5, and 6; then R(E) U R(>3) = {2, 4, 5, 6}. Now suppose we roll two dice. The outcomes are ordered pairs of numbers; for example (1, 2) is the outcome of rolling a 1 on the first die and a 2 on the second. Then S = {(1, 1), (1, 2),, (1, 6), (2, 1), (2, 6),, (6, 1),, (6, 6)}. Now one game that is played rolling, shooting, two dice is craps. In this game we add the values showing on the two dice; for example (1, 6) yields a 7. We now divide S up into mutually exclusive events based on the sum of the two values. Thus, R(2) = the set of outcomes adding to 2 = {(1, 1)}, R(7) = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}. We can now list these events, and their probabilities, as follows: Events and Their Probabilities for the Game of Craps Cumulative Event (X) Outcomes in Event X P(X) P(X) R(2) (1,1) 1/36 1/36 R(3) (1,2), (2,1) 2/36 3/36 R(4) (1,3), (2,2), (3,1) 3/36 6/36 R(5) (1,4), (2,3), (3,2), (4,1) 4/36 10/36 R(6) (1,5), (2,4), (3,3), (4,2), (5,1) 5/36 15/36 R(7) (1,6), (2,5), (3,4), (4,3), (5,2), (6,1) 6/36 21/36 R(8) (2,6), (3,5), (4,4), (5,3), (6,2) 5/36 26/36 R(9) (3,6), (4,5), (5,4), (6,3) 4/36 30/36 R(10) (4,6), (5,5), (6,4) 3/36 33/36 R(11) (5,6), (6,5) 2/36 35/36 R(12) (6,6) 1/36 36/36 Total 36 Outcomes in craps each with p = 1/36 1 As a side note, the game is played as follows: the shooter rolls the two dice; if they sum to 7 or 11 the shooter wins the pot. If not, then the shooter continues to roll the dice until one of two events occur: he rolls a 7, and loses, or he rolls the same value as originally, in which case he wins. In Las Vegas there are some additional rules to make the game fair, so that the house and the shooter have about an equal chance of winning: 2, 3 and 12 on the first roll are automatic losers. Now let s see how to calculate the probability that the shooter wins the game. There is a 6/36 (rolls a 7 ) + 2/36 (rolls an 11 ) = 8/36 = 2/9 chance the shooter wins on the first roll. There is a 3/36 chance of rolling a 4 on the first roll and a 3/9 probability of following the first roll with a second roll of 4. Why? Because after the first roll there are exactly 9 ways the game can end: six of these are rolling a 7 and three are by rolling a 4. Since each outcome is equally likely and three of the nine are favorable, the probability of a win for the shooter after rolling an initial 4 is 3/9. Thus, the shooter has a (3/36)*(3/9) = 1/36 chance of rolling an initial 4 and winning the game. More will be

discussed about these calculations when we talk about independence and the chain rule later in this chapter. As an exercise, show that the probability the shooter will win is 0.49293. Now returning to the Venn diagram, suppose we are given the following probabilities: P(A) =.6, P(B) =.4, P(A B) =.3. How would we calculate the following probabilities: P(AUB), P(A c B), P(A c UB)? We need some additional theory to help us here. Looking at the Venn diagram, suppose we drew events A and B such that the area of the events relative to the area of S just equals the probability of the events. Thus, A occupies 60% of the area of S, B occupies 40% and A B 30%. If we add the areas of A and B we get 100%, but clearly the area of AUB is not all of S. The reason is that part of A overlaps with B and we have double counted that area when adding up the two areas. Thus we need to subtract out the overlap once to avoid the double-count. So the area of AUB is.6 +.4 -.3 =.7 = P(AUB). Thus, P(AUB) = P(A) + P(B) P(A B). This is often referred to as the addition rule of probability. Now the P(A c B) is the P(B) P(A B) =.4 -.3 =.1 and the P(A c UB) = P(A c ) + P(B) P(A c B) = (1 -.6) +.4 -.1 =.7. Using De Morgan we get A c UB (A B c ) c and P(A c UB) = 1 P(A B c ) = 1 [P(A) P(A B)] = 1 (.6 -.3) =.7. So we have checked the answer using the addition rule with De Morgan s rules. And what about P(A c UB c )? Using De Morgan, we get 1 P(A B) = 1.3 =.7. Using the addition rule, we get P(A c ) + P(B c ) P(A c B c ) = (1 -.6) + (1 -.4) [1 P(AUB)] =.4 +.6 -.3 =.7. Again the same both ways. Conditional probability, Independence and the Chain Rule. Suppose two events A and B are mutually exclusive. Thus, there is no overlap of the circles representing the two events A and B. Now what is the answer to the following question: What is the probability of A given that B has occurred? Obviously the answer is 0 since A cannot occur if B has occurred. What if A and B do overlap? In this case we need to ask how likely is it for A to occur knowing that we are inside circle B. But the only way that A can occur inside B is if A overlaps with B since the part of A outside B is not relevant. In this case the answer is the area of A B divided by the area of B. And since the area of the events are proportional to their probabilities, the answer is P(A given B) = P(A B)/P(B). For simplicity, we write P(A given B) as P(A B). The chain rule follows directly by multiplying both sides of this relationship by P(B); P(A B) = P(B)*P(A B). In other words, A and B can occur if B occurs and then A occurs given that B has occurred. Of course, we could equally well say A and B can occur if first A occurs and then B occurs given that A has occurred; that is, P(A B) = P(A)*P(B A). Returning to our craps example, the probability of rolling a 4 on the first roll and winning

the game is P[R(4)]*P[R(4) on the second roll before rolling a 7 given rolling a 4 on the first roll]. But the probability of R(4) on the two rolls are independent of each other, meaning that the outcome of the second roll does not depend on the outcome of the first roll. So the probability of rolling a 4 on the second roll before rolling a 7 = the number of outcomes resulting in a roll of 4 divided by the number of ways to roll a 4 or a 7 = 9. Thus, P[R(4)]*P[R(4) on the second roll before rolling a 7, given that the first roll is a 4] = (3/36)*(3/9) = 1/36. [You now have all you need to calculate the probability of winning at the craps table! Give it a try.] Formalizing the concept of independence, we say two events A and B are independent if and only if P(A B) = P(A). But since P(A B) = P(A B)/P(B), then if A and B are independent, we see that P(A B) = P(A B)/P(B) = P(A). Multiplying both sides of this last equation by P(B), we get P(A B) = P(A)*P(B). It is also the case that if P(A B) = P(A)*P(B), then A and B are independent. In fact, some texts define independence in just this way. Bayesian inference. The idea is that as new information arises our prior probability estimates for the state of the world must be revised. As a simple illustration, suppose I believe that a coin is fait; that is, P(H) = P(T) =.5, where H is a head and T is a tail. Now suppose I toss the coin three times and every time a head appears. Am I likely to continue to believe that the coin is fair? Suppose I am allowed to choose between two coins to toss, one which is fair, the other with two heads. I choose the coin randomly and then am allowed to toss it three times. Now if any of the three tosses is tails, I am certain that it is a fair coin. But what is the probability that the coin is fair given that all three tosses are heads? To answer this question we turn to Bayes. Let 3H be the event of rolling three straight heads, F = fair coin and B = biased coin. We know the following probabilities P(F) = P(B) =.5 since I chose the coin at random. And P(F 3H) = P(F 3H)/P(3H) = P(F 3H)/ [P(F 3H) + P(B 3H)]. Now P(F 3H) = P(F) *P(3H F) =.5*.125 =.0625 and P(B 3H) = P(B) *P(3H B) -.5*1 =.5. Thus, P(F 3H) =.0625/ (.0625 +.5) =.0625/.5625 = 1/9 =.111. So there is an 8/9 probability the coin is biased. Now let us generalize Bayes Theorem. Let here be n possible states of nature, S 1, S 2,, S n, and k possible outcomes/events that can occur under each state, say O 1, O 2, O k. Let us suppose we conduct our experiment and O i occurs. The question then is what is the probability that nature is in state S j given O i occurred. Using the logic we just used to solve the problem above we can generalize the problem to finding P(S j O i ) = P(S j O i )/P(O i ) = P(S j O i )/[P(S 1 O i ) + P(S 2 O i ) = + P(S n O i )] = P(S j )*P(O i S j )/[P(S 1 )*P(O i S 1 ) + P(S 2 )*P(O i S 2 ) + + P(S n )*P(O i S n )]. An easy way to work this type of problem is to set up a contingency table as follows.

Contingency Table for Solving a Bayesian Problem S 1 S 2 S j S n O 1 P(S 1 )*P(O 1 S 1 ) P(S 2 )*P(O 1 S 2 ) P(S j )*P(O 1 S j ) P(S n )*P(O 1 S n ) P(O 1 ) O 2 P(S 1 )*P(O 2 S 1 ) P(S 2 )*P(O 2 S 2 ) P(S j )*P(O 2 S j ) P(S n )*P(O 2 S n ) P(O 2 ) O i P(S 1 )*P(O i S 1 ) P(S 2 )*P(O i S 2 ) P(S j )*P(O i S j ) P(S n )*P(O i S n ) P(O i ) O k P(S 1 )*P(O k S 1 ) P(S 2 )*P(O k S 2 ) P(S j )*P(O k S j ) P(S n )*P(Ok Sn) P(O k ) P(S 1 ) P(S 2 ) P(S j ) P(S n ) 1 Then if we want to know P(O i S j ) we divide the entry in the i th row and j th column by the i th row sum. Below we present historical weather data for three cities: New York City, Miami, and Atlanta. Each cell is the probability of the weather condition in the row for the city in the column. Thus, the probability of rain in NY is 0.1 and the probability of clear skies for Atlanta is 0.4. Now let us suppose our prior probabilities for being in each city is P(NY) = 0.3, P(Miami) = 0.5, and the P(Atlanta) = 0.2. In the next chart we present the joint probabilities of each weather condition for each city. For example, P(Miami and Rain) = P(Miami)*P(Rain Miami) = 0.5*0.4 = 0.2. Conditional probabilities of each condition NY Miami Atlanta Rain 0.1 0.4 0.2 Cloudy but no rain 0.3 0.4 0.4 Clear skies 0.6 0.2 0.4 Joint probabilities of each condition and each city NY Miami Atlanta Rain 0.03 0.2 0.04 0.27 Cloudy but no rain 0.09 0.2 0.08 0.37 Clear skies 0.18 0.1 0.08 0.36 0.3 0.5 0.2 1 Now the probability that I am in Miami given that the weather is cloudy is P(Miami Cloudy)/P(Cloudy) =.2/.37 = 0.54, which is slightly greater than the a priori probability of my being in Miami = 0.5.