Random processes. Lecture 17: Probability, Part 1. Probability. Law of large numbers

Random processes Lecture 17: Probability, Part 1 Statistics 10 Colin Rundel March 26, 2012 A random process is a situation in which we know what outcomes could happen, but we don t know which particular outcome will happen. Examples: coin tosses, die rolls, itunes shuffle, whether the stock market goes up or down tomorrow, etc. It can be helpful to model a process as random even if it is not truly random. Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 2 / 21 Probability Law of large numbers There are several possible interpretations of probability but they (almost) completely agree on the mathematical rules probability must follow. P(A) = Probability of event A 0 P(A) 1 Frequentist interpretation: The probability of an outcome is the proportion of times the outcome would occur if we observed the random process an infinite number of times. Single main stream school until recently. Bayesian interpretation: A Bayesian interprets probability as a subjective degree of belief: For the same event, two separate people could have differing probabilities. Largely popularized by advances in computational technology and methods during the last twenty years. Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 3 / 21 Law of large numbers states that as more observations are collected, the proportion of occurrences with a particular outcome, ˆp n, converges to the probability of that outcome, p. The plot below shows the cumulative proportion of 1s in n rolls of a die. The probability settles around 0.167 ( 1 6 ), which is in fact the probability of rolling a 1. p^n 0.30 0.25 0.20 0.15 0.10 0.05 0.00 1 10 100 1,000 10,000 100,000 Statistics 10 (Colin Rundel) n Lecture (number 17 of rolls) March 26, 2012 4 / 21

Disjoint (mutually exclusive) outcomes Law of large numbers (cont.) Disjoint (mutually exclusive) When tossing a fair coin, if heads comes up on each of the first 10 tosses, what do you think the chance is that another head will come up on the next toss? 0.5, less than 0.5, or more than 0.5? H H H H H H H H H H? The probability is still 0.5, or there is still a 50% chance that another head will come up on the next toss. P(H on 11 th toss) = P(T on 11 th toss) = 0.5 The coin is not due for a tail. The common (mis)understanding of the LLN is that random processes are supposed to compensate for whatever happened in the past; this is just not true and is also called gambler s fallacy (or law of averages). Two outcomes are called disjoint or mutually exclusive if they cannot both happen. The outcome of a single coin toss cannot be a head and a tail. A student cannot fail and pass a class. A card drawn from a deck cannot be an ace and a queen. When events are disjoint, it s easy to calculate the probability of one event or the other happening. The probability of rolling a 1 or a 2: P(1) + P(2) = 1 6 + 1 6 = 2 6 0.33 The probability of tossing a head or a tail: P(H) + P(T ) = 0.5 + 0.5 = 1 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 5 / 21 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 6 / 21 Disjoint (mutually exclusive) outcomes Probabilities when events are not disjoint Disjoint events Non-disjoint events What is the probability that a randomly selected student is from the Northeast or the South? What is the probability of drawing a jack or a red card from a well shuffled full deck? 0.30 0.25 0.20 0.15 0.10 0.05 0.00 int. midwest northeast south west Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 7 / 21 P(jack or red) = P(jack) + P(red) P(jack and red) = 4 52 + 26 52 2 52 = 28 52 General addition rule - P(A or B) = P(A) + P(B) P(A and B) Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 8 / 21

Probabilities when events are not disjoint Example - Tenting Sample space What is the probability that a randomly sampled student is a sophomore or is planning on tenting during their Duke career? tenting no yes Total first-year 17 62 79 class sophomore 37 40 77 junior 22 26 48 senior 3 4 7 Total 79 132 211 Sample space is the collection of all possible outcomes of a trial. A couple has one child, what is the sample space for the gender of the child? S = {B, G} A couple has two children, what is the sample space for the gender of these children? P(sophomore or tenting) = P(sophomore) + P(tenting) P(sophomore and tenting) = 77/211 + 132/211 40/211 = 169/211 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 9 / 21 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 10 / 21 Probability distributions Complementary events A probability distribution lists all possible events and the probabilities with which they occur. The probability distribution for the gender of one kid: Event B G Probability 0.5 0.5 A couple has one kid. If we know that the child is not a boy, what is gender of the child? Rules for probability distributions: 1 The events listed must be disjoint 2 Each probability must be between 0 and 1 3 The probabilities must total 1 A couple has two kids, if we know that they are not both girls, what are the possible gender combinations for these kids? The probability distribution for the genders of two children: Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 11 / 21 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 12 / 21

Example - Political Parties Disjoint vs. complementary In a survey, 52% of respondents said they are Democrats. What is the probability that a randomly selected respondent from this sample is a Republican? (a) 0.48 (b) > 0.48 (c) < 0.48 (d) cannot calculate using the information given Do the sum of probabilities of two disjoint events always add up to 1? Do the sum of probabilities of two complementary events always add up to 1? Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 13 / 21 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 14 / 21 Independent events Two processes are independent if knowing the outcome of one provides no useful information about the outcome of the other. Knowing that the coin landed on a head on the first toss does not provide any useful information for determining what the coin will land on in the second toss since coin tosses are independent. Outcomes of two tosses of a coin are independent. On the other hand, knowing that the first card drawn from a deck is an ace does provide useful information for determining the probability of drawing an ace in the second draw. Outcomes of two draws from a deck of cards (without replacement) are dependent. You toss a coin twice, what is the sample space for the outcomes of these tosses? You toss a coin twice, what is the probability of getting two tails in a row? While it is possible to calculate this probability by first obtaining the sample space, this would be an inefficient way if the number of trials was much higher. Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 15 / 21 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 16 / 21

Product rule for independent events Example - Insurer Drivers Product rule for independent events P(A and B) = P(A) P(B) Or more generally, P(A 1 and and A k ) = P(A 1 ) P(A k ) You toss a coin twice, what is the probability of getting two tails in a row? A recent Gallup poll report shows that 17.7% of Americans were uninsured as of December 2011. Assuming that the uninsured rate stayed constant, what is the probability that two randomly selected Americans are both uninsured? P(T on the first toss) P(T on the second toss) = 1 2 1 2 = 1 4 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 17 / 21 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 18 / 21 P(at least 1 success) P(at least 1 success) (cont.) If we were to randomly select 5 Americans, what is the probability that at least one is uninsured? If we were to randomly select 5 people, the sample space for the number of applicants who are uninsured would be: S = {0, 1, 2, 3, 4, 5} We are interested in instances where at least one person is uninsured: Since the probability of the sample space must add up to 1: P(at least 1 uninsured) = 1 Prob(none uninsured) = 1 [(1 0.177) 5 ] = 1 0.823 5 = 1 0.3776 = 0.6224 S = {0, 1, 2, 3, 4, 5} So we can divide up the sample space intro two categories: At least 1 S = {0, at least one} P(at least one) = 1 P(none) Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 19 / 21 Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 20 / 21

Example - Duke Vegetarians Roughly 20% of Duke undergraduates are vegetarian or vegan (estimated based on the class survey). What is the probability that among a random sample of 3 Duke undergraduates at least one is vegetarian or vegan? Statistics 10 (Colin Rundel) Lecture 17 March 26, 2012 21 / 21