Why should you care?? Intellectual curiosity. Gambling. Mathematically the same as the ESP decision problem we discussed in Week 4.

I. Probability basics (Sections 4.1 and 4.2) Flip a fair (probability of HEADS is 1/2) coin ten times. What is the probability of getting exactly 5 HEADS? What is the probability of getting exactly 10 HEADS? What is the probability of getting more than 8 HEADS? What is the probability that the first three tosses are HEADS and that the total number of HEADS is less than 7? 1

Why should you care?? Intellectual curiosity. Gambling. Mathematically the same as the ESP decision problem we discussed in Week 4. If we can answer the coin-tossing questions we can answer the ESP questions. Mathematically quite similar to many problems in polling. Answering the coin-tossing questions will lead us to a method for answering polling questions. It s a good example for learning about sample spaces, probabilities, and events. 2

What are the possible outcomes? If we really care about all ten tosses, then outcomes are things like HHHHHHHHHH, HTHTHTHTHT,... Here H stands for HEADS and T stands for TAILS. We don t need to list them all there are 1024 possible outcomes! But we will need to count certain types of outcomes; more on this later. If we only care about the number of HEADS (or TAILS) the only outcomes are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. The set of all possible outcomes is called the sample space. We usually use the letter S to denote the sample space. 3

Another example: Choose two people at random and ask them whether they are Republicans, Democrats, or neither. The sample space is S = {RR, DD, NN, RD, DR, RN, NR, ND, DN}. For example, RD stands for the outcome that the first person is a Republican and the second is a Democrat. By itself a sample space is of limited use. We also need to assign probabilities to all the outcomes. 4

Probabilities Let s look at a simpler version of the coin-tossing example, with only three coins tossed. The sample space is S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} If the coins are fair, it makes sense to assign equal probabilities to each of the 8 outcomes. So we can use P (HHH) = P (HHT ) = = 1/8. Probabilities will always be nonnegative, and the sum of the probabilities of all the outcomes will always be one. 5

Another legal assignment of probabilities, since they re all nonnegative and add to one: All probabilities are zero except P(HHH) = P(TTT) = 1/2. But it certainly doesn t accurately model the experiment we described. The following isn t a legal assignment of probabilities, since they don t add to one. P(HHH) = P(TTT) = 1/2 and all other probabilities are 1/4. 6

Let s stick with the reasonable model with all probabilities equal to 1/8. It s now easy to assign probabilities if we only want to know how many HEADS we got: Zero HEADS = {TTT}, so P (0) = P (T T T ) = 1/8. One HEADS = {HTT, THT, TTH}, so P (1) = 3/8. Two HEADS = {HHT, HTH, THH}, so P (2) = 3/8. Three HEADS = {HHH}, so P (3) = 1/8. It s convenient to summarize this in a table. # HEADS Probability 0 1/8 1 3/8 2 3/8 3 1/8 7

NOTE: The same technique will work for tossing 10 or 20 or 100 or 1000 coins. (And for the ESP problem, and polling, etc.) But we would need to count a lot of outcomes. We ll learn how to do this later. 8

Events Quite often we re interested in events that are made up of several outcomes. The event Two HEADS is made up of the outcomes HHT, HTH, THH. It s easy to compute the probability of an event: Just add the probabilities of each outcome in the event. We ll usually use capital letters like A, B, C to denote events. 9

Example: Roll two fair dice. Sample space is all (36) pairs of numbers from 1 to 6. For example, (1,1), (1,2),... It s reasonable to make each outcome equally likely, so P (1, 1) = P (1, 2) = = 1/36. Let A stand for both rolls the same. There are six outcomes in A: (1,1), (2,2), (3,3), (4,4), (5,5), (6,6). Since each has probability 1/36, the probability of A is P (A) = 6/36. 10

Let B stand for the rolls are different. There are thirty outcomes in B. Why? Hard way: Count them: (1,2), (1,3), etc. Easy way: Event B is everything that s not in event A. Since A has six outcomes, B has the other 30. So P (B) = 30/36. Note that P (B) = 1 P (A). We ll return to this later. Let C = first roll is not 1. P (C) = 30/36. (Check this!) Let D = sum is 6. The event D contains 5 outcomes: (1,5), (2,4), (3,3), (4,2), (5,1). Since all have probability 1/36, P (D) = 5/36. 11

The complement law Let A be an event. All the outcomes not in A make up an event called the complement of A. The complement of A is denoted A c. The complement law: P (A c ) = 1 P (A). Note: This should be clear: All it says is that P (A) + P (A c ) = 1. We noted this before, with the event A = the rolls are the same. Then A c = the rolls are different. (We called this B earlier.) 12

Why is the complement law useful? Toss a fair coin 700 times. Want to compute the probability of A = more than 1 HEADS. Hard way: Compute P(2 HEADS), P(3 HEADS),..., P(700 HEADS) and add. Easy way: Compute P (A c ) = P(0 HEADS) + P(1 HEADS). Then compute P (A) = 1 P (A c ). 13

Conditional Probability Main question: How to update probabilities to incorporate new information. For example, P(get 4.0 in STT 201) might change as some quiz and lab project grades become available. Back to dice example: Let A = both rolls same and D = sum is 6 as before. We know P (D) = 5/36 and P (A) = 6/36. I tell you that A has happened. What should we now use for the probability that D happens? We ll call this the conditional probability of D given A. We ll denote this P (D A). Still the question: What s P (D A)? 14

(Intuitive method) Event A contains the outcomes (1,1), (2,2), (3,3) (4,4), (5,5), (6,6). Since there are six outcomes in A and one of them has a sum of 6, P (D A) = 1/6. There s a formula that s not intuitive but is more useful (see p. 248 of the text): P (D A) = P (D and A) P (A). In our case P (D and A) = 1/36 and P (A) = 6/36. Plug into formula to get P (D A) = 1/36 1/6 = 1/6. We ll use the formula to compute conditional probabilities. 15

Still in the dice example. Let E = sum is 7 and F = first roll is 3. It s easy to see that P (E) = 6/36, since E = {(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)}. P (F ) = 6/36, since F = {(3,1), (3,2), (3,3), (3,4), (3,5), (3,6)}. P (E and F ) = 1/36, since E and F = (3,4). So P (E F ) = P (E and F ) P (F ) = P (1/36) P (1/6) = 1/6. 16

Independence In the first example, P (D) was 5/36, but P (D A) was 1/6. So knowing that A occured changed the probability that D would occur. In the second example, P (E) was 6/36 and P (E F ) was 1/6 (which is the same). So knowing that F occured did not change the probability that E would occur. Events E and F are independent. Events D and A are not. Formally, two events A and B are independent if knowing that one occured does not change the probability that the other will occur. 17

Multiplication Law Remember the formula for conditional probability: P (A B) = P (A and B). P (B) Use algebra to get P (A and B) = P (A B)P (B). This is called the multiplication law. If A and B are independent then P (A and B) = P (A)P (B). 18

Using the multiplication law Example: Sampling without replacement: A group of 50 people contains 35 females and 15 males. Two people are chosen at random. 19

What is the probability that both are female? Let F 1 stand for the event that the first person chosen is female. Let F 2 stand for the event that the second person chosen is female. We re interested in P (F 1 and F 2). We know P (F 1) = 35/50. P (F 2 F 1) = 34/49, because if F 1 happens then there are 34 females out of the 49 people left. P (F 1 and F 2) = P (F 2 F 1)P (F 1). (This is the multiplication rule.) So we just plug in the values we know: P (F 1 and F 2) = P (F 2 F 1)P (F 1) = (34/49)(35/50) =... 20

What is the probability that exactly one person chosen is female? Be careful: We need to compute P (F 1 and M2) and P (M1 and F 2) and add. As before P (F 1 and M2) = P (M2 F 1)P (F 1) = (15/49)(35/50). and P (M1 and F 2) = P (F 2 M1)P (M1) = (35/49)(15/50). So the answer is (15/49)(35/50) + (35/49)(15/50) =... 21

Are F 1 and F 2 independent? One way to check is to compute P (F 2 F 1) and P (F 2). If these probabilities are equal, then F 1 and F 2 are independent, otherwise they are not. We already know that P (F 2 F 1) = (34/49). To compute P (F 2) we can compute P (F 2 and F 1) and P (F 2 and M1) and add. We already computed P (F 2 and M1) to be 3/14. We already computed P (F 2 and F 1) to be 17/35. So (omitting a bit of arithmetic) P (F 2) = 3/14 + 17/35 = 35/50. Since this isn t equal to P (F 2 F 1) the events are not independent. 22

Drug testing A company decides to institute a policy of randomly testing employees for drug use. The test has the following properties: If a person uses the drug being tested for, the test comes back positive with probability 0.98. If the person does not use the drug being tested for, the test comes back negative with probability 0.97. Also, 1% of the employees actually use the drug being tested for. Question: An employee s drug test comes back positive. What is the probability that he is a drug user? 23

Let D represent the event that the employee is a drug user. Let A represent the event that the test comes back positive. We want P (D A). We know P (A D) = 0.98; P (A c D c ) = 0.97; and P (D) = 0.01. We also know that P (D A) = P (D and A) P (A). The numerator is easy: P (D and A) = P (A D)P (D) = (0.98)(0.01) = 0.0098. 24

We ll compute the denominator by computing P (A and D) and P (A and D c ) and adding. The first is 0.0098. For the second P (A and D c ) = P (A D c )P (D c ) So the denominator is = (0.03)(0.99) = 0.0297. 0.0098 + 0.0297 = 0.0395, and the answer is 0.0098 0.0395 0.25. 25

This is disturbing! We have a good test. Out of those who test positively, only about 25% are drug users. The other 75% are falsely accused as being drug users. 26

Retesting the positives What if we retest the people who got a positive test? Only change: Now P (D) = 0.25. So P (D and A) = P (A D)P (D) = (0.98)(0.25) = 0.245. P (D c and A) = P (A D c )P (D c ) So the denominator is = (0.03)(0.75) = 0.0225. 0.245 + 0.0225 = 0.2675, and the answer is 0.245 0.2675 0.92. 27

Important point: The proportion of drug users in the group has a huge effect on the probability of false accusations. Think about the relevance to medical tests for HIV and other conditions. 28