Chapter 2.5 Random Variables and Probability The Modern View (cont.) I. Statistical Independence A crucially important idea in probability and statistics is the concept of statistical independence. Suppose that you have two random variables X1 and X2. These two random variables will be pairwise statistically independent if the realized value X1 takes on does not affect the probability of X2 taking on any possible value, and vice versa. Loosely speaking, movements in X1 cannot affect movements in X2. If the realized value of X1 somehow affected the probability that X2 would take on a particular value, then the two variables would be dependent. Independence greatly simplifies probability calculations for X1 since we need not worry about what is happening to X2. It is still somewhat murky, since we have not made clear how that we know two random variables are independent. But, a simple theorem called the factorization theorem (or factorization principle) makes things much simpler. Let A and B be random events. P(A and B) = P(A)P(B) if and only if A and B are independent. So, suppose that we are flipping a coin. This is a Bernoulli event. Assume the coin is fair and the probability of a heads is p, or P[H] = p. Now, flip it twice. What will be the probability of first getting a head and then getting another head? That's easy. Since the two flips are independent, P[H1 and H2] = p 2. But, this is just P[H1]P[H2] and so P[H1 and H2] = P[H1]P[H2] and factorization works. Now, try a tail on the first and a tail on the second. This is P[T1 and T2] = (1-p) 2 = P[T1]P[T2], factorization again occurs. What about a head on the first flip and a tail on the second. This will also show factorization, as will a tail on the first and a head on the second. Factorization is a very useful way of looking at statistical independence. Next, consider flipping a coin twice again, with P[Head] = 1. The first flip is independent of the second flip. Let A = 1 if the first flip is a head, and zero otherwise. Let B = 1 if the second flip is a head, and zero otherwise. Now, how do we compute the probability of A+B? Obviously, A+B can
be 0, 1, or 2. To get A+B = 0, both A and B must be zero. Therefore, by the factorization theorem we multiply to get P[A+B=0] = P[A = 0 and B = 0] = (1-p) 2. Similarly, for A+B to be 2, both A = 1 and B = 1 must occur. Again, employing the factorization idea, we get P[A+B=2] = P[A=1 and B=1]= p 2. However, to get A+B = 1, there are two separate ways this can happen. The first way is A = 1 and B = 0, while the second way is A = 0 and B = 1. Two mutually exclusive ways of getting A+B = 1. In this case, we ADD the probabilities of the two different, mutually exclusive ways together. Thus, P[A+B = 1] = P[A = 1 and B = 0] + P[A = 0 and B = 1] = P[A=1]P[B=0] + P[A=0]P[B=1] = p(1-p) + (1-p)p = 2p(1-p) The probabilities of mutually exclusive ways of an event occurring add together (called the addition principle), while independent events multiply together (called the factorization principle). II. Statistical Dependence Sometimes we must deal with two or more random variables that are not statistically independent. The probability of one is affect by what happens to the other. Let's try to understand this with a simple example. Think of the following highly contrived set of circumstances for random variables A and B. A : You flip a coin. A = 1 if a head, zero otherwise where P[A=1] = p. B : If A is a head, you flip the coin again. If A is a tail, you roll a die where P[X = i] = 1/6 for i = 1,...,6. This is the second event. Calculate the pdf for A+B.
Here is the way we can calculate the pdf of A+B and draw the graph, as shown below. The important point to note is that the probability of B = 3 or 4 or 1 depends on the value A assumes. For example, if A = 1, then P[B=3] = 0, whereas if A = 0, then P[B = 3] = 1/6. For sure, the world is complicated, and these dependent random variables show that randomness can be complicated also. Not all sets of random variables follow simple rules of combination. Some require us to think carefully how probability is determined. Sometimes, this involves us in counting arguments. In all cases we look at two important criteria to help us in calculating probability. First, what are the different, mutually exclusive ways of getting a particular outcome? Second, for each of these ways, are they composed of independent events? In the graph above, we look at an outcome (A+B) = 1 and find that there are two ways this can happen. Namely, A = 1 and B = 0, and also A = 0 and B = 1. Note that P[A = 1 and B = 0] = p(1-p) and P[A = 0 and B = 1] = (1-p)/6. These are two mutually exclusive ways of getting A+B = 1, so we add them together to get P[A+B = 1] = p(1-p) + (1-p)/6. It will be useful for you to work your way through the graph above. Questions:
#1. Consider the case where you draw two cards from a 52 card deck without replacement. Now consider the random event that you have drawn two red cards. Explain why that the first and second draws are not statistically independent. #2. Assume A and B are independent random variables. Therefore, we know that P[A and B] =? #3. Flip a coin twice. Let E1 be the outcome from the first flip and E2 be the outcome from the second flip. Use the factorization principle to prove E1 and E2 and statistically independent. #4. Let A and B be random variables. But, suppose B becomes certain and no longer random. Show that A and B must now be statistical independent. #5. Consider the following Venn diagram showing the probabilities of A and B. Is the equation as it is written correct? Draw the diagram again assuming A and B are independent. How does the equation below the diagram change? #6. Use the following Venn diagram to calculate P[A], P[B], P[A & B], P[A or B]
#7. Use the following Venn diagram to calculate P[A], P[B], P[C], P[A & B &C], P[A & B or C], and P[A&C] #8. In blackjack (or the card game 21), it is claimed that people can raise the odds of winning by counting cards. Does this mean that draws at a black jack table are not independent? Explain.