When Are Two Random Variables Independent? 1 Introduction. Almost all of the mathematics of inferential statistics and sampling theory is based on the behavior of mutually independent random variables, for these reasons. 1. In most instances, for all practical purposes, sampling is performed with replacement. 1 2. As a consequence, in an experiment that consists of sampling n members of a population and performing the same measurement on each of them: if Y j is the result of measuring the j th member of the sample, then the random variables {Y 1, Y 2... Y n } are independent and identically distributed. Fortunately, joint distributions of independent random variables are the easiest ones to analyze. This handout focuses on how to characterize independence in the case of two random variables; characterizing independence for three or more random variables is more complicated. The intuitive, unmathematicized idea to be rendered precise is this: Random variables X and Y are independent if they have nothing to do with each other, in the sense that the probabilities associated with X remain the same regardless what the value of Y is and vice-versa). I will need the following useful result about joint CDF s true regardless whether or not X and Y are independent). Recall that F X x) P X x), F Y y) P Y y), and F XY x, y) P X x Y y. Theorem 1 If X and Y are random variables defined on the same sample space, then for any a < b and c < d, P a < X b) c < Y d) F XY b, d) F XY b, c) F XY a, d) + F XY a, c). 1) Exercise 1 Prove equation 1 ). 1 because the population is generally much larger than the sample size the population is often theoretically infinite. 1
2 The Discrete Case. As discussed in class and in handout #5: Definition 1 discrete case) Let X and Y be two discrete random variables defined on the same sample space. X and Y are independent iff for every value x i in the distribution of X and every value y j in the distribution of Y, P X x i Y yj P X x i )P Y y j ); or, equivalently, px i, y j ) p X x i )p Y y j ). 2) Equation 2) wors perfectly for discrete random variables, but it does not wor for random variables in general. Our first tas is to establish a condition which is equivalent to 2) for discrete random variables but which will also apply to other types of random variables. Theorem 2 Let X and Y be discrete random variables defined on the same sample space. Then X and Y are independent F XY x, y) F X x)f Y y) for all points x, y) in the plane. 3) Proof ): Assuming X and Y to be independent, we can calculate as follows. The proof is based on the fact that set intersection distributes over set union in the same way that multiplication distributes over addition see exercise 2) on p.3). F X x)f Y y) P X x)p Y y) P X x i ) P Y y j ) distribute )) P X x i )P Y y j ) xi x partially regroup ) P X x i )P Y y j ) by equation 2) ) because the events are nonoverlapping ) P by exercise 2 on p.3) ) P P X x i ) Y y j )) X x i ) Y y j X x i ) because the events are nonoverlapping ) P by exercise 2 on p.3) ) P 2 P Y y j [ P X x i ) ] Y y)) X x i ) Y y)) X x i ) Y y) X x) Y y) F XY x, y).
The proof in the other direction will be simpler if I allow all integers positive, negative or zero) as indices for the values of X and Y and assume that larger indices go with larger values: 2 x 3 < x 2 < x 1 < x 0 < x 1 < x 2 < x 3 < and y 3 < y 2 < y 1 < y 0 < y 1 < y 2 < y 3 <. Arranging things in this way gives formula 4), which streamlines the proof. Proof ): P X x i ) F X x i ) F X x i 1 ) and P Y y j ) F Y y j ) F Y y j 1 ). 4) P X x i )P Y y j ) F X x i ) F X x i 1 )) F Y y j ) F Y y j 1 )) multiply out ) F X x i )F Y y j ) F X x i 1 )F Y y j ) F X x i )F Y y j 1 ) + F X x i 1 )F Y y j 1 ) by assumption ) F XY x i, y j ) F XY x i 1, y j ) F XY x i, y j 1 ) + F XY x i 1, y j 1 ) by exercise 1 on p.1 ) P x i 1 < X x i yj 1 < Y y j P X x i Y yj. Exercise 2 Show, for any events {B, A 1, A 2, A 3,...}, that ) A B A B. 3 The General Case. 1 Theorem 2 for discrete random variables indicates how to go about characterizing independence in in general. Equation 5) gives a characterization of of independence that is valid in all cases including the discrete case, by Theorem 2). Definition 2 general case) Let X and Y be any random variables defined on the same sample space. Then X and Y are independent F XY x, y) F X x)f Y y) for all points x, y) in the plane. 5) 4 The Continuous Case. Now suppose that X and Y have continuous joint density 3 f XY x, y), so that X and Y separately have marginal densities f X x) 1 f XY x, y) dy and f Y y) f XY x, y) dx. I will need the following fact for the proof of the ) direction of Theorem 4. Theorem 3 x y F XY x, y)) f XY x, y). 6) 2 Full disclosure department: even with this modification, this proof does not wor for all discrete random variables, because it is not always possible to arrange their values consecutively. 3 In general, of course, f XY x, y) need not be continuous, but by assuming continuity I can simplify the proofs. Theorems 4 and 5 are true even if f XY x, y) is not continuous.) 3
Proof. Step 1. For fixed x, put 4 g x ŷ) : Then f XY ˆx, ŷ) dˆx. y F XY x, y) y y f XY ˆx, ŷ) dˆx }{{} g xŷ) y ) g x ŷ) dŷ dy FTC ) g x y) f XY ˆx, y) dˆx. Step 2. Then, x y F XY x, y)) by Step 1 ) ) f XY ˆx, y) dˆx x FTC ) f XY x, y). There are two ways to determine from the joint density function whether or not X and Y are independent. The first of these Theorem 4) parallels equation 2) on p.2; the second characterization Theorem 5) generalizes the first one. Theorem 4 For continuous random variables X and Y with continuous joint density f XY x, y): X and Y are independent f XY x, y) f X x)f Y y) for all x, y). 7) Proof ): F XY x, y) y y F X x)f Y y). f XY ˆx, ŷ) dˆx dŷ f X ˆx)f Y ŷ) dˆx dŷ ) y ) f X ˆx) dˆx f Y ŷ) dŷ 4 The variables ˆx and ŷ are variables of integration. They are logically distinct from the variables x and y, which appear on integral signs. 4
Proof ): f XY x, y) by equation 6) on p.3 ) x y F XY x, y)) y F Xx)F Y y)) by independence ) x x )) d F X x) dy F Y y) x F Xx)f Y y)) f Y y) d dx F Xx)) f Y y)f X x). As mentioned above, Theorem 5 generalizes Theorem 4. The difference between the theorems is that in equation 8), the functions gx) and hy) do not need to be the marginal density functions f X x) and f Y y). Theorem 5 For continuous random variables X and Y with continuous joint density f XY x, y): X and Y are independent there exist continuous nonnegative functions gx) and hy) such that f XY x, y) gx)hy). 8) Proof ): This is immediate from Theorem 4: tae gx) f X x) and hy) f Y y). Proof ): Let f XY x, y) gx)hy). I will show it is also true that f XY x, y) f X x)f Y y), so that X and Y are independent by Theorem 4. Step 1. We have f X x) f XY x, y) dy gx)hy) dy gx) hy) dy gx), 9) }{{ } so that ) ) hy) hy) f XY x, y) gx)hy) gx) f X x). In Step 2, I will show that hy) f Y y) thus completing the proof). Step 2. A calculation parallel to that in equation 9) shows that I will complete the proof by showing that 1 f Y y) hy) gx) dx; gx) dx 1 : f XY x, y) dx dy gx)hy) dx dy ) ) hy) dy gx) dx ) gx) dx. 5