Let X and Y denote two random variables. The joint distribution of these random

EE385 Class Notes 9/7/0 John Stensby Chapter 3: Multiple Random Variables Let X and Y denote two random variables. The joint distribution of these random variables is defined as F XY(x,y) = [X x,y y] P. (3-) This is the probability that (X,Y) lies in the shaded region (below y and to the left of x) depicted on Figure 3-. Elementary Properties of the Joint Distribution As x and/or y approach minus infinity the distribution approaches zero; that is, F XY (-,y) = 0 and F XY (x,- ) = 0. (3-) To show this, note that {X = -, Y y} {X = - }, but P[X = - ] = 0 F XY (-,y) = 0. Similar reasoning can be given to show that F XY (x,- ) = 0. As x and y both approach infinity (simultaneously, and in any order) the distribution approaches unity; that is, F XY (, ) =. (3-3) to x = - y-axis y x x-axis to y = - Figure 3-: Region included in the definition of F(x,y). Updates at http://www.ece.uah.edu/courses/ee385/ 3-

EE385 Class Notes 9/7/0 John Stensby y-axis y x D x x-axis y = Figure 3-: Region x < X x, Y y on plane. This follows easily by noting that {X, Y } = S and P(S ) =. In many applications, the identities P[x < X x, Y y] = F XY (x,y) - F XY (x,y) (3-4) P[X x, y < Y y ] = F XY (x,y ) - F XY (x,y ) (3-5) are useful. To show (3-4), note that P[x < X x, Y y] is the probability that the pair (X, Y) is in the shaded region D depicted by Figure 3-. Now, it is easily seen that P[X x, Y y] = P[X x, Y y] + P[x < X x, Y y], which is equivalent to F XY (x,y) = F XY (x,y) + P[x < X x, Y y]. This leads to (3-4). A similar development leads to (3-5). Joint Density The joint density of X and Y is defined as the function Updates at http://www.ece.uah.edu/courses/ee385/ 3-

EE385 Class Notes 9/7/0 John Stensby f (x,y) = XY FXY( x, y). (3-6) x y Integrate this result to obtain F XY z x y ( x, y) = f XY(u, v) du dv. (3-7) z Density f XY (x,y) and distribution F XY (x,y) describe the joint statistics of the random variables X and Y. On the other hand, f X (x) and F X (x) describe the marginal statistics of X. Let D denote a region of the x-y plane such that {(X,Y) D} = {ρ S : (X(ρ), Y(ρ)) D} is an event (see Fig. 3-3). The probability of this event is P[(X,Y) ] = f XY (x, y) dx dy D D. (3-8) Note that [- < X,Y < ] = f XY(x, y) dx dy = - - P. (3-9) y-axis D x-axis Figure 3-3: Region used in development of P[(X,Y) D]. Updates at http://www.ece.uah.edu/courses/ee385/ 3-3

EE385 Class Notes 9/7/0 John Stensby That is, there is one unit of area under the joint density function. Marginal descriptions can be obtained from joint descriptions. We claim that F X (x) = F XY (x, ) and F Y (y) = F XY (,y). (3-0) To see this, note that {X x} = {X x, Y } and {Y y} = {X, Y y}. Take the probability of these events to obtain the desired results. Other relationships are important as well. For example, marginal density f X (x) can be obtained from the joint density f XY (x,y) by using z fx( x) = fxy( x, y)dy. (3-) To see this, use Leibnitz's rule to take the partial derivative, with respect to x, of the distribution x y F XY (x,y)= f XY(u,v) dvdu x F XY to obtain zy ( x, y ) = f (x, v) dv. (3-) XY Now, let y go to infinity, and use F X (x) = F XY (x, ) to get the desired result f X(x) = F XY(x, )= f XY(x,v) dv x. (3-3) A similar development leads to the conclusion that z fy( y) = fxy( x, y)dx. (3-4) Updates at http://www.ece.uah.edu/courses/ee385/ 3-4

EE385 Class Notes 9/7/0 John Stensby Special Case: Jointly Gaussian Random Variables Random variables X and Y are jointly Gaussian (a.k.a jointly normal) if their joint density has the form (x η x ) (x ηx)(y ηy) (y ηy) f XY (x,y) = exp r + x y ( r ) x y r σ σ πσ σ σx σy, (3-5) where η x = E[X], η y = E[Y], σ y = Var[Y], σ x = Var[X], and r is a parameter known as the correlation coefficient (r lies in the range - r ). The marginal densities for X and Y are given by f x x exp x x ( ) = ( η ) / σx and fy( y) = exp ( y ηy) / σy. πσ πσ x y Independence Random variables X and Y are said to be independent if all events of the form {X A} and {Y B}, where A and B are sets of real numbers, are independent. Apply this to the events {X x} and {Y y} to see that if X and Y are independent then F XY (x,y) = P[X x, Y y] = P[X x] P[Y y] = F X (x) F Y (y) f XY (x,y) = F XY (x,y) = F X (x) F Y (y) = f X (x) f Y (y). xy x y (3-6) The converse of this can be shown as well. Hence, X and Y are independent if, and only if, their joint density factors into a product of marginal densities; a similar statement can be made for distribution functions. This result generalizes to more than two random variables; n random Updates at http://www.ece.uah.edu/courses/ee385/ 3-5

EE385 Class Notes 9/7/0 John Stensby variables are independent if, and only if, their joint density (alternatively, joint distribution) factors into a product of marginal densities (alternatively, marginal distributions). Example 3-: Consider X and Y as jointly Gaussian. The only way you can get the equality πσxσy r exp R S T ( x ηx ) ( r x ηx)( y ηy) ( y ηy) + σ σ σ σ ( r ) x x = exp ( x η ) / σ exp ( y ηy) / σ πσ πσ x L N M y x x y y y OU Q PV W (3-7) is to have the correlation coefficient r = 0. Hence, Gaussian X and Y are independent if and only if r = 0. Many problems become simpler if their random variables are (or can be assumed to be) independent. For example, when dealing with independent random variables, the expected value of a product (of independent random variables) can be expressed as a product of expected values. Also, the variance of a sum of independent random variables is the sum of the variances. These two simplifications are discussed next. Expectation of a Product of Independent Random Variables Independence of random variables can simplify many calculations. As an example, let X and Y be random variables. Clearly, the product Z = XY is a random variable (review the definition of a random variable given in Chapter ). As we will discuss in Chapter 5, the expected value of Z = XY can be computed as E[XY] xy f (x, y) dxdy =, (3-8) XY where f XY (x,y) is the joint density of X and Y. Suppose X and Y are independent random variables. Then (3-6) and (3-8) yield Updates at http://www.ece.uah.edu/courses/ee385/ 3-6

EE385 Class Notes 9/7/0 John Stensby XY X Y E[XY] = xy f (x, y) dxdy = x f (x)dx y f (y)dy = E[X]E[Y]. (3-9) This result generalizes to more than two random variables; for n independent random variables, the expected value of a product is the product of the expected values. The converse of (3-9) is not true, in general. That is, if E[XY] = E[X]E[Y], it does not necessarily follow that X and Y are independent. Variance of a Sum of Independent Random Variables For a second example where independence simplifies a calculation, let X and Y be independent random variables, and compute the variance of their sum. The variance of X+Y is given by Var[X + Y] = E {(X + Y) (E[X] + E[Y])} = E {(X E[X]) + (Y E[Y])} [ ] = E {X E[X]} + E {X E[X]}{Y E[Y]} + E {Y E[Y]}. (3-0) Since X and Y are independent we have E[{X-E[X]}{Y-E[Y]}] = E[X-E[X]] E[Y-E[Y]] = 0, and (3-0) becomes Var[X + Y] = E {X E[X]} + E {Y E[Y]} = Var[X] + Var[Y]. (3-) That is, for independent random variables, the variance of the sum is the sum of the variances (this applies to two or more random variables). In general, if random variables X and Y are dependent, then (3-) is not true. Example 3-: The result just given can be used to simplify the calculation of variance in some cases. Consider the binomial random variable X, the number of successes out of n independent trials. As used in an example that was discussed in Chapter (where we showed Updates at http://www.ece.uah.edu/courses/ee385/ 3-7

EE385 Class Notes 9/7/0 John Stensby that E[X] = np), we can express binomial X as X = X+ X + + Xn, (3-) where X i, i n, are random variables defined by Xi = th, if i trial is a "success", i n = 0, otherwise. (3-3) Note that all n of the X i are independent, they have identical mean p, and they have identical variance ( ) i i i Var[X] = E[X ] E[X] = p p = pq, (3-4) where p is the probability of success on any trial, and q = -p. Hence, we can express the variance of Binomial X as Var[X] = Var[X+ X + + X n] = Var[X ] + Var[X ] + + Var[X n] = npq. (3-5) Hence, for the Binomial random variable X, we know that E[X] = np and VAR[X] = npq. Random Vectors: Vector-Valued Mean and Covariance Matrix Let X, X,..., X n denote a set of n random variables. In this subsection, we use vector and matrix techniques to simplify working with multiple random variables. Denote the vector-valued random variable T X = [X X X X ]. (3-6) 3 n Updates at http://www.ece.uah.edu/courses/ee385/ 3-8

EE385 Class Notes 9/7/0 John Stensby Clearly, using vector notation is helpful; writing X is much easier than writing out the n random variables X, X,..., X n. The mean of X is a constant vector η = E[X ] with components equal to the means of the X i. We write T η E X E [X X X 3 X n ] = = E[X ] E[X ] E[X 3] E[X n] T = [ η η η3 ηn ] [ ] T (3-7) where η i = E[X i ], i n. The covariance of X i and X j is defined as σ ij = E[(X i - η i )(X j - η j )], i, j n. (3-8) Note that σ ij = σ ji. Use these n covariance values to form the covariance matrix Λ= L N M σ σ σ n σ σ σ n σ σ σ n n nn O Q P. (3-9) Note that this matrix is symmetric; that is, note that Λ = Λ Τ. Finally, we can write T E Λ= (X η) (X η). (3-30) Equation (3-30) provides a compact, simple definition for Λ. Symmetric Positive Semi-Definite and Positive Definite Matrices A real-valued, symmetric matrix Q is positive semi-definite (sometimes called Updates at http://www.ece.uah.edu/courses/ee385/ 3-9

EE385 Class Notes 9/7/0 John Stensby nonnegative definite) if U T QU 0 for all real-valued vectors U. A real-valued, symmetric matrix Q is positive definite if U T QU > 0 for all real-valued vectors U 0. A real-valued, symmetric, positive semi-definite matrix may (or may not) be singular. However, a positive definite symmetric matrix is always nonsingular. Theorem 3-: The covariance matrix Λ is positive semi-definite. Proof: Let U = [ u u T u n ] be an arbitrary, real-valued vector. Now, define the scalar n j j j. (3-3) j= Y= u (X η ) Clearly, E[Y ] 0. However n n n n E[Y ]=E u (X η ) u (X η ) = u E[(X η )(X η )]u j j j k k k j k k j j k j= k= j= k= n n = u σ j= k= u j jk k (3-3) T = U ΛU 0. Hence U U T Λ 0 for all U and Λ is positive semi-definite. Matrix Λ is positive definite in almost all practical applications. That is, U T Λ U >0 for all U 0. If Λ is not positive definite, then at least one of the X i can be expressed as a linear combination of the remaining n- random variables, and the problem can be simplified by reducing the number of random variables. If Λ is positive definite, then Λ 0, Λ is nonsingular and Λ exits ( Λ denotes the determinant of the covariance matrix). Updates at http://www.ece.uah.edu/courses/ee385/ 3-0

EE385 Class Notes 9/7/0 John Stensby Uncorrelated Random Variables Suppose we are given n random variables X i, i n. We say that the X i are uncorrelated if, for i, j n, E[XiX j] = E[X i]e[x j], i j. (3-33) For uncorrelated X i, i n, we have σ = E[(X - η )(X - η )] = E[X X ]- ηe[x ]- η E[X ] + η η, i j ij i i j j i j i j j i j j = E[X ]E[X ]- ηη - η η + η η, i j i j i j j i j j (3-34) and this leads to ij i σ = σ, i = j = 0, i j. (3-35) Hence, for uncorrelated random variables, matrix Λ is diagonal and of the form (the variances are on the diagonal) Λ= L NM σ 0 0 0 0 σ 0 0 0 0 σn 0 0 0 0 σn O QP. (3-36) If X i and X j are independent they are also uncorrelated, a conclusion that follows from (3-9). However, the converse is not true, in general. Uncorrelated random variables may be dependent. Updates at http://www.ece.uah.edu/courses/ee385/ 3-

EE385 Class Notes 9/7/0 John Stensby Multivariable Gaussian Density Let X, X,..., X n be jointly Gaussian random variables. Let Λ denote the covariance matrix for the random vector X = [X X... X n ] T, and denote η = E[X ]. The joint density of X, X,..., X n can be expressed as a density for the vector X. This density is denoted as f(x ), and it is given by f( X) exp T = ( X η) Λ ( X η). (3-37) n/ ( π) / Λ When n = (or ), this result yields the expressions given in class for the first (second) order case. With (3-37), we have perpetuated a common abuse of notation. We have used X = [X X X n ] T to denote a vector of random variables. However, in (3-37), X is a vector of algebraic variables. This unfortunate dual use of a symbol is common in the literature. This ambiguity should present no real problem; from context, the exact interpretation of X should be clear. Example 3-3: Let X = [X X ] T be a Gaussian random vector. Then X and X are Gaussian random variables with joint density of the form (3-5). Let η = E[X ] and η = E[X ] denote the means of X and X, respectively; likewise, let σ = VAR[X ] and σ = VAR[X ]. Finally, let r denote the correlation coefficient in the joint density. Find an expression for covariance matrix Λ in terms of these quantities. This can be accomplished by comparing the exponents of (3-37) and (3-5). For the exponent of (3-37), let Q = Λ - and write T (X η) Q(X η ) = {X η} {X η} [ ] q q X η q q X η { } = q (X η ) + q (X η )(X η ) + q (X η ), (3-38) Updates at http://www.ece.uah.edu/courses/ee385/ 3-

EE385 Class Notes 9/7/0 John Stensby where we have used the fact that Q is symmetric (Q T = Q). Now, compare (3-38) with the exponent of (3-5) (where X and X are used instead of x and y) and write { } q (X η ) + q (X η )(X η ) + q (X η ) (X η ) (X η )(X η ) (X η ) = r +. ( r ) σσ σ σ (3-39) Equate like terms on both sides of (3-39) and obtain q = σ ( r ) r q = σσ ( r ) (3-40) q =. σ ( r ) Finally, take the inverse of matrix Q and obtain r ( r ) ( r ) σ rσσ σ σ σ Λ= Q = =. (3-4) r rσσ σ σσ ( r ) σ( r ) The matrix on the right-hand-side of (3-4) shows the general form of the covariance matrix for a two-dimensional Gaussian random vector. From (3-4) and the discussion before (3-9), we note that E[(X -η )(X -η )] = rσ σ. Hence, the correlation coefficient r can be written as Updates at http://www.ece.uah.edu/courses/ee385/ 3-3

EE385 Class Notes 9/7/0 John Stensby E[(X η )(X η )] σ ηη r = = σσ σσ, (3-4) the covariance normalized by the product σ σ. When working problems, Equation (3-4) is an important, very useful, formula for correlation coefficient r. Updates at http://www.ece.uah.edu/courses/ee385/ 3-4