Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
|
|
- Ashlie Briana Bruce
- 5 years ago
- Views:
Transcription
1 Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage and current, while some are abstracted from the problem, e.g., probability of passing a class and probability of graduating. Whenever we need to handle relationship between two or more events, we need mathematical tools to describe the probabilistic phenomenon. The objective of this chapter to present the concepts of joint distributions. 5. Joint PMF and Joint PDF Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Definition. Let X and Y be two discrete random variables. The joint PMF of X and Y is defined as p X,Y (x, y) P[X x Y y]. (5.) The interpretation of a joint PMF is that the sample space is now the Cartesian plane of Ω X Ω Y, where Ω X is the sample space of X, and Ω Y is the sample space of Y. Pictorially, this means that the sample space of the joint PMF is a two-dimensional plane (X, Y ). We stress the importance of this two-dimensional sample space, because every outcome of a joint variable is a point in the two-dimensional space, i.e., (X, Y ). Therefore, P[X A Y B] for sets A and B can be interpreted as P[X A Y B] P[(ξ, ζ) ξ X (A), and ζ Y (B)}]. (5.) For discrete random variables, the PMF p X,Y (x, y) can be considered as delta functions in the two-dimensional space.
2 Example. Let X be a coin flip, Y be a dice. Find the joint PMF of X and Y. Solution. The joint PMF is p X,Y (x, y), x 0,, y,, 3, 4, 5, 6. Pictorially, we have the joint PMF given by the following table. Y X 0 X In this example, we observe that if X and Y are not interacting (formally, we call them independent which we will discuss later), then the joint PMF is the product of the two individual probabilities. The continuous version of the joint PMF is called the joint PDF. Definition. Let X and Y be two continuous random variables. The joint PDF of X and Y is a function f X,Y (x, y) that can be integrated to yield a probability: P[a X b c Y d] d b c a f X,Y (x, y)dxdy. (5.3) Like PDFs for single random variables, a joint PDF is a density which can be integrated to obtain the probability. Note also in this definition, the probabilities of the events a X b} and c Y d} are related using logical AND. Example. Consider a uniform joint PDF f X,Y (x, y) defined on [0, ], as shown in Figure 5.. The shaded area corresponds to P[a X b c X d] d b c a d b c a f X,Y (x, y)dxdy dxdy (d c)(b a). In general, when f X,Y (x, y) is not uniform, we have to integrate f X,Y (x, y) over the interval specified. c 8 Stanley Chan. All Rights Reserved.
3 (a) General f X,Y (x, y) (b) Example. Figure 5.: The joint PDF f X,Y (x, y) is a two-dimensional function. Integrating over the rectangle [a, b] [c, d] returns the probability P[a X b c Y d]. Normalization The normalization property of a two-dimensional PMF and PDF states that by enumerating over all outcomes of the sample space we will obtain. Theorem. All joint PMFs and joint PDFs satisfy p X,Y (x, y) x y or f X,Y (x, y)dxdy. (5.4) Example. Consider a joint uniform PDF defined in the shaded area Ω with PDF defined below. Find the constant c. c, if (x, y) Ω, f X,Y (x, y) 0, otherwise. Solution. To find the constant c, we note that f X,Y (x, y)dxdy. The left hand side of this equation is precisely the area, which is Ω. Therefore, we have c / Ω. c 8 Stanley Chan. All Rights Reserved. 3
4 Marginal PMF and Marginal PDF If we only sum / integrate with respect to one random variable, we obtain the PMF / PDF of the other random variable. The resulting PMF / PDF is called the marginal PMF / PDF. Definition 3. The marginal PMF is defined as p X (x) y p X,Y (x, y) and p Y (y) x p X,Y (x, y) (5.5) Definition 4. The marginal PDF is defined as f X (x) f X,Y (x, y)dy and f Y (y) f X,Y (x, y)dx (5.6) Since f X,Y (x, y) is a two-dimensional function, when integrating over y from to, we project f X,Y (x, y) onto the x-axis. Therefore, the resulting function depends on x only. Example. Consider the joint PDF f X,Y (x, y) shown in Figure 5.. Find the marginal PDFs. Solution. If we integrate over x and y, then we have, if < x, 3, if < x,, if < x 3, f X (x), if < x 3, and f Y (y), if 3 < x 4, 0, otherwise. 0, otherwise. Figure 5.: Example of a joint uniform PDF f X,Y (x, y) and the corresponding marginal PDFs. Example. Consider a D Gaussian PDF as shown in Figure 5.3. The PDF of the joint Gaussian is f X,Y (x, y) πσ exp ((x µ } X) + (y µ Y ) ). σ c 8 Stanley Chan. All Rights Reserved. 4
5 Find the marginal PDFs f X (x) and f Y (y). Solution. f X (x) Similarly, we have f X,Y (x, y)dy exp (x µ X) πσ σ exp (x µ X) πσ σ f Y (y) πσ exp } }. ((x µ X) + (y µ Y ) ) σ exp (y µ Y ) πσ σ exp (y µ } Y ). πσ σ } dy } dy The result of this example shows that the marginalization of a D Gaussian is D Gaussian along the vertical and the horizontal axes. Thus, we can think of marginalization of a projection. Figure 5.3: Marginalization is equivalent to projection. A joint PDF shown in this figure can be marginalized onto the x or the y axis. Independence of Random Variables Finally, we say that two random variables are independent if the joint PMF or PDF can be factorized as a product of the marginal PMF / PDFs: Definition 5. If two random variables X and Y are independent, then p X,Y (x, y) p X (x)p Y (y), and f X,Y (x, y) f X (x)f Y (y). c 8 Stanley Chan. All Rights Reserved. 5
6 To see why this definition is coherent to the definition of independence of two events, we recall that two events A and B are independent if P[A B] P[A]P[B]. Letting A X x} and B Y y}, we see that if A and B are independent then P[X x Y y] P[X x]p[y y]. This is precisely the relationship p X,Y (x, y) p X (x)p Y (y). Independence is an important statistical property. If there are many random variables X, X,..., X N, the joint PDF f X,...,X N (x,..., x N ) is a N-dimensional function which could be computationally intractable. However, if we assume all these random variables are independent, then the joint PDF becomes f X,...,X N (x,..., x N ) N f Xn (x n ), which is often manageable. As a special case of independent random variables, we define the notion of independent and identically distributed (i.i.d.) random variables. n Definition 6 (Independent and Identically Distributed (i.i.d.)). A collection of random variables X,..., X N are called independent and identically distributed (i.i.d.) if All X,..., X N are independent; All X,..., X N have the same distribution, i.e., f X (x)... f XN (x). If X,..., X N are i.i.d., we have that f X,...,X N (x,..., x) N f Xn (x) [f X (x)] N, (5.7) where the particular choice of X is unimportant because f X (x)... f XN (x). n 5. Joint CDF Same as Ch.3 and Ch.4, we need to understand the cumulative distribution function (CDF) for the multi-variable case. Definition 7. Let X and Y be two random variables. The joint CDF of X and Y is the function F X,Y (x, y) such that F X,Y (x, y) P[X x Y y]. (5.8) c 8 Stanley Chan. All Rights Reserved. 6
7 From this definition, we can explicitly write out the probability as follows. Definition 8. If X and Y are discrete, then F X,Y (x, y) p X,Y (x, y ). (5.9) y y x x If X and Y are continuous, then F X,Y (x, y) y x f X,Y (x, y )dx dy. (5.0) Note that since F X,Y (x, y) is the integration from to x (and y), we have F X,Y (, y) y y 0dy 0. f X,Y (x, y )dx dy Similarly, we have F X,Y (x, ) 0, and F X,Y (, ) 0. CDF evaluated at x and y is F X,Y (, ) f X,Y (x, y )dx dy. If only x or y is at, we obtain the marginal CDF. Proposition. Let X and Y be two random variables. Then marginal CDF can be obtained from F X (x) F X,Y (x, ) F Y (y) F X,Y (, y). To see these results, we note that F X,Y (x, ) x y f X,Y (x, y )dy dx f X (x )dx F X (x). c 8 Stanley Chan. All Rights Reserved. 7
8 By fundamental theorem of calculus, we can derive PDF from the CDF. Definition 9. Let F X,Y (x, y) be the joint CDF of X and Y. Then, the joint PDF can be obtained through f X,Y (x, y) y x F X,Y (x, y). The order of the partial derivative can be switched, yielding a symmetric result: f X,Y (x, y) x y F X,Y (x, y). 5.3 Conditional PMF and PDF Conditional PMF Definition 0. Let X and Y be two discrete random variables. The conditional PMF of X given Y is p X Y (x y) p X,Y (x, y). (5.) p Y (y) By definition of conditional probability, we can also define p X Y (x y) P[X x Y y] because p X Y (x y) p X,Y (x, y) p Y (y) P[X x Y y] P[Y y] P[X x Y y]. It is important to understand the randomness exhibited in a conditional PMF. In p X Y (x y), the random variable Y is fixed to a specific value Y y. The randomness of Y has been taken care by the denominator p Y (y) in Equation 5.. Therefore, there is no randomness associated with Y. The variable x in p X Y (x y) describes the randomness. In particular, we have that but p X Y (x p X,Y y) (x, y) p x x Y (y) x p X,Y (x, y) p Y (y) p X Y (x y p X,Y (x, ) y ). p y y Y (y ) Therefore, p X Y (x y) is a probability of X, not Y. p Y (y) p Y (y), Unlike marginal PMF which is a function of either x or y, e.g., p X (x) or p Y (y), a conditional PMF can be a function of both x and y. For example, p X Y (x y) is the conditional probability c 8 Stanley Chan. All Rights Reserved. 8
9 of having random variable X x, given that Y is at a fixed value y. Thus p X Y (x y) depends on both x and y. Example. Consider a joint PMF given in the following table. Find the conditional PMF p X Y (x ) and the marginal PMF p X (x). Y 3 4 X Solution. To find the marginal PMF, we need to sum over all the y for every x. Therefore, 4 x : p X () p X,Y (, y) x : p X () x 3 : p X (3) x 4 : p X (4) Hence, the marginal PMF is y 4 p X,Y (, y) y 4 p X,Y (3, y) y 4 p X,Y (4, y) y p X (x) [ ] 3 The conditional PMF p X Y (x ) is p X Y (x ) p X,Y (x, ) p Y () [ ]. [ 3 ] 0 Example. Consider two random variables X and Y defined as follows. 0, with prob 5/6, 0 4 Y, with prob /, Y X 0 0 4, with prob /6. 3 Y, with prob /3, 0 Y, with prob /6. c 8 Stanley Chan. All Rights Reserved. 9
10 Find p X Y (x y), p X (x) and p X,Y (x, y). Solution. Since Y takes two different states, we can enumerate Y 0 and Y 0 4. This gives us /, if x 0.0, /, if x, p X Y (x 0 ) /3, if x 0., and p X Y (x 0 4 ) /3, if x 0, /6, if x. /6, if x 00. The joint PDF p X,Y (x, y) can be found as ( ) ( 5 ) p X,Y (x, 0 ) p X Y (x 0 )p Y (0 6, x 0.0, ( ) ) ( 5 ) 3 6, x 0., ( ) ( 5 ) 6 6, x. ( ) ( ) p X,Y (x, 0 4 ) p X Y (x 0 4 )p Y (0 4 6, x, ( ) ) ( ) 3 6, x 0, ( ) ( ) 6 6, x 00. Therefore, the joint PDF is given by the following table. The marginal PDF p X (x) is p X (x) y p X,Y (x, y) [ ]. Conditional PDF Definition. Let X and Y be two continuous random variables. The conditional PDF of X given Y is f X Y (x y) f X,Y (x, y). (5.) f Y (y) Example. Let X and Y be two continuous random variables with a joint PDF e x e y, 0 y x < f X,Y (x, y) 0, otherwise. Find the conditional PDF f X Y (x y) and f Y X (y x). c 8 Stanley Chan. All Rights Reserved. 0
11 Solution. In order to find the conditional PDFs, we first find the marginal PDFs. f X (x) f Y (y) Therefore, the conditional PDFs are f X Y (x y) f X,Y (x, y) f Y (y) f Y X (y x) f X,Y (x, y) f X (x) f X,Y (x, y)dy f X,Y (x, y)dx x 0 y e x e y dy e x ( e x ) e x e y dx e y. e x e y e y e (x+y), x y e x e y e x ( e x ) e y, 0 y < x. e x Example. This example considers a classical detection problem. Let X be a random bit such that +, with prob / X, with prob /. Suppose that X is transmitted over a noisy channel so that the observed signal is Y X + N, where N N (0, ) is the noise which is independent to the signal X. Suppose that we observe Y > 0, is the signal more likely to be X + or X? Solution. First of all, we know that Therefore, given Y > 0, we need to find It holds that P[X + Y > 0] P[Y > 0 X +] f Y X (y + ) e (y ) π f Y X (y ) π e (y+). P[Y > 0 X +]P[X +]. P[Y > 0] 0 π e (y ) dy 0 e (y ) dy π ( ) 0 Φ Φ( ). c 8 Stanley Chan. All Rights Reserved.
12 Similarly, we have By law of total probability, we have that P[Y > 0 X ] Φ(+). P[Y > 0] P[Y > 0 X +]P[X +] + P[Y > 0 X ]P[X ] (Φ(+) + Φ( )), because Φ(+) + Φ( ) Φ(+) + Φ(+). Therefore, P[X + Y > 0] Φ( ) The implication is that if Y > 0, the posterior probability P[X + Y > 0] The complement of this result gives that P[X Y > 0] Therefore, X + is more likely. 5.4 Joint Expectation, Moment, and Covariance Joint Expectation and Joint Moment Definition. Let X and Y be two random variables. The joint expectation is E[XY ] xyp X,Y (x, y) (5.3) y x if X and Y are discrete, or E[XY ] xyf X,Y (x, y)dxdy (5.4) if X and Y are continuous. Joint expectation is also called correlation. Theorem. If X and Y are independent, then E[XY ] E[X]E[Y ]. (5.5) Proof. We only prove the discrete case because the continuous can be proved similarly. If X and Y are independent, we have p X,Y (x, y) p X (x)p Y (y). Therefore, E[XY ] xyp X,Y (x, y) ( ) ( ) xyp X (x)p Y (y) xp X (x) yp Y (y) y x y x x E[X]E[Y ]. y c 8 Stanley Chan. All Rights Reserved.
13 In general, for any two independent random variables and two functions f and g, it holds that E[f(X)g(Y )] E[f(X)]E[g(Y )]. (5.6) Of particular interest is the function f(x) X k and g(y ) Y l, which gives the definition of joint moments. Definition 3. Let X and Y be two random variables. The joint moment is E[X k Y l ] x k y l p X,Y (x, y) (5.7) y x if X and Y are discrete, or if X and Y are continuous. Covariance E[X k Y l ] x k y l f X,Y (x, y)dxdy (5.8) The concept of covariance can be considered as a generalization of the concept of variance. Instead of measuring (X µ X ), a covariance of two random variables measures (X µ X )(Y µ Y ). Thus while the variance is always non-negative, a covariance can be negative. Definition 4. Let X and Y be two random variables. The covariance is where µ X E[X] and µ Y E[Y ]. Cov(X, Y ) E[(X µ X )(Y µ Y )], (5.9) The following theorem illustrates a few important properties of the covariance. Theorem 3. The following results hold: a. Cov(X, Y ) E[XY ] E[X]E[Y ] b. X and Y are independent Cov(X, Y ) 0. c. Cov(X, Y ) 0 X and Y are independent. Remark: If Y X, then Cov(X, Y ) E[X ] E[X] Var[X]. Proof. The proof of part (a) is straight-forward: Cov(X, Y ) E[(X µ X )(Y µ Y )] E[XY Xµ Y Y µ X + µ X µ Y ] E[XY ] µ X µ Y. c 8 Stanley Chan. All Rights Reserved. 3
14 The proof of part (b) follows from Equation (5.5). If X and Y are independent, then E[XY ] E[X]E[Y ]. In this case, Cov(X, Y ) E[XY ] E[X]E[Y ] E[X]E[Y ] E[X]E[Y ] 0. Proof of part (c) requires a counter example. Consider a discrete random variable Z with PMF p Z (z) [ ]. Let X and Y be X cos π Z and Y sin π Z. Then, we can show that E[X] 0, E[Y ] 0. The covariance is Cov(X, Y ) E[(X 0)(Y 0)] E [cos π Z sin π ] Z [ E [ (sin π0) 4 + (sin π) 4 + (sin π) 4 + (sin π3) 4 ] sin πz ] 0. Our next goal is to show that X and Y are dependent. To this end, we only need to show that p X,Y (x, y) p X (x)p Y (y). The joint PMF p X,Y (x, y) can be found by noting that Z 0 X, Y 0 Z X 0, Y Z X, Y 0 Z 3 X 0, Y. Thus, the PMF is The marginal PMFs are p X,Y (x, y) p X (x) [ 4 ] 4, py (y) [ 4 4]. The product p X (x)p Y (y) is p X (x)p Y (y) Therefore, p X,Y (x, y) p X (x)p Y (y), although E[XY ] E[X]E[Y ]. c 8 Stanley Chan. All Rights Reserved
15 The next theorem is general to random variables that are not necessarily independent. Theorem 4. For any X and Y (not necessarily independent), a. E[X + Y ] E[X] + E[Y ]. b. Var[X + Y ] Var[X] + Cov(X, Y ) + Var[Y ]. Of course, if X and Y are independent, then Cov(X, Y ) 0 and hence Var[X + Y ] Var[X] + Var[Y ]. Proof. Proof of (a). Recall the definition of joint expectation: E[X + Y ] y y x x (x + y)p X,Y (x, y) x xp X,Y (x, y) + yp X,Y (x, y) x y x ( ) x p X,Y (x, y) + ( ) y p X,Y (x, y) y y x xp X (x) + y yp Y (y) E[X] + E[Y ]. Proof of (b). Var[X + Y ] E[(X + Y ) ] E[X + Y ] E[(X + Y ) ] (µ X + µ Y ) E[X + XY + Y ] (µ X + µ X µ Y + µ Y ) E[X ] µ X + E[Y ] µ Y + (E[XY ] µ X µ Y ) Var[X] + Cov(X, Y ) + Var[Y ]. Correlation Coefficient Definition 5. Let X and Y be two random variables. The correlation coefficient is ρ Cov(X, Y ) Var[X]Var[Y ] (5.) Correlation coefficient provides a convenient way of assessing the relationship between two random variables. The following proposition outlines its properties. c 8 Stanley Chan. All Rights Reserved. 5
16 Theorem 5. The correlation coefficient ρ has the properties that: When X Y (fully correlated), ρ +. When X Y (negatively correlated), ρ. When X and Y are independent, ρ 0. However, if ρ 0, it does not imply that X and Y are independent. Proof. When X Y, ρ Var[X] Var[X]Var[X]. When X Y, ρ E[X( X)] E[X]E[ X] Var[X]Var[ X]. When X and Y are independent, then Cov(X, Y ) 0. A counter example for the converse can be found in Theorem 3(c). In general a correlation coefficient is always bounded between - and. Theorem 6. Correlation coefficient always satisfies ρ. (5.) Proof. We prove this result by Cauchy inequality. Cauchy inequality states that E[XY ] E[X ]E[Y ]. Therefore, we have Cov(X, Y ) E[(X µ X )(Y µ Y )] E[(X µ X ) ]E[(Y µ Y ) ] Var[X]Var[Y ]. Hence, we have Cov(X, Y ) Var[X]Var[Y ]. c 8 Stanley Chan. All Rights Reserved. 6
17 5.5 Conditional Expectation When dealing with two dependent random variables, sometimes we would like to determine the expectation of a random variable when the second random variable takes a particular state. The conditional expectation is a formal way of doing so. Definition 6. The conditional expectation of X given Y y is E[X Y y] x xp X Y (x y) (5.) for the discrete random variables, and E[X Y y] for the continuous random variables. There are a few points to note here: xf X Y (x y)dx (5.3) In E[X Y y], the expectation is taken over X. In other words, we are exploring the randomness of X. To evaluate the conditional expectation, the PDF is f X Y (x y). The random variable Y is fixed at Y y. Thus, there is no randomness associated with Y. The resulting object E[X Y y] is a function of y because the random variable X has been eliminated by the expectation. Conditional expectation is meaningful only when X and Y are dependent. If X and Y are independent, then f X Y (x y) f X (x) and so E[X Y y] E[X]. That is the conditional expectation does not really depend on y. If we do not specify a particular value Y takes, then we refer to E[X Y ], which is a random variable in Y. One of the most useful results in conditional expectation is the following theorem. Theorem 7 (Law of Total Expectation). E[X] y E[X Y y]p Y (y), or E[X] E[X Y y]p Y (y)dy. (5.4) c 8 Stanley Chan. All Rights Reserved. 7
18 Proof. We only prove the discrete case, as the continuous case can be proved by replacing summation with integration. E[X] xp X (x) ( ) x p X,Y (x, y) xp X Y (x y)p Y (y) x x y x y ( ) xp X Y (x y) p Y (y) E[X Y y]p Y (y). y x y Corollary. Let X and Y be two random variables. Then, E[X] E [E[X Y ]]. (5.5) Proof. The previous theorem states that E[X] y E[X Y y]p Y (y). If we treat E[X Y y] as a function of y, e.g., h(y), then E[X] y E[X Y y]p Y (y) y h(y)p Y (y) E[h(Y )] E [E[X Y ]]. Remark: To be slightly more clear, the two expectations in Equation (5.5) are E[X] E Y [ EX Y [X Y ] ], i.e., the inner expectation is taken over f X Y, whereas the outer expectation is f Y. Example. Consider a joint PMF given by the following table. Find E[X Y 0 ] and E[X Y 0 4 ]. Y X Solution. To find the conditional expectation, we first need to know the conditional PMF. 0 0 ] 6 p X Y (x 0 ) [ 3 p X Y (x 0 4 ) [ 0 0 c 8 Stanley Chan. All Rights Reserved ].
19 Therefore, the conditional expectations are ( ) ( ) ( ) E[X Y 0 ] (0 ) + (0 ) + () 3 6 ( ) ( ) ( ) E[X Y 0 4 ] () + (0) + (00) 3 6 From the conditional expectations we can also find E[X]: E[X] E[X Y 0 ]p Y (0 + E[X Y 0 4 ]p Y (0 4 ) ( ) ( ) ( ) ( ) Example. Consider two random variables X and Y. The random variable X is Gaussian distributed with X N (µ, σ ). The random variable Y has a conditional distribution Y X N (X, X ). Find E[Y ]. Solution. We know that the two PDFs are f X (x) (x µ) e σ, and f Y X (y x) πσ The conditional expectation of Y given X is E[Y X x] yf Y X (y x)dy (y x) e x. πx (y x) y e x dy x. πx The last equality holds because we are computing the expectation of a Gaussian random variable with mean x. Finally, applying the law of total expectation we can show that E[Y ] E[Y X x]f X (x)dx x πσ e (x µ) σ dx µ. Application: MMSE Estimator. (Optional) Consider a pair of random variables (X, Y ). We observed this pair of random variables. Can we determine the relationship between them? That is, can we design a function g such that we can minimize the error min g E[(Y g(x)) ]. c 8 Stanley Chan. All Rights Reserved. 9
20 We may assume that we know the distributions f X (x), f Y (y), and f Y X (y x). The solution to this problem is called the minimum mean squared error (MMSE) estimator. Theorem 8. The MMSE estimator is a function g which minimizes the mean squared error: g argmin E[(Y g(x)) ], g and is given by Proof. By law of total expectation, we have that E[(Y g(x)) ] g (x) E[Y X x]. (5.6) E[(Y g(x)) X x]f Y X (y x)dy. Since all terms in this integration are non-negative, we can minimize the overall by minimizing the inner expectation. The inner expectation is E[(Y g(x)) X x]. When conditioned on X x, the function g(x) g(x), and is independent of Y. Therefore, we can treat g(x) c for some constant c and try to determine c. This means that we want to find c to minimize c argmin c argmin c E[(Y c) X x] (y c) f Y X (y x)dy. Take derivative with respect to c and set it to zero yields ( d ) (y c) f Y X (y x)dy 0 (y c)f Y X (y x)dy 0. dc which implies that c yf Y X (y x)dy E[Y X x]. Therefore, the inner expectation is minimized when g(x) E[Y X x]. 5.6 Sum of Two Random Variables One typical problem we encounter in engineering is that given two random variables X and Y, what is the PDF of the sum, i.e., X + Y? Such problem arises naturally when we want to evaluate the average of a number of random variables, e.g., the sample mean of a collection c 8 Stanley Chan. All Rights Reserved.
21 of data points. In this section we will discuss a general principle of how to determine the PDF of a sum of two random variables. To start with, we consider two random variable X and Y with PDFs f X (x) and f Y (y) respectively. Let us define the sum as Z X + Y. Our goal is to determine the PDF of Z. Theorem 9. Let X and Y be two independent random variables with PDFs f X (x) and f Y (y) respectively. Let Z X + Y. The PDF of Z is given by f Z (z) (f X f Y )(z) where denotes the convolution. Proof. Let us start by analyzing the CDF of Z. The CDF of Z is F Z (z) P[Z z] z y f X (z y)f Y (y)dy, (5.7) f X (x)f Y (y)dxdy, where the integration limits can be seen from Figure 5.4. Then, by fundamental theorem of calculus, we can show that f Z (z) d dz F Z(z) d dz z y where denotes the convolution. ( d dz z y f X (x)f Y (y)dxdy ) f X (x)f Y (y)dx dy f X (z y)f Y (y)dy (f X f Y )(z), The result of this derivation shows that the PDF of X + Y is the convolution of f X (x) and f Y (y). The following example illustrate how we can compute the convolution. Example. Let X and Y be independent, and let xe x, x 0 f X (x) f Y (y) 0, x < 0 ye y, y 0 0, y < 0. c 8 Stanley Chan. All Rights Reserved.
22 Figure 5.4: The shaded region highlights the set X + Y Z. Find the PDF of Z X + Y. Solution. Using the results derived above, we see that f Z (z) z f X (z y)f Y (y)dy f X (z y)f Y (y)dy, where the upper limit z came from the fact that x 0. Therefore, since Z X + Y, we must have Z Y X 0 and so Z Y. Substituting the PDFs into the integration yields For z < 0, f Z (z) 0. f Z (z) z 0 (z y)e (z y) ye y dy z3 6 e z, z 0. In general, function of two random variables is not limited to summation. The following example illustrates the case of a product of two random variables. Example. Let X and Y be two independent random variables such that x, if 0 x,, if 0 y, f X (x) and f Y (y) 0, otherwise, 0, otherwise. Let Z XY. Find f Z (z). Solution. The CDF of Z can be evaluated as F Z (z) P[Z z] P[XY z] z y f X (x)f Y (y)dxdy. c 8 Stanley Chan. All Rights Reserved.
23 Taking the derivative yields z y f Z (z) d dz F Z(z) d dz (a) y f X( z y )f Y (y)dy, f X (x)f Y (y)dxdy where (a) holds by the fundamental theorem of calculus. The upper and lower limit of this integration can be determined by noting that z 0 z y x, which implies that z y. Since y, we have that z y. Therefore, the PDF is z ( ) z f Z (z) y f X f Y (y)dy y For z < 0, f Z (z) 0. z dy ( z), z 0. y 5.7 Two-dimensional Gaussian Covariance Matrix and Joint Gaussian PDF Among many joint distributions, the joint Gaussian is of particular interest because of its usefulness. To define a joint Gaussian distribution, we first define a few notations: [ ] [ ] [ ] X µ Var(X ) Cov(X X, µ, Σ, X ). Cov(X, X ) Var(X ) X µ The vector µ is called the mean vector, and the matrix Σ is called the covariance matrix. It is not difficult to show that the covariance matrix can be defined in the following way. Theorem 0. The covariance matrix Σ is equivalent to Σ E[(X µ)(x µ) T ]. (5.8) Proof. For a two-dimensional random variable, the theorem holds because [[ ] E[(X µ)(x µ) T X µ [X ] E ] ] µ X µ X µ [ ] (X E µ ) (X µ )(X µ ) (X µ )(X µ ) (X µ ) [ ] Var(X, X ) Cov(X, X ). Cov(X, X ) Var(X, X ) c 8 Stanley Chan. All Rights Reserved. 3
24 Clearly, the definition can be extended to random vector with any finite dimension. We can also prove the following property of the covariance matrix. Theorem. The covariance matrix Σ is symmetric positive semi-definite, i.e., Σ T Σ, and v T Σv 0, v R d. Proof. Symmetry is immediate from the definition, because Cov(X i, X j ) Cov(X j, X i ). The positive semi-definiteness comes from the fact that v T Σv v T E[(X µ X )(X µ X ) T ]v E[v T (X µ X )(X µ X ) T v] E[u T u], let u (X µ X ) T v E[ u ] 0. The PDF of a multi- With these tools in hand, we can now define a joint Gaussian. dimensional Gaussian is given by the following definition. Definition 7. A d-dimensional joint Gaussian has a PDF f X (x) (π)d Σ exp } (x µ)t Σ (x µ), (5.9) where d denotes the dimensionality of the vector x. In this course, we are mostly interested in the case when d. As a special case, if we assume that X and X are independent, then we can show the following result. Theorem. Let x [X, X ] T. If X and X are independent, then ( f X (x) exp (x } ) ( µ ) exp (x } ) µ ), (5.30) (π)σ (π)σ σ i.e., the product of two D Gaussians. σ c 8 Stanley Chan. All Rights Reserved. 4
25 Proof. To show this result, we note that if X and X are independent, then Σ [ ] [ ] Var(X ) Cov(X, X ) Var(X ) 0 Cov(X, X ) Var(X ) 0 Var(X ) [ ] σ 0 0 σ. The determinant Σ is Σ σ σ. Therefore, (x µ) T Σ (x µ) [ x µ x µ ] [ σ (x µ ) σ 0 σ + (x µ ). σ Substituting these results into Equation 5.9 yields the desired result. ] [x ] 0 µ x µ Geometric Interpretation Geometrically, the mean µ and the covariance matrix Σ can be interpreted as the center and the radius of the ellipse representing the Gaussian. Figure 5.5 illustrate three examples. As one can observe in these examples, the mean vector µ controls the center of the Gaussian. The radius and orientation of the Gaussian is controlled by the covariance matrix. Figure 5.5: The center and the radius of the ellipse is determined by µ and Σ. The precise relation of the radius and orientation of the Gaussian is determined by eigenc 8 Stanley Chan. All Rights Reserved. 5
26 vectors and eigenvalues of Σ. Definition 8. The covariance matrix Σ can be decomposed as Σ UΛU T, (5.3) for some unitary matrix U and diagonal matrix Λ. The columns of U are called the eigenvectors, and the entries of Λ are called the eigenvalues. If we write out the definition of the eigenvector and eigenvalue, we can see that (at least for the two-dimensional case): [ ] [ ] Σ UΛU T u u λ 0 u T 0 λ u T. The column vector u defines the direction of the major axis, and u defines the direction of the minor axis. The values λ and λ define the radii of the axes, respectively. See Figure 5.6 for illustration. Figure 5.6: The center and the radius of the ellipse is determined by µ and Σ. Maximum-a-Posteriori Classifier Consider a dataset of two classes C and C. We assume that all data within each class follows a Gaussian distribution. More specifically, we assume that X C N (µ, Σ ), and X C N (µ, Σ ). Suppose we are given testing data point x, how do we design a classifier to classify this data point? To answer this question, we first need to determine the two PDFs. Assume that the probability of obtaining C is π and the probability of obtaining C is π. That is, f C (C ) π, c 8 Stanley Chan. All Rights Reserved. 6
27 and f C (C ) π, with π + π. The conditional PDFs are given by f X C (x C ) (π)d Σ exp } (x µ ) T Σ (x µ ) f X C (x C ) (π)d Σ exp } (x µ ) T Σ (x µ ) One possible way of designing a classifier is to test the posterior distribution, and check f C X (C x) f C X (C x). (5.3) If f C X (C x) f C X (C x), we claim that the class is C. Otherwise it is C. By Bayes theorem, we can rewrite the posterior distribution as Substituting the Gaussians we have f X C (x C )f C (C ) f X C (x C )f C (C ). π (π)d Σ e (x µ )T Σ (x µ ) π (π)d Σ e (x µ )T Σ (x µ ). The comparison defined by this posterior distribution is called the maximum-a-posteriori (MAP) classification. Definition 9. The maximum-a-posteriori (MAP) classification is a test to check whether f X C (x C )f C (C ) f X C (x C )f C (C ). (5.33) To demonstrate how the MAP classification can be used in practice, we consider a special case when Σ Σ, and π π /. Theorem 3. Let X C N (µ, Σ ), and X C N (µ, Σ ). Suppose that Σ Σ Σ, and π π /. Then the MAP classifier of C and C is w T x + x 0 0, (5.34) where w Σ (µ µ ), and x 0 µ T Σ µ + µ T Σ µ }. Proof. When Σ Σ, and π π /, the MAP classifier can be simplified as e (x µ )T Σ (x µ ) e (x µ )T Σ (x µ ), (5.35) c 8 Stanley Chan. All Rights Reserved. 7
28 which implies that (x µ ) T Σ (x µ ) (x µ ) T Σ (x µ ). (5.36) Note that the sign is flipped because there is a / term in the exponential. Rewriting the terms we obtain an equivalent expression x T Σ (µ µ ) µ T Σ µ + µ T Σ µ }. (5.37) If we define w Σ (µ µ ), and x 0 µ T Σ µ + µ T Σ µ }, the above expression can be simplified as w T x + x 0 0. (5.38) The result above shows a linear classifier. Given a data point x, all we need to do is to project x by w, and then check whether the intercept w T x + x 0 is less than or greater than 0. If it is less than 0, then we claim that the class is C. Figure 5.7: Classifying two classes of data points. c 8 Stanley Chan. All Rights Reserved. 8
Review of Probability Theory
Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving
More informationSTA 256: Statistics and Probability I
Al Nosedal. University of Toronto. Fall 2017 My momma always said: Life was like a box of chocolates. You never know what you re gonna get. Forrest Gump. There are situations where one might be interested
More informationRandom Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.
Random Variables Amappingthattransformstheeventstotherealline. Example 1. Toss a fair coin. Define a random variable X where X is 1 if head appears and X is if tail appears. P (X =)=1/2 P (X =1)=1/2 Example
More informationconditional cdf, conditional pdf, total probability theorem?
6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random
More information1 Random Variable: Topics
Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?
More informationRandom Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R
In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationContinuous Random Variables
1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables
More informationChapter 4. Continuous Random Variables 4.1 PDF
Chapter 4 Continuous Random Variables In this chapter we study continuous random variables. The linkage between continuous and discrete random variables is the cumulative distribution (CDF) which we will
More information3 Multiple Discrete Random Variables
3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f
More informationENGG2430A-Homework 2
ENGG3A-Homework Due on Feb 9th,. Independence vs correlation a For each of the following cases, compute the marginal pmfs from the joint pmfs. Explain whether the random variables X and Y are independent,
More informationP (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n
JOINT DENSITIES - RANDOM VECTORS - REVIEW Joint densities describe probability distributions of a random vector X: an n-dimensional vector of random variables, ie, X = (X 1,, X n ), where all X is are
More information3. Probability and Statistics
FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important
More informationBASICS OF PROBABILITY
October 10, 2018 BASICS OF PROBABILITY Randomness, sample space and probability Probability is concerned with random experiments. That is, an experiment, the outcome of which cannot be predicted with certainty,
More informationLecture 11. Probability Theory: an Overveiw
Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the
More informationECE Lecture #9 Part 2 Overview
ECE 450 - Lecture #9 Part Overview Bivariate Moments Mean or Expected Value of Z = g(x, Y) Correlation and Covariance of RV s Functions of RV s: Z = g(x, Y); finding f Z (z) Method : First find F(z), by
More informationJoint Probability Distributions and Random Samples (Devore Chapter Five)
Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete
More informationFormulas for probability theory and linear models SF2941
Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms
More informationBivariate distributions
Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient
More informationProblem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},
ECE32 Spring 25 HW Solutions April 6, 25 Solutions to HW Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in italics where
More informationReview: mostly probability and some statistics
Review: mostly probability and some statistics C2 1 Content robability (should know already) Axioms and properties Conditional probability and independence Law of Total probability and Bayes theorem Random
More information2 (Statistics) Random variables
2 (Statistics) Random variables References: DeGroot and Schervish, chapters 3, 4 and 5; Stirzaker, chapters 4, 5 and 6 We will now study the main tools use for modeling experiments with unknown outcomes
More information18.440: Lecture 28 Lectures Review
18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is
More informationLecture 1: August 28
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random
More information6.1 Moment Generating and Characteristic Functions
Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,
More informationAlgorithms for Uncertainty Quantification
Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example
More informationProbability review. September 11, Stoch. Systems Analysis Introduction 1
Probability review Alejandro Ribeiro Dept. of Electrical and Systems Engineering University of Pennsylvania aribeiro@seas.upenn.edu http://www.seas.upenn.edu/users/~aribeiro/ September 11, 2015 Stoch.
More informationLecture 16 : Independence, Covariance and Correlation of Discrete Random Variables
Lecture 6 : Independence, Covariance and Correlation of Discrete Random Variables 0/ 3 Definition Two discrete random variables X and Y defined on the same sample space are said to be independent if for
More informationMultiple Random Variables
Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More informationx. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).
.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics
More informationChapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables
Chapter 2 Some Basic Probability Concepts 2.1 Experiments, Outcomes and Random Variables A random variable is a variable whose value is unknown until it is observed. The value of a random variable results
More information1 Presessional Probability
1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationECE 4400:693 - Information Theory
ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential
More informationProbability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3
Probability Paul Schrimpf January 23, 2018 Contents 1 Definitions 2 2 Properties 3 3 Random variables 4 3.1 Discrete........................................... 4 3.2 Continuous.........................................
More informationMultivariate distributions
CHAPTER Multivariate distributions.. Introduction We want to discuss collections of random variables (X, X,..., X n ), which are known as random vectors. In the discrete case, we can define the density
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More informationSTAT 430/510: Lecture 16
STAT 430/510: Lecture 16 James Piette June 24, 2010 Updates HW4 is up on my website. It is due next Mon. (June 28th). Starting today back at section 6.7 and will begin Ch. 7. Joint Distribution of Functions
More informationJoint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:
Joint Distributions Joint Distributions A bivariate normal distribution generalizes the concept of normal distribution to bivariate random variables It requires a matrix formulation of quadratic forms,
More informationMathematics 426 Robert Gross Homework 9 Answers
Mathematics 4 Robert Gross Homework 9 Answers. Suppose that X is a normal random variable with mean µ and standard deviation σ. Suppose that PX > 9 PX
More informationEE4601 Communication Systems
EE4601 Communication Systems Week 2 Review of Probability, Important Distributions 0 c 2011, Georgia Institute of Technology (lect2 1) Conditional Probability Consider a sample space that consists of two
More informationLecture Notes 3 Multiple Random Variables. Joint, Marginal, and Conditional pmfs. Bayes Rule and Independence for pmfs
Lecture Notes 3 Multiple Random Variables Joint, Marginal, and Conditional pmfs Bayes Rule and Independence for pmfs Joint, Marginal, and Conditional pdfs Bayes Rule and Independence for pdfs Functions
More informationECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.
ECE32 Exam 2 Version A April 21, 214 1 Name: Solution Score: /1 This exam is closed-book. You must show ALL of your work for full credit. Please read the questions carefully. Please check your answers
More informationProbability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.
Probability UBC Economics 326 January 23, 2018 1 2 3 Wooldridge (2013) appendix B Stock and Watson (2009) chapter 2 Linton (2017) chapters 1-5 Abbring (2001) sections 2.1-2.3 Diez, Barr, and Cetinkaya-Rundel
More informationEEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as
L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace
More informationRandom Signals and Systems. Chapter 3. Jitendra K Tugnait. Department of Electrical & Computer Engineering. Auburn University.
Random Signals and Systems Chapter 3 Jitendra K Tugnait Professor Department of Electrical & Computer Engineering Auburn University Two Random Variables Previously, we only dealt with one random variable
More informationMore on Distribution Function
More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function F X. Theorem: Let X be any random variable, with cumulative distribution
More information18.440: Lecture 28 Lectures Review
18.440: Lecture 28 Lectures 17-27 Review Scott Sheffield MIT 1 Outline Continuous random variables Problems motivated by coin tossing Random variable properties 2 Outline Continuous random variables Problems
More informationAppendix A : Introduction to Probability and stochastic processes
A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of
More information01 Probability Theory and Statistics Review
NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement
More informationLecture 8: Channel Capacity, Continuous Random Variables
EE376A/STATS376A Information Theory Lecture 8-02/0/208 Lecture 8: Channel Capacity, Continuous Random Variables Lecturer: Tsachy Weissman Scribe: Augustine Chemparathy, Adithya Ganesh, Philip Hwang Channel
More informationMultivariate probability distributions and linear regression
Multivariate probability distributions and linear regression Patrik Hoyer 1 Contents: Random variable, probability distribution Joint distribution Marginal distribution Conditional distribution Independence,
More informationACM 116: Lectures 3 4
1 ACM 116: Lectures 3 4 Joint distributions The multivariate normal distribution Conditional distributions Independent random variables Conditional distributions and Monte Carlo: Rejection sampling Variance
More informationRandom Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay
1 / 13 Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay August 8, 2013 2 / 13 Random Variable Definition A real-valued
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationJoint Distribution of Two or More Random Variables
Joint Distribution of Two or More Random Variables Sometimes more than one measurement in the form of random variable is taken on each member of the sample space. In cases like this there will be a few
More informationChapter 4. Multivariate Distributions. Obviously, the marginal distributions may be obtained easily from the joint distribution:
4.1 Bivariate Distributions. Chapter 4. Multivariate Distributions For a pair r.v.s (X,Y ), the Joint CDF is defined as F X,Y (x, y ) = P (X x,y y ). Obviously, the marginal distributions may be obtained
More information5 Operations on Multiple Random Variables
EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y
More information4. Distributions of Functions of Random Variables
4. Distributions of Functions of Random Variables Setup: Consider as given the joint distribution of X 1,..., X n (i.e. consider as given f X1,...,X n and F X1,...,X n ) Consider k functions g 1 : R n
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationMAS113 Introduction to Probability and Statistics. Proofs of theorems
MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2
MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is
More informationIntroduction to Computational Finance and Financial Econometrics Probability Review - Part 2
You can t see this text! Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2 Eric Zivot Spring 2015 Eric Zivot (Copyright 2015) Probability Review - Part 2 1 /
More informationSummary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016
8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Introduction to Probability and Statistics Lecture 17: Continuous random variables: conditional PDF Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin
More informationSolutions to Homework Set #5 (Prepared by Lele Wang) MSE = E [ (sgn(x) g(y)) 2],, where f X (x) = 1 2 2π e. e (x y)2 2 dx 2π
Solutions to Homework Set #5 (Prepared by Lele Wang). Neural net. Let Y X + Z, where the signal X U[,] and noise Z N(,) are independent. (a) Find the function g(y) that minimizes MSE E [ (sgn(x) g(y))
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More information5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors
EE401 (Semester 1) 5. Random Vectors Jitkomut Songsiri probabilities characteristic function cross correlation, cross covariance Gaussian random vectors functions of random vectors 5-1 Random vectors we
More informationBivariate Distributions
STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 17 Néhémy Lim Bivariate Distributions 1 Distributions of Two Random Variables Definition 1.1. Let X and Y be two rrvs on probability space (Ω, A, P).
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More informationRecall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.
Expectations of Sums of Random Variables STAT/MTHE 353: 4 - More on Expectations and Variances T. Linder Queen s University Winter 017 Recall that if X 1,...,X n are random variables with finite expectations,
More information3-1. all x all y. [Figure 3.1]
- Chapter. Multivariate Distributions. All of the most interesting problems in statistics involve looking at more than a single measurement at a time, at relationships among measurements and comparisons
More informationPractice Examination # 3
Practice Examination # 3 Sta 23: Probability December 13, 212 This is a closed-book exam so do not refer to your notes, the text, or any other books (please put them on the floor). You may use a single
More informationProbability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27
Probability Review Yutian Li Stanford University January 18, 2018 Yutian Li (Stanford University) Probability Review January 18, 2018 1 / 27 Outline 1 Elements of probability 2 Random variables 3 Multiple
More informationLet X and Y denote two random variables. The joint distribution of these random
EE385 Class Notes 9/7/0 John Stensby Chapter 3: Multiple Random Variables Let X and Y denote two random variables. The joint distribution of these random variables is defined as F XY(x,y) = [X x,y y] P.
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationChapter 4. Chapter 4 sections
Chapter 4 sections 4.1 Expectation 4.2 Properties of Expectations 4.3 Variance 4.4 Moments 4.5 The Mean and the Median 4.6 Covariance and Correlation 4.7 Conditional Expectation SKIP: 4.8 Utility Expectation
More information[POLS 8500] Review of Linear Algebra, Probability and Information Theory
[POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationMultivariate Distributions (Hogg Chapter Two)
Multivariate Distributions (Hogg Chapter Two) STAT 45-1: Mathematical Statistics I Fall Semester 15 Contents 1 Multivariate Distributions 1 11 Random Vectors 111 Two Discrete Random Variables 11 Two Continuous
More informationSTT 441 Final Exam Fall 2013
STT 441 Final Exam Fall 2013 (12:45-2:45pm, Thursday, Dec. 12, 2013) NAME: ID: 1. No textbooks or class notes are allowed in this exam. 2. Be sure to show all of your work to receive credit. Credits are
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationCDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables
CDA6530: Performance Models of Computers and Networks Chapter 2: Review of Practical Random Variables Two Classes of R.V. Discrete R.V. Bernoulli Binomial Geometric Poisson Continuous R.V. Uniform Exponential,
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationChapter 4 Multiple Random Variables
Review for the previous lecture Theorems and Examples: How to obtain the pmf (pdf) of U = g ( X Y 1 ) and V = g ( X Y) Chapter 4 Multiple Random Variables Chapter 43 Bivariate Transformations Continuous
More informationData Analysis and Monte Carlo Methods
Lecturer: Allen Caldwell, Max Planck Institute for Physics & TUM Recitation Instructor: Oleksander (Alex) Volynets, MPP & TUM General Information: - Lectures will be held in English, Mondays 16-18:00 -
More informationNotes for Math 324, Part 19
48 Notes for Math 324, Part 9 Chapter 9 Multivariate distributions, covariance Often, we need to consider several random variables at the same time. We have a sample space S and r.v. s X, Y,..., which
More informationChapter 5,6 Multiple RandomVariables
Chapter 5,6 Multiple RandomVariables ENCS66 - Probabilityand Stochastic Processes Concordia University Vector RandomVariables A vector r.v. is a function where is the sample space of a random experiment.
More informationPCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities
PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationLecture 25: Review. Statistics 104. April 23, Colin Rundel
Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April
More informationContinuous r.v practice problems
Continuous r.v practice problems SDS 321 Intro to Probability and Statistics 1. (2+2+1+1 6 pts) The annual rainfall (in inches) in a certain region is normally distributed with mean 4 and standard deviation
More informationIntroduction to Machine Learning
What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes
More informationMULTIVARIATE PROBABILITY DISTRIBUTIONS
MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined
More informationCDA5530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables
CDA5530: Performance Models of Computers and Networks Chapter 2: Review of Practical Random Variables Definition Random variable (R.V.) X: A function on sample space X: S R Cumulative distribution function
More informationReview (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology
Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna
More information4 Pairs of Random Variables
B.Sc./Cert./M.Sc. Qualif. - Statistical Theory 4 Pairs of Random Variables 4.1 Introduction In this section, we consider a pair of r.v. s X, Y on (Ω, F, P), i.e. X, Y : Ω R. More precisely, we define a
More informationUCSD ECE153 Handout #27 Prof. Young-Han Kim Tuesday, May 6, Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei)
UCSD ECE53 Handout #7 Prof. Young-Han Kim Tuesday, May 6, 4 Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei). Neural net. Let Y = X + Z, where the signal X U[,] and noise Z N(,) are independent.
More information