Bivariate Distributions

Similar documents
STAT/MATH 395 PROBABILITY II

Bivariate distributions

HW4 : Bivariate Distributions (1) Solutions

Quick review on Discrete Random Variables

Continuous Random Variables

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University

Jointly Distributed Random Variables

Lecture 2: Repetition of probability theory and statistics

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Chapter 4 : Discrete Random Variables

Notes for Math 324, Part 19

Distributions of Functions of Random Variables

ENGG2430A-Homework 2

STAT/MATH 395 PROBABILITY II

Lecture 11. Probability Theory: an Overveiw

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

Lecture 2: Review of Probability

Algorithms for Uncertainty Quantification

ECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.

HW5 Solutions. (a) (8 pts.) Show that if two random variables X and Y are independent, then E[XY ] = E[X]E[Y ] xy p X,Y (x, y)

1 Joint and marginal distributions

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Review of Probability Theory

Joint probability distributions: Discrete Variables. Two Discrete Random Variables. Example 1. Example 1

Multivariate Random Variable

STAT 430/510: Lecture 16

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Joint Distribution of Two or More Random Variables

Chapter 2. Probability

Lecture 16 : Independence, Covariance and Correlation of Discrete Random Variables

Review of Probability. CS1538: Introduction to Simulations

STAT Chapter 5 Continuous Distributions

SDS 321: Introduction to Probability and Statistics

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

BASICS OF PROBABILITY

(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise.

CMPSCI 240: Reasoning Under Uncertainty

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Class 8 Review Problems solutions, 18.05, Spring 2014

STA 256: Statistics and Probability I

Ch. 5 Joint Probability Distributions and Random Samples

Functions of two random variables. Conditional pairs

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Introduction to Machine Learning

SDS 321: Introduction to Probability and Statistics

1 Presessional Probability

Notes 12 Autumn 2005

MULTIVARIATE PROBABILITY DISTRIBUTIONS

Probability and Distributions

Continuous distributions

1 Probability theory. 2 Random variables and probability theory.

SDS 321: Introduction to Probability and Statistics

ECE353: Probability and Random Processes. Lecture 7 -Continuous Random Variable

Covariance and Correlation

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Review: mostly probability and some statistics

Appendix A : Introduction to Probability and stochastic processes

Problem #1 #2 #3 #4 Total Points /5 /7 /8 /4 /24

1 Random Variable: Topics

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

Probability and Statistics Notes

Chapter 4 continued. Chapter 4 sections

M378K In-Class Assignment #1

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

Math 426: Probability MWF 1pm, Gasson 310 Exam 3 SOLUTIONS

Hypergeometric, Poisson & Joint Distributions

STT 441 Final Exam Fall 2013

Introduction to Machine Learning

2 (Statistics) Random variables

Random Variables and Their Distributions

Lecture 1: August 28

CME 106: Review Probability theory

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Homework 4 Solution, due July 23

MAS113 Introduction to Probability and Statistics. Proofs of theorems

Chapter 5 Class Notes

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable

ECSE B Solutions to Assignment 8 Fall 2008

4 Pairs of Random Variables

Class 8 Review Problems 18.05, Spring 2014

Exam 1 - Math Solutions

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

Notes for Math 324, Part 20

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

Dept. of Linguistics, Indiana University Fall 2015

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Week 12-13: Discrete Probability

ACM 116: Lectures 3 4

STOR Lecture 16. Properties of Expectation - I

Section 8.1. Vector Notation

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Recitation 2: Probability

Transcription:

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 17 Néhémy Lim Bivariate Distributions 1 Distributions of Two Random Variables Definition 1.1. Let X and Y be two rrvs on probability space (Ω, A, P). A two-dimensional random variable (X, Y ) is a function mapping (X, Y ) : Ω R, such that for any numbers x, y R: {ω Ω X(ω) x and Y (ω) y} A (1) Definition 1. (Joint cumulative distribution function). Let X and Y be two rrvs on probability space (Ω, A, P). The joint (cumulative) distribution function (joint c.d.f.) of X and Y is the function F XY given by for x, y R F XY (x, y) P ({X x} {Y y}) P (X x, Y y), () We first consider the case of two discrete rrvs. 1.1 Distributions of Two Discrete Random Variables Example 1. Here is an excerpt of an article published in the Time and entitled Why Gen Y Loves Restaurants And Restaurants Love Them Even More : According to a new report from the research firm Technomic, 4% of millennials say they visit upscale casual-dining restaurants at least once a month. That s a higher percentage than Gen X (33%) and Baby Boomers (4%) who go to such restaurants once or more monthly. Time, Aug. 15, 1, By Brad Tuttle 1

We can find the populations of each category provided by the US Census Bureau : Age Population -19 83,67,556-34 (Millenials) 6,649,947 35-49 (Gen X) 63,779,197 5-69 (Baby Boomers) 71,16,117 7+ 7,83,71 Table 1: Demography in the US by age group [US Census data] Let X and Y be two random variables defined as follows : X 1 : a person between and 69 visits upscale restaurants at least once a month and X otherwise Y 1 : a person between and 69 is a millenial Y : a person between and 69 is a Gen X Y 3 : a person between and 69 is a Baby Boomer We can translate the statement of the article in the form of a contingency table shown below : X Y 1 3 36,337 4,73 47,3 1 6,3 1,47 4, Table : Count Data ( 1) The probability that the couple (X, Y ) takes on particular values can be found by dividing each cell by the total population of people between and 69. X Y 1 3.184.16.38 1.3.16. Table 3: Joint Probability Mass Function p XY (x, y) Now, let us define formally the joint probability mass function of two discrete random variables X and Y.

Definition 1.3. Let X and Y be two discrete rrvs on probability space (Ω, A, P). The joint pmf of X and Y, denoted by p XY, is defined as follows : p XY (x, y) P ({X x} {Y y}) P (X x, Y y), (3) for x X(Ω) and y Y (Ω) Property 1.1. Let X and Y be two discrete rrvs on probability space (Ω, A, P) with joint pmf p XY, then the following holds : p XY (x, y), for x X(Ω) and y Y (Ω) x X(Ω) y Y (Ω) p XY (x, y) 1 Property 1.. Let X and Y be two disctete rrvs on probability space (Ω, A, P) with joint pmf p XY. Then, for any subset S X(Ω) Y (Ω) P ((X, Y ) S) p XY (x, y) (x,y) S The above property tells us that in order to determine the probability of event {(X, Y ) S}, you simply sum up the probabilities of the events {X x, Y y} with values (x, y) in S. Definition 1.4 (Marginal distributions). Let X and Y be two discrete rrvs on probability space (Ω, A, P) with joint pmf p XY. Then the pmf of X alone is called the marginal probability mass function of X and is defined by: p X (x) P(X x) p XY (x, y), for x X(Ω) (4) y Y (Ω) Similarly, the pmf of Y alone is called the marginal probability mass function of Y and is defined by: p Y (y) P(Y y) p XY (x, y), for y Y (Ω) (5) x X(Ω) Definition 1.5 (Independence). Let X and Y be two discrete rrvs on probability space (Ω, A, P) with joint pmf p XY. Let p X and p Y be the respective marginal pmfs of X and Y. Then X and Y are said to be independent if and only if : p XY (x, y) p X (x)p Y (y), for all x X(Ω) and y Y (Ω) (6) Definition 1.6 (Expected Value). Let X and Y be two discrete rrvs on probability space (Ω, A, P) with joint pmf p XY and let g : R R be a bounded piecewise continuous function. Then, the mathematical expectation of g(x, Y ), if it exists, is denoted by E[g(X, Y )] and is defined as follows : E[g(X, Y )] g(x, y)p XY (x, y) (7) x X(Ω) y Y (Ω) 3

Example. Consider the following joint probability mass function : p XY (x, y) xy 1 S(x, y) with S {(1, 1), (1, ), (, )} 1. Show that p XY is a valid joint probability mass function. Answer. The joint probability mass function of X and Y is given by the following table : X 1 Y 1 1 4 8 We first note that p XY (x, y) for all x, y 1,. Second, Hence, p XY x1 y1. What is P(X + Y 3)? p XY (x, y) 1 + 4 + + 8 1 is indeed a valid joint probability mass function. Answer. Let us denote B the set of values such that X + Y 3 : Therefore, P(X + Y 3) B {(1, 1), (1, ), (, 1)} (x,y) B p XY (x, y) p XY (1, 1) + p XY (1, ) + p XY (, 1) 1 + 4 + 5 3. Give the marginal probability mass functions of X and Y. Answer. The marginal probability mass function of X is given by : p X (1) p XY (1, y) p XY (1, 1) + p XY (1, ) 1 + 4 5 p X () y1 p XY (, y) p XY (, ) + p XY (, ) + 8 8 y1 4

Similarly, the marginal probability mass function of Y is given by : p Y (1) p Y () x1 x1 p XY (x, 1) p XY (1, 1) + p XY (, 1) 1 + 1 p XY (x, ) p XY (1, ) + p XY (, ) 4 + 8 1 4. What are the expected values of X and Y? Answer. The expected value of X is given by : E[X] x1 y1 xp XY (x, y) { } x p XY (x, y) x1 y1 xp X (x) x1 5 1 + 8 1 Similarly, the expected value of Y is given by : E[Y ] 5. Are X and Y independent? x1 y1 yp XY (x, y) { } y p XY (x, y) y1 x1 yp Y (y) x1 1 1 + 1 5 Answer. It is easy to see that X and Y are not independent, since, for example: p X ()p Y (1) 8 1 8 169 p XY (, 1) 5

1. The Trinomial Distribution (Optional section) Example 3. Suppose n students are selected at random: Let A be the event that a randomly selected student went to the football game on Saturday. Let P(A). p 1 Let B be the event that a randomly selected student watched the football game on TV on Saturday. Let P(B).5 p Let C be the event that a randomly selected student completely ignored the football game on Saturday. Let P(C).3 1 p 1 p One possible outcome is: BBCABBAACABBBCCBCBCB That is, the first two students watched the game on TV, the third student ignored the game, the fourth student went to the game, and so on. By independence, the probability of observing that particular outcome is : p 4 1 p 1 (1 p 1 p ) 4 1 Now, let us denote the three following random variables : X is the number in the sample who went to the football game on Saturday, Y is the number in the sample who watched the football game on TV on Saturday, Z the number in the sample who completely ignored the football game Then in this case: X 4, Y 1 and thus Z X Y 6. One way of having (X 4, Y 1) is given by the above outcome. Actually, there are many ways of observing (X 4, Y 1). We find all the possible outcomes by giving all the permutations of the A s, B s and C s of the above ordered sequence. The number of ways we can divide items into 3 distinct groups of respective sizes 4, 1 and 6 is the multinomial coefficient : ( ) 4, 1, 6! 4! 1! 6! Thus, the joint probability of getting (X 4, Y 1) is :! 4! 1! ( 4 1)! p4 1 p 1 (1 p 1 p ) 4 1 Let us generalize our findings. Definition 1.7 (Trinomial process). A trinomial process has the following features : 6

1. We repeat n N identical trials. A trial can result in exactly one of three mutually exclusive and exhaustive outcomes, that is, events E 1, E and E 3 occur with respective probabilities p 1, p and p 3 1 p 1 p. In other words, E 1, E and E 3 form a partition of Ω. 3. p 1, p (thus p 3 ) remain constant trial after trial. In this case, the process is said to be stationary. 4. The trials are mutually independent. Definition 1.8 (Trinomial Distribution). Let (Ω, A, P) be a probability space. Let E 1, E and E 3 be three mutually exclusive and exhaustive events that occur with respective probabilities p 1, p and p 3 1 p 1 p. Assume n N trials are performed according to a trinomial process. Let X, Y and Z denote the numbers of times events E 1, E and E 3 occur respectively. Then random variables X, Y are said to follow a trinomial distribution T rin(n, p 1, p ) and their joint probability mass function is given by : p XY (x, y) n! x!y!(n x y)! px 1p y (1 p 1 p ) n x y (8) Remark. Note that the joint distribution of X and Y is exactly the same as the joint distribution of X and Z or the joint distribution of Y and Z since X + Y + Z n. 1.3 Distributions of Two Continuous Random Variables Definition 1.9 (Joint probability density function). Let X and Y be two continuous rrvs on probability space (Ω, A, P). X and Y are said to be jointly continuous if there exists a function f XY such that, for any Borel set on R : P((X, Y ) B) f XY (x, y) dx dy (9) Then function f XY is called the joint probability density function of X and Y. Moreover, if the joint distribution function F XY is of class C, then the joint pdf of X and Y can be expressed in terms of partial derivatives : B f XY (x, y) F (x, y) x y (1) Property 1.3. Let X and Y be two continuous rrvs on probability space (Ω, A, P) with joint pdf f XY, then the following holds : f XY (x, y), for x, y R 7

f XY (x, y) dx dy 1 Definition 1.1 (Marginal distributions). Let X and Y be two continuous rrvs on probability space (Ω, A, P) with joint pdf f XY. Then the pdf of X alone is called the marginal probability density function of X and is defined by: f X (x) f XY (x, y) dy, for x R (11) Similarly, the pdf of Y alone is called the marginal probability density function of Y and is defined by: f Y (y) f XY (x, y) dx, for y R (1) Definition 1.11 (Independence). Let X and Y be two continuous rrvs on probability space (Ω, A, P) with joint pdf f XY. Let f X and f Y be the respective marginal pdfs of X and Y. Then X and Y are said to be independent if and only if : f XY (x, y) f X (x)f Y (y), for all x, y R () Definition 1.1 (Expected Value). Let X and Y be two continuous rrvs on probability space (Ω, A, P) with joint pdf f XY and let g : R R be a bounded piecewise continuous function. Then, the mathematical expectation of g(x, Y ), if it exists, is denoted by E[g(X, Y )] and is defined as follows : E[g(X, Y )] g(x, y)f XY (x, y) dx dy (14) Example 4. Let X and Y be two continuous random variables with joint probability density function : f XY (x, y) 4xy 1 [,1] (x, y) 1. Verify that f XY is a valid joint probability density function. Answer. Note that f XY can be rewritten as follows : { 4xy if x 1, y 1 f XY (x, y) otherwise It is now clear that f XY (x, y) for all x, y R. Second, let us verify 8

that the integral of f XY over R is 1 : f XY (x, y) dx dy x 1 1 4xy dx dy { 4x [ y 4x x dx ] 1 } y dy dx dx Hence, f XY is a valid joint probability density function.. What is P(Y < X)? Answer. The set B in the xy-plane such that (x, y) [, 1] and y < x is : B {(x, y) y < x 1} Then, B f XY (x, y) dx dy x x4 1 x y { x 4x [ y 4x x 3 dx 1 4xy dx dy ] x } y dy dx dx 3. What are the marginal probability density functions of X and Y? Answer. The marginal probability density function of X, for x [, 1], is given by : 9

f X (x) f XY (x, y) dy 4xy dy [ y 4x x And f X (x) otherwise. In a nutshell, ] 1 f X (x) x 1 [,1] (x) Similarly, we find that the marginal probability density function of Y is : 4. Are X and Y independent? Answer. Yes, since for all x, y R: f Y (y) y 1 [,1] (y) f X (x)f Y (y) 4xy 1 [,1] (x) 1 [,1] (y) f XY (x, y) 5. What are the expected values of X and Y? Answer. The expected value of X can be computed as follows : E[X] x3 3 3 xf XY (x, y) dx dy { } x f XY (x, y) dy dx xf X (x) dx x dx Similarly, we find that the expected value of Y is /3. 1 Example 5. The joint probability density function of X and Y is given by: f XY (x, y) 6 ( x + xy ) 1 S (x, y) 7 with S {(x, y) < x < 1, < y < } 1

1. Show that f XY is a valid joint probability density function.. Compute the marginal probability density function of X. 3. Find P(X > Y ). 4. What are the expected values of X and Y? 5. Are X and Y independent? The Correlation Coefficient.1 Covariance Definition.1 (Covariance). Let X and Y be two rrvs on probability space (Ω, A, P). The covariance of X and Y, denoted by Cov(X, Y ), is defined as follows : Cov(X, Y ) E[(X E[X])(Y E[Y ])] (15) upon existence of the above expression If X and Y are discrete rrv with joint pmf p XY, then the covariance of X and Y is : Cov(X, Y ) (x E[X])(y E[Y ])p XY (x, y) (16) x X(Ω) y Y (Ω) If X and Y are continuous rrv with joint pdf f XY, then the covariance of X and Y is : Cov(X, Y ) (x E[X])(y E[Y ])f XY (x, y) dx dy (17) Theorem.1. Let X and Y be two rrvs on probability space (Ω, A, P). The covariance of X and Y can be calculated as : Cov(X, Y ) E[XY ] E[X]E[Y ] (18) Example 6. Suppose that X and Y have the following joint probability mass function: X What is the covariance of X and Y? Y 1 3 1.5.5.5.5 11

Property.1. Here are some properties of the covariance. For any random variables X and Y, we have : 1. Cov(X, Y ) Cov(Y, X). Cov(X, X) Var(X) 3. Cov(aX, Y ) acov(x, Y ) for a R 4. Let X 1,..., X n be n random variables and Y 1,..., Y m be m random variables. Then : n m n m Cov X i, Cov(X i, Y j ). Correlation i1 j1 Y j i1 j1 Definition. (Correlation). Let X and Y be two rrvs on probability space (Ω, A, P) with respective standard deviations σ X Var(X) and σ Y Var(Y ). The correlation of X and Y, denoted by ρ XY, is defined as follows : upon existence of the above expression ρ XY Cov(X, Y ) σ X σ Y (19) Property.. Let X and Y be two rrvs on probability space (Ω, A, P). 1 ρ XY 1 () Interpretation of Correlation The correlation coefficient of X and Y is interpreted as follows: 1. If ρ XY 1, then X and Y are perfectly, positively, linearly correlated.. If ρ XY 1, then X and Y are perfectly, negatively, linearly correlated. X and Y are also said to be perfectly linearly anticorrelated 3. If ρ XY, then X and Y are completely, linearly uncorrelated. That is, X and Y may be perfectly correlated in some other manner, in a parabolic manner, perhaps, but not in a linear manner. 4. If < ρ XY < 1, then X and Y are positively, linearly correlated, but not perfectly so. 5. If 1 < ρ XY <, then X and Y are negatively, linearly correlated, but not perfectly so. X and Y are also said to be linearly anticorrelated Example. Remember Example 1.1 on dining habits where we defined discrete random variables X and Y as follows : 1

X 1 : a person between and 69 visits upscale restaurants at least once a month and X otherwise Y 1 : a person between and 69 is a millenial Y : a person between and 69 is a Gen X Y 3 : a person between and 69 is a Baby Boomer The joint probability mass function of X and Y is given below : X Y 1 3.184.16.38 1.3.16. Compute the covariance and the correlation of X and Y. What can you say about the relationship between X and Y? Theorem.. Let X and Y be two rrvs on probability space (Ω, A, P) and let g 1 and g be two bounded piecewise continuous functions. If X and Y are independent, then provided that the expectations exist. E[g 1 (X)g (Y )] E[g 1 (X)] E[g (Y )] (1) Proof. Assume that X and Y are discrete random variables with joint probability mass function p XY and respective marginal probability mass functions p X and p Y. Then E[g 1 (X)g (Y )] g 1 (x)g (y)p XY (x, y) x X(Ω) y Y (Ω) x X(Ω) y Y (Ω) x X(Ω) g 1 (x)g (y)p X (x)p Y (y) g 1 (x)p X (x) E[g 1 (X)] E[g (Y )] y Y (Ω) g (y)p Y (y) since X and Y are independent The result still holds when X and Y are continuous random variables (see Homework 5). Theorem.3. Let X and Y be two rrvs on probability space (Ω, A, P). If X and Y are independent, then Cov(X, Y ) ρ XY ()

Note, however, that the converse of the theorem is not necessarily true! That is, zero correlation does not imply independence. Let s take a look at an example that illustrates this claim. Example 7. Let X and Y be discrete random variables with the following joint probability mass function: X Y -1 1-1... 1.. 1. What is the correlation between X and Y?. Are X and Y independent? 3 Conditional Distributions 3.1 Discrete case Example. An international TV network company is interested in the relationship between the region of citizenship of its customers and their favorite sport. Sports Citizenship Africa America Asia Europe Tennis..7.4.1 Basketball.3.11.1.4 Soccer.8.5.4.16 Football.1.17..3 The company is interested in answering the following questions : What is the probability that a randomly selected person is African and his/her favorite sport is basketball? If a randomly selected person s favorite sport is soccer, what is the probability that he/she is European? If a randomly selected person is American, what is the probability that his/her favorite sport is tennis? Definition 3.1. Let X and Y be two discrete rrvs on probability space (Ω, A, P) with joint pmf p XY and respective marginal pmfs p X and p Y. Then, for y Y (Ω), the conditional probability mass function of X given Y y is defined by : p X Y (x y) P(X x Y y) p XY (x, y) (3) p Y (y) 14

provided that p Y (y). Definition 3. (Conditional Expectation). Let X and Y be two discrete rrvs on probability space (Ω, A, P). Then, for y Y (Ω), the conditional expectation of X given Y y is defined as follows : E[X Y y] xp X Y (x y) (4) x X(Ω) Property 3.1 (Linearity of conditional expectation). Let X and Y be two discrete rrvs on probability space (Ω, A, P). If c 1, c R and g 1 : X(Ω) R and g : X(Ω) R are piecewise continuous functions. Then, for y Y (Ω), we have : E[c 1 g 1 (X) + c g (X) Y y] c 1 E[g 1 (X) Y y] + c E[g (X) Y y] (5) Remark : The above property is also true for continuous rrvs. Property 3.. Let X and Y be two discrete rrvs on probability space (Ω, A, P). If X and Y are independent, then, for y Y (Ω), we have : Similarly, for x X(Ω), we have : p X Y (x y) p X (x) for all x X(Ω) (6) p Y X (y x) p Y (y) for all y Y (Ω) (7) Remark : The above property is also true for continuous rrvs. Property 3.3. Let X and Y be two discrete rrvs on probability space (Ω, A, P). If X and Y are independent, then, for y Y (Ω), we have : Similarly, for x X(Ω), we have : E[X Y y] E[X] (8) E[Y X x] E[Y ] (9) Remark : The above property is also true for continuous rrvs. Definition 3.3 (Conditional Variance). Let X and Y be two discrete rrvs on probability space (Ω, A, P). Then, for y Y (Ω), the conditional variance of X given Y y is defined as follows : Var(X Y y) E[(X E[X Y y]) Y y] (x E[X Y y]) p X Y (x y) x X(Ω) Property 3.4. Let X and Y be two discrete rrvs on probability space (Ω, A, P). Then, for y Y (Ω), the conditional variance of X given Y y can be calculated as follows : Var(X Y y) E[X Y y] (E[X Y y]) x p X Y (x y) xp X Y (x y) x X(Ω) (3) x X(Ω) (31) 15

3. Continuous case Definition 3.4. Let X and Y be two continuous rrvs on probability space (Ω, A, P) with joint pdf f XY and respective marginal pdfs f X and f Y. Then, for y R, the conditional probability density function of X given Y y is defined by : f X Y (x y) f XY (x, y) (3) f Y (y) provided that f Y (y). Definition 3.5 (Conditional Expectation). Let X and Y be two continuous rrvs on probability space (Ω, A, P). Then, for y R, the conditional expectation of X given Y y is defined as follows : E[X Y y] xf X Y (x y) dx (33) Example. Let X and Y be two continuous random variables with joint probability density function : f XY (x, y) 3 1 S(x, y) with S {(x, y) x < y < 1, < x < 1} 1. Compute the conditional probability density function of Y given X x for x (, 1). Answer. To compute the conditional pdf, we need to compute the marginal pdf of X given by : for x (, 1) f X (x) x 3 dy 3 (1 x ) f XY (xy) dy Hence, f X (x) 3 (1 x )1 (,1) (x) 16

Therefore the conditional pdf of Y given X x is thus : f Y X (y x) f XY (x, y) f X (x) 3 1 (,1)(x)1 (x,1)(y) 3 (1 x )1 (,1) (x) 1 1 x 1 (x,1)(y) for x (, 1). Note that we recognize that f Y X is the pdf of a uniform distribution on (x, 1).. Find the expected value of Y given X x for x (, 1). Answer. The conditional expectation is computed as follows : for x (, 1) E[Y X x] y x 1 1 x yf Y X (y x) dy 1 1 x dy [ ] y 1 1 x4 (1 x ) x + 1 The result should not be that surprising since the expected value of a uniform distribution is the average of the bounds of the interval. x Definition 3.6 (Conditional Variance). Let X and Y be two continuous rrvs on probability space (Ω, A, P). Then, for y R, the conditional variance of X given Y y is defined as follows : Var(X Y y) E[(X E[X Y y]) Y y] (x E[X Y y]) f X Y (x y) dx Property 3.5. Let X and Y be two continuous rrvs on probability space (Ω, A, P). Then, for y R, the conditional variance of X given Y y can be calculated as follows : ( ) Var(X Y y) E[X Y y] (E[X Y y]) x f X Y (x y) dx xf X Y (x y) dx (35) (34) 17