Conditional distributions Conditional expectation and conditional variance with respect to a variable Probability Theory and Stochastic Processes, summer semester 07/08 80408 Conditional distributions Conditional distributions of discrete random variables Recall that if both X and Y are discrete random variables and A {x, x, }, B {y, y, } are such that P X A and P Y B then we define the joint probability mass function of X and Y as A B x, y P X x and Y y [0, ] Assuming that P Y y j > 0 we define the conditional distribution of X given Y y j as the distribution with the following probability mass function, called the conditional probability mass function of X given Y y j : P X x i Y y j P X x i and Y y j P Y y j Similarly, assuming that P X x i > 0 we define the conditional distribution of Y given X x i as the distribution with the following probability mass function, called the the conditional probability mass function of Y given X x i : P Y y j X x i P X x i and Y y j P X x i Remark If X and Y are independent then P X x i and Y y j P X x i P Y y j and easy calculation yields P X x i Y y j P X x i and P Y y j X x i P Y y j Example A hen lays N eggs, where N has Poisson distribution with parameter λ Each egg hatches with probability p independently of the other eggs Let K be the number of chicks Find the joint probability mass function of N and K
SOLUTION Since each egg hatches with probability p independently of the other eggs the conditional distribution of K given N n, n, is the binomial distribution with parameters n and p and naturally P K 0 N 0 So for n 0,,, and k 0,,, n we have P N n and K k P K k N k P N n n p k p n k λ λn e k n! n! k!n k! pk p n k e e λ λ n pk k! p n k n k! and P N n and K k 0 otherwise Conditional distributions of continuous random variables λ λn Let X and Y have the joint probability density function f X,Y and let f X and f Y be marginal densities of X and Y respectively Assuming that f Y y > 0 we define the conditional distribution of X given Y y as the distribution with the following density function, called the conditional probability density function of X given Y y: f X Y y x f Y y f X,Y x, y Similarly, assuming that f X x > 0 we define the conditional distribution of Y given X x as the distribution with the following density function, called the conditional probability density function of Y given X x: f Y Xx y f X x f X,Y x, y Remark If X and Y are independent then f X,Y x, y f X xf Y y and easy calculation yields f X Y y x f X x and f Y Xx y f Y y Example Let X, Y has a bivariate normal distribution with the density { x f X,Y x, y π µ exp ρ σ ρ ρ x µ } y µ y µ +, σ σ where, σ > 0, µ, µ R and ρ, Calculate the conditional probability density function of Y given X x SOLUTION To calculate f Y Xx y let us recall that the marginal distributions of X is N µ, σ and f X x exp πσ x µ n!
Now we have f Y Xx y { exp x µ π ρ σ ρ ρ x µ y µ πσ exp { x πσ π µ exp ρ σ ρ We simplify ρ { x µ ρ x µ { y µ ρ σ y µ ρ ρ x µ σ ρ σ and get f Y Xx y y µ σ + x µ ρ x µ y µ σ + ρ y µ ρ σ x µ exp π ρ σ ρ σ σ + y µ ρ x µ y µ σ + } y µ + x µ σ } x µ σ } } y µ + x µ σ y µ ρ σ x µ Thus we recognize that Y X x N µ + ρ σ x µ, ρ σ The knowledge of X reduces the variance of Y by ρ σ Mixed case Sometimes we will deal with the case when X is continuous and Y discrete Let B {y, y, } be such that P Y B We assume that we are given functions f j : R [0, +, j,, such that P X dx and Y y j f j xdx, which means that for any < a < b < + we have P X [a, b] and Y y j xb xa f j xdx The marginal probability density function of X reads as f X x + j f j x b a f j xdx 3
while the marginal probability mass function of Y is given by P Y y j x+ x f j xdx f j xdx The conditional probability density function of X given Y y j reads as f X Y yj x f j x P Y y j f j x f jzdz while the conditional probability mass function of Y given X x for x such that f X x > 0 is given by P Y y j X x f jx f X x f j x + k f kx Example 3 Let Λ be a continuous random variable with Erlangk, α distribution k,,, α > 0, which means that it has density { α k k! f Λ λ λk e α λ if λ 0; 0 if λ < 0 The conditional distribution of variable N given Λ λ > 0 has P oiλ distribution Find the unconditional marginal distribution of N SOLUTION Since N Λ λ P oiλ we have and λ λn P N n Λ λ e n! P N n and Λ dλ P N n Λ λ f Λ λdλ The marginal distribution of N reads as λ λn α k e n! k! λk e α λ dλ Since P N n fλ 0 λ λn e α k n!k! α k n! k! λk e α λ dλ 0 λ n+k e α+λ dλ { α+ n+k n+k! λn+k e α+ λ if λ 0; 0 if λ < 0 4
is the density of Erlangn + k, α + distribution, we get and finally P N n 0 λ n+k e α+λ dλ α k n + k! n!k! α + n+k n + k k n + k! α + n+k k n k α α + α + This is so called negative binomial distribution with parameters k and /α + Conditional expectation Sometimes we are not interested in the conditional distribution of one variable given another variable but in some characteristics of this conditional distribution, for example in its expectation, which may be much easier to estimate Definition The conditional expectation E Y X x is the expectation of the conditional distribution of Y given X x if it is well defined EXAMPLES In Example we have K N n Binn, p thus E K N n n p In Example we have Y X x N µ + ρ σ x µ, ρ σ thus E Y X x µ + ρ σ x µ In Example 3 we have N Λ λ P oiλ thus E N Λ λ λ The conditional expectation E Y X x is always a function of x Let us define E : R R, { E Y X x if E Y X x is well defined; Ex 0 otherwise Now we introduce the conditional expectation of Y given X Definition For two real variables X and Y let E : R R be defined as in The conditional expectation E Y X is the random variable defined as EXAMPLES E Y X : EX In Example we have E K N n n p thus E K N p N In Example we have E Y X x µ + ρ σ x µ thus E Y X µ + ρ σ X µ In Example 3 we have E N Λ λ λ thus E N Λ Λ 5
Properties of EY X The conditional expectation E Y X has the following properties E Y X is a random variable, which may be expressed as a function of variable X, E Y X EX If EY is well defined then also EEY X is well defined and EEY X EY 3 If Y is some function of X, Y F X, then EY X F X 4 If Y is independent from X and EY is well defined then EY X EY 5 If Z is some function of X then EZY X ZEY X 6 Linearity If β and β are some real numbers then: Eβ Y + β Y X β EY X + β EY X We will prove in the case when X and Y have joint probability mass function Let I 0 be the interval or more complicated set of those x for which f X x 0 and I be the interval of those x for which f X x > 0 We write ns EY y f Y ydy y f X,Y x, ydx dy I 0 0 + I yf X,Y x, ydy dx yf X,Y x, ydy EY X xf X xdx I EEX EEY X Decomposition of the variance dx + I y f X,Y x, y dy f X xdx f X x Exf X xdx yf X,Y x, ydy dx Let F : R R and X, Y be two real random variables We have the following important identity: E Y F X E {Y E Y X} + E {E Y X F X} 3 To prove 3 we write E X F X E {Y E Y X + E Y X F X} 4 E Y E Y X + E {Y E Y X E Y X F X} + E E Y X F X 6
We will prove that E {Y E Y X E Y X F X} 0 Since E Y X F X is some function of X we may write E Y X F X GX and from properties -3 and 5-6 of the conditional expectation we get E {Y E Y X E Y X F X} EE Y E Y X GX X E {GXE Y E Y X X} E {GX [E Y X E E Y X X]} E {GX [E Y X E Y X]} E {GX0} 0 This calculation and 4 gives 3 From 3 it follows that E Y F X E {Y E Y X} +E {E Y X F X} E Y E Y X and the equality is attained iff E E Y X F X 0 which holds iff P F X E Y X This may be interpreted that the conditional expectation EY X is the best predictor of Y which is a function of X when the error is measured as E Y F X mean square error Example 4 If X and Y have the bivariate normal distribution as in Example then the best predictor of Y given X, in the sense of the mean square error, is Ŷ E Y X µ + ρ σ X µ Setting in 3 F X EY we get E Y EY E {Y E Y X} + E {E Y X EY } 5 Denoting D Y X E {Y E Y X} X and D E Y X E {E Y X EE Y X} E {E Y X EY } we get the folowing decomposition of the variance of Y σ Y E Y EY ED Y X + D E Y X The quantity D Y X is called the conditional variance of Y given X 7