Chapter 12: Bivariate & Conditional Distributions James B. Ramsey March 2007 James B. Ramsey () Chapter 12 26/07 1 / 26
Introduction Key relationships between joint, conditional, and marginal distributions. James B. Ramsey () Chapter 12 26/07 2 / 26
Introduction Key relationships between joint, conditional, and marginal distributions. Joint Density: James B. Ramsey () Chapter 12 26/07 2 / 26
Introduction Key relationships between joint, conditional, and marginal distributions. Joint Density: f j (X 1, X 2 ) = f 2j1 (X 2 jx 1 )f 1 (X 1 ) f i (X i ) = = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Z f j (X 1, X 2 )dx j6=i James B. Ramsey () Chapter 12 26/07 2 / 26
Introduction Key relationships between joint, conditional, and marginal distributions. Joint Density: f i (X i ) = Prob. Distributions: f j (X 1, X 2 ) = f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Z f j (X 1, X 2 )dx j6=i James B. Ramsey () Chapter 12 26/07 2 / 26
Introduction Key relationships between joint, conditional, and marginal distributions. Joint Density: f i (X i ) = Prob. Distributions: f j (X 1, X 2 ) = f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Z f j (X 1, X 2 )dx j6=i Pr(X 1 x 0 ) = = Z x0 Z x0 Z [ f 1 (X 1 )dx 1 f j (X 1, X 2 )dx 2 ]dx 1 James B. Ramsey () Chapter 12 26/07 2 / 26
Cond. Prob. Distributions: Pr(X 1 x 0 jy = y 0 ) = R x 0 f 1j2(X 1 jy = y 0 )dx 1 James B. Ramsey () Chapter 12 26/07 3 / 26
Cond. Prob. Distributions: Pr(X 1 x 0 jy = y 0 ) = R x 0 f 1j2(X 1 jy = y 0 )dx 1 Joint Probs: pr(x 1 x 1,0, X 2 x 2,0 ) = R x 1,0 R x2,0 f j (X 1, X 2 )dx 1 dx 2 James B. Ramsey () Chapter 12 26/07 3 / 26
Cond. Prob. Distributions: Pr(X 1 x 0 jy = y 0 ) = R x 0 f 1j2(X 1 jy = y 0 )dx 1 Joint Probs: pr(x 1 x 1,0, X 2 x 2,0 ) = R x 1,0 f 1 (X 1 ) = R f j (X 1, X 2 )dx 2 = R f 1j2(X 1 jx 2 )f 2 (X 2 )dx 2 R x2,0 f j (X 1, X 2 )dx 1 dx 2 James B. Ramsey () Chapter 12 26/07 3 / 26
Cond. Prob. Distributions: Pr(X 1 x 0 jy = y 0 ) = R x 0 f 1j2(X 1 jy = y 0 )dx 1 Joint Probs: pr(x 1 x 1,0, X 2 x 2,0 ) = R x 1,0 R x2,0 f j (X 1, X 2 )dx 1 dx 2 f 1 (X 1 ) = R f j (X 1, X 2 )dx 2 = R f 1j2(X 1 jx 2 )f 2 (X 2 )dx 2 Or the marginal distn. is the weighted sum of the conditional distns. James B. Ramsey () Chapter 12 26/07 3 / 26
Association is not Causality Follows clearly from fact that: f j (X 1, X 2 ) =f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) James B. Ramsey () Chapter 12 26/07 4 / 26
Association is not Causality Follows clearly from fact that: f j (X 1, X 2 ) =f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Sources of "association": measuring a at & a curved plate; James B. Ramsey () Chapter 12 26/07 4 / 26
Association is not Causality Follows clearly from fact that: f j (X 1, X 2 ) =f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Sources of "association": measuring a at & a curved plate; Visitors to a real estate agent; buyers and sellers; James B. Ramsey () Chapter 12 26/07 4 / 26
Association is not Causality Follows clearly from fact that: f j (X 1, X 2 ) =f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Sources of "association": measuring a at & a curved plate; Visitors to a real estate agent; buyers and sellers; Individual consumption/income; James B. Ramsey () Chapter 12 26/07 4 / 26
Association is not Causality Follows clearly from fact that: f j (X 1, X 2 ) =f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Sources of "association": measuring a at & a curved plate; Visitors to a real estate agent; buyers and sellers; Individual consumption/income; Individual height & weight; James B. Ramsey () Chapter 12 26/07 4 / 26
Association is not Causality Follows clearly from fact that: f j (X 1, X 2 ) =f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Sources of "association": measuring a at & a curved plate; Visitors to a real estate agent; buyers and sellers; Individual consumption/income; Individual height & weight; Breakdown of two components of a machine James B. Ramsey () Chapter 12 26/07 4 / 26
Association is not Causality Follows clearly from fact that: f j (X 1, X 2 ) =f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Sources of "association": measuring a at & a curved plate; Visitors to a real estate agent; buyers and sellers; Individual consumption/income; Individual height & weight; Breakdown of two components of a machine Compare I.Q.& height, James B. Ramsey () Chapter 12 26/07 4 / 26
Association is not Causality Follows clearly from fact that: f j (X 1, X 2 ) =f 2j1 (X 2 jx 1 )f 1 (X 1 ) = f 1j2 (X 1 jx 2 )f 2 (X 2 ) Sources of "association": measuring a at & a curved plate; Visitors to a real estate agent; buyers and sellers; Individual consumption/income; Individual height & weight; Breakdown of two components of a machine Compare I.Q.& height, Health & wearing of a top hat. James B. Ramsey () Chapter 12 26/07 4 / 26
Buyers & Renters Entering a Real Estate O ce Arrivals: Distn. is given by: e λ λ N N! ; Recall conditions for Poisson distn. James B. Ramsey () Chapter 12 26/07 5 / 26
Buyers & Renters Entering a Real Estate O ce Arrivals: Distn. is given by: e λ λ N N! ; Recall conditions for Poisson distn. Given N arrivals the distn. of Buyers is Binomial: James B. Ramsey () Chapter 12 26/07 5 / 26
Buyers & Renters Entering a Real Estate O ce Arrivals: Distn. is given by: e λ λ N N! ; Recall conditions for Poisson distn. Given N arrivals the distn. of Buyers is Binomial: N π B (1 π) N B ; N = B + R B James B. Ramsey () Chapter 12 26/07 5 / 26
Replace N by R + B to obtain the distn.: James B. Ramsey () Chapter 12 26/07 6 / 26
Replace N by R + B to obtain the distn.: e λ λ R +B (R + B)! [ = e λ λ R +B R!B! R + B π B (1 B π B (1 π) R ] π) R James B. Ramsey () Chapter 12 26/07 6 / 26
Replace N by R + B to obtain the distn.: e λ λ R +B (R + B)! [ = e λ λ R +B R!B! R + B π B (1 Is a two parameter distribution, λ, π; B π B (1 π) R ] π) R James B. Ramsey () Chapter 12 26/07 6 / 26
Replace N by R + B to obtain the distn.: e λ λ R +B (R + B)! [ = e λ λ R +B R!B! R + B π B (1 Is a two parameter distribution, λ, π; B π B (1 π) R ] π) R λ = mean arrivals per hour; π = proportion of Buyers. James B. Ramsey () Chapter 12 26/07 6 / 26
Derivation of the Bivariate Normal Distn. Recall the bivariate Gaussian distn. from a pair of independent Gaussian distns. James B. Ramsey () Chapter 12 26/07 7 / 26
Derivation of the Bivariate Normal Distn. Recall the bivariate Gaussian distn. from a pair of independent Gaussian distns. φ(x, Y ) = φ(x )φ(y ) = expf 1 2 ( X η x σ x ) 2 g expf 1 2 p ( Y η y σ y ) 2 g p 2πσx 2πσy James B. Ramsey () Chapter 12 26/07 7 / 26
Let η x, η y both equal zero, σ x =σ y = 1 in order to simplify the algebra; James B. Ramsey () Chapter 12 26/07 8 / 26
Let η x, η y both equal zero, σ x =σ y = 1 in order to simplify the algebra; φ(x, Y ) = expf 1 2 X g2 p 2π expf 1 2 Y 2 g p 2π = expf 1 2 [X 2 + Y 2 ]g 2π James B. Ramsey () Chapter 12 26/07 8 / 26
But if X, Y are associated, we speculate from the calculation of "r" that for some parameter ρ the quadratic above contains a term like: James B. Ramsey () Chapter 12 26/07 9 / 26
But if X, Y are associated, we speculate from the calculation of "r" that for some parameter ρ the quadratic above contains a term like: expf 1 2 [X 2 + Y 2 2ρXY ]g James B. Ramsey () Chapter 12 26/07 9 / 26
But if X, Y are associated, we speculate from the calculation of "r" that for some parameter ρ the quadratic above contains a term like: expf 1 2 [X 2 + Y 2 2ρXY ]g If correct, integration to 1 implies that: James B. Ramsey () Chapter 12 26/07 9 / 26
But if X, Y are associated, we speculate from the calculation of "r" that for some parameter ρ the quadratic above contains a term like: expf 1 2 [X 2 + Y 2 2ρXY ]g If correct, integration to 1 implies that: φ(x, Y ) = expf 1 2(1 ρ 2 ) [X 2 + Y 2 2ρXY ]g 2π p (1 ρ 2 ) James B. Ramsey () Chapter 12 26/07 9 / 26
And in general for non-zero means and non-unitary variances, one has: φ(x, Y ) = expf 1 2(1 ρ 2 ) [( X η x σ x ) 2 + ( Y η y σ y ) 2 2π p (1 ρ 2 )σ x σ y 2ρ( X η x σ x )( Y η y σ y )]g 2π p (1 ρ 2 )σ x σ y James B. Ramsey () Chapter 12 26/07 10 / 26
And in general for non-zero means and non-unitary variances, one has: φ(x, Y ) = expf 1 2(1 ρ 2 ) [( X η x σ x ) 2 + ( Y η y σ y ) 2 2π p (1 ρ 2 )σ x σ y 2ρ( X η x σ x )( Y η y σ y )]g 2π p (1 ρ 2 )σ x σ y If ρ equals zero, get the joint distn. of a pair of independent variables James B. Ramsey () Chapter 12 26/07 10 / 26
And in general for non-zero means and non-unitary variances, one has: φ(x, Y ) = expf 1 2(1 ρ 2 ) [( X η x σ x ) 2 + ( Y η y σ y ) 2 2π p (1 ρ 2 )σ x σ y 2ρ( X η x σ x )( Y η y σ y )]g 2π p (1 ρ 2 )σ x σ y If ρ equals zero, get the joint distn. of a pair of independent variables The σ x, σ y in the denominator results from transforming from standardized to non-standardized variables, e.g.u, V de ned by the transformations: James B. Ramsey () Chapter 12 26/07 10 / 26
U= X η x σ x implies: du = 1 σ x dx ; James B. Ramsey () Chapter 12 26/07 11 / 26
U= X η x σ x implies: du = 1 σ x dx ; And V = Y η y σ y implies: dv = 1 σ y dy ; James B. Ramsey () Chapter 12 26/07 11 / 26
U= X η x σ x implies: du = 1 σ x dx ; And V = Y η y σ y implies: dv = 1 σ y dy ; so that the change in "scale" is allowed for in that: James B. Ramsey () Chapter 12 26/07 11 / 26
U= X η x σ x implies: du = 1 σ x dx ; And V = Y η y σ y implies: dv = 1 σ y dy ; so that the change in "scale" is allowed for in that: du = ( du dv dx )dx ; and dv = ( dy )dy. James B. Ramsey () Chapter 12 26/07 11 / 26
U= X η x σ x implies: du = 1 σ x dx ; And V = Y η y σ y implies: dv = 1 σ y dy ; so that the change in "scale" is allowed for in that: du = ( du dv dx )dx ; and dv = ( dy )dy. 1 1 We multiply the density in (U, V) by the re-scaling; σ x σ y density integrates to one. so that the James B. Ramsey () Chapter 12 26/07 11 / 26
U= X η x σ x implies: du = 1 σ x dx ; And V = Y η y σ y implies: dv = 1 σ y dy ; so that the change in "scale" is allowed for in that: du = ( du dv dx )dx ; and dv = ( dy )dy. 1 1 We multiply the density in (U, V) by the re-scaling; σ x σ y density integrates to one. See overheads so that the James B. Ramsey () Chapter 12 26/07 11 / 26
The Conditional Normal Density Function For simplicity let means = 0, variances equal 1. James B. Ramsey () Chapter 12 26/07 12 / 26
The Conditional Normal Density Function For simplicity let means = 0, variances equal 1. The Joint distn. is: James B. Ramsey () Chapter 12 26/07 12 / 26
The Conditional Normal Density Function For simplicity let means = 0, variances equal 1. The Joint distn. is: φ(x, Y ) = expf 1 2(1 ρ 2 ) g[x 2 + Y 2 2ρXY ] 2π p 1 ρ 2 James B. Ramsey () Chapter 12 26/07 12 / 26
which can be rewritten as the product of a conditional & a marginal distn. James B. Ramsey () Chapter 12 26/07 13 / 26
which can be rewritten as the product of a conditional & a marginal distn. φ(x, Y ) = φ(y jx )φ(x ) = X expf 2 2(1 ρ 2 ) p g 1 expf 2(1 ρ 2 ) g[y 2 2ρXY ] p p 2π 2π 1 ρ 2 James B. Ramsey () Chapter 12 26/07 13 / 26
In [Y 2 2ρXY ] complete the square by adding/subtracting ρ 2 X 2 ; James B. Ramsey () Chapter 12 26/07 14 / 26
In [Y 2 2ρXY ] complete the square by adding/subtracting ρ 2 X 2 ; [Y 2 2ρXY + ρ 2 X 2 ] = [Y ρx ] 2 and expf X 2 2(1 ρ 2 ) g = expf X 2 ρ 2 X 2 2(1 ρ 2 ) g = expf X 2 2 g James B. Ramsey () Chapter 12 26/07 14 / 26
In [Y 2 2ρXY ] complete the square by adding/subtracting ρ 2 X 2 ; yields: [Y 2 2ρXY + ρ 2 X 2 ] = [Y ρx ] 2 and expf X 2 2(1 ρ 2 ) g = expf X 2 ρ 2 X 2 2(1 ρ 2 ) g = expf X 2 2 g James B. Ramsey () Chapter 12 26/07 14 / 26
f expf [Y 2 ρx] 2(1 ρ 2 ) p g X 2 p gfexpf p 2π 1 ρ 2 2π 2 g g James B. Ramsey () Chapter 12 26/07 15 / 26
f expf [Y 2 ρx] 2(1 ρ 2 ) p g X 2 p gfexpf p 2π 1 ρ 2 2π 2 g The conditional distn. is: Gaussian with conditional mean: ρx and variance (1-ρ 2 ). g James B. Ramsey () Chapter 12 26/07 15 / 26
The General Conditional Distribution If η x, η y are non-zero and σ x, σ y are non-unitary,then can show: James B. Ramsey () Chapter 12 26/07 16 / 26
The General Conditional Distribution If η x, η y are non-zero and σ x, σ y are non-unitary,then can show: σ 2 Y jx 0 = (1 ρ 2 )σ 2 y ; σ 2 X jy 0 = (1 ρ 2 )σ 2 x James B. Ramsey () Chapter 12 26/07 16 / 26
Most important is the conditional mean: James B. Ramsey () Chapter 12 26/07 17 / 26
Most important is the conditional mean: E fy jx = x 0 g = η y + ρ σ y σ x (x o η x ) = [η y ρ σ y η σ x ] + ρ σ y x o ; x σ x = α + βx 0 which is a linear relationship between Y and X; cf Chapter 5. James B. Ramsey () Chapter 12 26/07 17 / 26
Most important is the conditional mean: E fy jx = x 0 g = η y + ρ σ y σ x (x o η x ) = [η y ρ σ y η σ x ] + ρ σ y x o ; x σ x = α + βx 0 which is a linear relationship between Y and X; cf Chapter 5. Recall that the conditional mean of YjX is the mean of Y w.r.t. the conditional distribution, f(yjx). James B. Ramsey () Chapter 12 26/07 17 / 26
Moments of Bivariate Distributions Because F x (X) = R F j (X, Y )dy, F y (Y) = R F j (X, Y )dx James B. Ramsey () Chapter 12 26/07 18 / 26
Moments of Bivariate Distributions Because F x (X) = R F j (X, Y )dy, F y (Y) = R F j (X, Y )dx the univariate moments are all de ned as before. James B. Ramsey () Chapter 12 26/07 18 / 26
Moments of Bivariate Distributions Because F x (X) = R F j (X, Y )dy, F y (Y) = R F j (X, Y )dx the univariate moments are all de ned as before. The theoretical analogue to the sample covariance is: James B. Ramsey () Chapter 12 26/07 18 / 26
Moments of Bivariate Distributions Because F x (X) = R F j (X, Y )dy, F y (Y) = R F j (X, Y )dx the univariate moments are all de ned as before. The theoretical analogue to the sample covariance is: µ 1,1 (X, Y ) = E f(x Z Z η x )(Y η y )g (X η x )(Y η y )f (X, Y )dxdy Z Z = XYf (X, Y )dxdy η x η y James B. Ramsey () Chapter 12 26/07 18 / 26
If X, Y are independently distributed, E{(X-η x )(Y-η y )} = 0. James B. Ramsey () Chapter 12 26/07 19 / 26
If X, Y are independently distributed, E{(X-η x )(Y-η y )} = 0. µ 1,1 (X, Y ) = σ x,y = E f(x η x )(Y η y )g James B. Ramsey () Chapter 12 26/07 19 / 26
If X, Y are independently distributed, E{(X-η x )(Y-η y )} = 0. µ 1,1 (X, Y ) = σ x,y = E f(x η x )(Y η y )g is the theoretical covariance. James B. Ramsey () Chapter 12 26/07 19 / 26
If X and Y are not independent, but joint Gaussian; James B. Ramsey () Chapter 12 26/07 20 / 26
If X and Y are not independent, but joint Gaussian; η x = η y = 0, and σ x = σ y = 1 for convenience, then: James B. Ramsey () Chapter 12 26/07 20 / 26
If X and Y are not independent, but joint Gaussian; η x = η y = 0, and σ x = σ y = 1 for convenience, then: E fxy g = = Z Z x y Z Z x y XY φ(x, Y )dxdy XY φ(y jx )φ(x )dxdy James B. Ramsey () Chapter 12 26/07 20 / 26
2 Z Z X 4 expf 3 1 g[y ρx 2(1 ρ Y 2 ]2 ) p p dy 5 expf X 2 p 2 g dx x y jx 2π 1 ρ 2 2π James B. Ramsey () Chapter 12 26/07 21 / 26
2 Z Z X 4 expf 3 1 g[y ρx 2(1 ρ Y 2 ]2 ) p p dy 5 expf X 2 p 2 g dx x y jx 2π 1 ρ 2 2π Z X fρx g expf X 2 p 2 g dx x 2π = E fρx 2 g = ρ James B. Ramsey () Chapter 12 26/07 21 / 26
As variance of X is unity by assumption. James B. Ramsey () Chapter 12 26/07 22 / 26
As variance of X is unity by assumption. ρ measures the degree of linear association. James B. Ramsey () Chapter 12 26/07 22 / 26
As variance of X is unity by assumption. ρ measures the degree of linear association. If σ x and σ y are non-unitary; James B. Ramsey () Chapter 12 26/07 22 / 26
As variance of X is unity by assumption. ρ measures the degree of linear association. If σ x and σ y are non-unitary; E fxy g = ρσ x σ y James B. Ramsey () Chapter 12 26/07 22 / 26
As variance of X is unity by assumption. ρ measures the degree of linear association. If σ x and σ y are non-unitary; E fxy g = ρσ x σ y is the covariance {units of X and of Y} and ρ the correlation coe cient is dimensionless. James B. Ramsey () Chapter 12 26/07 22 / 26
The Sampling of Joint & Conditional Distributions In Chapter 9 discussed the sampling of univariate distributions and have explored the use of simple random sampling at length. James B. Ramsey () Chapter 12 26/07 23 / 26
The Sampling of Joint & Conditional Distributions In Chapter 9 discussed the sampling of univariate distributions and have explored the use of simple random sampling at length. Sampling for a joint distn. is similar: collect random samples of individuals & measure the joint observations; e.g. sample individuals and measure income & consumption, or height & weight. James B. Ramsey () Chapter 12 26/07 23 / 26
Sampling for conditional distributions is di erent. James B. Ramsey () Chapter 12 26/07 24 / 26
Sampling for conditional distributions is di erent. One can sample for height given weight, or weight given height. James B. Ramsey () Chapter 12 26/07 24 / 26
Sampling for conditional distributions is di erent. One can sample for height given weight, or weight given height. This can be achieved by: sampling heights & for each height sample weight; or sample for weight & for each weight,sample heights. James B. Ramsey () Chapter 12 26/07 24 / 26
Sampling for conditional distributions is di erent. One can sample for height given weight, or weight given height. This can be achieved by: sampling heights & for each height sample weight; or sample for weight & for each weight,sample heights. If using natural experiments, be sure which conditional distn. is being sampled. James B. Ramsey () Chapter 12 26/07 24 / 26
Consider some examples: Sampling I.Q. and heights; James B. Ramsey () Chapter 12 26/07 25 / 26
Consider some examples: Sampling I.Q. and heights; Sampling electrical output & fuel inputs; James B. Ramsey () Chapter 12 26/07 25 / 26
Consider some examples: Sampling I.Q. and heights; Sampling electrical output & fuel inputs; Incomes & consumption. James B. Ramsey () Chapter 12 26/07 25 / 26
End of Chapter 12. James B. Ramsey () Chapter 12 26/07 26 / 26