Joint distribution. Joint distribution. Marginal distributions. Joint distribution

Joint distribution To specify the joint distribution of n rndom vribles X 1,...,X n tht tke vlues in the smple spces E 1,...,E n we need probbility mesure, P, on E 1... E n = {(x 1,...,x n ) x i E i, i = 1,...,n}. Joint distribution To specify the joint distribution of n rndom vribles X 1,...,X n tht tke vlues in the smple spces E 1,...,E n we need probbility mesure, P, on E 1... E n = {(x 1,...,x n ) x i E i, i = 1,...,n}. For A E 1... E n P((X 1,...,X n ) A) = P(A) nd for A 1 E 1,...,A n E n P(X 1 A 1,...,X n A n ) = P(A 1... A n ). We sy tht P is the joint distribution of the bundled vrible X =. p.1/3 (X 1,...,X n ).. p.1/3 Joint distribution Mrginl distributions To specify the joint distribution of n rndom vribles X 1,...,X n tht tke vlues in the smple spces E 1,...,E n we need probbility mesure, P, on E 1... E n = {(x 1,...,x n ) x i E i, i = 1,...,n}. For A E 1... E n If P is the joint distribution of X = (X 1,...,X n ) we cn get the mrginl distribution of X i s P i (A) = P(X i A) = P(E 1... E i 1 A E i+1... E n ). for A E i. P((X 1,...,X n ) A) = P(A) nd for A 1 E 1,...,A n E n P(X 1 A 1,...,X n A n ) = P(A 1... A n ).. p.1/3. p./3

Mrginl distributions Independence If P is the joint distribution of X = (X 1,...,X n ) we cn get the mrginl distribution of X i s P i (A) = P(X i A) = P(E 1... E i 1 A E i+1... E n ). for A E i. Specifiction of the mrginl distributions lone is not enough to specify the joint distribution we lso need to specify how the vribles we consider re relted. Definition: We sy tht X 1,...,X n re independent if P(X 1 A 1,...,X n A n ) = P(X i A 1 )... P(X n A n ). (1) If we specify the mrginl distributions of X 1,...,X n nd sy tht the vribles re independent then we hve specified the joint distribution by eqution (1).. p./3. p.3/3 Independence Trnsformtions Definition: We sy tht X 1,...,X n re independent if P(X 1 A 1,...,X n A n ) = P(X i A 1 )... P(X n A n ). (1) Theorem: If X 1 nd X re independent rndom vribles tking vlues in E 1 nd E respectively, nd if h 1 : E 1 E 1 nd h : E E re two trnsformtions then the rndom vribles h 1 (X 1 ) nd h (X ) re independent. Mrginl trnsformtions preserve independence.. p.3/3. p.4/3

Discrete smple spces Exmple If E 1,...,E n re discrete smple spces so is E = E 1...E n nd the joint distribution is given in terms of point probbilities p(x 1,...,x n ), x i E i, i = 1,...,n. Consider E = E 0 E 0 = {A, C, G, T} {A, C, G, T}, nd we let X nd Y denote rndom vribles representing two evolutionry relted nucleic cids in DNA sequence. Let the joint distribution of X nd Y hve point probbilities A 0.17 0.0063 0.0464 0.0051 C 0.0196 0.008 0.008 0.076 G 0.0556 0.0145 0.151 0.0071 T 0.0146 0.0685 0.0069 0.1315. p.5/3. p.6/3 Discrete smple spces Exmple If E 1,...,E n re discrete smple spces so is E = E 1...E n nd the joint distribution is given in terms of point probbilities p(x 1,...,x n ), x i E i, i = 1,...,n. The mrginl distribution of X i hs point probbilities p i (x i ) = P(X i = x i ) = for x i E i. x 1,...,x i 1,x i+1,...,x n p(x 1,...,x i 1, x, x i+1,...,x n ) Consider E = E 0 E 0 = {A, C, G, T} {A, C, G, T}, nd we let X nd Y denote rndom vribles representing two evolutionry relted nucleic cids in DNA sequence. Let the joint distribution of X nd Y hve point probbilities A 0.17 0.0063 0.0464 0.0051 C 0.0196 0.008 0.008 0.076 G 0.0556 0.0145 0.151 0.0071 T 0.0146 0.0685 0.0069 0.1315 Let A = {(x, y) E x = y} denote the event tht the two relted nucleic cids re identicl then. p.5/3 P(X = Y ) = P(A) = 0.17 + 0.008 + 0.151 + 0.1315 = 0.6746.. p.6/3

Independence nd point probbilities Exmple Theorem: If X 1,...,X n re rndom vribles with vlues in discrete smple spces then they re independent if nd only if P(X 1 = x 1,...,X n = x n ) = P(X 1 = x 1 )... P(X n = x 1 ) X Y A 0.17 0.0063 0.0464 0.0051 0.1850 C 0.0196 0.008 0.008 0.076 0.301 G 0.0556 0.0145 0.151 0.0071 0.93 T 0.0146 0.0685 0.0069 0.1315 0.15 0.170 0.901 0.766 0.163 Sme exmple s bove but with the point probbilities for the mrginl distributions. Note tht X nd Y re not independent! For instnce 0.17 = P((X, Y ) = (A, A)) P(X = A) P(Y = A) = 0.1850 0.170 = 0.0401. p.7/3. p.8/3 Independence nd point probbilities Exmple Theorem: If X 1,...,X n re rndom vribles with vlues in discrete smple spces then they re independent if nd only if P(X 1 = x 1,...,X n = x n ) = P(X 1 = x 1 )... P(X n = x 1 ) In words, the rndom vribles re independent if nd only if the point probbilities for their joint distribution fctorize s product of the point probbilities for their mrginl distributions. X Y A 0.0401 0.0537 0.051 0.0400 0.1850 C 0.0654 0.0874 0.0833 0.0651 0.301 G 0.0634 0.0848 0.0809 0.063 0.93 T 0.0481 0.0643 0.0613 0.0479 0.15 0.17 0.901 0.766 0.163 Sme mrginls s bove but X nd Y re independent in this exmple.. p.7/3. p.9/3

The bivrite norml distribution The bivrite norml distribution The function f(x, y) = 1 ρ exp ( x ρxy + y ) π 0.15 0.15 for ρ ( 1, 1) on R is n exmple of bivrite density on R. It stisfies tht f(x) 0 nd f(x, y)dxdy = 1 density 0.10 0.05 0.00 4 x 0 4 0 4 4 y density 0.10 0.05 0.00 4 x 0 4 0 4 4 y With ρ = 0 (left) nd ρ = 0.75 (right).. p.10/3. p.11/3 The bivrite norml distribution The bivrite norml distribution The function f(x, y) = 1 ρ exp ( x ρxy + y ) π We find tht the mrginl distribution of X is given by P( X b) = b f(x, y)dydx = b (1 ρ ) π e (1 ρ )x dx, for ρ ( 1, 1) on R is n exmple of bivrite density on R. It stisfies tht f(x) 0 nd f(x, y)dxdy = 1 The rndom vribles X nd Y hve joint distribution with density f if P( 1 X b 1, Y b ) = b1 b 1 f(x, y)dxdy. p.10/3. p.1/3

The bivrite norml distribution The bivrite norml distribution We find tht the mrginl distribution of X is given by P( X b) = b f(x, y)dydx = b (1 ρ ) π e (1 ρ )x dx, We find tht the mrginl distribution of X is given by P( X b) = b f(x, y)dydx = b (1 ρ ) π e (1 ρ )x dx, which shows tht the mrginl distribution of X is N(0, (1 ρ ) 1 ). which shows tht the mrginl distribution of X is N(0, (1 ρ ) 1 ). The mrginl distribution of Y is N(0, (1 ρ ) 1 ), but the ρ lso determines dependence between X nd Y. The vribles X nd Y re independent if nd only if ρ = 0.. p.1/3. p.1/3 The bivrite norml distribution We find tht the mrginl distribution of X is given by P( X b) = b f(x, y)dydx = b (1 ρ ) π e (1 ρ )x dx, which shows tht the mrginl distribution of X is N(0, (1 ρ ) 1 ). Generl results If X 1,...,X n re rel vlued rndom vribles we cn specify their joint distribution by density f : R n [0, ) such tht P( 1 X 1 b 1,..., n X n b n ) = b1 1 bn n f(x 1,...,x n )dx n...dx 1 cn be computed s n successive ordinry integrls (order does not mtter). The mrginl distribution of Y is N(0, (1 ρ ) 1 ), but the ρ lso determines dependence between X nd Y.. p.1/3. p.13/3

Generl results The model builders pproch If X 1,...,X n re rel vlued rndom vribles we cn specify their joint distribution by density f : R n [0, ) such tht P( 1 X 1 b 1,..., n X n b n ) = b1 1 bn n f(x 1,...,x n )dx n...dx 1 We wnt to construct probbilistic model for the rndom vribles X 1,...,X n. cn be computed s n successive ordinry integrls (order does not mtter). The mrginl distribution of X i hs density f i (x i ) = f(x 1,...,x n )dx n...dx i+1 dx i 1...dx 1. }{{} n 1. p.13/3. p.14/3 Generl results The model builders pproch If X 1,...,X n re rel vlued rndom vribles we cn specify their joint distribution by density f : R n [0, ) such tht P( 1 X 1 b 1,..., n X n b n ) = b1 1 bn n f(x 1,...,x n )dx n...dx 1 We wnt to construct probbilistic model for the rndom vribles X 1,...,X n. Cn we ssume tht the vribles re independent if yes continue. cn be computed s n successive ordinry integrls (order does not mtter). The mrginl distribution of X i hs density f i (x i ) = f(x 1,...,x n )dx n...dx i+1 dx i 1...dx 1. }{{} n 1 The X i s re independent if f(x 1,...,x n ) = f 1 (x 1 )... f n (x n ).. p.13/3. p.14/3

The model builders pproch Conditionl distributions We wnt to construct probbilistic model for the rndom vribles X 1,...,X n. Cn we ssume tht the vribles re independent if yes continue. Cn we ssume tht the vribles ll hve the sme mrginl distribution if yes continue. Definition: The conditionl distribution of Y given tht X A is defined s P(Y B X A) = provided tht P(X A) > 0. P(Y B, X A) P(X A). p.14/3. p.15/3 The model builders pproch Conditionl distributions We wnt to construct probbilistic model for the rndom vribles X 1,...,X n. Cn we ssume tht the vribles re independent if yes continue. Cn we ssume tht the vribles ll hve the sme mrginl distribution if yes continue. Then we sy: Let X 1,...,X n be iid = independent nd identiclly distributed. Definition: The conditionl distribution of Y given tht X A is defined s P(Y B X A) = provided tht P(X A) > 0. P(Y B, X A) P(X A) If X nd Y re discrete we cn condition on events X = x nd get conditionl distributions in terms of point probbilities And we need to specify either the point probbilities for their common distribution or the density the joint distribution is given by products. p(y x) = P(Y = y X = x) = P(Y = y, X = x) P(X = x) where p(x, y) re the joint point probbilities. = p(x, y) p(x, y) y. p.14/3. p.15/3

Exmple Systemtic specifiction Using P(Y = y X = x) = P(X = x, Y = y) P(X = x) = p(x, y) p(x, y). y E we hve to divide by precisely the row sums to get the mtrix of conditionl distributions: X Y A 0.6874 0.0343 0.507 0.076 C 0.0649 0.6667 0.073 0.411 G 0.1904 0.0495 0.7359 0.04 T 0.0658 0.3093 0.0311 0.5938 The row sums bove equl 1 nd this is n exmple of mtrix of trnsition Dependence mong discrete vribles indexed by time prmeter cn be treted systemticlly. We cn define collection of conditionl probbilities, P t (x, y) for t 0, of Y = y given X = x s the solution to system of differentil equtions: dp t (x, y) dt for λ(z, y) 0 for z y nd The λ(y, z) s re clled intensities. = z P t (x, z)λ(z, y) λ(z, z) = y z λ(y, z). probbilities.. p.16/3. p.17/3 Systemtic specifiction Solution Dependence mong discrete vribles indexed by time prmeter cn be treted systemticlly. We cn define collection of conditionl probbilities, P t (x, y) for t 0, of Y = y given X = x s the solution to system of differentil equtions: On finite smple spce nd with the initil condition P 0 (x, x) = 1 the bove system of differentil equtions hs unique solution such tht P t (x, ) is (conditionl) probbility mesure for ll x. dp t (x, y) dt = z P t (x, z)λ(z, y) In generl no closed form expression for the solution. for λ(z, y) 0 for z y nd λ(z, z) = y z λ(y, z).. p.17/3. p.18/3

Jukes-Cntor model Liner Regression Intensities: A 3α α α α C α 3α α α G α α 3α α T α α α 3α We often specify the conditionl distribution of rel vlued rndom vrible Y given X = x for nother rel vlued rndom vrible X by writing Y = α + βx + ǫ where ǫ is nother men 0 rndom vrible (noise), which is independent of X. The prmeter α > 0 tells how mny muttions tht occur per time unit. The solution is P t (x, x) = 0.5 + 0.75 exp( 4αt) P t (x, y) = 0.5 0.5 exp( 4αt), if x y,. p.19/3. p.1/3 Kimur model Liner Regression Intensities: A α β β α β C β α β β α G α β α β β T β α β α β for α, β > 0 nd the solution is We often specify the conditionl distribution of rel vlued rndom vrible Y given X = x for nother rel vlued rndom vrible X by writing Y = α + βx + ǫ where ǫ is nother men 0 rndom vrible (noise), which is independent of X. This is loction trnsformtion of the distribution of ǫ. The conditionl men of Y given X = x is α + βx. P t (x, x) = 0.5 + 0.5 exp( 4βt) + 0.5 exp( (α + β)t) P t (x, y) = 0.5 + 0.5 exp( 4βt) 0.5 exp( (α + β)t), if λ(x, y) = α P t (x, y) = 0.5 0.5 exp( 4βt), if λ(x, y) = β,. p.0/3. p.1/3

Liner Regression Conditionl densities We often specify the conditionl distribution of rel vlued rndom vrible Y given X = x for nother rel vlued rndom vrible X by writing Y = α + βx + ǫ where ǫ is nother men 0 rndom vrible (noise), which is independent of X. This is loction trnsformtion of the distribution of ǫ. The conditionl men of Y given X = x is α + βx. Definition: If f is the density for the joint distribution of two rndom vribles X nd Y tking vlues in R n nd R m, respectively, then with f 1 (x) = f(x, y)dy R m we define the conditionl distribution of Y given X = x to be the distribution with density f(x, y) f(y x) = f 1 (x). for y R m nd x R n with f 1 (x) > 0. If ǫ N(0, σ ) then Y X = x N(α + βx, σ ). Note the formul f(x, y) = f(y x)f 1 (x), tht llows us to specify the joint deistribution by specifying the mrginl. p.1/3 distribution of X nd the conditionl distribution of Y given X.. p./3 Conditionl densities Generlized liner models Definition: If f is the density for the joint distribution of two rndom vribles X nd Y tking vlues in R n nd R m, respectively, then with f 1 (x) = f(x, y)dy R m we define the conditionl distribution of Y given X = x to be the distribution with density f(x, y) f(y x) = f 1 (x). We consider the setup with the probbility mesures P θ on discrete E 1 R given by the point probbilities for θ Θ R. p θ (x) = exp(θx b(θ) + c(x)) for y R m nd x R n with f 1 (x) > 0.. p./3. p.3/3

Generlized liner models We consider the setup with the probbility mesures P θ on discrete E 1 R given by the point probbilities for θ Θ R. p θ (x) = exp(θx b(θ) + c(x)) Let Y be rel vlued rndom vrible we cn define the conditionl distribution of X given Y = y to be P β0 +β 1 y.. p.3/3 Generlized liner models We consider the setup with the probbility mesures P θ on discrete E 1 R given by the point probbilities for θ Θ R. p θ (x) = exp(θx b(θ) + c(x)) Let Y be rel vlued rndom vrible we cn define the conditionl distribution of X given Y = y to be P β0 +β 1 y. Tht is, the conditionl point probbilites for the distribution of X given Y = y re p(x y) = p β0 +β 1 y(x) = exp((β 0 + β 1 y)x b(β 0 + β 1 y) + c(x)). The conditionl men of X given Y = y is b (β 0 + β 1 y).. p.3/3