Lecture 3 Probability review (cont d)

STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto of each varable. If they are ot depedet, the we eed to specfy ther jot dstrbuto. I the dscrete case, the jot dstrbuto s specfed by a jot PMF f X,...,X k (,..., k ) = P[X =,..., X k = k ]. I the cotuous case, t s specfed by a jot PDF f X,...,X k (,..., k ), whch satsfes for ay set A R k, P[(X,..., X k ) A] = f X,...,X k (,..., k )d... d k. A Whe t s clear whch radom varables are beg referred to, we wll smply wrte f(,..., k ) for f X,...,X k (,..., k ). Eample 3.. (X,..., X k ) have a multomal dstrbuto, (X,..., X k ) Multomal(, (p,..., p k )), f these radom varables take oegatve teger values summg to, wth jot PMF ( ) f(,..., k ) = p p... p k k,...,. Here, p,..., p k are values [0, ] that satsfy p +... + p k = (represetg the probables of k dfferet mutually eclusve outcomes), ad ( ),..., s the multomal coeffcet ( ),..., =!. (It s uderstood that the above formula s oly for!!...!,..., k 0 such that +... + k = ; otherwse f(,..., k ) = 0.) X,..., X k descrbe the umber of samples belogg to each of k dfferet outcomes, f there are total samples each depedetly belogg to outcomes,..., k wth probabltes p,..., p k. For eample, f I roll a stadard s-sded de 00 tmes ad let X,..., X 6 deote the umbers of s to 6 s obtaed, the (X,..., X 6 ) Multomal(00, (,,,,, )). 6 6 6 6 6 6 A secod eample of a jot dstrbuto s the Multvarate Normal dstrbuto, dscussed the et secto. The covarace betwee two radom varables X ad Y s defed by the two equvalet epressos Cov[X, Y ] = E [(X E[X])(Y E[Y ])] = E[XY ] E[X]E[Y ]. 3-

So Cov[X, X] = Var[X], ad Cov[X, Y ] = 0 f X ad Y are depedet. The covarace s blear: For ay costats a,..., a k, b,..., b m R ad ay radom varables X,..., X k ad Y,..., Y m (ot ecessarly depedet), Cov[a X +... + a k X k, b Y +... + b m Y m ] = k m a b j Cov[X, Y j ]. = j= The correlato betwee X ad Y s ther covarace ormalzed by the product of ther stadard devatos: Cov[X, Y ] corr(x, Y ) =. Var[X] Var[Y ] For ay a, b > 0, we have Cov[aX, by ] = ab Cov[X, Y ]. O the other had, the correlato s varat to rescalg: corr(ax, by ) = corr(x, Y ), ad satsfes always corr(x, Y ). 3. The Multvarate Normal dstrbuto The Multvarate Normal dstrbuto of dmeso k s a dstrbuto for k radom varables X,..., X k whch geeralzes the ormal dstrbuto for a sgle varable. It s parametrzed by a mea vector µ R k ad a symmetrc covarace matr Σ R k k, ad we wrte (X,..., X k ) N (µ, Σ). Rather tha wrtg dow the geeral formula for ts jot PDF (whch we wll ot use ths course), let s defe ths dstrbuto by the followg propertes: Defto 3.. (X,..., X k ) have a multvarate ormal dstrbuto f, for every choce of costats a,..., a k R, the lear combato a X +...+a k X k has a (uvarate) ormal dstrbuto. (X,..., X k ) have the specfc multvarate ormal dstrbuto N (µ, Σ) whe, addto,. E[X ] = µ ad Var[X ] = Σ for every =,..., k, ad. Cov[X, X j ] = Σ j for every par j. Whe (X,..., X k ) are multvarate ormal, each X has a (uvarate) ormal dstrbuto, as may be see by takg a = ad all other a j = 0 the above defto. The vector µ specfes the meas of these dvdual ormal varables, the dagoal elemets of Σ specfy ther varaces, ad the off-dagoal elemets of Σ specfy ther parwse covaraces. Eample 3.3. If X,..., X k are ormal ad depedet, the a X +... + a k X k has a ormal dstrbuto for ay a,..., a k R. To show ths, we ca use the MGF: Suppose X N (µ, σ ). The a X N (a µ, a σ ), so (from Lecture ) a X has MGF M a X (t) = e a µ t+ a σ t. 3-

As a X,..., a k X k are depedet, the MGF of ther sum s the product of ther MGFs: M a X +...+a k X k (t) = M a X (t)... M ak X k (t) = e a µ t+ a σ t... e a kµ k t+ a σ t = e (a µ +...+a k µ k )t+ (a σ +...+a k σ k )t. But ths s the MGF of a N (a µ +... + a k µ k, a σ +... + a k σ k ) radom varable! As the MGF uquely determes the dstrbuto, ths mples a X +... + a k X k has ths ormal dstrbuto. The by defto, (X,..., X k ) are multvarate ormal. More specfcally, ths case we must have (X,..., X k ) N (µ, Σ) where µ = E[X ], Σ = Var[X ], ad Σ j = 0 for all j. Eample 3.4. Suppose (X,..., X k ) have a multvarate ormal dstrbuto, ad (Y,..., Y m ) are such that each Y j (j =,..., m) s a lear combato of X,..., X k : Y j = a j X +... + a jk X k for some costats a j,..., a jk R. The ay lear combato of (Y,..., Y m ) s also a lear combato of (X,..., X k ), ad hece s ormally dstrbuted. So (Y,..., Y m ) also have a multvarate ormal dstrbuto. For two arbtrary radom varables X ad Y, f they are depedet, the corr(x, Y ) = 0. The coverse s geeral ot true: X ad Y ca be ucorrelated wthout beg depedet. But ths coverse s true the specal case of the multvarate ormal dstrbuto; more geerally, we have the followg: Theorem 3.5. Suppose X s multvarate ormal ad ca be wrtte as X = (X, X ), where X ad X are subvectors of X such that each etry of X s ucorrelated wth each etry of X. The X ad X are depedet. To vsualze what the jot PDF of the multvarate ormal dstrbuto looks lke, let s just cosder the two-dmesoal settg k =, where we obta the specal case of a Bvarate Normal dstrbuto for two radom varables X, Y. I ths case, the dstrbuto s specfed by the meas µ ad µ of X ad Y, the varaces σ ad σ of X ad Y, ad the correlato ρ betwee X ad Y. Whe σ = σ = ad µ = µ = 0, the cotours of the jot PDF of X ad Y are show below, for ρ = 0 o the left ad ρ = 0.75 o the rght: 3-3

y 0 y 0 0 0 Whe FIGURE ρ = 0, 7. X ad Y are depedet stadard ormal varables, ad these cotours are crcular; Jot the PDFs jot of PDF two Bvarate has a peak Normal at 0 ad dstrbutos. decays radally O the away left, from X ad 0. Whe Y are ρ = 0.7, the cotours margally are ellpses. N (0, ) ad As have ρ creases zero correlato. to, theo cotours the rght, cocetrate X ad Y aremore margally ad more aroud the le N (0, y ) = ad. have (I the correlato geeral 0.75. k-dmesoal settg ad for geeral µ ad Σ, the jot PDF has a sgle peak at the mea µ R k, ad t decays away from µ wth cotours that are ellpsods aroud µ, wth ther shape depedg o Σ.) 3.3 Statstcs For data X,..., X, a statstc T (X,..., X ) s ay real-valued fucto of the data. I other words, t s ay umber that you ca compute from the data. For eample, the sample mea X = (X +... + X ), the sample varace ad the rage S = ((X X) +... + (X X) ), R = ma(x,..., X ) m(x,..., X ) are all statstcs. Sce the data X,..., X are realzatos of radom varables, a statstc s also a (realzato of a) radom varable. A major use of probablty ths course wll be to uderstad the dstrbuto of a statstc, called ts samplg dstrbuto, based o the dstrbuto of the orgal data X,..., X. Let s work through some eamples: Eample 3.6 (Sample mea of IID ormals). Suppose X,..., X IID N (µ, σ ). The sample mea X s actually a specal case of the quatty a X +...+a X from Eample 3.3, where 3-4

a =, µ = µ, ad σ = σ for all =,...,. The from that Eample, X N ) (µ, σ. Eample 3.7 (Ch-squared dstrbuto). Suppose X,..., X IID N (0, ). Let s derve the dstrbuto of the statstc X +... + X. By depedece of X,..., X, +...+X (t) = (t)... (t). We may compute, for each X, ts MGF (t) = E[e tx ] = e t π e d = π e (t ) d. If t, the M X (t) =. Otherwse, (t) = t t π e ( t) d. We recogze the quatty sde ths tegral as the PDF of the N (0, hece the tegral equals. The { t (t) = ( t) / t <. t ) dstrbuto, ad Ths s the MGF of the Gamma(, ) dstrbuto, so X Gamma(, ). Ths s also called the ch-squared dstrbuto wth degree of freedom, deoted χ. Gog back to the sum, +...+X (t) = (t)... (t) = { t ( t) / t <. Ths s the MGF of the Gamma(, ) dstrbuto, so X +... + X Gamma(, ). Ths s called the ch-squared dstrbuto wth degrees of freedom, deoted χ. 3-5