Fridayʼs lecture" Problem solutions" Joint densities" 1."E(X) xf (x) dx (x,y) dy X,Y Marginal distributions" The distribution of a ratio" Problems"

Frdayʼs lecture" Jot destes" Margal dstrbutos" The dstrbuto of a rato" Problems" Problem solutos" 1." E(X) = xf X (x)dx = x f X,Y (x,y)dy dx 2. E(X) " = kp X (k) = p X (1) + 2p X (2) +... = xf X,Y (x,y)dxdy k=1 = P(X > ) + P(X > 1) + P(X > 2) +... = P(X > k) k= 152 153 3." P(X < 1,Y < 1) = 1 1 λ 2 e λ (x+y) dxdy = (1 exp( 1λ)) 2 (1 e 1λ ) 2 =.1 λ =.11 4. From 2, f X ad Y are o-egatve teger-valued, we have " E(X) = (1 F X (k)) (1 F Y (k)) = E(Y) k= k= 2 ca be geeralzed for teger-valued radom varables to" E(X) = P(X > k) + P(X k) k= k= ad almost the same argumet apples." 1 154 I the cotuous case, f X, we have" (1 F X (x))dx = f X (t)dt dx x = f X (t) dx = tf X (t)dt = E(X) ad the same argumet as the frst case apples. Fally, the secod argumet ca smlarly be exteded to the cotuous case." k 5. (a)" P(S = N = k) = π (1 π) k t (b) P(S " =,N = k) = P(S = N = k)p(n = k) = k πk (1 π) k k pk (1 p) k 155 1

(c)" P(S = ) = P(S =,N = ) = = π = (1 p) 1 π π = (1 p) 1 π = = = ( πp) (1 πp)!π (1 π) p (1 p) ( )!!( )! (1 π)p 1 p (d) " P(N =,S = ) P(N = S = ) = P(S = ) 1 p = 1 πp 1+ 1 1 p 1 πp (1 π)p 1 p (1 π)p 1 p 156 A codtoal desty" If (X,Y) has ot desty f X,Y (x,y), we ca defe a codtoal desty of X, gve that Y=y, by" f X Y (x y) = f (x,y) X,Y f Y We ca the compute" P(X A Y = y) = x A f X Y (x y)dx eve though the codto {Y=y} has probablty." Dscrete case?" 157 A example" Let f X,Y (x,y)=2, x, y, x+y 1. Fd the codtoal desty of Y gve that X=x." Idepedece" Two radom varables are depedet f" P(X A,Y B) = P(X A)P(Y B) I partcular, ths holds f" p X,Y (x,y) = p X (x)p Y or" or" f X,Y (x,y) = f X (x)f Y f X Y (x y) = f X (x) 158 159 2

Idepedece" Two radom varables are depedet f" P(X A,Y B) = P(X A)P(Y B) I partcular, ths holds f" p X,Y (x,y) = p X (x)p Y or" f X,Y (x,y) = f X (x)f Y The addto rule for expectatos" E(X+Y) = E(X) + E(Y)" NOTE: No assumpto of depedece. Ths result holds wheever the expectatos exst." A specal case: E(aX + b) = a E(X) + b" or" f X Y (x y) = f X (x) 16 161 The addto rule for varaces" Var(X+Y) = E((X+Y) 2 ) (E(X+Y)) 2 " " " "= E(X 2 )+2E(XY)+E(Y 2 ) (E(X)) 2" " " " 2E(X)E(Y) (E(Y)) 2 " " " "= Var(X) + Var(Y) " " " "+ 2(E(XY) E(X)E(Y))" If X ad Y are depedet, " E(XY) = xyf X (x)f Y dxdy = xf X (x)dx yf Y dy = E(X)E(Y) so Var(X+Y) = Var(X) + Var(Y)" Covarace" The covarace of X ad Y s defed as" Cov(X,Y) = E{(X E(X))(Y E(Y))}" " " = E(XY) E(X)E(Y) E(X)E(Y)" " " "+ E(X)E(Y)" " " "= E(XY) E(X)E(Y)" If X ad Y are depedet, Cov(X,Y) =." Cov(X,X) = " Var(X+Y) = Var(X) + Var(Y) + 2 Cov(X,Y)" Var(X Y) =" 162 163 3

Correlato" Uts of covarace s product of uts of X ad Y. Sometmes oe wats a utfree quatty. To do that we stadardze" X ad Y:" X * = X E(X) Var(X), Y* = Y E(Y) Var(Y) Defe the correlato coeffcet" ρ(x,y) = Cov(X *,Y * ) where" σ X = = Cov(X,Y) σ X σ Y Var(X) 164 Propertes of correlato coeffcet" ρ(x,y) 1 If ρ(x,y) = 1 the Y = ax + b" 165 Last Modayʼs lecture" Codtoal dstrbuto ad desty" Idepedet radom varables" The addto rules for expected value ad varace" Covarace ad correlato" A example" Let X be -1,, or 1 wth equal probabltes 1/3. " E(X) = " Let Y = 1 f X=, otherwse." E(Y) = " XY = " E(XY) = " Cov(X,Y) = E(XY) E(X)E(Y) = " Are X ad Y depedet?" 166 167 4

Calculatg covarace ad correlato" f X,Y (x,y) = 2, < x < y < 1 f X (x) = 2(1 x), < x < 1 f Y = 2y, < y < 1 1 E(X) = x 2(1 x) dx = 1 3 1 E(Y) = y 2y dy = 2 3 1 y E(XY) = xy 2dxdy = 1 4 Cov(X,Y) = 1 1 2 = 1 4 3 3 36 1 Var(X) = x 2 2(1 x)dx 1 3 1 Var(Y) = y 2 2y dy 2 3 Corr(X,Y) = 1 36 1 18 = 1 2 ( ) 2 = 1 ( ) 2 = 1 18 18 168 Law of large umbers" Let X 1, X 2,... be depedet radom varables, all wth the same dstrbuto havg expected value µ ad varace σ 2.! The" P 1 X µ > ε =1 as E 1 X =1 = 1 Var X =1 = Lk" We wrte " X = " 1 " X", the sample average. " 169 =1 Estmato" I have studed mmum aual temperatures Karlstad, Swede. It has bee suggested that" F X ( x;µ,σ,ξ) = exp 1+ ξ x µ σ If we kew the parameters ξ, μ ad σ, we could draw a hstogram of the data ad plot the correspodg desty to see f t s a good ft. " Desty..2.4.6.8 Karlstad data" µ,σ,ξ = 22,5,-.25 µ,σ,ξ = 2.4,4.9,-.23 Karlstad 17-4 -35-3 -25-2 -15-1 Mmum temperature 171 5

Modayʼs lecture" Cocepts" Covarace ad correlato" Law of large umbers" Estmato" 172 Samplg dstrbuto" Sce " ˆθ(X " 1,...,X ) s a radom varable, we ca compute ts samplg dstrbuto cdf " P(ˆ θ (X 1,...,X ) x) ad other propertes such as" E θ (ˆθ(X 1,...,X ) = ˆθ (x,...,x )f 1 X (x 1,...,x ;θ)dx 1...dx bas(ˆ θ,θ) = E θ (ˆ θ ) θ Var θ (ˆ θ ) se(ˆ θ ) = sd θ (ˆ θ ) = h(θ) ese(ˆ θ ) = h(ˆ θ ) What estmato s all about" I 1918 R. A. Fsher proposed estmatg parameters by cosderg " how lkely are the data f θ s the true parameter?" Choosg the parameter that makes the observatos most lkely s formalzed usg the lkelhood fucto! L(θ ) = f X 1,...,X (x 1,...,x ;θ), cotuous case p X 1,...,X (x 1,...,x ;θ ),dscrete case The data are fxed! The parameter s varyg! R.A.Fsher 189-1962 175 6

The expoetal case" Cosder x 1,...,x depedet observatos from a expoetal dstrbuto wth parameter λ." The lkelhood s " L(λ) = λ exp( λx ) = λ exp( λ x ) =1 Lkelhood.e+ 5.e-25 1.e-24 1.5e-24..5 1. 1.5 2. lambda =1 176 The method of maxmum lkelhood" Defe the mle" ˆθ = argmax(l(θ)) We compute t by settg" L (θ) = ad checkg that " L (θ) " < ", or that L has sg chage + about the maxmum. Alteratvely, plot L (θ) as a fucto of θ ad fd the maxmum umercally." Computatoal trck: maxmze the log lkelhood (θ) " = log(l(θ)) 177 Expoetal case" L(λ) = λ e λ x L (λ) = λ 1 e λ x λ ( x )e λ x = λ 1 e λ x ( λ x ) (λ) = log(λ) λ (λ) = λ x (λ) = λ < 2 x = ˆλ = 1/ x Bomal case" N L(p) = x p x =1 (1 p)n x = cost p x N (1 p) (p) = cost + log p x 1 p + log(1 p)n (p) = x =1 p(1 p) N 1 p = ˆp = x N N (p) = (ˆp p) so + p(1 p) x 178 179 7

Frdayʼs lecture" The method of maxmum lkelhood" Computatoal tools" Checkg for a maxmum" Problems" 18 Problem solutos" 1. Note msprt problem:" E(X Y = y) = xf X Y (x y)dx (a)" E(E(X Y)) = x f (x,y) X,Y f f Y Y dy dx = = xf X,Y (x,y)dxdy = E(X) (b)" Var(X Y = y) = E(X 2 2 Y = y) E(X Y = y) As (a), " E(E(X 2 Y = y)) = E(X 2 ) Let Z = E(X Y = y), so Var(X) = E(Var(X Y = y)) + [ E(Z) ] 2 181 But E(X) " " = E(E(X " " Y " = y)) " = " E(Z) " "so" E(X) " " " " whece " [ ] 2 = [ E(Z) ] 2 Var(X) = E(Var(X Y = y)) + E(Z 2 ) E(Z) (c) " [ ] 2 = E(Var(X Y = y) + Var(E(X Y = y)) E(Y) = E(E(Y X = k)) =E(X.1) = 1.1= 1 2. Cov(U,V) = Cov(X+Y,Y+Z) = Cov(X,Y) +" Cov(X,Z) + Cov(Y,Y) + Cov(Y,Z) = Var(Y)" =144" Var(U) = Var(X) + Var(Y) + Cov(X,Y) = 169" Var(V) = Var(Y) + Var(Z) + Cov(Y,Z) = 18" 144 Corr(U,V) = 169 18 =.83 3. (a)" E( µ) = E( a X ) = a E(X ) = µ a = µ =1 (b)" Var( µ) = a 2 Var(X ) = σ 2 2 a =1 (c) By symmetry the weghts ought to be equal, whch case they each have to be 1/. Ths s deed optmal, sce" =1 a 1 =1 2 wth equalty f ad oly f each a = 1/. " =1 =1 =1 2 = a 2 a + 2 = a 2 1 =1 =1 2 so" a = a 1 + 1 1 =1 2 =1 182 183 8

4. Cov(X,aX+b) = a Cov(X,X) = a Var(X)" Var(aX+b) = a 2 Var(x) so Corr(X,aX+b) =" a /" a. Coversely, f Corr(X,Y) = 1 wrte" Var(Y* X*) = Var(Y*) + Var(X*) 2 = so Y* X* = c or" Y E(Y) σ Y = X E(X) σ X + c so Y = ax + b where" a = σ Y, b = E(Y) + σ σ Y (c + E(X) ) X σ X The case Corr(X,Y)=-1 s smlar, startg from Var(Y* + X*) = " 5. Let X ad Y be the respectve arrval tmes. They are depedet, U(9,1). We eed to compute P( X Y >1) = 1 P( X Y 1)." 184 The ot dstrbuto s uform o the square wth corers (9,9), (9,1), (1,9) ad (1,1). The probablty we wat therefore s the area of the two tragles wth below:" Ths area s clearly (5/6) 2 = 25/36." 185 HOV lae eeded?" HOV, cot." The followg data are for passeger car occupacy durg oe hour at Wlshre ad Budy Los Ageles:" Occupats Frequecy Predcted 1 676 2 227 3 56 4 26 5 6 6 14 The geometrc dstrbuto p X = p(1-p) y-1 s a reasoable ft. The log lkelhood s" l(p) = log(p) + (y 1)log(1 p) =1 whece" l (p) = p (y 1) = p ˆ = 1 1 p y 186 But we do ot have detaled data o the 6 group. However," P(Y 6) = p(1 p) k 1 = p(1 p) 5 (1 p) k = (1 p) 5 k=6 k= so the log lkelhood, the log probablty of what we actually observed, becomes" l(p) = ( log(p) + (y 1)log(1 p) ) + 5log(1 p) {:y <6} {:y 6} = (111 14)log(p) + 455 log(1 p) + 14 5log(1 p) = 997 log(p) + 525 log(1 p) 187 9

The uform dstrbuto" Let X 1,,X be d U(,θ). The " L(θ) = θ, so " "l(θ) = - log(θ) ad " "l (θ) = -/θ What s the mle?" Reparametrzato" A drug reacto survellace program was carred out 9 hosptals. Out of 11,526 motored patets, 3,24 had a adverse reacto." Model: X = # adverse reactos ~" Log lkelhood" 188 Stadard error of the mle:" The bootstrap method ust plugs the estmate of p to the formula for the stadard error:" But the stadard error s a fucto of p. What s ts mle?" Fact: The mle of " h(θ) s h(ˆθ). 189 Stoppg" Flp a co utl the frst heads. Suppose t takes 6 tres. The lkelhood s" Now suppose we were gog to flp the co 6 tmes, ad happeed to get oe head. The lkelhood s" How are the mles dfferet?" Fact: Chagg the lkelhood by a costat does ot chage the mle." However, the stadard errors would be dfferet." 19 1