THE ROYAL STATISTICAL SOCIETY 2009 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 2 STATISTICAL INFERENCE

THE ROYAL STATISTICAL SOCIETY 009 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE STATISTICAL INFERENCE The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the formato of ay other persos usg the examatos The solutos should NOT be see as "model aswers" Rather, they have bee wrtte out cosderable detal ad are teded as learg ads Users of the solutos should always be aware that may cases there are vald alteratve methods Also, the may cases where dscusso s called for, there may be other vald pots that could be made Whle every care has bee take wth the preparato of these solutos, the Socety wll ot be resposble for ay errors or omssos The Socety wll ot eter to ay correspodece respect of these solutos Note I accordace wth the coveto used the Socety's examato papers, the otato log deotes logarthm to base e Logarthms to ay other base are explctly detfed, eg log 0 RSS 009

Graduate Dploma, Module, 009 Questo ( E W E W μ Var ( EW W + EW σ + μ E W E W σ + μ ( For,,,, we have ( W W ρσ E( WW E( W E( W Cov,, + + + ad therefore EWW + ρσ + μ E WW + ρσ + μ, as requred ( Method of momets estmators are obtaed as follows ˆ μ W We have ˆ σ +, so ˆ μ W ˆ σ W W We have ˆ ρσˆ + ˆ μ WW +, so ˆ ρ WW + W W W Soluto cotued o ext page

( E ˆ σ E W + W E W + W 4 ( EW ( EW ( EWW ( σ + μ + + σ + μ σ + μ + ρσ + μ 4 ( ˆ σ σ ( + ρ Bas E σ σ σ ρ ( ρ (v Var ( W W Var ( W Var ( W Cov ( W, W σ ρσ + + + + σ Var ( ˆ μ Var ( W+ W ( + ρ (v If W ˆ W, ρ s udefed Assume the that W W We the have WW ( W + W + WW WW ( W + W ˆ ρ 4 4 ( W + W ( W + W + WW WW + ( W + W 4 4 Clearly ths s ot a estmator to be reled o It s ot possble to obta a sesble estmate of a correlato based o oly two observatos

Graduate Dploma, Module, 009 Questo p P( N log ( p ( E( N p p p log log log ( p ( p ( p ( p ( For depedet observatos N, N,, N 40, the lkelhood s L p N ( ( p 40 N log p Σ ( ( Π log L p N log p 40log log p log N dlog L ΣN 40 + dp p log ( p p The maxmum lkelhood estmator ˆp therefore satsfes ΣN 40 + 0 pˆ pˆ pˆ ( log( ( The Fsher formato s the questo I d log L E dp (the secod dervatve s quoted 40 E N + 40( + log ( p ( log( p p p ( + ( p 40 p 40 log p ( p log( p p log p ( ( p ( p p p ( p p ( p log( p 40 log log ( ( p p ( log( 40 log p p p Soluto cotued o ext page

Therefore a approxmate 95% cofdece terval for p s gve by pˆ ± 96 ( log( ( pˆ pˆ pˆ pˆ pˆ 40 log If p ˆ 08, ths cofdece terval s ( 08 96 08 0 log 0 ± 40 log 0 08 (, e 08 ± ( 96 00506, e 08 ± 0099 or (070, 8899 (v We use the Newto-Raphso method, startg from p ˆ 0 075 Iteratos cotue accordg to the scheme descrbed below utl covergece occurs pˆ dlog L pˆ p pˆ dp 0 d log L dp p pˆ 0 0 We have, sertg pˆ 0 075 ad Σ N 00, ad dlog L 00 40 + 798, p p dp ˆ0 075 05log 05 ( + d log L 00 40 log 05 + 3064 p pˆ dp 0 075 05log 05 798 pˆ 075 075 + 00585 0808 3064

Graduate Dploma, Module, 009 Questo 3 A radom terval (X, X s a 95% cofdece terval for θ f P( X < θ < X 095 for all possble values of θ ( f ( x x k Σx Σx α k,, ( Πx e α e ( Πx k k (( k! α α (( k! ( ; α g Σ x k h(x,, x Sce the jot desty s the product of a factor ot volvg the parameter α ad a factor oly depedet o the observatos x,, x through Σx, t follows by the factorsato theorem that Y ΣX s suffcet for α ( Frst, the momet geeratg fucto (mgf of Y X s Y ( ( mgf of ( α k (for t < α M t X t Y Now wrtg W, we have that the mgf of W s α ty t Y tw α α t MW ( t E( e E e E e MY, α t α α k e M ( t ( t W k ad ths s the mgf of χ k Therefore, by the : correspodece of mgfs ad dstrbutos, W χ k, ( Use the stadard satsfyg P( r k χ k tables to fd r satsfyg P( r k χ < 005 ad r Y χ < 0975 The we have P r < < r 095, for all α Y Y α Thus a 95% cofdece terval for α s, r r Soluto cotued o ext page

(v 0, k 3 So the umber of degrees of freedom s 0 3 60 From χ tables, we have r 4048 ad r 8398 the 95% cofdece terval for α s ΣX ΣX ΣX ΣX to, e to 8398 4048 4649 04 To fd the expected legth of ths terval, we eed E(ΣX, e E(Y as defed above Usg the mgf of Y, ths s the dervatve of M Y (t at t 0 We have dmy dt ( t ( t k kα α whch, o sertg t 0 together wth 0 ad k 3, gves smply 30α the expected legth of the 95% cofdece terval s 30α 0768α 04 4649

Graduate Dploma, Module, 009 Questo 4 Suppose that θ s the parameter of a dstrbuto We wat to test H where 0 : θ θ H : θ θ 0 0 θ ad θ ( θ are gve 0 Let α be the requred sgfcace level The Neyma-Pearso approach s to choose the test wth the largest power at θ, subject to ts sze beg α The Neyma- Pearso lemma shows that ths property s satsfed by a lkelhood rato test ( f ( x μ( a θ ( μ( + aθ + e x! The lkelhood s L( μθ, f ( x x μ( θ Π ( + aθ ( x! x + Σa Σx e μ Π The lkelhood rato s x ( x ( Π L e a x L e x 0 0Σa Σx 0, 0 Π + / Π! 0 Σx 0, 0 0 Π /! 0Σa e + x ( a Therefore the crtcal rego cossts of values of the x such that 0Σa x e ( + a k e such that log (, where k s a costat, Σ x + a c, where c s a costat ( Uder H 0 (μ 0 ad θ 0, we have E(X Var(X 0 ad thus E(X log( + a 0 log( + a ad Var(X log( + a 0(log( + a Therefore, by the cetral lmt theorem, ( ( Σ X log + a N 0Σ log + a,0σ log + a uder H 0 ( 0 We requre P Σ X log + a c H 005 Thus we have ( c 0Σ log + a + 645 0Σ log + a Soluto cotued o ext page

( (a The lkelhood rato s ( L μ,0 e μ / Π x! μ Σ L e Π x μ Σ x ( μ 0 e 0 Σx 0, 0 0 /! 0 Therefore the crtcal rego cossts of values such that x x μ μ Σ ( 0 e 0 c e such that Σx c (sce μ > 0 Thus the same form of test s obtaed for all μ > 0, so ths test s uformly most powerful (b The lkelhood rato s ( θ ( θ x ( θ ( x Π L e a x L e x 0 + Σa Σx 0, 0 Π + / Π! 0 Σ 0,0 0 /! ( θ 0θ Σa e a x Π + Therefore the crtcal rego cossts of values such that ( θ Σ x log + a c Dfferet tests are obtaed for dfferet values of θ so there s o uformly most powerful test

Graduate Dploma, Module, 009 Questo 5 ( Let ˆ θ \ be the correspodg estmator based o the observatos wth X mssg, e ˆ θ ˆ \ θ ( X, X,, X, X +,, X, for,,, θ ˆ θ ˆ θ, for,,, Now defe ( \ The jack-kfe estmator s the gve by θj θ k We have E ( ˆ θ θ +, ad so E ( ˆ θ\ k θ + ( ˆ ( ˆ \ E θ E θ E θ θ + k θ k θ ( J E E θ θ θ θ, e θ s a ubased estmator of θ J ( (a We use cˆ ( / U T S U T X T For the sample wth X mssg, e X,, X, X+,, X, the sum of the observatos s T X ad the sum of the squares of the observatos s U X U X cˆ \ ( T X ( U U X T ( c T X Therefore the jack-kfe estmator c s ( U U X c ( T X c T Soluto cotued o ext page

(b Let ˆ σ ( c c The a approxmate 95% cofdece terval for the coeffcet of varato s ˆ σ c ± t where t s the upper 5% pot of t (c Take say 000 bootstrap samples I the th sample, sample values at radom wth replacemet from X, X,, X ad use these values to * sample sd fd c sample mea * * * Order the 000 estmates: c( < c < < c( 000 * * The a approxmate 95% cofdece terval s c to c 5 ( 975

Graduate Dploma, Module, 009 Questo 6 f y e m y α α (,,, ( The lkelhood s L m m α α e ασ y the log lkelhood s log L α mlog + mlogα ασ y dlog L m m Σ y whch o settg equal to zero gves soluto ˆ α dα α Σ y To vestgate whether ths s a maxmum, cosder d log L m dα < 0 α ˆ α m/ y lkelhood estmator of α Σ maxmses log L α ; thus m/ Σ y s the maxmum For H0 : α, H: α, the geeralsed lkelhood rato test has crtcal rego gve by ( ( ˆ α log L log L k (for some costat k, m e mlog Σ y mlog + m k y Σ e log( Σ y m Σ y k (for some costat k ( Whe m s large, ( L L( ˆ α The 95% pot of log log χ, approxmately, uder H 0 χ s 384 So choose k the above equal to 384 Soluto cotued o ext page

( H0 : α β, H: α β As above, ˆ β ad, uder H 0, ˆ m+ ˆ α β Σ w Σ y +Σ w For ths geeralsed lkelhood rato test, the crtcal rego s gve by { ˆ αβ ˆ ( ˆ αβˆ } log L, log L, k { mlog ˆ α + log ˆ α ˆ ασ y ˆ ασ w e ( ˆ α ˆ α log ( ˆ ˆ β β } mlog + Σ y + Σ w k m+ y w Σ +Σ e ( m+ log ( m+ e m mlog + m log + k y w Σ Σ m+ m m+ log mlog log k y w y w Σ +Σ Σ Σ (v There s oe costrat uder H 0 k 384 (as above Isertg the gve values the left-had sde of the above equalty gves 300 00 00 300 log 00 log 00 log 40 40 00 whch equals 333 Sce 333 < 384, there s ot sgfcat evdece agast H 0 at the 5% level

Graduate Dploma, Module, 009 Questo 7 (a Pror ( p 6p ( p ( 0 p π < < ( Let X umber sample supportg Caddate We have X B (, p x, so The posteror desty s f x p p p x x f p x p p p p p p x x x x+ ( ( 6 ( ( x+ We ote from the formato the questo that ths s a beta dstrbuto wth α x + ad α + x ( For a large sample, the posteror dstrbuto has approxmately a Normal dstrbuto From the formato the questo, x+ x+ mea x + + + x + 4, varace ( x+ ( + x ( + 5( + 4 So a approxmate Bayesa 95% terval for p s gve by ( x+ ( + x x + ± 96 + 4 + 5 + 4 Soluto cotued o ext page

(b Pror (,, π p p p p 3 x x x3 ( P X x, X x, X x p p p 3 3 3 x+ 3 The posteror jot dstrbuto s smply proportoal to x x p p p3, e from the formato the questo t s a Drchlet dstrbuto wth α x +, α x + ad α 3 x 3 + ( Wth respect to the posteror, usg the formato the questo, ( E p p E p E p x + x + x x +, + 4 + 4 + 4 ( p p ( p + ( p ( p p Var Var Var Cov, ( x+ ( x + x3+ + ( x + ( x+ x3+ 3 + ( x+ ( x + ( + 4 ( + 5 So a approxmate Bayesa 95% terval for p p s gve by x x + ± 96 Var + + 4 ( p p

Graduate Dploma, Module, 009 Questo 8 Decso makg: the actos "accept H 0 " or "accept H " must be take after aalysg the data Usually o wder ssues are volved; the data are relevat oly to the mmedate stuato (eg qualty cotrol ether wat to stop the producto le or let t cotue Stregth of evdece: t s ot ecessarly expected that the curret expermet wll lead to mmedate actos, rather that t wll add to prevously gaed formato Wder ssues are volved ad t s ofte felt mportat that sgfcat evdece s foud several depedet studes (eg at depedet cetres I prcple, p- values ca be combed (meta-aalyss Oe applcato s clcal trals The cotrast should ot be take too far I the former case, a value ear the crtcal value ("just accept" H 0 or "just accept" H may lead to a suspeso of acto utl further evdece s obtaed O the other had, a "very sgfcat" result the secod case may lead to mmedate acto Sgfcace level: decso makg, ths wll delberately be chose to reflect the "cost" of wrogly rejectg H 0 (eg stoppg the producto le whe othg s wrog I the stregth of evdece approach, t s customary to use oe of the tradtoal values (eg 005, or to quote the exact p-value Sample sze: decso makg, ths wll be delberately chose to reflect the cost of makg wrog decsos (eg cotug operatg the producto le whe fact there s a fault I the stregth of evdece approach, t s commo practce to esure that the sample sze s suffcetly large that the power of detectg a effect of practcal mportace s suffcetly hgh