THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE 2

THE ROYAL STATISTICAL SOCIETY 06 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULE The Socety s provdg these solutos to assst caddates preparg for the examatos 07. The solutos are teded as learg ads ad should ot be see as "model aswers". Users of the solutos should always be aware that may cases there are vald alteratve methods. Also, the may cases where dscusso s called for, there may be other vald pots that could be made. Whle every care has bee take wth the preparato of these solutos, the Socety wll ot be resposble for ay errors or omssos. The Socety wll ot eter to ay correspodece respect of these solutos.

Note that there are half-marks some questos. Please roud up ay odd half-mark a questo the total mark for that questo.. () E( X ) x( ) dx ( ) [ x ] [] ( ) (or by a geometrc argumet) Method of momets: set sample mea equal to populato mea ad solve for, [] so the estmator ˆ satsfes ˆ X.e. ˆ X [] () E( X ) E( X ) [ E( X )], so E( ˆ ) E( X ) ( ).e. ˆ s ubased [] 3 3 x 3 3 E( X ) x ( ) ( ) [ ] ( ) [] 3( ) 3 var( X ) ( ) (4 4 4 3 6 3 ) ( ) ( ) [] Hece ˆ 4( ) ( ) var( ) 4 var( X ). [] 3 () Cramér-Rao lower boud: uder certa regularty codtos (may be mplct) [] the varace of ay ubased estmator of a parameter s bouded below by l E [] l ( E also acceptable), where l log( L( )) ad L( ) s the lkelhood fucto. Oe of the regularty codtos for the C-R lower boud s that the rage of values for x should ot deped o. Ths does ot hold here, so the C-R s ot applcable [].

(v) For y, P( Y y) P( X, X,..., X y) P( X y) [] ( y ) y ( ) [] y F( y) [] so ( y) f( y) ( ) [] (v) Mea square error of s E E c Y E c Y [( ) ] [] [( ( ) ) ] [( ( )) ] ( ) c( ) E[ Y] c E[( Y) ] [] c( ) ( ) c ( ) ( ) ( ) ( ) [] [Caddates may possbly start from MSE = Varace + Bas. If they do ad succeed gettg the correct fal expresso they should get 3 marks, wth or marks for partally successful attempts.] Dfferetate w.r.t. c ad equate to zero: ( ) c( ) 0 [] so c=. [] The secod dervatve s postve, so ths correspods to a mmum. [] [A alteratve argumet ot usg calculus would be acceptable.]

. () Suppose that ˆ s the MLE of a parameter ad g( ) s a (-) fucto [] of. The g( ˆ ) s the MLE of. [] () The lkelhood fucto s L(π; y) = ( y ) πy ( π) y []. Takg logs, l = log(l) = cost + y log(π) + ( y) log( π). [] Dfferetatg ad equatg to zero, y = ( y), [] so π = y. []. π ( π ) l π < 0, ths s the MLE []. () T x T P( X T ) e dx [] e [], say, ad P( X T). 0 We have a bomal [] wth trals, probablty of success, ad y successes. [] Hece the MLE of s y []. T ˆ From part () ˆ e, [] so T log( ˆ y ˆ ) log( ) [] ad ˆ T log y as requred. [] (v) The approxmate dstrbuto of ˆ s N(, I l l ), where I E E (ether expresso OK) ad l s the log lkelhood. [] mark s awarded for ormalty, [] for the mea, [] for ether expresso for the varace, ad [] for sayg that l s the log lkelhood. There s o eed to use the otato I θ. The recprocal of oe of the gve expressos could be quoted drectly as the varace of θ. A approxmate 95% cofdece terval has ed-pots ˆ.96I, []

3. () A statstc T( X, X,... X ) s a fucto of X, X,... X but ot of. [] It s suffcet for f the codtoal dstrbuto of X, X,... X, gve the value of T, does ot deped o. [] What ths meas s that T cotas all the formato about that s avalable X, X,... X. [] () The lkelhood s L( ; x) f ( x ; ) exp{ A( ) B( x ) C( x ) D( )} [] [] exp{ A( ) B( x ) C( x ) D( )} [Ths fal expresso s ot strctly eeded aswerg (), but s eeded (). It should be awarded a mark whether t appears the aswer to () or to ().] () Wrte K ( T; ) exp{ A( ) B( x ) D( )} [] ad K ( x) exp{ C( x )}. [] The L( ; x) K( T; ) K( x), where T B( x ) s a suffcet statstc for. [] (v) The lkelhood fucto s x x L( ; x) exp [] exp{ A( ) B( x ) C( x ) D( )} where A( ), B x C( x ) log( x ) ad D( ) log( ). [] ( ) x, Hece the dstrbuto s a member of the oe-parameter expoetal famly [] ad B( x ) x s a suffcet statstc for. [] (v) A pror dstrbuto represets kowledge about the probablty dstrbuto of a ukow parameter before ay data are cosdered. [] Combg the pror dstrbuto wth the lkelhood fucto gves the posteror dstrbuto. [] A cojugate pror dstrbuto s such that the resultg posteror dstrbuto belogs to the same famly as the pror dstrbuto. []

The lkelhood fucto for the Raylegh dstrbuto ca be wrtte as x x L( ; x) exp [] Whe multpled by a pror dstrbuto of the gve form, we get a posteror dstrbuto proportoal to exp( ), where [] ad x. [] Ths s of the same form as the pror dstrbuto wth updated parameter values [] so the famly of pror dstrbutos gve s deed cojugate. []

4 () Suppose that we are testg a ull hypothess H0: agast the alteratve H: []. The test statstc for a geeralsed lkelhood rato test of H0 vs. H s Max Max L( ; x) L( ; x), [] where L( ; x) s the lkelhood fucto. [0.5] Reject H0 for small values of. [0.5] () For large samples [0.5] where d s the dfferece betwee the log ~ d umber of freely varyg depedet parameters ad. [0.5] For H0: 0 vs. H : 0, d= [] ad L( ˆ ˆ 0; x) L( ; x), where s the maxmum lkelhood estmator for. [] log [ l( ˆ ; x) l( ; x)], where l( ; x) s the log-lkelhood []. Usg the approxmato 0 Pr[( ( ˆ ; ) ( ; )) ], where l x l 0 x ; ; s the upper crtcal pot of, [] a cofdece terval for wth cofdece coeffcet ( ) s gve by those values of for whch ˆ [] l( ; x) l( ; x) ; () For a radom sample of observatos x, x,..., x from a Posso dstrbuto wth mea the lkelhood fucto s x e x! x e x. [] The log lkelhood s l( ; x) cost x log. [] Dfferetatg, equatg the dervatve to zero, [] ad checkg the secod dervatve to cofrm a maxmum,[0.5] gves the MLE as ˆ []. x x The geeralsed lkelhood rato test statstc for testg H0: 5 agast a two-sded alteratve s log [ l( ˆ ; x) l(5; x)]. []

ˆ x 7 / 9 3, [0.5] log [7(log 3 log 5) 9(3 5)] [ 3.79 8] 8.4. [] Ths s well above the 5% crtcal value for (3.84) [0.5], so the ull hypothess s rejected. So there s evdece that the accdet rate has chaged [0.5] (actually decreased). (v) From part (), the edpots of the terval satsfy the equato l( ; x) l(3; x) [] gvg (7log 9 ) (7log3 9x3).9, [] 3.84 whch smplfes to 3log 0.085 as requred. []

5 (a) Deote the vector of values x, x,..., x by x. Fd a fucto gx ( ; ) of x ad whch s mootoc [] ad whose probablty dstrbuto s kow ad does ot deped o. [] Ths s called a pvotal quatty. [] Make a probablty statemet regardg gx ( ; ).e Pr[ g g( x; ) g] where s fxed (typcally 0.05 or 0.0) [] ad g, gdo ot deped o. Now mapulate the equaltes the probablty statemet to get the mddle [].e Pr[ ( X) ( X)]. The terval ( ( X), ( X)) s a 00( )% cofdece terval for. [] [If the mootocty codto s ot metoed, full marks ca stll be acheved provded that t s metoed that the resultg cofdece set eed ot be a sgle terval.] (b) I the Bayesa framework, s cosdered to be a radom varable [] ad so has a probablty dstrbuto. A pror dstrbuto s specfed, [] before takg ay data to accout. The lkelhood fucto represets the formato about cotaed the data x. [] The posteror dstrbuto for s obtaed by takg the product of the pror dstrbuto ad the lkelhood fucto ad ormalsg so that t tegrates to. [] A 00( )% credble terval for s gve by (, ) such that Pr[ ] accordg to the posteror dstrbuto. [] (c) Suppose that a estmate ˆ( x) based o x ( x, x,..., x ) s avalable for. [] Take * * a sample of sze wth replacemet from x, x,... x - call t x - ad calculate ˆ( x ). [] Repeat the samplg wth replacemet (a large umber) B tmes, to gve B estmates of. [] Arrage these B estmates ascedg order to gve ˆ* * * [] ˆ ˆ []... [ B]. [] Let B m (deally choose B so that ths s a teger). The a 00( )% bootstrap percetle cofdece terval for s [] ( ˆ, ). * * [ m] [ B m ] Iterpretato of frequetst terval: the parameter s fxed but ukow the terval s radom. [] I may repettos of fdg 00( )% cofdece tervals, the tervals wth cota the true value of the parameter 00( )% of the tme the log ru, but there o formato o whether ay dvdual terval wll do so. [] Iterpretato of Bayesa terval: s cosdered to be a radom varable so the terval s a terval betwee two quatles of ts posteror dstrbuto. [] So ths case the ed-pots of the terval are fxed (ot radom ulke the frequetst terval). []

6. () Let F ( ) 0 x be the c.d.f. of the ull dstrbuto. [] If the radom sample s ordered as X[] X[]... X[ ] defe 0 f 0 x X[] k F ( x) f X [ k ] x X[ k ], k,,..., f x X[ ] [] The the test statstc for the oe-sample Kolmogorov-Smrov test s Sup F 0( x ) F ( x ) [] x [The ull hypothess wll be rejected for large values of the test statstc. If ths s stated here by caddates, but ot explctly metoed ther soluto to (), addtoal mark should be awarded here.] () x e Recogse that s the c.d.f. for a expoetal dstrbuto wth mea [] ad that F0 ( x) F ( x) must reach ts maxmum value at or mmedately below oe of the observed values of x. [] It rses to 0.67 just below x [], the drops to 0.033; t rses to 0.38 just below x [], the drops to 0.038; rses to 0.359 just below x [3], the drops to 0.59; rses to 0.93 just below x [4], the drops to 0.093; rses to 0.099 just below x [5] before jumpg to 0.0, ad fally decreasg to zero. [ marks f all these calculatos are correct, mark f method s clearly kow but calculatos are wrog.] So the test statstc has value 0.359. [] Large values of the test statstc lead to rejecto of the ull hypothess [] ad 0.359 s well short of the gve crtcal values []. Thus there s suffcet evdece to reject H 0 [] () The t-test s a test for the mea [] whereas the sg test s a test for the meda [] The dstrbuto of the test statstc for the t-test assumes approxmate ormalty for the data whch s clearly ot the case here. [] The sg test s o-parametrc ad does ot deped o the dstrbuto of the data ad so ca be used here. [] (v) If the dstrbuto s expoetal wth mea, the ts meda m s foud by solvg m e 0.5 [] so m log.386 []. Hece test H : m.386 agast H : m.386 [] 0

The test statstc S s the umber of observatos greater tha.386, whch s 3. [] Sce ths s rght the cetre of the dstrbuto of S [B(5, 0.5)], there s clearly o evdece agast the ull hypothess. [] [Although I m ot expectg t, I would award full marks f a caddate calculated Pr(X>) = 0.368, ad used a test statstc equal to the umber of observatos exceedg, wth ull dstrbuto B(5, 0.368)].

p 7. (a) If p s the probablty of H0 the s the odds of the H0. [] ( p ) Gve two smple hypotheses H0, (ull) H (alteratve) the Bayes factor s the lkelhood uder H 0 dvded by the lkelhood uder H [] [The recprocal of ths would also be acceptable, although the wordg of the questo should steer caddates towards ths defto]. I Bayesa ferece for a parameter, the posteror dstrbuto of s gve by L( ; x) p( ) q( x), where p( ) s the pror dstrbuto, L( ; x) s the lkelhood hx ( ) fucto, ad hx () s the margal dstrbuto of the data. [] Suppose the smple hypotheses are H0 : 0; H :. The q( 0 x) L( 0; x) p( 0) L( ; x) p( ) Posteror odds = [] q( x) h( x) h( x) p 0 L( 0; x) pror odds x Bayes factor, as requred.[] p L( ; x) x e (b) () Pr( X x) [] x! so L( ; x) x e e x! x! x The Bayes factor s ths lkelhood evaluated at 5 dvded by the lkelhood evaluated at 0. The factoral terms cacel leavg x 5 x e 5 5 [] e (0.5) [] x 0 e 0 () Posteror odds wll be greater tha pror odds f the Bayes factor exceeds. [] Ths occurs f 5 x x e (0.5) or 5 (0.5) e () [] 5 5 xlog(0.5) 5 ; [] x ; [] x log(0.5) log(), as requred [] Pror dstrbuto has p.d.f. s x e x! [0.5]. 0. 0. e, 0 [0.5] ad the lkelhood fucto The lkelhood for a composte hypothess such as H s obtaed by tegratg the product of ths lkelhood fucto ad the pror dstrbuto

over values of cotaed the hypothess, [] so x 0. x e 0.e 0. ( 0.) L( H; x) d e d x! x! [] 0 0 0. x!( 0.) ( x) ( x) ( 0.) ( 0.) e d ( x ) 0 [] The tegral s the form of the gamma fucto gve the ht wth 0. ( x ) 0. ad ( x ) [] so L( H ; x) ( x ) x!( 0.) the Bayes factor s x 5 e 5 x x 5 L( H0; x) e (5) L( H ; x) 0. ( x ) ( x ) x!( 0.) ( x ) [] ad as requred. []

8. () A strategy d s admssble f there s aother strategy dj for whch U ( d, ) U ( d, ) for all k, where Ud (, ) s the utlty of strategy whe k j k state of ature holds, [] wth at least oe equalty strct []. Clearly d s admssble (compare wth ay other strategy), as s d (compare wth d3, d4). []. A maxm strategy s oe for whch the mmum utlty s maxmsed. [] The mmum utlty s.0, 0.5, -4.5 for d3, d4, d5 respectvely (admssble strateges do t eed to be cosdered, but caddates wll ot be pealsed f they do so). So d3 s maxm. [] k () The Bayes strategy s the oe whch maxmses expected utlty, [] where expectato s take wth respect to the pror dstrbuto of the states of ature. [] Pror probabltes of,, 3, 4 are 0.4, 0.4, 0., 0. respectvely [], ad the expected utltes for the three admssble strateges are: d3: 0.4x + 0.4x + 0.x + 0.x =.; d4: 0.4x0.5 + 0.4x0.5 + 0.x.5 + 0.x.5 = 0.9; d5: -0.4x4.5 + 0.4x - 0.x.5 + 0.x4 = -.5. [] So d3 s the Bayes strategy. [] () Expected utltes are d3: ( ) d4: 0.5.5( ).5 d5:.75.5( ).5 3 [] It s farly clear that the expected utlty for d5 s smaller tha that for d3 for ay value of betwee 0 ad, so d5 s ever Bayes. [] d3 s better tha d4 f.5.e. 0.5. So d3 s Bayes whe >0.5 ad d4 s Bayes for 0.5 [] (Both are equally good at 0.5). (v) Posteror probablty for gve advce of small demad, s the product of the pror probablty ad the probablty of small demad, gve such advce, dvded by the probablty of such advce. []

For ths s 0.04 0.09 0.0,,. [] 0.4x0.9 0.36. For, 3, 4, the correspodg values are For d3 the expected utlty s ow (0.36+0.04)+(0.09+0.0) 0.6 [] Caddates could smlarly calculate expected utltes for d4, d5, but t would be equally acceptable to say that lookg at the table of utltes ad the domace of the pror probablty for, that t s clear that the expected utltes for d4, d5 wll be less tha that for d3 [] so d3 s Bayes whe the advce s for small demad. [] The expected utlty f the advce s used s (Expected utlty whe advce s large)x(probablty advce s large ) + (Expected utlty whe advce s 0.665 0.6( ) small )x(probablty advce s small ).65. [] ( ) I the absece of advce, part () showed that the optmal strategy was d3 wth expected utlty.. So the advce creases expected utlty by 0.065.e 6500 compared to a fee of 50000, so the ga from usg the cosultacy frm s advce s 500. []