THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the formato of ay other persos usg the examatos The solutos should NOT be see as "model aswers" Rather, they have bee wrtte out cosderable detal ad are teded as learg ads Users of the solutos should always be aware that may cases there are vald alteratve methods Also, the may cases where dscusso s called for, there may be other vald pots that could be made Whle every care has bee take wth the preparato of these solutos, the Socety wll ot be resposble for ay errors or omssos The Socety wll ot eter to ay correspodece respect of these solutos RSS

Graduate Dploma, Statstcal Theory & Methods, Paper II, Questo () Let X deote the umber of breakages the th chromosome The the lkelhood fucto s 33 x 33 x 33 = e e L( ) = = for > 33 33 e x! = ( e ) x! = Gve that ( ) x = + 6 + + 3 =, 33 l L = l = 33 33l ( e ) + l l ( x!) = dl 33e 33 So = 33 + = 33 + Wth the usual regularty codtos, the d e e maxmum lkelhood estmate ˆ dl satsfes d =, e 33 33 + = ˆ e ˆ () by dl e = d ( e ) 33 A teratve algorthm for fdg ˆ umercally s gve dl d l + = d = d = A tal estmate could be foud by plottg l() [or L()] agast Alteratvely, t s ofte satsfactory to use the estmator for a otrucated Posso, whch here would be = = 37 33 () Usg the gve value ˆ 36 36 36e 984 36 P X = = = = e 977 =, P( X k) ( 36) 36 e = = 36 e k! 36 = = = = 8 P( X ) P( X ) Smlarly, P(X = 3) = 84, P(X = 4) = 966, P(X = 5) = 45 Hece P(X 6) = 63 [Note These probabltes are accurate to 4 dp, but there s slght roudg the expected frequeces below] x 3 4 5 6 TOTAL observed 6 4 5 7 33 expected 334 6 7 649 467 58 33 Comparg the observed ad expected frequeces, the χ test wll have 4 df sce had to be estmated The test statstc s ( ) ( ) ( ) ( ) 467 ( ) 334 6 6 4 7 5 649 7 58 X = + + + + + = 457 334 6 7 649 467 58 Ths s very hghly sgfcat as a observato from χ 4, e there s very strog evdece agast the ull hypothess of a trucated Posso dstrbuto k

Graduate Dploma, Statstcal Theory & Methods, Paper II, Questo (a) Suppose that the data x = ( x x x ) mass) fucto f ( x, ) T have jot probablty desty (or, beg a ukow parameter The loss L δ ( x), of a decso rule s the loss assocated wth choosg that decso The rsk of δ s Rδ ( ) = E X L{ δ ( X), } rπ ( δ) = Eπ Rδ ( ) = Rδ ( ) π ( ) d, whch ; ad the Bayes rsk s π s the pror dstrbuto of A pror dstrbuto whch leads to posteror dstrbutos the same famly s called cojugate β π, < < α () The pror dstrbuto of s ( ) X s bomal, X beg the umber of seeds germatg out of Hece the posteror dstrbuto of s α β ( ) ( ) x x α + x x ( ) = ( ) π β x +, < < Ths s beta wth parameters α + x ad β + x Therefore Γ ( α + β + ) α + x β + x π ( x) = ( ), for < < Γ α + x Γ β + x () Wth a quadratc loss fucto, the Bayes estmate of s equal to the mea of uder the posteror dstrbuto Ths s Γ ( α + β + ) α + x Γ ( α + x) Γ ( β + x) ( α β ) ( α x ) = Γ ( α + x) Γ ( α + β + + ) β + x E x = d = Γ + + Γ + + α + x α + β + () If d = d, the posteror expected loss s Γ α + β + α + x+ β + x ce x = c ( ) d Γ α + x Γ β + x Γ α + β + Γ α + x+ c α + x α + x+ = c = Γ α + x Γ α + β + + α + β + α + β + + Sce the loss uder d = d s, choose d f ce x >, α + x α + β + + e f > α + β + c α + x+ A uform pror has α =, β = Hece for = 5, x = ad c = 5, choose d sce 8 > (647 > 6) 7 5

Graduate Dploma, Statstcal Theory & Methods, Paper II, Questo 3 () Suppose that the data cosst of pars (x, y ) (for = to ) of observatos take o uts from a populato Let the raks of the {x } be {v } ad those of the {y } be {w }, for = to Defe d = v w (for = to ) Spearma's rak correlato coeffcet r s s the product-momet correlato coeffcet of the raks (v, w ) for = to It may be calculated as () 6 rs = d = ( ) Observato 3 4 5 6 7 Rak A 3 4 5 6 7 Rak B 3 4 5 7 6 d = There are 7! possble rakgs altogether We eed to fd the umber of ways whch a value of d ca arse Keepg the A rakg fxed, the B rakg could be 3 4 5 6 7 3 4 5 6 7 3 5 4 6 7 3 4 5 7 6 3 4 5 6 7 4 3 5 6 7 3 4 6 5 7 Ths s 7 ways out of 7! for the B rakg, e the probablty (p-value) s 7 = = 7! 6! 7 () Evromet 3 4 5 6 7 8 Rak X 5 7 3 6 4 8 Rak Y 4 5 7 3 6 8 d 4 4 d = 4 6 4 3 33 r s = = = = = 538 8 63 63 63 The 5% crtcal value of r s for = 8 s 738 Hece there s o evdece of assocato (at the 5% level) [Note The 5% crtcal value s wrogly quoted Table XVI some copes of the Socety's Abrdged Tables for Examato Caddates as 74 Caddates were, of course, ot pealsed the examato]

Graduate Dploma, Statstcal Theory & Methods, Paper II, Questo 4 The power of a test s the probablty of rejectg the ull hypothess expressed as a fucto of the parameter uder vestgato If both the sgfcace level of the test ad the power requred at a partcular value of the parameter are specfed, the a lower boud for the ecessary sample sze ca be determed x = =, for x = () The lkelhood fucto s L( ) = x e = ( x ) e >, ad so the lkelhood rato for testg H : = agast H: = (where > ) s ( ) L Λ= = exp ( ) x L = By the Neyma-Pearso lemma, the most powerful test has crtcal rego c= x: x k, k beg chose to gve sgfcace level α for the test = () t t x P X > x = t e dt e = = e, x > x Hece So P ( X x) = e x / x P X > x = P X > x = e, x >, x >, so X Exp( ) x () Usg the gve result, Uder H [ = 5], Thus 5 = X = χ 5 X, wth % pot 358 = χ 5 P X 358 = 5 =, ad the test therefore rejects H f = x 358 5 5 X χ, ad so the power of the test s = 5 5 358 = 5 = 5 > 6795 = ( χ > 6795) = = (v) Uder H [ = 5], P X P X P 995

Graduate Dploma, Statstcal Theory & Methods, Paper II, Questo 5 Gve a radom sample ( T x = x x x ) parameter, the lkelhood fucto for ths sample s L( ) f ( x, ) from a dstrbuto whose pdf cotas a cosdered as a fucto of The maxmum lkelhood estmator, ˆ, of s the value of that L maxmses For large samples, uder stadard regularty codtos, where I( ) [ I ( ) dl = E d ˆ approx N,, I ( ) s Fsher's "formato fucto" ad l( ) l L( ) = s the Cramér-Rao lower boud for the varace of a ubased estmator] ˆ s cosstet, asymptotcally ubased () So E X = E X = = = Var X = X = = = Var E X Var ( X) ( E X ) = + = + E ˆ = + =, ad ˆ s a ubased estmator () L( ) x ( ) e ( ) e = = x!! ( x ) = so l( ) = l L( ) = + xl ( ) l ( x! ) dl ad x d = + = x,, > dl dl = x 3/ ad E E X 3/ d 4 d = 4 Thus I( ) = 3/ 4 ad so the Cramér-Rao lower boud s 4 3/ cotued o ext page

Questo 5 cotued () Whe =, ˆ = X X k( k ) ( k ) ˆ e E E ( X X) = = ( k k) = e k!! k k e j! E X + X + = + + j= j k= k= ( j )( j ) puttg j = k = Sce E[ X] E X = + = ad, ˆ = + 3 + = + + 3 + = + 4 + E E X X Var ˆ 4 4 + so that Hece the effcecy of ˆ s + = + + = ; t as 3/ 4 CRLB = Var ( ˆ ) ( + ), ad sce = ths s

Graduate Dploma, Statstcal Theory & Methods, Paper II, Questo 6 The opeg part of ths questo s stadard bookwork regardg the relatoshp betwee statstcal tests ad cofdece sets ( ) x ( µ ) w/ w/ () Gve f ( x, w) = e e e [where x x () ] for µ < x < ad < w <, ( ) / / w w x ( µ )/ we have fw ( w) = e ( e ) e dx µ ( ) w/ w/ v/ = e ( e ) e dv puttg v= ( x µ ), so dv = dx ( ) w/ w/ v/ = e ( e ) e v= w/ w/ = e ( e ) P W w e e dy w y/ y/ Therefore ( ) = ( ) / ( ) w y e = w/ ( e ) =, < w < () Let Z W F z = P Z z = P W z = e, < z < z = The Z Z s a fucto of whose dstrbuto does ot deped o Hece t s a pvotal quatty () Choose ay terval [, ] z z z, where z, such that fz z dz = α for < α < z w The, gve the rage W = w, we have z z, ad a ( α )% cofdece w w terval for s, z z

Graduate Dploma, Statstcal Theory & Methods, Paper II, Questo 7 Classcal (or "frequetst") A ull hypothess wll specfy a model for data, based o a dstrbuto whch there s a ukow parameter; a alteratve hypothess uses the same dstrbuto wth dfferet values for the parameter For example, a ull hypothess ca use the model N ( µ,) wth the alteratve N ( µ,) Gve the model, a test ca be set up wth a gve probablty of rejectg the ull hypothess, for example f a sample mea s "ulkely" to take the value t dd the data, where "ulkely" mght mea a probablty of less tha 5 I ths case the alteratve hypothess s automatcally accepted (eve whe the ull hypothess s fact true) The ull hypothess s ever "proved", ad eve wth large samples of data there s a measurable chace of makg Types I ad II errors It s evdece, ot proof, for or agast a ull hypothess that s obtaed ths method, ad msterpretato s easy usklled hads Ths remas the most commoly used method of hypothess testg Bayesa It s uusual to test a smple ull hypothess But after calculatg a cofdece terval, a testg process may be carred out by rejectg a ull hypothess that = f a ( α )% cofdece terval for does ot cota Probabltes ca be assged to opposg hypotheses, ad costs ca be troduced to ths process, much more easly tha others Lkelhood If does ot have a lkelhood wth a certa dstace of the maxmum lkelhood (e the lkelhood for the maxmum lkelhood estmator ˆ ) foud from the sample data, the ull hypothess that = s rejected Ths method depeds o usg lkelhood as a measure of how plausble varous values of are The dstace from the maxmum s sometmes chose rather arbtrarly

Graduate Dploma, Statstcal Theory & Methods, Paper II, Questo 8 () The pdf of X ad Y s (, ) So (, ) P Y X f x y dxdy = y + φ y/ φ x/ φ y y = e e dy = e dy φ φ = x y f x y e + φ = for xy, φ + φ () The lkelhood fucto based o observatos of w s w w L( ψ) = w ψ ( ψ), ψ = w w L ( 5) 5 5 The lkelhood rato s = = L ( 7) 7 3 w 74 w = 667 The SPR test wth the gve values of α ad β s to cotue samplg whle A< < B accept H f B accept H f A α 5 α 95 where A = = = ad B = = = 9 β 95 9 β 5 Cotue samplg whle l A< wl ( 74) + ( w) l ( 667) < l B e 63 347 < w < 63+ 347 () Plot w agast ad stop samplg as soo as the sample path crosses oe of the boudary les of the "cotue samplg" rego Σw Accept H Cotue samplg Accept H cotued o ext page

Questo 8 cotued p ( w ) z = l = wl ( 74) + ( w) l ( 667) p ( w ) The [ ] (v) Let for =,, E Z = 7 l 74 + 3l 667 = 85, ad so whe H s true the expected sample sze s approxmately equal to ( β) l A+ β lb 95l9 + 5l9 = = E 85 [ Z ] 3