THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for the nformaton of any other persons usng the eamnatons. The solutons should NOT be seen as "model answers". Rather, they have been wrtten out n consderable detal and are ntended as learnng ads. Users of the solutons should always be aware that n many cases there are vald alternatve methods. Also, n the many cases where dscusson s called for, there may be other vald ponts that could be made. Whle every care has been taken wth the preparaton of these solutons, the Socety wll not be responsble for any errors or omssons. The Socety wll not enter nto any correspondence n respect of these solutons. Note. In accordance wth the conventon used n the Socety's eamnaton papers, the notaton log denotes logarthm to base e. Logarthms to any other base are eplctly dentfed, e.g. log. RSS 6

Hgher Certfcate, Paper I, 6. Queston () The frst place can be occuped by 9 dfferent dgts, to 9. Each of the other three places can be occuped by dgts, to 9. Hence there are 9 = 9 possble PINs. () All of the combnatons n () are allowed ecept,,, 9999, so there are 9 9 = 899 possbltes. () Only the 9 dgts to 9 can be used. The frst place can be flled n 9 ways, the second n 8, the thrd n 7 and the last n 6. So there are 9 8 7 6 = 34 possbltes. (v) Wth all dgts possble n any poston, there would be 4 PINs. There are 7 ncreasng sequences (3, 34,, 6789) and 7 decreasng sequences (9876, 8765,, 3), whch are not allowed. The number of possble PINs s therefore 4 4 = 9986. (v) All of the 4 combnatons are allowed ecept: (a) the where all 4 dgts are the same:,,, 9999; (b) those where one dgt occurs three tmes and another just once. There are 9 = 9 ways of choosng the two dgts. But note that, for eample, 333, 333, 333 and 333 are four dfferent PINs; whchever two dgts occur, the odd one out can be n any of the 4 places n the PIN. Therefore there are 4 9 = 36 PINs of ths sort. The number of possble PINs s therefore 4 36 = 963.

Hgher Certfcate, Paper I, 6. Queston () A: (a) P( entres) = = =.5. 4 (b) P( entry) = = =.5. B: (a) P( entres) = 3 3 7 = =.49. 4 64 (b) P( entry) = 3 7 3 = 4 4 64 =.49. C: (a) P( entres) = 5 4 4 = =.377. 5 35 (b) P( entry) = 4 4 56 5 = =.496. 5 5 65 () P( entry n total) = P( from A, from B and C) + P( from B, from A and C) + P( from C, from A and B) 7 4 7 4 56 7 459 = + + =. 64 35 64 4 35 65 4 64 35 [If worked n decmals, ths s.469.] P( from A n total) = P( from A and n total) / P( n total) = P( from A, from B and C) / P( n total) = 7 4 64 35 459 35 8 =. 7 () Denote the numbers of entres from A, B, C as (,, ) etc. Then we need P(,, ) + P(,, ) + P(,, ) + P(,, ) + P(,, ) + P(,, ). Snce entres from each group are ndependent, we have, as an eample, P(,, ) = P( from A).P( from B).P( from C).

Hgher Certfcate, Paper I, 6. Queston 3 3 4 ( ) ( ) + () We have k d=, so k d=. Ths gves 3 4 5 = k + 3 5 = k +, so k = 3. 3 5 f() = at = and at =. f() s symmetrcal about = ½. The sketch s as follows. f()....3.4.5.6.7.8.9. () E( X ) = by symmetry [or by drect ntegraton: f( ) d]. ( ) ( ) d E( X ) = 3 d= 3 + 4 4 5 6 5 6 6 3 3 3 = + 5 3 7 = + = =. 5 3 7 5 7 { } Var( X) = E( X ) E( X) = =. 7 8 /3 /3 3 4 3 4 5 3( ) 3 3 3 5 P X = + d= + 3 3 7 7 = 3. +. 4 4 5 = + = = 3 3 5 3 8 5 8 3 8 (=.99). () The requred probablty s 5 5 7 64 = 8 8 =.379. (v) The varance of X for a sample of sze 5 s Var( X ) / 8 = = =.74. 5 5 4

Hgher Certfcate, Paper I, 6. Queston 4 Let X represent cyclng tme wthout delays: X ~ N(5, ). 7 5 7 =Φ =Φ =.977. () P( X ) ( ) [Φ denotes the cdf of the standard Normal dstrbuton as usual.] () Addng n the delay tmes, also Normally dstrbuted [N(.7,.9)], and lettng T denote the total tme: 7 5.7 7 =Φ =Φ.45 =.8934 ;.9 (a) T ~ N(5.7,.9), so PT ( ) ( ) 7 6.4 7 =Φ =Φ.55 =.796 ;.8 (b) T ~ N(6.4,.8), so PT ( ) ( ) 7 7. 7 =Φ =Φ.887 =.4646..7 (c) T ~ N(7.,.7), so PT ( ) ( ) () The number of delays s dstrbuted as B(3, ½). Hence the stuatons n (), ()(a), ()(b) and ()(c) arse wth probabltes /8, 3/8, 3/8 and /8 respectvely, so the (uncondtonal) mean of the total journey tme s 3 3 8.4 ET ( ) = 5 + 5.7 + 6.4 + 7. = = 6.5 mnutes. 8 8 8 8 8 (v) Mean tme T.55 N 6.5,. 7 6.5 PT ( 7) =Φ =Φ (.45) =.999..55

Hgher Certfcate, Paper I, 6. Queston 5 () E(X) = λ λ λ λ λ λ e = λe = λe e λ!! =. = = ( ) E(X ) = E X ( X ) + X = E X ( X ) + E[ X]. λ λ λ λ λ λ e E X ( X ) = ( ) = λ e = λ e e = λ!!. = = ( ) Hence EX ( ) λ λ = +, and { } Var( X) = E( X ) E( X) = λ. () n e λ λ =! L =, and hence log L= nλ+ log λ+ constant. n = dlog L Σ = n + whch on settng equal to zero gves that the mamum dλ λ lkelhood estmate s ˆ Σ d log L λ = =. [Consderaton of confrms that n dλ d log L Σ ths s a mamum: = <.] dλ λ λ = = Var( X ) λ n = n. Thus the mamum lkelhood estmator of Var( ˆλ ) s ˆ λ. n () Var ( ˆ) Var ( X ) By the central lmt theorem, ˆ λ ( = X ) s appromately Normally dstrbuted wth mean λ and varance λ /n. We estmate the varance by ˆλ /n, so that we ˆ have ˆ λ λ ~N λ, n, appromately. Hence an appromate 95% confdence nterval s gven by ˆ λ λ.95 P.96 < <.96 ˆ λ / n, leadng to the nterval ˆ ˆ λ λ.96, ˆ λ+.96 n ˆ λ. n Soluton contnued on net page

(v) For the gven sample, we have n = and Σ = 48, leadng to ˆ λ = = 4. The appromate confdence nterval s therefore 4 4 4.96 to 4 +.96,.e..87 to 5.3. The sample also gves Σ = 38; so the sample varance s s = 48 46 38 = = 4.8. Ths s close to the sample mean (4), supportng a Posson hypothess for the underlyng model.

Hgher Certfcate, Paper I, 6. Queston 6 λ λ ( ), f = e < < lambda/ f().. By symmetry, E(X) =. Hence λ λ X E X e d λ λ λ = = = { e d+ e d } Var( ) ( ). Substtutng u = n the frst ntegral gves second. Hence we get, ntegratng by parts, λu ue du, whch s the same as the E( X ) λ e λ = d λ λ e = + λ = [ ] + e λ e = + λ λ λ e.d λ d λ e d λ e [ ] λ = + λ λ =. λ Soluton contnued on net page

If Q, q are the upper and lower quartles, we have same dstance below by symmetry. Q λ λe d=, and q wll be the 4 Q λ λq = e ( e ) 4 = +, gvng = e λq. Therefore λq = log. Hence the sem-nterquartle range s (log )/λ. n n λ L e λ λ = e λ =, and hence log L= constant + nlog λ λ. = dlog L n dλ = λ whch on settng equal to zero gves that the mamum lkelhood estmate s ˆ n λ =. [Consderaton of mamum: d log L n dλ = <.] λ d log L confrms that ths s a dλ

Hgher Certfcate, Paper I, 6. Queston 7 () () The sum of all table entres s 3c. These probabltes must add up to, so c = /3. The margnal dstrbutons are gven by the row and column totals. Hence: P(X = ) = 5c = /; P(X = ) = c = /3; P(X = 3) = 5c = /6. Smlarly: P(Y = ) = c = /5; P(Y = ) = 6c = /5; P(Y = 3) = 6c = /5; P(Y = 4) = 6c = /5. () 5 E( X) = + + 3 = + + =. 3 6 3 3 4 3 E( X ) = + 4 + 9 = + + =. 3 6 3 3 5 5 Var( X ) = =. 3 3 9 We also need E(Y) later: 3 4 EY ( ) = + + + =. 5 5 5 5 5 Dstrbuton of XY: Values of y 3 4 6 Probablty 6c 7c 4c 6c 5c c [c = /3, see above] 6 4 4 6 5 E( XY) = + + 3 + 4 + 6 + = = 3 3 3 3 3 3 3 3 Also we have 5 E( X) E( Y ) = =. 3 5 3 Cov( XY, ) = EXY ( ) EXEY ( ) ( ) =. (v) X and Y are not ndependent [even though Cov(X, Y) = and even though some cells have P(X =, Y = y) = P(X = ).P(Y = y)]. For eample, we have P(X =, Y = 4) = /5, but P(X = ).P(Y = 4) = /. Soluton contnued on net page

(v) U = f X = or 3 U = f X = V = f Y = or 3 V = f Y = or 4 Table of jont dstrbuton of U and V, wth margns. Values of U Values of V c = /5 8c = 4/5 c = /3 c = /3 c = /3 c = /3 c = /5 8c = 3/5 Consder the cell wth (U, V) = (, ). The cell probablty s /5 but the product of the margnal probabltes s /5. So U and V are not ndependent.

Hgher Certfcate, Paper I, 6. Queston 8 () Y = a + b + e, =,,, n. The {e } are uncorrelated random varables wth mean and constant varance σ. The method of least squares s equvalent to the method of mamum lkelhood for estmatng the regresson coeffcents (a and b) f the {e } are Normally dstrbuted. [If the analyss s to proceed to nference for the regresson coeffcents, Normalty of the {e } s requred.] ()(a) For Y = β + e, we mnmse S = e = ( y ) β. ds We have = ( y β ) whch on settng equal to zero gves dβ y = β, so the least squares estmate s ˆ y [Consderaton of β =. ds dβ confrms that ths s a mnmum: ds dβ = >.] (b) See scatter plot at foot of page. It shows an ncreasng trend, roughly lnear; but there seems to be some ncrease n varablty as ncreases. There are not enough data ponts to be sure. The usual summary statstcs (not all requred for the zero ntercept model) are n =, Σ = 8, Σy = 4, Σ = 55, Σy = 44, Σ y = 55. βˆ = 55/55 =.5. So the ftted lne s y =.5. Hence the estmated epected number of volatons for = s.5 = 4.. Logcally, zero traffc flow should mply zero speed volatons, so that y should be when s,.e. the zero ntercept model seems reasonable. The scatter plot does not contradct ths. Volatons 5 3 4 Traffc Flow per mnute 5