THE ROYAL STATISTICAL SOCIETY 9 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 3 STOCHASTIC PROCESSES AND TIME SERIES The Society provides these solutions to assist candidates preparing for the examinations in future years and for the information of any other persons using the examinations. The solutions should NOT be seen as "model answers". Rather, they have been written out in considerable detail and are intended as learning aids. Users of the solutions should always be aware that in many cases there are valid alternative methods. Also, in the many cases where discussion is called for, there may be other valid points that could be made. While every care has been taken with the preparation of these solutions, the Society will not be responsible for any errors or omissions. The Society will not enter into any correspondence in respect of these solutions. Note. In accordance with the convention used in the Society's examination papers, the notation log denotes logarithm to base e. Logarithms to any other base are explicitly identified, e.g. log 1. RSS 9
Graduate Diploma, Module 3, 9. Question 1 i G( z) = pi z. i= Xn X i n Gn( z) = E( z ) = pe i ( z X1 = i) = pi[ Gn 1( z) ] = G( Gn 1( z) ) i= i=. θ n = P(X n = ) = G n (). Setting z = in the relationship of part, we obtain θ n = G(θ n 1 ) (n ). Letting n in the result of part, and noting that G is a continuous function of z so that G(θ n 1 ) G(θ) as n, we obtain the equation θ = G(θ). We now have the special case as identified in the question. (v) In this special case, quoting the standard result for a binomial distribution, G(z) = (1 p + pz). (vi) θ 1 is simply the zero term of the binomial distribution, so θ 1 = (1 p). Equivalently, θ 1 = G 1 () = G() = (1 p). (vii) θ = G( θ1) = 1 p+ p(1 p) = ( 1 p) [ 1 + p(1 p) ] ( ) ( ) = 1 p 1+ p p, as required. (viii) θ is the smallest positive root of the equation θ = G(θ), i.e. of θ = (1 p + pθ), so we must solve p θ + (p p 1)θ + (1 p) =. Because θ = 1 is necessarily a root of the equation θ = G(θ), it is easy to factorize the quadratic to give (θ 1)[p θ (1 p) ] =. It follows that the extinction probability is given by min{1, [(1 p)/p] }. Hence the extinction probability is 1 if p ½ and is [(1 p)/p] if p > ½.
Graduate Diploma, Module 3, 9. Question A Markov chain is said to be irreducible if it is possible, with non-zero probability, to move from any state in the state space to any other state. A chain is said to be recurrent if, starting from any state in the space, the probability of eventually returning to that state is 1. [These explanations may be put more formally in terms of n-step transition probabilities.] In the present case, because all the transition probabilities are non-zero, it is clearly possible to move from any state to any other state in one step, so the chain is irreducible. It is a general result that all finite irreducible Markov chains are recurrent. The stationary distribution (π 1, π, π 3 ) is given by the solution of the equations (/5)π 1 + (1/5)π + (1/5)π 3 = π 1 (/5)π 1 + (3/5)π + (/5)π 3 = π (1/5)π 1 + (1/5)π + (/5)π 3 = π 3, which reduce to 3π 1 = π + π 3 π = π 1 + π 3 3π 3 = π 1 + π, together with the normalisation condition π 1 + π + π 3 = 1. It readily follows that the solution is (π 1, π, π 3 ) = (¼, ½, ¼). The probabilities and hence also the proportions in the second generation are given by the terms of the matrix product /5 /5 1/5 (/5 /5 1/5) 1/5 3/5 1/5 1/5 /5 /5 which gives the vector of probabilities/proportions (7/5 1/5 6/5). The approximate proportions that we would expect to find are the ones given by the stationary distribution (π 1 π π 3 ) = (1/4 1/ 1/4). The reasoning lying behind this is as follows. Let p ij (n) represent the n-step transition probability from state i to state j; then, for all i and j, p ij (n) π j as n. Hence, after a large number, n, of generations, we would expect that p ij (n) π j. In a large population of individuals, each of whom has the same approximate probability π j of being in state j, π j is also the approximate proportion of the population who are in state j.
Graduate Diploma, Module 3, 9. Question 3 Because of the memory-less property of the exponential distribution, how long line has been under repair is statistically independent of how much longer it will take to repair it. Define states as follows. : no line runs 1: line 1 runs, line under repair : line 1 under repair, line runs 3: both lines run. The instantaneous transition rates are as follows. transition rate 1 1/ 1 1/1 1 3 1/ 1/5 3 1/ 3 1/3 3 1 1/15 The equilibrium equations are (1/)π = (1/1)π 1 + (1/5)π (3/5)π 1 = (1/)π + (1/15)π 3 (7/1)π = (1/3)π 3 (1/1)π 3 = (1/)π 1 + (1/)π. These reduce to 5π = 6π 5π 1 = 16π π 3 = 1π. Using the normalisation condition π + π 1 + π + π 3 = 1, it follows that (π, π 1, π, π 3 ) = k (6/5, 16/5, 1, 1), where 1/k = (6/5) + (16/5) + 1 + 1 = 656/5. So (π, π 1, π, π 3 ) = (13/38, 5/41, 5/656, 55/656) = (.4,.1,.4,.8). In particular the long-term proportion of time that the factory is unable to meet the production target is π =.4.
Graduate Diploma, Module 3, 9. Question 4 The state space is the set of all non-negative integers. The instantaneous transition rates are as follows. transition rate i i + 1 λ (i ) i i 1 μ (i 1) The traffic intensity is defined by ρ = λ/μ. A necessary and sufficient condition for an equilibrium distribution to exist is ρ < 1, i.e. λ < μ. The detailed balance equations are λπ n 1 = μπ n (n 1). Thus π n = ρπ n 1 (n 1) and, using this relation recursively, we find π n = ρ n π (n ). Using the normalisation condition Σπ n =1, we find, using the formula for the sum of a geometric series (or observing that we are dealing with a geometric distribution), that π n = (1 ρ)ρ n (n ), as required. The service time distribution, i.e. here the waiting time distribution, for this model is exponential with parameter μ. The pdf is μe μt (t ). (v) The arriving customer s waiting time is the sum of n + 1 independently and identically distributed service times, each having an exponential distribution with parameter μ. These are the service times of the customers ahead of him in the queue plus his own (note that, because of the memory-less property of the exponential distribution, the residual service time of the customer being served at the time of arrival of the new customer is also exponential with parameter μ). Using the note given in the question, the required pdf is μ n+1 t n e μ t /n! (t ). (vi) In equilibrium, from part the probability that an arriving customer finds n customers ahead of him in the queue is given by π n = (1 ρ)ρ n (n ). Thus, using the result of part (v), the pdf of his waiting time is given by n n+ ( ) 1 n μt μt n 1 ρ ρ μ te / n! = ( 1 ρ) μe ( ρμt) / n! n= n= ( ) t, t n ( μ λ) μ ( λ ) /! ( μ λ) = e t n = e μ λ n= which is the pdf of the exponential distribution with parameter μ λ.
Graduate Diploma, Module 3, 9. Question 5 The autoregressive characteristic equation is 1 (3/4)z + (1/8)z =, which has roots z =, 4. Both the roots are greater than one in modulus, so the stationarity condition is satisfied. On substituting, we obtain 3 1 3 1 ψ ε = ψ ε ψ ε + ε = ψ ε ψ ε +. ε i t i i t 1 i i t i t i 1 t i i t i t i= 4 i= 8 i= 4 i= 1 8 i= Equating coefficients of the ε t i, we obtain the following. i = : ψ = 1 i = 1: ψ 1= (3/4)ψ = 3/4 i : ψ i = (3/4)ψ i 1 (1/8)ψ i. The last of these provides the required set of recurrence relations, for i, and the values for ψ and ψ 1 provide the required initial conditions. The general solution is of the form ψ i = A 1 α i i 1 + A α (i ), where A 1 and A are arbitrary constants and α 1 and α are the roots of the auxiliary equation α = (3/4)α (1/8). The roots of the auxiliary equation are [the inverses of the roots of the characteristic equation of part ] 1/ and 1/4. Hence the general solution is ψ i = A 1 (1/) i + A (1/4) i. Using the initial conditions, A 1 + A = 1 and (1/)A 1 + (1/4)A = 3/4. Hence A 1 = and A = 1, and the solution for the ψ i is as stated in the question. Generally, Var(Y t ) = ψ i i σ =. In the present case, this gives i i i i σ Var(Y t ) = ( 1/) ( 1/4) i σ = 4( 1/4) 4(1/8) + ( 1/16) i= i= Summing the geometric series in this expression gives Var(Y t ) = 4 4 1 64 + σ = σ. 1 (1/ 4) 1 (1/ 8) 1 (1/16) 35
Graduate Diploma, Module 3, 9. Question 6 If the underlying trend is an exponential one, taking logarithms will transform the trend to a linear one, in which case an ARIMA model is likely to provide a better fit. If the variability of the series and, in particular, of any seasonal effects increases with increase in the underlying level of the series, taking logarithms will tend to stabilise the variation, and in this case also an ARIMA model is likely to provide a better fit. Approximate 95% confidence limits are at ±/ 18 = ±.149. So any autocorrelation outside these limits differs significantly from zero at the 5% level. We see that a number of autocorrelations lie well outside these limits, notably at lags 1, 6, 1, 18, 4, 3, 36. This clearly indicates the presence of seasonality of period 1 months and also suggests the presence of trend. The purpose of taking differences is to eliminate the trend and the purpose of taking seasonal differences is to eliminate the seasonality. Approximate 95% confidence limits are at ±/ 167 = ±.155. So any autocorrelation outside these limits differs significantly from zero at the 5% level. Here the only significant autocorrelations are at lag 1 and at lag 1. This shows that any trend and seasonality have been removed by the differencing to obtain a stationary series and suggests that the stationary series may be modelled by moving average terms at lags 1 and 1. So a seasonal ARIMA(,1,1) (,1,1) 1 model is suggested. A seasonal ARIMA(,1,1) (,1,1) 1 has been fitted. The equation of the fitted model is (see the parameter estimates in the computer output in the question) (1 L)(1 L 1 )Y t = (1.8759L)(1.7789L 1 )ε t, where L is the lag operator (backward shift operator) and {ε t } is a white noise process, i.e. or (1 L L 1 + L 13 )Y t = (1.8759L.7789L 1 +.68L 13 )ε t Y t = Y t 1 + Y t 1 Y t 13 + ε t.8759ε t 1.7789ε t 1 +.68ε t 13. (v) (vi) None of the p-values of the modified Box-Pierce statistics is significant. So the residuals of the fitted model appear to come from a white noise process our model appears to give a good fit to the data. The forecast and 95% prediction interval for Y 19 are given by 8.677 and (8.45985, 8.894) respectively. The forecast sales and prediction interval are given by taking exponentials of these values. This, correct to the nearest 1 litres, gives 5867 as the forecast sales for December 1995 and (471, 79) as the 95% prediction interval.
Graduate Diploma, Module 3, 9. Question 7 The updating equations are as follows. ( / ) ( 1 α)( 1 1) Lt = α Yt It p + Lt + Bt ( ) ( 1 γ) ( / ) ( 1 ) B = γ L L + B t t t 1 t 1 I = δ Y L + δ I t t t t p ŷ T (h) = (L T + hb T ) I T p+h. We require ˆ y T (1) and ŷ T (1). (a) ŷ T (1) = (311.44 + 1.7)(.79) = (313.14) (.79) =. = to nearest whole number. (b) ŷ T (1) = [311.44 + (1)(1.7)](.97) = (4519.6)(.97) = 31.88 = 3 to nearest whole number. For January 1994, the values are as follows. Level L t =.4(45/.79) +.6(311.44 + 1.7) = 36.11 Trend B t =.1(36.11 311.44) +.9(1.7) = 3. Index I t =.1(45/36.11) +.99(.79) =.79 Fitted From, ŷ T (1) =. Residual Deaths Fitted = 45. =.98 (v) Given some appropriately chosen initial values for the level and the trend and for the first twelve seasonal indices, for any given set of values of the smoothing constants α, γ and δ (each between and 1 inclusive, of course) the numerical values of all the quantities in the table may be calculated for each month in the series. The sum of squares of the residuals (or some other appropriate function of the residuals) may be used as a measure of how well the Holt-Winters method with the chosen values of α, γ and δ performs. By looking at a grid of values of α, γ and δ or by carrying out a formal optimisation, the values that minimise the sum of squares of the residuals may be found as the best set of values to use.
Graduate Diploma, Module 3, 9. Question 8 {Y t } is an ARIMA(,,1) process. Y t = Y t 1 Y t + ε t θε t 1. ŷ T (1) = E(Y T+1 H T ) = E(Y T Y T 1 + ε T+1 θε T H T ) = Y T Y T 1 θε T. [Note that these are conditional expectations given the entire history of the process up to and including time T. So Y T and Y T 1 are known. Further, ε T (sometimes called the "innovation" at time T) can be found using the one-stepahead prediction at time T 1 and the observed Y T this is as shown in part, replacing T + 1 by T. ε T+1 cannot be found, of course, and so has expectation as a white noise term.] Y T+1 ˆ y T (1) = ε T+1. (v) ŷ T () = E(Y T+ H T ) = E(Y T+1 Y T + ε T+ θε T+1 H T ) = ŷ T (1) Y T = 3Y T Y T 1 θε T (substituting from the result of part ). (vi) For h 3, setting t = T + h in the model equation, ŷ T (h) = E(Y T+h H T ) = E(Y T+h 1 Y T+h + ε T+h θε T+h 1 H T ) = E(Y T+h 1 H T ) E(Y T+h H T ) + = ŷ T (h 1) y ˆ T (h ). (vii) The general form of the solution of the difference equation of part (vi) (in the examination, this could be quoted or easily found) is y ˆ T (h) = A + Bh (h 1). To determine A and B, the initial conditions of parts and (v) give A + B = Y T Y T 1 θε T, A + B = 3Y T Y T -1 θε T. Hence B = b T = Y T Y T 1 θε T and A = Y T. (viii) Using the result of part, replacing T by T 1, we have Y T ˆ y T 1 (1) = ε T. Note also that ˆ y T 1 (1) = Y T 1 + b T 1. Substituting into the expression for b T in part (vii), b T = Y T Y T 1 θ (Y T ŷ T 1 (1)) = Y T Y T 1 θ (Y T Y T 1 b T 1 ) = (1 θ)(y T Y T 1 ) + θb T 1.