THE ROYAL STATISTICAL SOCIETY 2009 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 3 STOCHASTIC PROCESSES AND TIME SERIES

Similar documents
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Economic Forecasting Lecture 9: Smoothing Methods

STAT STOCHASTIC PROCESSES. Contents

THE ROYAL STATISTICAL SOCIETY 2007 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 2 PROBABILITY MODELS

Ch 9. FORECASTING. Time Series Analysis

Continuous-Time Markov Chain

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY (formerly the Examinations of the Institute of Statisticians) GRADUATE DIPLOMA, 2004

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

The Transition Probability Function P ij (t)

Forecasting. Simon Shaw 2005/06 Semester II

Chapter 9: Forecasting

at least 50 and preferably 100 observations should be available to build a proper model

A time series is called strictly stationary if the joint distribution of every collection (Y t

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE EXAMINATION MODULE 2

1 Linear Difference Equations

Exercises Stochastic Performance Modelling. Hamilton Institute, Summer 2010

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Exercises - Time series analysis

Seasonal Models and Seasonal Adjustment

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

Some Time-Series Models

Lesson 2: Analysis of time series

5 Transfer function modelling

IE 5112 Final Exam 2010

3 Theory of stationary random processes

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

MAT SYS 5120 (Winter 2012) Assignment 5 (not to be submitted) There are 4 questions.

Stationary remaining service time conditional on queue length

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

TMA4285 December 2015 Time series models, solution.

Lecture 7: Exponential Smoothing Methods Please read Chapter 4 and Chapter 2 of MWH Book

FE570 Financial Markets and Trading. Stevens Institute of Technology

Performance Evaluation of Queuing Systems

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION MAS451/MTH451 Time Series Analysis TIME ALLOWED: 2 HOURS

QUEUING MODELS AND MARKOV PROCESSES

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

IEOR 6711: Stochastic Models I, Fall 2003, Professor Whitt. Solutions to Final Exam: Thursday, December 18.

IDENTIFICATION OF ARMA MODELS

Statistical Methods for Forecasting

Queueing systems. Renato Lo Cigno. Simulation and Performance Evaluation Queueing systems - Renato Lo Cigno 1

Statistics 150: Spring 2007

IEOR 6711, HMWK 5, Professor Sigman

Cointegration, Stationarity and Error Correction Models.

Ch. 15 Forecasting. 1.1 Forecasts Based on Conditional Expectations

Scenario 5: Internet Usage Solution. θ j

AR, MA and ARMA models

Forecasting using R. Rob J Hyndman. 3.2 Dynamic regression. Forecasting using R 1

Generalised AR and MA Models and Applications

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Chapter 3: Regression Methods for Trends

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Modelling Complex Queuing Situations with Markov Processes

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Time Series I Time Domain Methods

GI/M/1 and GI/M/m queuing systems

Computer Systems Modelling

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

CS 798: Homework Assignment 3 (Queueing Theory)

Autoregressive and Moving-Average Models

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

3 Time Series Regression

Time Series Analysis -- An Introduction -- AMS 586

Applied Forecasting (LECTURENOTES) Prof. Rozenn Dahyot

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

P (L d k = n). P (L(t) = n),

Forecasting with ARMA Models

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN SOLUTIONS

Figure 10.1: Recording when the event E occurs

THE ROYAL STATISTICAL SOCIETY 2002 EXAMINATIONS SOLUTIONS ORDINARY CERTIFICATE PAPER II

STAT 520: Forecasting and Time Series. David B. Hitchcock University of South Carolina Department of Statistics

Queueing Networks and Insensitivity

Lecture 20: Reversible Processes and Queues

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Time Series 2. Robert Almgren. Sept. 21, 2009

Vector Auto-Regressive Models

arxiv: v1 [stat.me] 5 Nov 2008

ARIMA Modelling and Forecasting

2. Transience and Recurrence

Photo: US National Archives

VAR Models and Applications

ESSE Mid-Term Test 2017 Tuesday 17 October :30-09:45

Basics: Definitions and Notation. Stationarity. A More Formal Definition

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -18 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

Continuous Time Processes

Time Series and Forecasting

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model

Markov Chain Model for ALOHA protocol

Suan Sunandha Rajabhat University

Week 5: Markov chains Random access in communication networks Solutions

Forecasting Module 2. Learning Objectives. Trended Data. By Sue B. Schou Phone:

ARMA (and ARIMA) models are often expressed in backshift notation.

Ch 5. Models for Nonstationary Time Series. Time Series Analysis

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Continuous Time Markov Chains

Transcription:

THE ROYAL STATISTICAL SOCIETY 9 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 3 STOCHASTIC PROCESSES AND TIME SERIES The Society provides these solutions to assist candidates preparing for the examinations in future years and for the information of any other persons using the examinations. The solutions should NOT be seen as "model answers". Rather, they have been written out in considerable detail and are intended as learning aids. Users of the solutions should always be aware that in many cases there are valid alternative methods. Also, in the many cases where discussion is called for, there may be other valid points that could be made. While every care has been taken with the preparation of these solutions, the Society will not be responsible for any errors or omissions. The Society will not enter into any correspondence in respect of these solutions. Note. In accordance with the convention used in the Society's examination papers, the notation log denotes logarithm to base e. Logarithms to any other base are explicitly identified, e.g. log 1. RSS 9

Graduate Diploma, Module 3, 9. Question 1 i G( z) = pi z. i= Xn X i n Gn( z) = E( z ) = pe i ( z X1 = i) = pi[ Gn 1( z) ] = G( Gn 1( z) ) i= i=. θ n = P(X n = ) = G n (). Setting z = in the relationship of part, we obtain θ n = G(θ n 1 ) (n ). Letting n in the result of part, and noting that G is a continuous function of z so that G(θ n 1 ) G(θ) as n, we obtain the equation θ = G(θ). We now have the special case as identified in the question. (v) In this special case, quoting the standard result for a binomial distribution, G(z) = (1 p + pz). (vi) θ 1 is simply the zero term of the binomial distribution, so θ 1 = (1 p). Equivalently, θ 1 = G 1 () = G() = (1 p). (vii) θ = G( θ1) = 1 p+ p(1 p) = ( 1 p) [ 1 + p(1 p) ] ( ) ( ) = 1 p 1+ p p, as required. (viii) θ is the smallest positive root of the equation θ = G(θ), i.e. of θ = (1 p + pθ), so we must solve p θ + (p p 1)θ + (1 p) =. Because θ = 1 is necessarily a root of the equation θ = G(θ), it is easy to factorize the quadratic to give (θ 1)[p θ (1 p) ] =. It follows that the extinction probability is given by min{1, [(1 p)/p] }. Hence the extinction probability is 1 if p ½ and is [(1 p)/p] if p > ½.

Graduate Diploma, Module 3, 9. Question A Markov chain is said to be irreducible if it is possible, with non-zero probability, to move from any state in the state space to any other state. A chain is said to be recurrent if, starting from any state in the space, the probability of eventually returning to that state is 1. [These explanations may be put more formally in terms of n-step transition probabilities.] In the present case, because all the transition probabilities are non-zero, it is clearly possible to move from any state to any other state in one step, so the chain is irreducible. It is a general result that all finite irreducible Markov chains are recurrent. The stationary distribution (π 1, π, π 3 ) is given by the solution of the equations (/5)π 1 + (1/5)π + (1/5)π 3 = π 1 (/5)π 1 + (3/5)π + (/5)π 3 = π (1/5)π 1 + (1/5)π + (/5)π 3 = π 3, which reduce to 3π 1 = π + π 3 π = π 1 + π 3 3π 3 = π 1 + π, together with the normalisation condition π 1 + π + π 3 = 1. It readily follows that the solution is (π 1, π, π 3 ) = (¼, ½, ¼). The probabilities and hence also the proportions in the second generation are given by the terms of the matrix product /5 /5 1/5 (/5 /5 1/5) 1/5 3/5 1/5 1/5 /5 /5 which gives the vector of probabilities/proportions (7/5 1/5 6/5). The approximate proportions that we would expect to find are the ones given by the stationary distribution (π 1 π π 3 ) = (1/4 1/ 1/4). The reasoning lying behind this is as follows. Let p ij (n) represent the n-step transition probability from state i to state j; then, for all i and j, p ij (n) π j as n. Hence, after a large number, n, of generations, we would expect that p ij (n) π j. In a large population of individuals, each of whom has the same approximate probability π j of being in state j, π j is also the approximate proportion of the population who are in state j.

Graduate Diploma, Module 3, 9. Question 3 Because of the memory-less property of the exponential distribution, how long line has been under repair is statistically independent of how much longer it will take to repair it. Define states as follows. : no line runs 1: line 1 runs, line under repair : line 1 under repair, line runs 3: both lines run. The instantaneous transition rates are as follows. transition rate 1 1/ 1 1/1 1 3 1/ 1/5 3 1/ 3 1/3 3 1 1/15 The equilibrium equations are (1/)π = (1/1)π 1 + (1/5)π (3/5)π 1 = (1/)π + (1/15)π 3 (7/1)π = (1/3)π 3 (1/1)π 3 = (1/)π 1 + (1/)π. These reduce to 5π = 6π 5π 1 = 16π π 3 = 1π. Using the normalisation condition π + π 1 + π + π 3 = 1, it follows that (π, π 1, π, π 3 ) = k (6/5, 16/5, 1, 1), where 1/k = (6/5) + (16/5) + 1 + 1 = 656/5. So (π, π 1, π, π 3 ) = (13/38, 5/41, 5/656, 55/656) = (.4,.1,.4,.8). In particular the long-term proportion of time that the factory is unable to meet the production target is π =.4.

Graduate Diploma, Module 3, 9. Question 4 The state space is the set of all non-negative integers. The instantaneous transition rates are as follows. transition rate i i + 1 λ (i ) i i 1 μ (i 1) The traffic intensity is defined by ρ = λ/μ. A necessary and sufficient condition for an equilibrium distribution to exist is ρ < 1, i.e. λ < μ. The detailed balance equations are λπ n 1 = μπ n (n 1). Thus π n = ρπ n 1 (n 1) and, using this relation recursively, we find π n = ρ n π (n ). Using the normalisation condition Σπ n =1, we find, using the formula for the sum of a geometric series (or observing that we are dealing with a geometric distribution), that π n = (1 ρ)ρ n (n ), as required. The service time distribution, i.e. here the waiting time distribution, for this model is exponential with parameter μ. The pdf is μe μt (t ). (v) The arriving customer s waiting time is the sum of n + 1 independently and identically distributed service times, each having an exponential distribution with parameter μ. These are the service times of the customers ahead of him in the queue plus his own (note that, because of the memory-less property of the exponential distribution, the residual service time of the customer being served at the time of arrival of the new customer is also exponential with parameter μ). Using the note given in the question, the required pdf is μ n+1 t n e μ t /n! (t ). (vi) In equilibrium, from part the probability that an arriving customer finds n customers ahead of him in the queue is given by π n = (1 ρ)ρ n (n ). Thus, using the result of part (v), the pdf of his waiting time is given by n n+ ( ) 1 n μt μt n 1 ρ ρ μ te / n! = ( 1 ρ) μe ( ρμt) / n! n= n= ( ) t, t n ( μ λ) μ ( λ ) /! ( μ λ) = e t n = e μ λ n= which is the pdf of the exponential distribution with parameter μ λ.

Graduate Diploma, Module 3, 9. Question 5 The autoregressive characteristic equation is 1 (3/4)z + (1/8)z =, which has roots z =, 4. Both the roots are greater than one in modulus, so the stationarity condition is satisfied. On substituting, we obtain 3 1 3 1 ψ ε = ψ ε ψ ε + ε = ψ ε ψ ε +. ε i t i i t 1 i i t i t i 1 t i i t i t i= 4 i= 8 i= 4 i= 1 8 i= Equating coefficients of the ε t i, we obtain the following. i = : ψ = 1 i = 1: ψ 1= (3/4)ψ = 3/4 i : ψ i = (3/4)ψ i 1 (1/8)ψ i. The last of these provides the required set of recurrence relations, for i, and the values for ψ and ψ 1 provide the required initial conditions. The general solution is of the form ψ i = A 1 α i i 1 + A α (i ), where A 1 and A are arbitrary constants and α 1 and α are the roots of the auxiliary equation α = (3/4)α (1/8). The roots of the auxiliary equation are [the inverses of the roots of the characteristic equation of part ] 1/ and 1/4. Hence the general solution is ψ i = A 1 (1/) i + A (1/4) i. Using the initial conditions, A 1 + A = 1 and (1/)A 1 + (1/4)A = 3/4. Hence A 1 = and A = 1, and the solution for the ψ i is as stated in the question. Generally, Var(Y t ) = ψ i i σ =. In the present case, this gives i i i i σ Var(Y t ) = ( 1/) ( 1/4) i σ = 4( 1/4) 4(1/8) + ( 1/16) i= i= Summing the geometric series in this expression gives Var(Y t ) = 4 4 1 64 + σ = σ. 1 (1/ 4) 1 (1/ 8) 1 (1/16) 35

Graduate Diploma, Module 3, 9. Question 6 If the underlying trend is an exponential one, taking logarithms will transform the trend to a linear one, in which case an ARIMA model is likely to provide a better fit. If the variability of the series and, in particular, of any seasonal effects increases with increase in the underlying level of the series, taking logarithms will tend to stabilise the variation, and in this case also an ARIMA model is likely to provide a better fit. Approximate 95% confidence limits are at ±/ 18 = ±.149. So any autocorrelation outside these limits differs significantly from zero at the 5% level. We see that a number of autocorrelations lie well outside these limits, notably at lags 1, 6, 1, 18, 4, 3, 36. This clearly indicates the presence of seasonality of period 1 months and also suggests the presence of trend. The purpose of taking differences is to eliminate the trend and the purpose of taking seasonal differences is to eliminate the seasonality. Approximate 95% confidence limits are at ±/ 167 = ±.155. So any autocorrelation outside these limits differs significantly from zero at the 5% level. Here the only significant autocorrelations are at lag 1 and at lag 1. This shows that any trend and seasonality have been removed by the differencing to obtain a stationary series and suggests that the stationary series may be modelled by moving average terms at lags 1 and 1. So a seasonal ARIMA(,1,1) (,1,1) 1 model is suggested. A seasonal ARIMA(,1,1) (,1,1) 1 has been fitted. The equation of the fitted model is (see the parameter estimates in the computer output in the question) (1 L)(1 L 1 )Y t = (1.8759L)(1.7789L 1 )ε t, where L is the lag operator (backward shift operator) and {ε t } is a white noise process, i.e. or (1 L L 1 + L 13 )Y t = (1.8759L.7789L 1 +.68L 13 )ε t Y t = Y t 1 + Y t 1 Y t 13 + ε t.8759ε t 1.7789ε t 1 +.68ε t 13. (v) (vi) None of the p-values of the modified Box-Pierce statistics is significant. So the residuals of the fitted model appear to come from a white noise process our model appears to give a good fit to the data. The forecast and 95% prediction interval for Y 19 are given by 8.677 and (8.45985, 8.894) respectively. The forecast sales and prediction interval are given by taking exponentials of these values. This, correct to the nearest 1 litres, gives 5867 as the forecast sales for December 1995 and (471, 79) as the 95% prediction interval.

Graduate Diploma, Module 3, 9. Question 7 The updating equations are as follows. ( / ) ( 1 α)( 1 1) Lt = α Yt It p + Lt + Bt ( ) ( 1 γ) ( / ) ( 1 ) B = γ L L + B t t t 1 t 1 I = δ Y L + δ I t t t t p ŷ T (h) = (L T + hb T ) I T p+h. We require ˆ y T (1) and ŷ T (1). (a) ŷ T (1) = (311.44 + 1.7)(.79) = (313.14) (.79) =. = to nearest whole number. (b) ŷ T (1) = [311.44 + (1)(1.7)](.97) = (4519.6)(.97) = 31.88 = 3 to nearest whole number. For January 1994, the values are as follows. Level L t =.4(45/.79) +.6(311.44 + 1.7) = 36.11 Trend B t =.1(36.11 311.44) +.9(1.7) = 3. Index I t =.1(45/36.11) +.99(.79) =.79 Fitted From, ŷ T (1) =. Residual Deaths Fitted = 45. =.98 (v) Given some appropriately chosen initial values for the level and the trend and for the first twelve seasonal indices, for any given set of values of the smoothing constants α, γ and δ (each between and 1 inclusive, of course) the numerical values of all the quantities in the table may be calculated for each month in the series. The sum of squares of the residuals (or some other appropriate function of the residuals) may be used as a measure of how well the Holt-Winters method with the chosen values of α, γ and δ performs. By looking at a grid of values of α, γ and δ or by carrying out a formal optimisation, the values that minimise the sum of squares of the residuals may be found as the best set of values to use.

Graduate Diploma, Module 3, 9. Question 8 {Y t } is an ARIMA(,,1) process. Y t = Y t 1 Y t + ε t θε t 1. ŷ T (1) = E(Y T+1 H T ) = E(Y T Y T 1 + ε T+1 θε T H T ) = Y T Y T 1 θε T. [Note that these are conditional expectations given the entire history of the process up to and including time T. So Y T and Y T 1 are known. Further, ε T (sometimes called the "innovation" at time T) can be found using the one-stepahead prediction at time T 1 and the observed Y T this is as shown in part, replacing T + 1 by T. ε T+1 cannot be found, of course, and so has expectation as a white noise term.] Y T+1 ˆ y T (1) = ε T+1. (v) ŷ T () = E(Y T+ H T ) = E(Y T+1 Y T + ε T+ θε T+1 H T ) = ŷ T (1) Y T = 3Y T Y T 1 θε T (substituting from the result of part ). (vi) For h 3, setting t = T + h in the model equation, ŷ T (h) = E(Y T+h H T ) = E(Y T+h 1 Y T+h + ε T+h θε T+h 1 H T ) = E(Y T+h 1 H T ) E(Y T+h H T ) + = ŷ T (h 1) y ˆ T (h ). (vii) The general form of the solution of the difference equation of part (vi) (in the examination, this could be quoted or easily found) is y ˆ T (h) = A + Bh (h 1). To determine A and B, the initial conditions of parts and (v) give A + B = Y T Y T 1 θε T, A + B = 3Y T Y T -1 θε T. Hence B = b T = Y T Y T 1 θε T and A = Y T. (viii) Using the result of part, replacing T by T 1, we have Y T ˆ y T 1 (1) = ε T. Note also that ˆ y T 1 (1) = Y T 1 + b T 1. Substituting into the expression for b T in part (vii), b T = Y T Y T 1 θ (Y T ŷ T 1 (1)) = Y T Y T 1 θ (Y T Y T 1 b T 1 ) = (1 θ)(y T Y T 1 ) + θb T 1.