STAT STOCHASTIC PROCESSES. Contents

Similar documents
STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Continuous-Time Markov Chain

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

The Transition Probability Function P ij (t)

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan

Statistics 150: Spring 2007

Irreducibility. Irreducible. every state can be reached from every other state For any i,j, exist an m 0, such that. Absorbing state: p jj =1

Stochastic process. X, a series of random variables indexed by t

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Birth and Death Processes. Birth and Death Processes. Linear Growth with Immigration. Limiting Behaviour for Birth and Death Processes

STA 624 Practice Exam 2 Applied Stochastic Processes Spring, 2008

SMSTC (2007/08) Probability.

CDA5530: Performance Models of Computers and Networks. Chapter 3: Review of Practical

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

EXAM IN COURSE TMA4265 STOCHASTIC PROCESSES Wednesday 7. August, 2013 Time: 9:00 13:00

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

TMA4265 Stochastic processes ST2101 Stochastic simulation and modelling

STOCHASTIC PROCESSES Basic notions

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Covariances of ARMA Processes

Time Series 2. Robert Almgren. Sept. 21, 2009

FE570 Financial Markets and Trading. Stevens Institute of Technology

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

Examination paper for TMA4265 Stochastic Processes

THE ROYAL STATISTICAL SOCIETY 2009 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 3 STOCHASTIC PROCESSES AND TIME SERIES

STAD57 Time Series Analysis. Lecture 8

Lecture 4a: Continuous-Time Markov Chain Models

Nonlinear time series

Markov processes and queueing networks

Probability Distributions

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

Ch 6. Model Specification. Time Series Analysis

STAT 443 (Winter ) Forecasting

The story of the film so far... Mathematics for Informatics 4a. Continuous-time Markov processes. Counting processes

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

STAT 380 Continuous Time Markov Chains

Continuous Time Markov Chains

2. Transience and Recurrence

Markov Processes Hamid R. Rabiee

Contents. 1 Time Series Analysis Introduction Stationary Processes State Space Modesl Stationary Processes 8

ITSM-R Reference Manual

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Empirical Market Microstructure Analysis (EMMA)

Manual for SOA Exam MLC.

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

TCOM 501: Networking Theory & Fundamentals. Lecture 6 February 19, 2003 Prof. Yannis A. Korilis

Markov Chains (Part 3)

Markov chains. Randomness and Computation. Markov chains. Markov processes

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

Continuous time Markov chains

Quantitative Model Checking (QMC) - SS12

γ 0 = Var(X i ) = Var(φ 1 X i 1 +W i ) = φ 2 1γ 0 +σ 2, which implies that we must have φ 1 < 1, and γ 0 = σ2 . 1 φ 2 1 We may also calculate for j 1

THE QUEEN S UNIVERSITY OF BELFAST

Stochastic Processes

Lesson 9: Autoregressive-Moving Average (ARMA) models

6 Markov Chain Monte Carlo (MCMC)

Parameter estimation: ACVF of AR processes

STAT Financial Time Series

3. ARMA Modeling. Now: Important class of stationary processes

AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

Time Series 3. Robert Almgren. Sept. 28, 2009

1 Types of stochastic models

Markov chains. 1 Discrete time Markov chains. c A. J. Ganesh, University of Bristol, 2015

A review of Continuous Time MC STA 624, Spring 2015

Generalised AR and MA Models and Applications

Stochastic Processes

1 Linear Difference Equations

Applied time-series analysis

2 Discrete-Time Markov Chains

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Stochastic Modelling Solutions to Exercises on Time Series

ELEMENTS OF PROBABILITY THEORY

Chapter 6: Random Processes 1

Stat 150 Practice Final Spring 2015

Univariate ARIMA Models

AARMS Homework Exercises

HANDBOOK OF APPLICABLE MATHEMATICS

Data analysis and stochastic modeling

Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t

Statistics 253/317 Introduction to Probability Models. Winter Midterm Exam Friday, Feb 8, 2013

Exponential Distribution and Poisson Process

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Circle a single answer for each multiple choice question. Your choice should be made clearly.

STOR 356: Summary Course Notes

Review Session: Econometrics - CLEFIN (20192)

Exercises - Time series analysis

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Markov Processes Cont d. Kolmogorov Differential Equations

Long-range dependence

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

A time series is called strictly stationary if the joint distribution of every collection (Y t

STATS 3U03. Sang Woo Park. March 29, Textbook: Inroduction to stochastic processes. Requirement: 5 assignments, 2 tests, and 1 final

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be ed to

Lecture 10: Semi-Markov Type Processes

Forecasting. Simon Shaw 2005/06 Semester II

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Transcription:

STAT 3911 - STOCHASTIC PROCESSES ANDREW TULLOCH Contents 1. Stochastic Processes 2 2. Classification of states 2 3. Limit theorems for Markov chains 4 4. First step analysis 5 5. Branching processes 5 6. Poisson process 6 7. Sojourn and waiting times 6 8. Continuous-time Markov chains 7 9. Differential equations for marginal probabilities 8 10. Birth and death processes 8 11. Time Series Analysis 10 12. ARMA(p, q) Models 10 13. ARIMA Models 12 14. Model selection 12 15. Prediction using ARIMA(p, d, q) models 13 1

STAT 3911 - STOCHASTIC PROCESSES 2 1. Stochastic Processes Definition 1.1 (Markov process). A process X = {X t : t T } is a Markov process if P(X(t) x X(t 1 ) = x 1, X(t 2 ) = x 2,..., X(t n ) = x n ) = P(X(t) x X(t n ) = x n ) whenever t 1 < t 2 < < t n < t. In other words, a Markov process is a stochastic process satisfying the Markov property: the conditional distribution of future values is independent of the past. Definition 1.2 (Markov chain). A Markov chain is a stochastic process with a finite or countable state space is a Markov chain if it satisfies the Markov property. Define the transition probabilities P ij by P ij = P(X n+1 = j X n = i) Theorem 1.3 (Chapman-Kolmogorov equation). Let Pij n denote the probability of going from state i to state j in n steps, that is, P n ij = P(X n+m = j X m = i) = P(X n = j X 0 = i) Then we have, for any m, n = 0, 1, 2,... and i, j S, P n+m ij = Pik m Pkj. n k=0 In terms of matrix multiplication, this is equivalent to P n+m = P n P m Proposition 1.4 (Initial distribution). At time t, let π n π n (j) = P(X n = j) for all j S. π n = π 0 P n denote the column vector such that 2. Classification of states Definition 2.1 (Classification of states). Define an equivalence relation on the set of states such that i j if and only if Pij n > 0 for some n 0. Two states i and j which are accessible to each other are said to communicate and we write i j Lemma 2.2. Communication is an equivalence relation on the set of states. If two states communicate with each other we say they belong to the same communicating class. Definition 2.3 (Irreducible). A Markov chain is said to be irreducible if there is only one class, i.e., all states communicate with each other. A class C is closed if no state outside C can be reached from any state in C. A state i is called absorbing if the set {i} is a closed class, i.e., P ii = 1.

STAT 3911 - STOCHASTIC PROCESSES 3 Definition 2.4 (Period of a state). A state i is said to have period d i if P n ii n = d i, 2d i, 3d i,.... = 0 except when Proposition 2.5. If i j then d i = d j. Definition 2.6 (First transition time). For any states i and j, define fij n starting in i, the first transition into j occurs at time n, that is as the probability that f 0 ij = 0 f n ij = P(X n = j, X k j, k = 1, 2,..., n 1 X 0 = i) Let T j be the random time of the first transition into state j. Thus we have P(T j = n X 0 = i) = fij n If X 0 = j then T j is the first return time to j. If X 0 = i j then T j is the first hitting time of j. We define the quantity f ij by the equality f ij = n=1 f ij is the probability of ever making a transition into state j, given that the process starts in state i. We also have that f ij > 0 if and only if state j is accessible from state i. f n ij Definition 2.7 (Recurrent states). A state is recurrent if P(X n = i for some n 1 X 0 = i) = 1 A state i is recurrent if and only if f ii = 1. A state i is transient if and only if f ii < 1. Proposition 2.8. A state i is recurrent if and only if Pii n = n=1 A state i is transient if and only if Pii n < n=1 Corollary. If i is recurrent and i j then state j is recurrent. Corollary. If i j and j is recurrent then f ij = 1.

STAT 3911 - STOCHASTIC PROCESSES 4 3. Limit theorems for Markov chains Let m jj be the mean recurrence time, that is, the expected number of transitions until a Markov chain starting in j returns to that state, so if j is transient m j j = n=1 nf jj n if j is recurrent Definition 3.1 (Ergodic state). Let j be recurrent. Then a state j is called positive recurrent if m jj < and null recurrent if m jj =. A positive recurrent, aperiodic state is called ergodic. 1 We have m jj 0 < 1 m jj <. = 0 if a state j is either transient of null recurrent. If j is positive recurrent then Theorem 3.2. Assume i j. If a state j is aperiodic, then If a state j has period d j, then lim P ij n = 1 n m jj lim P ij n = n Definition 3.3 (Stationary distributions). A probability distribution π is called stationary for for a Markov chain X with transition matrix P if π P = π We note that the stationary distribution π associated with a Markov chain is not necessarily unique, nor does it always exist. Theorem 3.4 (Irreducible aperiodic Markov chains). Let X be an irreducible aperiodic Markov chain. Then Either all the states are transient or null recurrent; in that case we have that lim n P n ij = 0 d j m jj for all i, j and there exists no stationary distribution. or else all states are positive recurrent - in that case we have Π j = lim n P n ij > 0 for all i, j; moreover, Π j is a stationary distribution and the stationary distribution is unique. Moreover, for all j, Π j = 1 m jj We note that in a finite-state, irreducible, aperiodic Markov chain all states are positive recurrent/ergodic, and so the stationary distribution exists and is unique. In a finite state, irreducible, periodic Markov chain all states are positive recurrent and the unique stationary distribution exists and is unique.

STAT 3911 - STOCHASTIC PROCESSES 5 4. First step analysis Consider the Markov chain X with transition probability matrix 1 0 0 P = α β γ 0 0 1 We ask the following questions. What state will the process ultimately be absorbed in? How long, on average, will it take to reach one of these states? Let T = min{n 0 X n = 0 or X n = 2} be the random time of absorption for the process X. We ask the questions u = P(X T = 0 X 0 = 1) and v = E [T X 0 = 1] Thus u is the probability of begin absorbed in state 0 given we start in state 1, and v is the expected time of being absorbed given we start in state 1. Now we have the following. u = 1 α + u β + 0 γ Similarly, we have v = 1 + a 0 + β v + γ 0 5. Branching processes Definition 5.1 (Branching process). let X 0 = 1 and for every n = 0, 1, 2,... define X n X n+1 = where Z i are independently, identically distributed random variables with probability distribution P(Z = k) = P k, k = 0, 1, 2,.... Then X = {X n, n 0} is a branching process i=1 Z i We have the following formulas. Lemma 5.2. We have E [E(X n )] = µe [X n 1 ] = µ n

STAT 3911 - STOCHASTIC PROCESSES 6 and ( ) σ 2 µ n 1 µ n 1 µ 1 µ 1 Var (X n ) = nσ 2 µ = 1 Proposition 5.3 (Extinction probabilities). Define u n to be the probability of extinction at or prior to the n th generation, assuming X 0 = 1. Then u n = P(X n = 0 X 0 = 1) Then we have where P k = P(Z = k). u n = P k (u n 1 ) k k=0 Proposition 5.4. Let Π 0 = lim n P(X n = 0) be the probability of eventual extinction. Then, definining the probability generating functions G n (s) = E [ s X n] = j=0 sj P(X n = j), we have that Π 0 = 1 if and only if µ 1, If µ > 1 then Π 0 < 1 is the smallest positive number p satisfying G(p) = p - that is, where G(p) = j=0 pj P j. Π 0 = G(Π 0 ) 6. Poisson process Definition 6.1 (Poisson process). A process N = {N(t), t 0} is a Poisson process if N(0) = 0, N is a process of independent increments. The number of events in any interval of length t has a Poisson distribution with parameter λt, that is P(N(t + s) N(s) = n) = e λt (λt) n n! The positive paramenter λ > 0 is called the intensity of the process. We have that E [N(t)] = λt and Var (N(t)) = λt Lemma 6.2. The sum of two Poisson processes of intensities λ and µ is a Poisson process of intensity λ + µ. 7. Sojourn and waiting times Definition 7.1 (Soujourn time). For n = 0, 1, 2,..., the sojourn time in state n equals S n = inf{t [0, ) N(t + S n 1 ) N(S n 1 ) = 1}

STAT 3911 - STOCHASTIC PROCESSES 7 Lemma 7.2. The sojourn times S n are independently, identically distributed exponential variables with parameter λ. Definition 7.3 (Waiting time). For n = 0, 1, 2,..., the waiting time until the n th event equals W n = inf{t [0, ) N(t) = n} Lemma 7.4. If X i are independently, identically distributed exponential random variables with parameter λ then n i=1 X i is distributed with a gamma distribution with parameters n, λ, that is, Γ(n, λ). Proposition 7.5. Given N(t) = 1, the waiting time W 1 = S 0 = T 1 has uniform distribution on [0, t] Theorem 7.6. Given that N(t) = 1, the waiting times (W 1,..., W n ) have the same conditional joint probability density of n independently, identically distributed uniformly distributed random variables on the interval [0, t]. Lemma 7.7. If X i, i = 1, 2 are independent exponential random variables with parameters λ i, then P(X 1 < X 2 ) = λ 1 λ 1 + λ 2 8. Continuous-time Markov chains Consider a continuous time Markov process X = {X(t), t 0} where transition probabilities P ij (t) = P(X(t + s) = j X(s) = i) are independent of s. This is a time-homogenous, or stationary, process. Definition 8.1 (Pure birth process). A pure birth process is a Markov process satisfying the following conditions. P k,k+1 (h) = λ k h + o(h), P k,k (h) = 1 λ k h + o(h), P(X(t + h) X(t) < 0 X(t) = k) = 0, X(0) = 0 or X(0) = 1 Example 8.2. Let us review some interesting special cases for birth intensities λ k. Poisson process: λ k = λ for all k. Simple birth: λ k = kβ. This model gives the probabiities 1 kβh + o(h) m = 0 P(X(t + h) X(t) = m X(t) = k) = kβh + o(h) m = 1 o(h) otherwise

STAT 3911 - STOCHASTIC PROCESSES 8 Simple birth with immigration: λ k = kβ + ν. 9. Differential equations for marginal probabilities Consider a pure birth process. Let P n (t) = P(X(t) = n). Assume P 0 (0) = 1. Theorem 9.1. The functions P n satisfy the following system of ordinary differential equations. P 0(t) = λ 0 P 0 (t) P n(t) = λ n P n (t) + λ n 1 P n 1 (t) Lemma 9.2. The marginal probability functions P n satisfy P 0 (t) = e λ 0t and t P n (t) = λ n 1 e λnt e λns P n 1 (s)ds As above, define S k to be the time between the kth and (k + 1)th birth. Then we have the following lemma. Lemma 9.3. The random variables S k are independent and S k has an exponential distribution with parameter λ k 10. Birth and death processes Definition 10.1 (Pure birth process). A birth and death process is a Markov process satisfying the following conditions. P k,k+1 (h) = λ k h + o(h), P k,k (h) = 1 (λ k + µ k h + o(h), P k,k 1 (h)µ k h + o(h) P(X(t + h) X(t) < 0 X(t) = k) = 0, X(0) = 0 or X(0) = 1 Proposition 10.2 (Chapman-Kolmogorov equation). P ij (t + s) = P ij (t)p kj (s). k=0 Proposition 10.3 (Sojourn times). We wish to calculate the sohourn time in state i, denoted S i. Defining P(S i t) = D i (t). We have D i (t) = e (λ i+µ i )t Thus S i is exponential distributed with parameter λ i + µ i. 0

STAT 3911 - STOCHASTIC PROCESSES 9 Proposition 10.4. The process stays in a given state i for a random length of time whose distribution function is exponential with parameter λ i + µ i. The process leaves state i by entering state i + 1 or i 1 with probability λ i λ i+µ i and µ i λ i+µ i. Proposition 10.5. Transition probabilities P ij (t) satisfy a system of ordinary differential equations, known as the backward Kolmogorov equations. For a fixed j, P 0j(t) = λ 0 P 0j (t) + λ 0 P 1j (t) P ij(t) = µ i P i 1,j (t) (λ i + µ i )P ij (t) + λ i P i+1,j (t) with initial conditions P ij (0) = δ ij. In matrix notation, with P (0) = I. P (t) = AP (t), Proposition 10.6. Transition probabilities P ij (t) satisfy a system of ordinary differential equations, known as the forward Kolmogorov equations. For a fixed j, P i0(t) = λ 0 P i,0 (t) + µ 1 P i,1 (t) P ij(t) = λ j 1 P i,j 1 (t) (λ i + µ i )P ij (t) + µ j+1 P i,j+1 (t) with initial conditions P ij (0) = δ ij. In matrix notation, with P (0) = I. P (t) = P (t)a, From the forward Kolmogorov equation, we can find a system of ordinary differential equations for the marginal probabilities P k = P(X(t) = j). We have P 0(t) = λ 0 P 0 (t) + µ 1 P 1 (t) P j(t) = λ j 1 P j 1 (t) (λ j + µ j )P j (t) + µ j+1 P j+1 (t) with P j (0) = π 0 (j). In matrix notation, π t = π t A

STAT 3911 - STOCHASTIC PROCESSES 10 11. Time Series Analysis Definition 11.1 (Time series). Let {X t } be a time series with E [ ] Xt 2 <. The mean function for {X t } is The covariance function of {X t } is µ X (t) = E [X t ] γ X (r, s) = Cov (() X r, X s ) = E [(X r µ X (r)(x s µ X (s)))] Definition 11.2 (Stationary). A time series {X t } is weakly stationary if µ X (t) is independent of t, γ X (t + h, t) is independent of t for each h. Definition 11.3 (Autocovariance function). Let {X t } be a stationary time series. The autocovariance function (ACVF) of {X t } at lag h is γ X (t) = Cov (X t+h, X t ) The autocorrelation function (ACF) of {X t } at lag h is ρ X (h) = γ X(h) γ X (0) = Cor(X t+h, X t ) Definition 11.4 (Sample autocovariance function). The sample autocovariance function is ˆγ(h) = 1 n The sample autocorrelation function is n h (x t+ h x)(x t x) t=1 ρ X (h) = γ X(h) γ X (0) 12. ARMA(p, q) Models Definition 12.1 (ARMA(p, q) process). A time series {X t } is an ARMA(p, q) process if {X t } is stationary and if for every t, X t ϕ 1 X t 1 ϕ p X t p = Z t + θ 1 Z t + + θ q Z t q where {Z t } W N(0, σ 2 ) and the polynomials θ(z) and ϕ(z) have no common factors. Theorem 12.2 (Stationary). A stationary solution {X t } of the equations exists, and is the unique stationary solution if and only if for all z = 1 ϕ(z) = 1 ϕ 1 z ϕ p z p 0

STAT 3911 - STOCHASTIC PROCESSES 11 Theorem 12.3 (Causality). An ARMA(p, q) process is causal, if there exists constants {ψ j } with X t = ψ j Z t j for all t. Causality is equivalent to the condition for all z 1. j=0 ϕ(z) = ϕ(z) = 1 ϕ 1 z ϕ p z p 0 The coefficients ψ j can be determined by the identity ψ(z) = ψ 0 + ψ 1 z + = θ(z) ϕ(z) Definition 12.4 (Invertibility). An ARMA(p, q) process {X t } is invertible if there exist constants π j such that Invertibility is equivalent to the condition for all z 1. Z t = π j X t j j=0 θ(z) = 1 + θ 1 z + + θ q z q The coeffients π j can be determined by the identity π(z) = π 0 + π 1 z + = ϕ(z) θ(z) Definition 12.5 (Partial autocorrelation function). The partial autocorrelation function α( ) satisfies the following equations. and in general, is the last component of α(0) = 1 α(1) = ρ(1) α(2) = ρ(2) ρ(1)2 1 ρ(1) 2 ϕ h = Γ 1 h γ y where Γ h is the covariance matrix and γ y is a vector of covariances. Theorem 12.6. The PACF of an AR(p) process is given by α(p) = ϕ p

STAT 3911 - STOCHASTIC PROCESSES 12 and α(h) = 0 for h > p Theorem 12.7 (Stationary AR(2) process). An AR(2) process X t = α 1 X t 1 + α 2 X t 2 + Z t is stationary if and only if the following all hold. α 1 + α 2 < 1 α 1 α 2 > 1 α 2 > 1 13. ARIMA Models Definition 13.1 (ARIMA process). If d is a nonnegative integer, then {X t } is an ARIMA(p, d, q) process if Y t = (1 B) d X t is a causal ARMA(p, q) process. 14. Model selection To select a model, we must first transform the data until we are examining a stationary process. Then, we consider the ACF and PACF functions. A few general points. A slowly decaying (slower than exponential decay) ACF indicates the data needs to be differenced, i.e., that an ARIMA(p, d, q) model should be considered. A PACF that cuts out sharply after lag p indicates an AR(p) model. This can be quantified by Quenoille s Test. Theorem 14.1 (Quenouille s Test). If a process is AR(m), and if N is large, then ˆπ k N (0, 1 N ), k > m Thus any ˆπ k satisfying ˆπ k > 2 N for k > m, indicates disagreement with H 0 at the 5% significance level. Any MA process must have ρ(1) < 0.5 An MA(q) process must have ρ(k) = 0 for all k > q. An AR(p) process will have the ACF function exhibiting exponential decay. To test the fit of a model, we use the Box-Pierce test and consider the AIC statistic. Theorem 14.2 (Box-Pierce test). Assume an ARIMA(p, d, q) model has been fitted to time series. Consider the series of residual {s t }. Then, for a fixed maximum lag K, we consider the statistic BP = N K k=1 ˆr 2 k

STAT 3911 - STOCHASTIC PROCESSES 13 where ˆr k 2 is the squared sample autocorrelation coefficient for the residual time series. This is distributed with according to a χ 2 -distribution with K p q degrees of freedom. This statistic is then used to test the hypothesis that the none of the autocorrelation coefficients for the residuals are different from zero against the alternate that at least one autocorrelation coefficient is greater than zero. Theorem 14.3 (AIC). The AIC statistic for a fitted model is a sum of the likelihood (as estimate of model fit) and the number of parameters (p and q). The best model is the one that minimizes the value of the AIC. 15. Prediction using ARIMA(p, d, q) models Definition 15.1 (Best linear predictor). The best linear predictor of Y in terms of X i, i = 1, 2,..., n in the least squares sense, is the linear function of X n λ(x) = β 0 + β i X i where β i, i = 0, 1, 2,..., n are chosen to minimise ( E Y β 0 i=1 n i=1 ) 2 β i X i Theorem 15.2 (Properties of the best linear predictor). If ˆλ(X) = ˆβ 0 + n i=1 ˆβ i X i is the best linear predictor for Y in terms of X i, i = 1, 2,..., n, the the residual U = Y ˆλ(X) has expectation 0 and is uncorrelated with every linear function λ(x). Theorem 15.3 (Uniqueness of the best linear predictor). Let ˆλ(X) = ˆβ 0 + N i=1 ˆβ i X i be the best linear predictor, and let U = Y ˆλ(X). Suppose λ (X) = β 0 + N i=1 β i X i and U = Y λ (X) has the properties: (1) E [U ] = 0 (2) U is uncorrelated with every linear function of X 1, X 2,... X n. Then λ (X) = ˆλ(X) Example 15.4 (Best linear predictor for AR(p)). Let {X t } be an AR(p) process. Then we have X t 1 (1) = µ + α 1 X t 1 + + α p X t p is the best linear predictor for X t in terms of X t 1,..., X t p

STAT 3911 - STOCHASTIC PROCESSES 14 Example 15.5 (Best linear predictor for ARMA(p, q)). Let {X t } be an ARIMA(p, q) process. Then we have As we have X t 1 (1) = µ + α 1 X t 1 + + α p X t p + β 1 Z t 1 + + β q Z t q X N+1 X N (1) = Z t N (0, σ 2 Z), we then have a (1 α)100% confidence interval for X N+1 is given by ( X N (1) ± Φ 1 α ) σ Z 2 Example 15.6 (Two step predictors for an AR(p) process). We have the best linear predictor for X t 2 (2) is given as X t 2 (2) = µ + α 1 X t 2 (1) + α 2 X t 2 + + α p X t p where X t 2 (1) is the one step predicted value for the AR(p) process above. We can calculate that the (1 α)100% confidence intervals for X t are given by ( X t 2 (2) ± Φ 1 α ) (1 + α1 2 2 )σ2 Z