STAT 443 (Winter ) Forecasting

Size: px
Start display at page:

Download "STAT 443 (Winter ) Forecasting"

Transcription

1 Winter 2014 TABLE OF CONTENTS STAT 443 (Winter ) Forecasting Prof R Ramezan University of Waterloo L A TEXer: W KONG Last Revision: September 3, 2014 Table of Contents 1 Introduction 1 2 Time Series Models 1 21 Zero Mean Models 1 22 Models with Trend 1 23 Models with Seasonal Component 2 3 Stationary Models 2 31 Moving Averages 3 32 The Sample Autocorrelation Function 4 33 Linear Regression 4 4 Prediction 4 41 Model Selection 5 42 Interpolation vs Extrapolation 6 43 Tests on Residuals 6 5 Smoothing Methods 6 51 Trend Estimation 7 52 Estimating Seasonality 7 53 Modeling Residuals 8 6 Stationary and Linear Processes 9 61 Linear Prediction Linear Processes Box-Jenkins Models Invertibility Partial Autocorrelation Function (PACF) ARIMA(p, d, q) Processes SARIMA(p, d, q) (P, D, Q) S Processes 18 7 Box-Jenkins Methodology 19 8 Parameter Estimation in ARMA Processes Yule-Walker Methods 20 9 Likelihood Models Forecasting Forecasting AR(p) Forecasting MA(q) Forecasting ARMA(p,q) Processes 25 These notes are currently a work in progress, and as such may be incomplete or contain errors i

2 Winter 2014 ACKNOWLEDGMENTS ACKNOWLEDGMENTS: Special thanks to Michael Baker and his L A TEX formatted notes They were the inspiration for the structure of these notes ii

3 Winter 2014 ABSTRACT Abstract The purpose of these notes is to provide the reader with a secondary reference to the material covered in STAT 443 The formal prerequisite to this course is STAT 331 but this author believes that the overlap between the two courses is less than 10% Readers should have a good background in linear algebra, basic statistics, and calculus before enrolling in this course iii

4 Winter TIME SERIES MODELS 1 Introduction A time series is a sub-class of stochastic processes which are indexed by time and can be represented by {X t : t T } Let T be an index set The sequence of random variables {X t : t T } is a stochastic process if X t is a random variable for all t T If T is a set of time points, then {X t } is a time series In this course, we will assume that T shows time points If T is a discrete (continuous) set, then the time series {X t } is said to be discrete (continuous) time The main focus of this course is to develop models for discrete time series Example 11 For example, when we say x 5 = 10, we mean the value of x at time 5 is equal to 10 2 Time Series Models Our interest lies in modeling and the analysis of data collected over time (time series) Ideally, given a discrete stochastic process {X t }, we want the joint distribution of X 1, X 2,, X n for all n N In real world applications, this is generally not possible because we don t have enough information to fully specify the joint distribution The good news is that in most information about the joint distribution is provided in the first two moments and the covariances between pairs of random variables In other words, E(X t ), E(X 2 t ) and E(X t X t ) for any t, t N summarizes most of the information content about the process If the joint distribution is multivariate normal, then the three expectations above fully specify the joint distribution Recall that the multivariate normal distribution is written as N p (µ, Σ) where p is the dimension, µ is mean vector, and Σ is the variance-covariance matrix which is p p It is easy to see that µ and Σ are parametrized by the three expectations above, which we now call (*) Since (*) contains a fair amount of information, instead of working with the joint, we will work with time series models which employ (*) Definition 21 A time series model for observed data {x t } is a specification of the joint distributions (or possibly only (*)) of a sequence of random variables {X t } of which {x t } is postulated to be realization 21 Zero Mean Models (1) iid noise: If {X 1,, X k } are iid random variables, then P (X 1 x 1,, X k x k ) = k P (X n x n ) = n=1 k P (X 1 x n ) and the joint is defined by one marginal distribution with zero mean Observe that using the independence assumption we see that P (X n+h x X 1 x 1,, X n x n ) = P (X n+h x) (2) Random walk: {S t, t N}, starting at S 0 = 0 is a random walk if S t = t k=1 X k where X k are white noise random variables (3) White noise (zero-mean): A white noise process is a sequence of uncorrelated random variables {X t } each with constant mean 0 and constant variance σ 2 We denote this by {X t } W N(0, σ 2 ) n=1 22 Models with Trend Consider the model X t = m t + Y t where m t is a slowly changing function, called the trend, and Y t has zero mean We have E(X t ) = m t t Notice that m t is a non-random function of time t This trend component can be linear, quadratic or any kind of arbitrary function Example 21 Consider X t = m t + Y t where m t = 2 + t and Y t N(0, 1) This is a linear trend with a perturbation being modeled as a standardized normal random variable 1

5 Winter STATIONARY MODELS 23 Models with Seasonal Component In a similar setup to the previous case (models with trend) we can write X t = s t + Y t where E(Y t ) = 0 t and S t is a periodic function (the seasonal component) with period d (S t = S t+d t) In a sense, s t is a particular kind of trend An example would be the seasonal component s t = α 0 + α 1 cos(α 2 t) We can also use indicator functions (on certain months) for the seasonality component as well In both models X t = m t + Y t and X t = s t + Y t the parameters α 0, α 1, α 2, are usually estimated by maximum likelihood or least squares methods Notice that in the general case, we can use the model X t = m t + s t + Y t which contains both the seasonal and trend component This is called the classical decomposition and will be frequently referred to in this course We an use regression models to estimate m t and s t Example 22 Consider the average seasonal temperature over many years where Z 1 = Spring of 2004, Z 2 = Summer of 2004,, Z 20 Suppose that we want to fit a model of the form X t = m t + s t + Y t where m t is polynomial in t We use the dummy coding contrasts matrix in the form [ I 0 ] T in the order of Spring, Summer, Fall, and Winter Suppose that X 1, X 2, X 3, X 4 represent the categories of the seasons Then, we use the general model Z t = Y t }{{} Error + p β i t i + i=0 }{{} m t 3 α j X j } {{ } s t We should use the rule that says that if a periodic trend with period d is being modeled through regression analysis, d 1 binary variates should be introduced to the model Now consider Example 2 (generated from R code on the UW Learn page) In this example, we fitted the model ln(y t ) = 3 11 β i t i + β j x j 2 + R t i=1 with R t being the random component which is iid N(0, σ 2 ) Although the model is good in terms of fit, it does not satisfy the fundamental assumption of independent residuals Therefore, if interest lies in forecasting, this model fails To be able to check the independence of residuals, as well as to introduce a new class of time series models, the concept of stationarity should be introduced j=3 3 Stationary Models Definition 31 The time series {X t : t T } is called strictly (strong) stationary if the joint distribution of X t1, X t2,, X tn is the same as that of X t1 k, X t2 k,, X tn k for all n, t 1,, t n, k N In other words, {X t } is strictly stationary if all of its statistical properties remain the same under time shifts In practice, strict stationarity is too limiting of an assumption and rarely holds true we mentioned earlier that a lot of the information about the joint distributions are provided in the moments E[X t ], E[X 2 t ] and E[X t X t ] for all t, t This motivates introducing a a type of stationarity based on these lower order moments, which we will call weak stationarity To introduce weak stationarity, we need some more definitions first Definition 32 Let {X t } be a time series with E[X 2 t ] < The mean function of {X t } is µ X (t) = µ t = E[X t ] and the covariance function of {X t } is γ X (r, s) = Cov(X r, X s ) = E[(X r µ X (r))(x s µ X (s))] Definition 33 The time series {X t } with E[X 2 t ] < is said to be weakly stationary if: 1 µ X (t) = E[X t ] is independent of t 2 γ X (t, t + h) = Cov(X t, X t+h ) is independent of t for all h; the covariance only depends on the distance h instead of t 2

6 Winter STATIONARY MODELS 3 E[X 2 t ] < is also one of the conditions for weak stationarity Also, in view of the latter condition above, we use the term covariance function with reference to a stationary time series {X t } we shall mean the function of one variable defined by γ X (h) := γ X (h, 0) = γ X (t + h, t) = γ X (t, t + h) Exercise 31 If E(X 2 t ) <, show that strict stationarity implies weak stationarity Definition 34 Let {X t } be a stationary time series The autocovariance function (ACVF) of {X t } at lag h is γ X (h) = Cov(X t+h, X t ) The autocorrelation function (ACF) of {X t } at lag h is γ X (h) = γ X (h)/γ X (0) = Corr(X t+h, X t ) Example 31 Investigate the stationarity of white noise Let {X t } be white noise with {X t } W N(0, σ 2 ) Automatically, σ 2 < = V ar(x t ) <, E[X t ] = 0 does not depend on t, and Cov(X t, X t+h ) = σδ th where δ th is the Dirac delta function Hence, white noise is weakly stationary Example 32 A random walk {S t } is not stationary because V ar(s t ) = tσ 2 Notation 1 Whenever we refer to a stationary time series from now on (since Jan 16, 2014), we mean weakly stationary unless otherwise specified 31 Moving Averages Example 33 Consider the process X t = Z t + θz t 1 where t Z and Z t W N(0, σ 2 ) This process is called the first-order moving average [MA(1)] Show that {X t } is stationary It can be shown that V ar(x t ) = σ 2 (1 + θ 2 ) < E(X t ) = 0 σ 2 (1 + θ 2 ) h = 0 Cov(X t, X t+h ) = θσ 2 h = 1 0 h > 1 and hence because all functions are independent of t and the variance is finite, then X t is stationary We can also derive the ACF as ρ(h) = γ(h) 1 h = 0 γ(0) = θ 1+θ h = h > 1 Note that this illustrates that γ(h) is an even function Example 34 Let {X t } be a stationary time series satisfying the equations X t = φx t 1 + Z t for t Z where φ < 1 and {Z t } W N(0, σ 2 ) Also let Z t and X s be uncorrelated for each s < t Then time series {X t } is called an autoregressive process of order 1 [AR(1)] Note that because {X t } is stationary, we have E[X t ] = µ = φµ = µ = 0 for any t We also have that γ(0) = φ 2 γ(0)+σ 2 = γ(0) = σ2 1 φ Now if h > 0, multiply both sides of the expression for X 2 t by X t h and take expectations to get E[X t X t h ] = φe[x t h X t 1 ] + E[X t 1 Z t ] = γ(h) = φγ(h 1) You can repeat the same trick for h < 0 to get γ(k) = φk σ 2 = γ(k) = φ k γ(0) = φk σ 2 1 φ 2 and hence the ACF is ρ(h) = φ h 1 φ 2 3

7 Winter PREDICTION 32 The Sample Autocorrelation Function What we have seen so far on ACF is based on given models (theoretical) In practice, based on the observed data {x 1, x 2,, x n } we use the sample ACF to assess the degree of dependence in data Sample ACF is the estimate of the theoretical ACF (under stationarity) Definition 35 Let x 1,, x n be observations of a time series The sample mean of x 1,, x n is x = 1 n n i=1 x i The sample autocovariance function is ˆγ(h) = 1 n h ( xt+ h x ) (x t x), h ( n, n) n The sample autocorrelation function is t=1 ˆρ(h) = ˆγ(h), h ( n, n) ˆγ(0) The convention that will use is the hat notation for estimates and the tilde notation (eg γ(h)) for estimators Variables without a hat or a tilde are theoretical fixed values In summary, γ(h) Theoretical; fixed but unknown γ(h) The estimator; random variable ˆγ(h) Realization of γ(h)based on a sample The sample ACF measures the correlation in the data (under stationarity) Therefore, it can be used to the uncorrelatedness of the residuals of a regression model Note that if we have Gaussian residuals, Independent Not Correlated It can also be show that for iid noise with finite variance, ( ρ(h) N 0, 1 ) n where n is the sample size for large values of n Therefore, for data from such processes (iid noise) we expect than 95% of the sample ACFs fall between ±196/ n That is, ( 196 P < ρ(h) < 196 ) = 095 n n Based on the rends in the plot of the sample ACF ( ˆ ρ(h) vs h), we will decide on different models for the data (to be described later) Remark 31 For the observed data {x 1,, x n }: If the data contains a trend (non-constant mean), ˆρ(h) will exhibit a slow decay (linear decay) as h increases If the data contains a substantial deterministic periodic term, ˆρ(h) will exhibit similar behaviour with the same period 33 Linear Regression This was just a review of STAT 331 and STAT 371 done in two lectures Nothing to see here Carry on 4 Prediction Suppose that we have a model Y i = α + βx i + R i, R i N(0, σ 2 ) iid We want to predict Y new for a new value for x = x new If α = α + β x, we can rewrite our model as Y i = α + β(x i x) + R i, R i N(0, σ 2 ) iid 4

8 Winter PREDICTION It can be shown that ( α G β G σ α, ( ) σ β, SXX n ) which are independent, and the estimator µ(x new ) = E[Y X = x new ] is ( µ(x new ) = α + β(x 1n new x) N (α + β(x new x), σ 2 + (x new x) 2 )) where S XX E[ µ(x new )] = α + β(x new x) + E[R new ] = α + β(x new x) }{{} =0 ( V ar[ µ(x new )] = V ar( α) + (x new x) 2 V ar( β) 1 = σ 2 n + (x new x) 2 ) Since Y new N ( α + β(x new x), σ 2) then Y new µ new N (0, σ (1 2 + If c = F 1 t n 2 (0975), then the 95% prediction interval is Y new µ new t n 2 σ n + (xnew x)2 S XX ˆµ(x new ) ± cˆσ n + (x new x) 2 S XX and the prediction interval for multiple linear regression is ˆµ(x new ) ± cˆσ 1 + x T new(x T X) 1 x new where c is from a t n p 1 distribution S XX )) 1n (xnew x)2 + S XX and Note 1 In prediction, we must always consider the bias-variance tradeoff That is, as the model becomes more flexible (less bias and more variance), the less prediction power it has because it does not understand the pattern as well 41 Model Selection We just give the formulas for the various information criterions here: AIC = 2l(ˆθ) + 2N p AICc = AIC + 2N p(n p + 1) n N p 1 BIC = 2l(ˆθ) + N p log(n) where ˆl(θ) is the log-likelihood function Also, here is the formula for the PRESS statistic PRESS = (y ŷ) 2 Here are some strategies involving these statistics: Use a stepwise strategy Build up by adding one variable at a time Build down by subtracting one variable at a time y validation set 5

9 Winter SMOOTHING METHODS Use a mixed strategy 42 Interpolation vs Extrapolation If we let h max = max(h ij ) where H = X(X T X) 1 X T then if the point x satisfies x T (X T X) 1 x h max, then estimating y for x is an interpolation problem, otherwise extrapolation (cf Montgomery, EA Peck) 43 Tests on Residuals The Shapiro-Wilk Test is as follows: H 0 : Y 1,, Y n come from a Gaussian distribution Reject H 0 if the p-value of this test is small In R, if the data is stored in the vector y, then use the command shapirotest(y) The Difference Sign Test is as follows: Count the number S of values such that y i y i 1 > 0 For large iid sequences For large n, S is approximately N(µ S, σs 2 ), therefore, µ S = E[S] = n 1, σs 2 = n W = S µ S σ 2 S N(0, 1) A large positive value of S µ S indicates the presence of increasing (decreasing) trend We reject (H 0 : data is random) if W > z 1 α/2 but this may not work for seasonal data The Runs Test is as follows: Estimate the median and call it m Let n 1 be the number of observations > m and n 2 be the number of observations < m Let R be the number of consecutive observations which are all smaller (larger) than m For large iid sequences For large number of observations, µ R = E[R] = 1 + 2n 1n 2, σr 2 = (µ R 1)(µ R 2) n 1 + n 2 n 1 + n 2 1 R µ R σ R N(0, 1) 5 Smoothing Methods Recall the classical decomposition X t = m t + s t + Y t with period d, noise Y t and trend m t For identification, we need d t=1 s t = 0 and E[Y t ] = 0 Here, the assumption of linearity is strong, in the sense that it may or may not hold Out goal is to estimate and extract m t and s t and hope that the random component Y t will turn out to be stationary time series 6

10 Winter SMOOTHING METHODS 51 Trend Estimation Consider a non-seasonal model with a trend and stochastic component If E[Y t ] 0 then we define m t = m t + E[Y t ] and Y t = Y t E[Y t ] to create new variables which follow our classical assumptions There are many ways to estimate trend in this model which we list below: 1 (Finite Moving Average Filter) Let q be a non-negative integer and consider the two-sided moving average of the series X t We have m t 1 q X t j = 1 q m t j + 1 q Y t j 2q + 1 2q + 1 2q + 1 j= q j= q j= q }{{} 0 2 (Exponential Smoothing) For fixed α [0, 1] define the recursion ˆm t = αx t + (1 α) ˆm t 1 with initial condition ˆm 1 = X 1 This gives an exponentially decreasing weighted moving average where in the general t 2 case, t 2 ˆm t = α(1 α) j X t j + (1 α) t 1 X 1 Note that a smaller α creates a smoother plot compared to a larger α j=0 3 (Polynomial Regression) This is just developing a parametric polynomial form of m t in the form where k is chosen arbitrarily m t = 4 We can also eliminate the trend through differencing where k β i t i i=0 X t = X t X t 1 = (1 B)X t and, B are known to be the differencing and backshift operators respectively Exponentiating these operators is equivalent to function composition In this case, we are applying differencing to get a stationary process (by eliminating the trend) Example 51 Consider X t = α + βt + Y t Since E[X t ] depends on time, this series is non-stationary Under differencing, where Y t X t = β + Y t Y t 1 }{{} =Y t is stationary Similarly, if X t = α + βt + γt 2 + Y t, then but it can be shown that 2 X t IS stationary X t = β + 2γt γ + Y t Y t 1 52 Estimating Seasonality Suppose that ˆm t is a moving average filter for the trend of the data For each k = 1,, d, estimate w k as q<k+jd n q w k = (x k+jd ˆm k+jd ) {x k+jd ˆm k+jd q < k + jd n q} 7

11 Winter SMOOTHING METHODS which is the average of {x k+jd ˆm k+jd q < k + jd n q} Normalize to get so that d s j = 0 Note that ŝ k = ŝ k d for k > d d i=1 ŝ k = w k w i d 53 Modeling Residuals Holt-Winters This generalizes exponential smoothing to the case where there is a trend and seasonality Following Chatfield and Yar (1988), we define trend as long-term change in the mean level per unit time Have local linear trend where mean level at time t is µ t = L t + T t t where L t and T t vary slowly across time L t is the level and T t is the slope of the trend at time t Holt s idea is that Ŷt+h Y 1,, Y t = L t + h T t There are two forms of seasonality to be added to Holt s model: additive and multiplicative Holt-Winters (Additive Case) Define Level, Trend, and Seasonal Index at time t by L t, T t, I t where seasonal effect is of period p The update rules are L t = α(x t I t p ) + (1 α)(l t + T t 1 ) T t = β(l t L t 1 ) + (1 β)t t 1 I t = γ(x t L t ) + (1 γ)i t p The forecast for h periods is L t + ht t + I t p+h Holt-Winters (Multiplicative Case) Define Level, Trend, and Seasonal Index at time t by L t, T t, I t where seasonal effect is of period p The update rules are L t = α(x t /I t p ) + (1 α)(l t + T t 1 ) T t = β(l t L t 1 ) + (1 β)t t 1 I t = γ(x t /L t ) + (1 γ)i t p The forecast for h periods is (L t + ht t ) I t p+h Holt-Winters (Algorithm) This is a recursive algorithm so you will need to provide initial values for L t, T t, I t at the beginning of the series Values will need to be provided for α, β, γ (R minimizes the squared one-step prediction error to estimate these parameters) 8

12 Winter STATIONARY AND LINEAR PROCESSES You then will need to choose between additive and multiplicative models (In R, this is the only step that you execute) Holt-Winters (Special Cases) In the case that β = γ = 0 we have no trend or seasonal updates in the H-W algorithm Here, we have L t = αx t + (1 α)l t 1 which is exactly (simple) exponential smoothing under α In the case that γ = 0 we have no seasonal component and there are two H-W equations for updating L t and T t We call the above case double exponential smoothing Example 52 (Inventory Prediction) This example has a trend but no seasonality Use the H-W algorithm with γ = 0 (double exponential smoothing) In the case of simple exponential smoothing, m t = αy t + (1 α)m t 1 where the forecaster is Ŷt+1 = m t This is why the exponential smoother looks like it is one step behind We can rewrite this equation as Ŷ t+1 = m t 1 + α(y t m t 1 ) = m t 1 + α(y t Ŷt) which is a trend plus a weighted forecasting error This case is β = γ = 0 in the H-W algorithm Remark 51 Double exponential smoothing captures more of the underlying trend than simple exponential smoothing [ CHECK OUT THE LECTURE SLIDES FOR MORE EXAMPLES (for the midterm)! ] 6 Stationary and Linear Processes To perform any form of forecasting, there must be an assumption that some things are the same in the future as in the past The idea of being constant over time is central to stationary processes Therefore, we ll use stationary processes as the main framework to develop forecasting models In this chapter/module, we will talk about moving average (M A(q)), autoregressive (AR(p)) process, and will look at the connection between the two We will also develop forecasting methods within stationary processes Definition 61 A process {X t } is called a moving average process of order q if X t = Z t + θ 1 Z t θ q Z t q where {Z t } W N(0, σ 2 ) and θ 1,, θ q are constants Sometimes Z t is referred to as the innovation Notice that these innovations are uncorrelated, have constant variance and zero mean Deriving the mean and autocovariance function of MA(q), it is easy to see that this process is stationary Definition 62 We say that a process {X t } is q dependent if X t and are X s are independent if t s > q That is, they are dependent if they are within q steps of each other Similarly, we saay that that stationary time series is q correlated if γ(h) = 0 whenever h > q Example 61 It is easy to show that the MA(q) process is q correlated The inverse of this statement is also true Proposition 61 If {X t } is a stationary q correlated time series with mean 0, then it can be represented as the MA(q) process (ON MIDTERM?) Definition 63 Consider the process {X t } denoted by X t = φx t 1 + Z t for t = 0, 1, 2, where {Z t } W N(0, σ) This process is called the first-order autoregressive process or AR(1) We can show this process by (1 φb)x t = Z t Notice that if φ = 1, then {X t } forms a random walk that is not stationary Therefore, depending on the value of φ, {X t } may or may not be stationary 9

13 Winter STATIONARY AND LINEAR PROCESSES Remark 61 Consider the AR(1) process with the condition φ 1 We have, by induction, X t = Z t + φ j Z t j = φ j Z t j Defining θ i = φ i, we have written X t as an MA( ) process Definition 64 process {X t } is called a autoregressive process of order p if where {Z t } W N(0, σ 2 ) and φ 1,, φ p are constants j=0 X t = X t + φ 1 X t φ p X t p + Z t Definition 65 {X t } is called a Gaussian time series if all its joint distributions are multivariate normal That is for any set i 1,, i m with each n N, the random vector (X i1,, X im ) follows a multivariate normal distribution ( [ ]) [ ] T σ 2 Note 2 If (X 1, X 2 ) MV N µ1 µ 2, 1 σ 12 σ 12 σ2 2 then ( X 1 X 2 = x 2 N µ 1 + σ ) 12 σ2 2 (x 2 µ 2 ), σ1 2 σ2 12 σ2 2 ( = N µ 1 + σ ) 1 (x 2 µ 2 ), σ 2 σ 1(1 ρ 2 ), ρ = σ 12 2 σ 1 σ 2 Example 62 Consider the stationary Gaussian time series {X t } Suppose X n has been observed and we want to forecast X t+h using m(x n ), a function of X n Let us measure the quality of the forecast by ) MSE = E ([X n+h m(x n )] 2 X n It can be shown that m( ) which minimizes MSE in a general case is m(x n ) = E(X n+h X n ) In this case, stationarity implies that E[X n+h ] = E[X n ] = µ and Cov(X n+h, X n+h ) = γ(0) = V ar(x n ) = σ 2 and Cor(X n+h, X n ) = γ(h) γ(0) If this process was a Gaussian time series, then ( [ [ ] T (X 1, X 2 ) MV N µ µ, ]) σ 2 σ 2 ρ(h) X 1 X 2 = x 2 N ( µ + ρ(h)(x 2 µ 2 ), σ 2 (1 ρ(h) 2 ) ) m(x n ) = E[X n+h X n ] = µ + ρ(h)(x n µ) }{{}}{{} (1) (2) ) MSE = E ([X n+h m(x n )] 2 X n = V ar(x n+h X n ) = σ 2 (1 ρ(h)) and even if the normality does not hold, we can still look at the predictor m(x n ) = ax n + b where a and b where there are derived from 2 min a,b E X n+h (ax n + b) }{{} m(x n) Equation (1) will be shown later Equation (2) is the best form of m(x) such that MSE is minimized under the Gaussian 10

14 Winter STATIONARY AND LINEAR PROCESSES process assumption Note that there is no conditional above because 1 [ E (X n+h m(x n )) 2] { ]} = E E [(X n+h m(x n )) 2 X n { ]} E E [(X n+h m (X n )) 2 X n, m (X n ) = E (X n+h X n ) = E [(X n+h m (X n )) 2] 61 Linear Prediction We now consider the problem of predicting X n+h, h > 0 for a stationary time series with known mean µ and ACVF γ( ) based on previous values {X n,, X 1 } showing the linear predictor of X n+h by P n X n+h We are interested in P n X n+h = a 0 + a 1 X n + a 2 X n a n X 1 which minimizes S(a 0,, a n ) = E [(X n+h P n X n+h ) 2] To get a 0, a 1,, a n we need to solve the system S a j a 0 = µ = 0 for j = 0, 1,, n Doing so, we get ( 1 ) n a i, Γ n a n = γ n (h) i=1 where a n = a 1 a 2 a n, Γ n = γ(0) γ(1) γ(n 1) γ(1) γ(0) γ(n 2) γ(n 1) γ(n 2) γ(0), γ n(h) = γ(h) γ(h + 1) γ(n + h 1) which implies P n X n+h = a 0 + Note 3 Here are some properties from the above: = µ = µ ( n a i X n i+1 i=1 1 ) n a i + i=1 n a i X n i+1 i=1 n a i (X n i+1 µ) P n X n+h is defined by µ, γ(h) [ It can be shown that E (X n+h P n X n+h ) 2] = γ(0) a T n γ n (h) E(X n+h P n X n+h ) = 0 E [(X n+h P n X n+h )X j ] = 0 for j = 1, 2,, n i=1 In a more general set-up, suppose that Y and W 1,, W n are any random variables with finite second moments and means µ Y = E(Y ), µ i = E(W i ) and Cov(Y, Y ), Cov(Y, W i ), Cov(W i, W j ) are all known for i = 1,, n Define W = (W n,, W 1 ) and µ W = (µ n,, µ 1 ) T Then 1 Midterm content ends here γ = Cov(Y, W ) = (Cov(Y, W n ),, Cov(Y, W 1 )) T Γ = Cov( W, W ) = [Cov(W n+1 i, W n+1 i )] n i, Rn n 11

15 Winter STATIONARY AND LINEAR PROCESSES Now by the same argument used in the derivation of P n X n+h, the best linear predictor of Y in terms of {W n,, W 1 } is P W Y = P (Y W ) = µ Y + a T n ( W µ W ) where a n is the solution of Γa = γ Also the MSE of this predictor is [ E (Y P W Y ) 2] = V ar(y ) a T n γ Note 4 In this case, we have the following properties Suppose that E[U 2 ] <, E[V 2 ] <, Γ = Cov( W, W ) and β, α 1,, α n are constants Then the following are true: 1 P W U = E[U] + ã T n ( W µ W ) where Γã n = γ [ 2 E (U P W U) W ] = 0 and E [U P W U] = 0 [ 3 E (U P W U) 2] = V ar(u) ã T n Cov(U, W ) 4 P W [α 1 U + α 2 V + β] = α 1 P W U + α 2 P W V + β 5 P W [ n i=1 α iw i + β] = n i=1 α iw i + β 6 P W U = E[U] if Cov(U, W ) = 0 Exercise 61 What is the best linear predictor for X n+1 in an AR(p) process, based on X 1,, X n for n > p? Example 63 Derive the one-step prediction for the AR(1) model (Here, h = 1) Suppose X t = φx t 1 + Z t where φ < 1 and {Z t } W N(0, σ 2 ) In a previous example, we showed that γ(h) = φ h γ(0), h = 1, 2,, γ(0) = Also, E[X t ] = µ = 0 To find the linear predictor, we need to solve σ2 1 φ 2 Γ n a n = γ n (h) = Γ na n γ(0) = γ n(h) γ(0) 1 φ φ n 1 φ 1 φ n 2 = φ n 1 φ n 2 1 a 1 a 2 a n = φ φ 2 φ n An obvious solution is Note that a n = a 1 a 2 a n = P nx n+1 = µ + = P n X n+1 = n a i (X n+1 i µ) i=1 n a i X n+1 i = a 1 X n + 0 = φx n i=1 MSE = E [(X n+1 P n X n+1 ) 2] [ = E (X n+1 φx n ) 2] = E[Zn+1] 2 = σ 2 12

16 Winter STATIONARY AND LINEAR PROCESSES or you can use the formula of MSE to get MSE = γ(0) a T n γ n (h) = γ(0) φγ(1) = γ(0) φ 2 γ(0) = γ(0)[1 φ 2 ] = σ 2 62 Linear Processes We have discussed linear prediction in which future values are predicted by linear combinations of historical values This section focuses on a class of linear time series which provides a general framework for studying stationary processes Definition 66 The time series {X t } is a linear process if X t = j= ψ jz t j for all t where {Z t } W N(0, σ 2 ) and ψ j is a sequence of constants such that j= ψ j < Example 64 Show that AR(1) with φ < 1 is a linear process We know that X t = φx t 1 + Z t }{{} W N(0,σ 2 ) and we showed before that X t = j=0 φj Z t j Since φ < 1 then if ψ j = φ j then j= ψ j and therefore all assumptions in the definition above are satisfied So AR(1) is a linear process For prediction purposes, we may not want to have dependence on the future innovations (Z t s) However, the general form of a linear process involves future innovations Definition 67 A linear process j= ψ jz t j is causal or future independent if ψ j = 0 for any j < 0 Example 65 Both AR(1) and M A(q) are causal where X AR(1) t = φ j Z t j j=0 X MA(q) t = Z t + q θ j Z t j 63 Box-Jenkins Models The Box-Jenkins methodology uses ARMA and ARIMA models for forecasting The class of ARMA models tries to balance goodness of fit with a limited number of parameters Whenever the series is not stationary, ARIMA models (ARMA with differencing) are used When seasonal effect is present, the more general SARIMA model will be used All these models are two key functions ACF and PACF Definition 68 {X t, t T } is an ARMA(p, q) process if 1) {X t, t T } is stationary 2) X t φ 1 X t 1 φ 2 X t 2 φ p X t p = Z t + θ 1 Z t θ q Z t q where {Z t } W N(0, σ 2 ) 3) Polynomials (1 φ 1 z φ p z p ) and (1 + θ 1 z + + θ q z q ) have no common factors/roots (IMPORTANT FOR THE FINAL!) We say that {X t, t T } is an ARMA process with mean µ if {X t µ} is an ARMA(p, q) process Recall the backward shift operator BX t = X t 1 With this operator, we can rewrite the ARMA process as (1) (1 φ i B φ 2 B 2 φ p B p ) X t = (1 + θ 1 B + θ 2 B θ q B q ) }{{}}{{} φ(b) θ(b) Z t 13

17 Winter STATIONARY AND LINEAR PROCESSES The general model above has a unique stationary solution X t iff φ(z) 0 for all complex z C such that z = 1 If z such that z = 1 we have φ(z) 0, then there exists δ > 0 such that 1 φ(z) = j= χ j z j, 1 δ < z < 1 + δ and j= χ j < Under this condition, is a linear filter and substituting (2) in (1), we get (2) 1 φ(b) = j= χ j B j X t = 1 φ(b) θ(b)z t Since 1/φ(B) and θ(b) are polynomials, then so is ψ(b) = 1 φ(b) θ(b) We then have X t = ψ(b)z t = where ψ(b) is of degree and j= ψ j < (Why?) j= ψ j Z t j Definition 69 An ARMA(p, q) process φ(b)x t = θ(b)z t where Z t W N(0, σ 2 ) is causal if there exists constants {ψ j } such that j=0 ψ j < and X t = j=0 ψ jz t j for any t This condition is equivalent to for any z C such that z 1 Remark 62 If the condition above holds true, then and we have Note 5 We try a few special cases: φ(z) = 1 φ 1 z 1 φ 2 z 2 φ p z p 0 θ(z) = ψ(z) φ(z) = θ(z) = φ(z) ψ(z) = 1 + θ 1 z + + θ q z q = (1 φ 1 z φ p z p )(ψ 0 + ψ 1 z + ) 1 = ψ 0 θ 1 = ψ 1 φ 1 ψ 0 1) If φ(z) = 1 then φ(b)x t = θ(b)z t reduces to X t = θ(b)z t = Z t + θ 1 Z t θ q Z t q which is an MA(q) process 2) If θ(b) = 1 we have φ(b)x t = Z t = X t φ 1 X t 1 φ p X t p = Z t which is an AR(p) process From here, we can see that the AR(p) and MA(q) are special cases of ARMA(p, q) processes where AR(p) = ARM A(p, 0) MA(q) = ARMA(0, q) 64 Invertibility An ARMA(p, q) process {X t } is invertible if there exists constants {Π j } such that j=0 Π j < and Z t = j=0 Π jx t j for all t Invertibility is equivalent to the condition θ(z) = 1 + θ 1 z + + θ q z q 0 14

18 Winter STATIONARY AND LINEAR PROCESSES for any z C such that z 1 Using the same methods above, one can get that Π 0 = 1 φ 1 = Π 0 θ 1 + Π 1 Example 66 Consider {X t, t T } satisfying X t 05X t 1 = Z t + 04Z t 1 where {Z t } W N(0, σ 2 ) Investigate the causality and invertibility of X t If the series is causal (invertible) then provide the causal (invertible) solutions These are called the M A( ) and AR( ) representations [Causality] We have φ(z) = 1 05z = z = 2 = z > 1 Since this is outside the unit circle, X t is causal We then have z = (1 05z)(ψ 0 + ψ 1 z + ) = ψ 0 = 1, ψ 1 05ψ 0 = 04, ψ 2 05ψ 1 = 0, = ψ 0 = 1, ψ 1 = 09, ψ 2 = 09(05), ψ 3 = 09(05) 2, We can kind of see the pattern (and prove using induction) { ψ j = 1 j = 0 ψ j = ψ j = 09(05) j 1 j 0 = X t = Z t + 09 (05) j 1 Z t j [Invertibility] We have θ(z) = z = 0 = z = 10/4 = z > 1 Since this is outside the unit circle, X t is invertible We then have, like above, 1 05z = (1 + 04z)(Π 0 + Π 1 z + ) = Π 0 = 1, Π Π 0 = 05, Π Π 1 = 0, = Π 0 = 1, Π 0 = 09, ψ 2 = 09( 04), ψ 3 = 09( 04) 2, We can kind of see the pattern (and prove using induction) { ψ j = 1 j = 0 ψ j = ψ j = 09( 04) j 1 j 0 = X t = Z t 09 ( 04) j 1 Z t j Remark 63 (ACVF of ARMA processes) Consider a causal, stationary process φ(b)x t = θ(b)z t with Z t W N(0, σ 2 ) The MA( ) representation of X t is X t = j=0 ψ jz t j where E[X t ] = 0 We have γ(h) = E[X t X t+h ] E[X t ]E[X t+h ] }{{} =0 = E ψ j Z t j ψ j Z t+h j Notice that E[Z t Z s ] = 0 when t s We then have { j=0 γ(h) = ψ jψ j+h E[Zj 2 ] h 0 j=0 ψ jψ j h E[Zj 2] h < 0 = σ2 ψ j ψ j+ h Example 67 Derive the ACVF for the following ARMA(1, 1) process j=0 j=0 X t φx t 1 = Z t θz t 1 where Z t W N(0, σ 2 ) and φ < 1 Note that φ(z) is causal because 1 φz = 0 = z = 1/φ > 1 It can be shown, with similar methods above, that { ψ j = φ(φ + θ) j = 0 ψ j = ψ j = φ j 1 (φ + θ) j 0 j=0 15

19 Winter STATIONARY AND LINEAR PROCESSES Now if h = 0 then If h 0 then γ(0) = σ 2 γ(0) = σ 2 j=0 j=0 ψj 2 = σ ψ 2 j = σ (θ + φ) 2 = σ 2 [ φ 2(j 1) ] 1 + (θ + φ) 2 φ 2i i=0 [ = σ (θ + ] φ)2 φ 1 φ 2 ψ j ψ j+ h = σ 2 ψ 0 ψ h + ψ j ψ j+ h = σ 2 φ h 1 (θ + φ) + (θ + φ) 2 = σ 2 φ h 1 (θ + φ) + (θ + φ) 2 φ h 1 [ = σ 2 φ h 1 (θ + φ) + (θ + φ)2 φ h +1 ] 1 φ 4 φ j 1 φ j+ h φ 2j 65 Partial Autocorrelation Function (PACF) ACF measures the correlation between X n and X n+h This correlation can be due to direct connection, or through the intermediate steps X n+1, X n+2,, X n+h 1 PACF looks at the correlation between X n and X n+h once the effect of the intermediate steps are removed We remove the effect of the intermediate steps by deriving the linear predictors P (X n X n+1,, X n+h 1 ) and P (X n+h X n+1,, X n+h 1 ) Definition 610 The partial autocorrelation function (PACF) is shown by α(h) and is defined to be 1 h = 0 α(h) = Cor(X n, X n+1 ) = ρ(1) h = 1 Cor [X n P (X n X n+1,, X n+h 1 ), X n+h P (X n+h X n+1,, X n+h 1 )] o/w Example 68 Derive the PACF for an AR(1) process with φ < 1 We saw in Example 10 that P (X n+1 X n ) = φx n where X t = φx t 1 + Z t is an AR(1) process We then have { α(0) = 1 h = 0 α(h) = α(1) = φ 1 = φ h = 1 For h = 2 we have Cor [X t P (X t X t+1 ), X t+2 P (X t+2 X t+1 )] = Cor X t P (X t X t+1 ), X t+2 φx t+1 }{{}}{{} f(x t+1) Z t+2 = 0 16

20 Winter STATIONARY AND LINEAR PROCESSES Similarly, α(h) = 0 for any h > 2 This is fairly similar to the ACF of an MA(1) process where the value cuts off after 1 Also notice that this is similar to ACF in the sense that the PACF is symmetric in h so h < 0 is omitted from deviations above Theorem 61 {X t, t T } is a causal AR(p) process if and only if its PACF has the following arguments: 1) α(p) 0 2) α(h) = 0, h > p Furthermore, α(p) = φ p This theorem show that PACF is a powerful tool for identifying AR(p) processes In fact, ACF to MA(q) is like the PACF to AR(q) from the visual point of view (trend) In summary: ACF PACF M A(q) Zero after lag q Decays exponentially AR(p) Decays exponentially Zero after lag p In the general case of ARMA processes, the PACF is defined as α(0) = 1 and α(h) = Φ hh for h 1 where Φ hh is the last component of the vector Φ h = Γ 1 h γ h in which γ(0) γ(1) γ(h 1) γ(1) γ(0) γ(h 2) Γ h =, γ h = γ(h 1) γ(h 2) γ(0) Definition 611 Based on observations {x 1,, x n } with x i x j for i, j = 1,, n The sample PACF ˆα(h) is given by ˆα(0) = 1, ˆα(h) = ˆΦ hh, h 1 where ˆΦ hh is the last component of where the terms on the right are sample estimates Example 69 Calculate α(2) for an M A(1) process ˆΦ = 1 ˆΓ h ˆγ h X t = Z t + θz t 1, {Z t } W N(0, σ 2 ) γ(1) γ(2) γ(h) We have shown before that (1 + θ 2 )σ 2 h = 0 γ(h) = θσ 2 h = 1 0 h 2 We have Φ = Γ 1 h γ h So α(h) is the last element of Φ h and h = 1 = Φ 11 = (γ(0)) 1 γ(1) = γ(1) γ(0) = h = 2 = θ 1 + θ 2 θσ 2 (1 + θ 2 )σ 2 0 ( (1 + θ 2 )σ 2 θσ 2 ) 1 ( θσ 2 Where the last element of the case of h = 2, in reduced form, is ) ( = θ(1+θ 2 )σ 4 (1+θ 2 ) 2 σ 4 θ 2 σ 4 θσ 2 (1+θ 2 ) 2 σ 4 θ 2 σ 4 ) It can be shown, in general, that α(2) = Φ 22 = θ θ 2 + θ 4 α(h) = Φ hh = ( θ)h h i=0 θ2h 17

21 Winter STATIONARY AND LINEAR PROCESSES 66 ARIM A(p, d, q) Processes Definition 612 Let d be a non-negative integer {X t, t T } is an ARIMA(p, d, q) process if Y t = (1 B) d X t is a causal ARMA(p, q) process The definition above means that {X t, t T } satisfies an equation of the form φ (B)X t φ(b)(1 B) d X t = θ(b)z t, {Z t } W N(0, σ 2 ) Note that φ (1) = 0 = X t is not stationary unless d = 0 Therefore, {X t } is stationary iff d = 0 in which case it is reduced to an ARMA(p, q) process in the previous case Recall that if {X t } exhibits a polynomial trend of the form m(t) = α 0 + α 1 t + + α d t d then (1 B) d X t will not have that trend any more Therefore, ARIMA models (when d 0) are appropriate when the trend in the data is well approximated by a polynomial degree d Example 610 Consider the process X t = 08X t 1 + 2t + Z t where {Z t } W N(0, σ 2 ) Write this process in ARIMA(p, d, q) format To get rid of the linear trend 2t, we perform one time differencing So If Y t = (1 B)X t then the above is written as (1 B)X t = 08 (X t 1 X t 2 ) + Z t Z t Y t 08Y t 1 = Z t Z t = (Y t 10) 08(Y t 1 10) = Z t Z t 1 Therefore {Y t 10} is an ARMA(1, 1) process Hence Y t i s an ARMA(1, 1) process with mean 10 and X t is an ARIMA(1, 1, 1) process We have seen how differencing can be used to remove a trend Seasonality is a particular type of trend which can be removed by a particular type of differencing This will be discussed under SARIMA (seasonal ARIMA models) 67 SARIMA(p, d, q) (P, D, Q) S Processes Recall the operator B where B k X t = X t k Clearly (1 B k ) and (1 B) k are different filters The latter is performing k times differencing, but the former is differencing once in lag k In R, we will write diff(x,difference=k) (1 B) k X t diff(x,lag=k) (1 B k )X t As an example, consider the process {X t } where t represents the month If there exists a seasonal effect, ie S(t) = S(t+12), then the effect of seasonal trend for X t and X t 12 should be the same That is Y t = X t X t 12 should not exhibit any seasonal trends Therefore, if we apply differencing using the latter of the above equations, we can (in theory) remove the effect of the seasonal trend Therefore, fitting an ARMA(p, q) model to the differenced series Y t = (1 B s )X t is the same as fitting the model φ(b)(1 B S )X t = θ(b)z t where S represents the season This is a special case of SARIMA models Definition 613 If d, D are non-negative integers, then {X t t T } is a seasonal ARIMA(p, d, q) (P, D, Q) S process with period S if the differenced series Y t = d D S X t = (1 B) d (1 B S ) D X t is a causal ARMA process defined by φ(b)φ(b S )Y t = θ(b)θ(b S )Z t, Z t W N(0, σ 2 ) 18

22 Winter PARAMETER ESTIMATION IN ARMA PROCESSES where φ(z) = 1 φ 1 z φ p z p Φ(z) = 1 Φ 1 z Φ P z P θ(z) = 1 + θ 1 z + + θ q z q Θ(z) = 1 + Θ 1 z + + Θ Q z Q Remark 64 Notice that the process {X t, t T } is causal iff φ(z) 0 Φ(z) 0 for all z : z < 1 Remark 65 In practice D is rarely more than 1 and typically P, Q are typically less than 3 Example 611 Write down the equation form of the ARMA(1, 1) 12 process This is equivalent to and the general form is ARMA(1, 1) 12 = SARIMA(0, 0, 0) (1, 0, 1) 12 (1 Φ 1 B 12 ) = (1 + Θ 1 B 12 )Z t, Z t W N(0, σ 2 ) If d 0 or D 0 then SARIMA models are not stationary This model, (ARMA(1, 1) 12 ) looks like ARMA(1, 1) In fact, this model is an ARMA(1, 1) sitting on the season s = 12 Example 612 Derive the ACF of SARIMA(0, 0, 1) 12 = SARIMA(0, 0, 0) (0, 0, 1) 12 This gives us the general form X t = Z t + Θ 1 Z t 12, Z t W N(0, σ 2 ) Show, as an exercise, that (1 + Θ 2 1)σ 2 h = 0 γ(h) = Cov(X t, X t+h ) = Θ 1 σ 2 h = 12 0 otherwise ρ(h) = γ(h) 1 h = 0 γ(0) = θ 1+θ h = otherwise 7 Box-Jenkins Methodology To use Box-Jenkins methodology you do the following: 1 Check for seasonal and non-seasonal trends (stationarity) 2 Use differencing to make the process stationary 3 Identify p, q, P, Q visually from ACF, PACF or with formal model selection methods 4 Forecast the future with the appropriate model (See slides for more info) 8 Parameter Estimation in ARMA Processes This section concentrates on estimation of the parameters φ i, i = 1,, p and θ j, j = 1,, q as well as σ 2, the variance of the WN, in the ARMA(p, q) process φ(b)x t = θ(b)z t where {Z t } W N(0, σ 2 ) We assume that p and q are correctly specified If the mean of the series in not zero, we will use the model φ(b)(x t µ) = θ(b)z t where µ = E[X t ], t Also, µ = X = 1 n The common parameter estimation methods are maximum likelihood, least squares, Yule-Walker, innovations algorithms, and the Durbin-Levinson method We will only focus on the first two for the remainder of this course n i=1 X i 19

23 Winter PARAMETER ESTIMATION IN ARMA PROCESSES 81 Yule-Walker Methods Consider a causal AR(p) model (1) X t φ 1 X t 1 φ p X t p = Z t with causal solution X t = j=0 ψ jz t j where {Z t } W N(0, σ 2 ) Multiply both sides of (1) by X t j with j = 0, 1, 2,, p and taking expectations will give us E[X t X t j ] φ 1 E[X t 1 X t j ] φ p E[X t p X t j ] = E[Z t X t j ] = γ(j) φ 1 γ(j 1) φ p γ(j p) = E[Z t X t j ] We then have { ] E[Zt X t j] = E[Z t X t ] = E [Z t j=0 ψ jz t j = E[Zt 2 ] = σ 2 j = 0 So the original equation reduces to E[Z t X t j ] = 0 j > 0 { γ(0) φ 1 γ(1) φ p γ(p) = σ 2 j = 0 γ(j) φ 1 γ( j 1 ) γ( j p ) = 0 j 0 These are called the Yule-Walker equations This can be easily generalized to a matrix form Γ p φ = γ p Based on a sample {x 1, x 2,, x n } the parameters φ and σ 2 can be estimated by ˆφ = 1 ˆΓ p ˆγ p where the matrices are defined in a similar fashion as the best linear predictor section The system above is called the sample Yule-Walker equations We can write Yule-Walker equations in terms of ACF too Explicitly, if we divide ˆγ p by γ(0) and multiply it in ˆΓ p then ˆφ = 1 ˆR p ˆρ p ˆR p = ˆΓ p ˆγ(0) = 1 1 ˆR p = ˆΓ p ˆγ(0) ˆρ p = ˆγ p /ˆγ(0) where ˆσ 2 = ˆγ(0) [1 ˆφ ] ˆρ p Notice that ˆγ(0) is the sample variance of {x 1,, x n } Based on a sample {x 1,, x n }, the above equations will provide the parameter estimates Using advanced probability theory, it can be shown that φ = φ 1 φ p MV N φ = φ 1 φ p, σ2 n Γ 1 p for large n If we replace σ 2 and Γ p by their sample estimates ˆσ 2 and ˆΓ p we can use this result for large-sample confidence intervals for the parameters φ 1,, φ p Example 81 Based on the following sample ACF and PACF, an AR(2) has been proposed for the data Provide the Yule- Walker estimates of the parameters as well as 95% confidence intervals for the parameters in φ The data was collected over a window of 200 points with sample variance 369 with the following table: h ˆf(h) ˆα(h) We want to estimate φ 1 and φ 2 in X t = φ 1 X t 1 + φ 2 X t 2 + Z t, {Z t } N(0, σ 2 ) 20

24 Winter FORECASTING The system is Similarly, Therefore the estimated model is [ ˆφ = ] 1 [ [ [ ˆσ 2 = ˆγ(0) 1 }{{} ˆφ ˆρ(1) ˆρ(2) 369 ] = [ ]] = 1112 ] X t = 0594X t X t 1 + Z t, {Z t } W N(0, 1112) Now φ N ) (φ, σ2 n Γ 1 2 ([ ] 0594 = N, 1112 [ ([ ] [ ]) = N, ]) So the 95% CI s for φ 1, φ 2 are ˆφ 1 ± 196 Vˆar( φ) = 0594 ± = (0455, 0733) ˆφ 2 ± 196 Vˆar( φ) = 0276 ± = (0137, 0415) (Johnson & Wichard discuss ellipsoid CI s in Applied Multivariate Statistical Analysis ) 9 Likelihood Models To use likelihood models, we have to make some distributional assumptions Consider {X t, t T } to be a Gaussian process We have that Z t in φ(b)x t = θ(b)z t is iid G(0, σ) Based on the observations x 1,, x n at times 1, 2,, n, the likelihood function of the parameters φ, θ and σ 2 is L(θ, φ, σ 2 ) = 1 (2π) n/2 Γ n 1/2 e 1 2 xt Γ 1 n x Notice that it is assumed that E[X t ] = 0 for all t To estimate φ, θ and σ 2, we maximize the likelihood function above Usually, it is easier to maximize the log of L(θ, φ, σ 2 ) which is called the log-likelihood In this likelihood function, γ(h) depends on the parameters θ, φ and σ 2 in a linear way Furthermore, as the dataset gets larger (n increases), the inversion Γ 1 n can be computationally challenging Therefore, efficient computational methods are needed for likelihood estimation 10 Forecasting We discuss how forecasting works under our studied processes 21

25 Winter FORECASTING 101 Forecasting AR(p) Let X t = p φ jx t j + Z t, Z t W N{0, σ 2 } be a causal AR(p) process We have ˆX n+h = E[X n+h X 1,, X n ], h > 0 h 1 p = E φ j X n+h j + φ j X n+h j X 1,, X n + E [Z n+h X 1,, X n ] }{{} j=h h 1 p = E φ j X n+h j X 1,, X n + E φ j X n+h j X 1,, X n due to the uncorrelatedness of Z n+h with respect to X k If h = 1, then the above equation becomes If h = 2, 3,, p then remark that and so ˆX n+1 = j=h p φ j X n+1 j j < h = n + h j > n j h = n + h j n =0 ˆX n+h = = p j=h h 1 φ j X n+h j + φ j E (X n+h j X 1,, X n ) h 1 φ j ˆXn+h j + p φ j X n+h j j=h If h > p, then n + h j > n and ˆX n+h = p φ j E (X n+h j X 1,, X n ) = p φ j ˆXn+h j In summary, for a causal AR(p), the h step predictor is ˆX n+1 = p φ jx n+1 j h = 1 ˆX n+h = h 1 φ ˆX j n+h j + p j=h φ jx n+h j h = 2, 3,, p p φ ˆX j n+h j h > p In AR(p), the h step prediction is a linear combination of the previous steps We either have the previous p steps in X 1,, X n so we substitute the values (like the h = 1 case), or we don t have all or some of them, in which case we recursively predict Given a dataset, φ j can be estimated and ˆX n+h will be computed Example 101 Based on the annual sales data of a chain store, an AR(2) model with parameters ˆφ 1 = 1 and ˆφ 2 = 021 has bee fitted If the total sales of the last 3 years have been 9, 11 and 10 million dollars Forecast this year s total sales (2013) as well as that of 2015 We have Now X t = X t 1 021X t 2 + Z t, {Z t } W N(0, σ 2 ) ˆX 2013 = X X 2011 = 669 ˆX 2015 = ˆX ˆX 2013 = ˆX (669) 22

26 Winter FORECASTING and since then ˆX 2014 = ˆX ˆX 2012 = = 48 ˆX 2015 = (669) = Forecasting MA(q) MA processes are linear combinations of white noise To do forecasting in MA(q), we need to estimate θ 1,, θ q as well as approximate the innovations Z t, Z t+1, First, consider the very simple case of MA(1) where X t = Z t + θz t 1, {Z t } W N(0, σ 2 ) We have If h = 1, then the above equation is ˆX n+h = E [X n+h X 1,, X n ] = E [Z n+h X 1,, X n ] + θe [Z n+h 1 X 1,, X n ] ˆX n+1 = E [Z n+1 X 1,, X n ] +θe [Z n X 1,, X n ] }{{} =0 = θe [Z n X 1,, X n ] = θz n and if h > 1 then the equation becomes ˆX n+1 = E[Z n+h ] + θe Z n + h 1 X 1,, X n = 0 }{{} >n Now we need to plug in a value for Z n We approximate the Z i s by U i s as follows Let U 0 = 0 and we estimate from the fact that Z t = X t θz t 1 We can then get that Ẑ t = U t = X t θu t 1, U 0 = 0 U 0 = 0 U 1 = X 1 U 2 = X 2 θx 1 U 3 = X 3 θx 2 + θ 2 X 1 Notice that as i, U i will need a convergence condition where θ < 1 is sufficient This was the invertibility condition for MA(1) We see that the U i s are recursively calculable and for an invertible MA(1) process, we have { θu n h = 1 ˆX n+h = 0 h > 1, U t = X t θu t 1, U 0 = 0 Now consider an MA(q) process X t = Z t + θ 1 Z t θ q Z t q We have ˆX n+h = E [X n+h X 1,, X n ] = E [Z n+h X 1,, X n ] + θ 1 E [Z n+h 1 X 1,, X n ] + + θ q E [Z n+h q X 1,, X n ] 23

27 Winter FORECASTING If h > q then the above equation s value is zero since we have n + h q > n If 0 < h q then at least some of the terms in the above are non-zero In particular, ˆX n+h = = q θ j E [Z n+h 1 X 1,, X n ] q θ j E [Z n+h 1 X 1,, X n ] j=h and for j = h, h + 1,, q we know E[Z n+h j X 1,, X n ] = Z n+h j and hence ˆX n+h = q θ j Z n+h j j=h Similar to MA(1), we approximate Z i s by U i s, provided the MA(q) process is invertible That is, θ(z) = 1+θ 1z++θ q z q 0 for all z 1 Therefore, assuming that U 0 = U 1 = U 2 = = 0 then U t = X t q θ ju t j and U 0 = 0 U 1 = X 1 U 2 = X 2 θ 1 X 1 U 3 = X 3 θ 2 X 2 + θ 2 θ 1 X 1 In summary, for an invertible MA(q) process, we have { q j=h ˆX n+h = θ ju n+h j 1 h q 0 h > q where U 0 = U i = = 0, i < 0 and U t = X t q θ ju t j for t = 1, 2, 3, Example 102 Consider the MA(1) process X t = Z t + 05Z t 1 where {Z n } W N(0, σ 2 ) If X 1 = 03, X 2 = 01, X 3 = 01, predict X 4, X 5 Notice that ˆX 5 = ˆX 3+2 which is a 2-step prediction based on the history X 1 = X 2 = X 3 Since this is an MA(1) model, hence 1-correlated, ˆX5 = 0 For X 4 we have where and hence ˆX 4 = 05(0225) = ˆX 4 = U 0 = 0 1 = θ j U 3+1 j = θ 1 U 3 = 05U 3 U 1 = X 1 05U 0 = X 1 = 03 U 2 = X 2 05U 1 = 01 (05)(03) = 025 U 3 = X 3 05U 2 = 01 (05)( 025) = 0225 Example 103 Consider the MA(1) process X t = Z t + θz t 1 with {Z t } W N(0, σ 2 ) and θ < 1 Show that the one-step predictor ˆX n+1 = θu n is equal to the predictor ˆX n+1 = n ( θ) j X n j+1 24

28 Winter FORECASTING This is by definition of U n which we can write the closed form n 1 U n = X n + ( θ) i X n i, n 2 i=1 and hence n 1 n 1 n ˆX n+1 = θu n = θx n ( θ) i+1 X n i = ( θ) i+1 X n i = ( θ) j X n j+1 = ˆXn+1 i=1 i=0 Clearly for n = 0, 1 we have ˆX n+1 = ˆXn+1 as well This shows that even in the MA process, the predictor may be written as a linear function of the history 103 Forecasting ARMA(p,q) Processes For the causal and invertible ARMA(p, q) process φ(b)x t = θ(b)z t where {Z t } W N(0, σ 2 ), the predictors for MA(q) and AR(p) are combined We will not go into the theory of forecasting in ARMA processes and will use R for that matter For example p q ˆX n+1 = φ j X n+1 j + θ i U n+1 i subject to some conditions i=1 25

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong STAT 443 Final Exam Review L A TEXer: W Kong 1 Basic Definitions Definition 11 The time series {X t } with E[X 2 t ] < is said to be weakly stationary if: 1 µ X (t) = E[X t ] is independent of t 2 γ X

More information

STOR 356: Summary Course Notes

STOR 356: Summary Course Notes STOR 356: Summary Course Notes Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC 7599-360 rls@email.unc.edu February 19, 008 Course text: Introduction

More information

Covariances of ARMA Processes

Covariances of ARMA Processes Statistics 910, #10 1 Overview Covariances of ARMA Processes 1. Review ARMA models: causality and invertibility 2. AR covariance functions 3. MA and ARMA covariance functions 4. Partial autocorrelation

More information

Statistics of stochastic processes

Statistics of stochastic processes Introduction Statistics of stochastic processes Generally statistics is performed on observations y 1,..., y n assumed to be realizations of independent random variables Y 1,..., Y n. 14 settembre 2014

More information

Part II. Time Series

Part II. Time Series Part II Time Series 12 Introduction This Part is mainly a summary of the book of Brockwell and Davis (2002). Additionally the textbook Shumway and Stoffer (2010) can be recommended. 1 Our purpose is to

More information

Time Series I Time Domain Methods

Time Series I Time Domain Methods Astrostatistics Summer School Penn State University University Park, PA 16802 May 21, 2007 Overview Filtering and the Likelihood Function Time series is the study of data consisting of a sequence of DEPENDENT

More information

3 Theory of stationary random processes

3 Theory of stationary random processes 3 Theory of stationary random processes 3.1 Linear filters and the General linear process A filter is a transformation of one random sequence {U t } into another, {Y t }. A linear filter is a transformation

More information

Contents. 1 Time Series Analysis Introduction Stationary Processes State Space Modesl Stationary Processes 8

Contents. 1 Time Series Analysis Introduction Stationary Processes State Space Modesl Stationary Processes 8 A N D R E W T U L L O C H T I M E S E R I E S A N D M O N T E C A R L O I N F E R E N C E T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Time Series Analysis 5

More information

STAT Financial Time Series

STAT Financial Time Series STAT 6104 - Financial Time Series Chapter 4 - Estimation in the time Domain Chun Yip Yau (CUHK) STAT 6104:Financial Time Series 1 / 46 Agenda 1 Introduction 2 Moment Estimates 3 Autoregressive Models (AR

More information

A time series is called strictly stationary if the joint distribution of every collection (Y t

A time series is called strictly stationary if the joint distribution of every collection (Y t 5 Time series A time series is a set of observations recorded over time. You can think for example at the GDP of a country over the years (or quarters) or the hourly measurements of temperature over a

More information

Levinson Durbin Recursions: I

Levinson Durbin Recursions: I Levinson Durbin Recursions: I note: B&D and S&S say Durbin Levinson but Levinson Durbin is more commonly used (Levinson, 1947, and Durbin, 1960, are source articles sometimes just Levinson is used) recursions

More information

Levinson Durbin Recursions: I

Levinson Durbin Recursions: I Levinson Durbin Recursions: I note: B&D and S&S say Durbin Levinson but Levinson Durbin is more commonly used (Levinson, 1947, and Durbin, 1960, are source articles sometimes just Levinson is used) recursions

More information

Parameter estimation: ACVF of AR processes

Parameter estimation: ACVF of AR processes Parameter estimation: ACVF of AR processes Yule-Walker s for AR processes: a method of moments, i.e. µ = x and choose parameters so that γ(h) = ˆγ(h) (for h small ). 12 novembre 2013 1 / 8 Parameter estimation:

More information

Forecasting. Simon Shaw 2005/06 Semester II

Forecasting. Simon Shaw 2005/06 Semester II Forecasting Simon Shaw s.c.shaw@maths.bath.ac.uk 2005/06 Semester II 1 Introduction A critical aspect of managing any business is planning for the future. events is called forecasting. Predicting future

More information

Chapter 4: Models for Stationary Time Series

Chapter 4: Models for Stationary Time Series Chapter 4: Models for Stationary Time Series Now we will introduce some useful parametric models for time series that are stationary processes. We begin by defining the General Linear Process. Let {Y t

More information

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

A SARIMAX coupled modelling applied to individual load curves intraday forecasting A SARIMAX coupled modelling applied to individual load curves intraday forecasting Frédéric Proïa Workshop EDF Institut Henri Poincaré - Paris 05 avril 2012 INRIA Bordeaux Sud-Ouest Institut de Mathématiques

More information

3. ARMA Modeling. Now: Important class of stationary processes

3. ARMA Modeling. Now: Important class of stationary processes 3. ARMA Modeling Now: Important class of stationary processes Definition 3.1: (ARMA(p, q) process) Let {ɛ t } t Z WN(0, σ 2 ) be a white noise process. The process {X t } t Z is called AutoRegressive-Moving-Average

More information

Time Series Analysis -- An Introduction -- AMS 586

Time Series Analysis -- An Introduction -- AMS 586 Time Series Analysis -- An Introduction -- AMS 586 1 Objectives of time series analysis Data description Data interpretation Modeling Control Prediction & Forecasting 2 Time-Series Data Numerical data

More information

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Autoregressive Moving Average (ARMA) Models and their Practical Applications Autoregressive Moving Average (ARMA) Models and their Practical Applications Massimo Guidolin February 2018 1 Essential Concepts in Time Series Analysis 1.1 Time Series and Their Properties Time series:

More information

Calculation of ACVF for ARMA Process: I consider causal ARMA(p, q) defined by

Calculation of ACVF for ARMA Process: I consider causal ARMA(p, q) defined by Calculation of ACVF for ARMA Process: I consider causal ARMA(p, q) defined by φ(b)x t = θ(b)z t, {Z t } WN(0, σ 2 ) want to determine ACVF {γ(h)} for this process, which can be done using four complementary

More information

Applied time-series analysis

Applied time-series analysis Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 18, 2011 Outline Introduction and overview Econometric Time-Series Analysis In principle,

More information

MATH 5075: Time Series Analysis

MATH 5075: Time Series Analysis NAME: MATH 5075: Time Series Analysis Final For the entire test {Z t } WN(0, 1)!!! 1 1) Let {Y t, t Z} be a stationary time series with EY t = 0 and autocovariance function γ Y (h). Assume that a) Show

More information

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn } Stochastic processes Time series are an example of a stochastic or random process Models for time series A stochastic process is 'a statistical phenomenon that evolves in time according to probabilistic

More information

STAT STOCHASTIC PROCESSES. Contents

STAT STOCHASTIC PROCESSES. Contents STAT 3911 - STOCHASTIC PROCESSES ANDREW TULLOCH Contents 1. Stochastic Processes 2 2. Classification of states 2 3. Limit theorems for Markov chains 4 4. First step analysis 5 5. Branching processes 5

More information

Introduction to Time Series Analysis. Lecture 11.

Introduction to Time Series Analysis. Lecture 11. Introduction to Time Series Analysis. Lecture 11. Peter Bartlett 1. Review: Time series modelling and forecasting 2. Parameter estimation 3. Maximum likelihood estimator 4. Yule-Walker estimation 5. Yule-Walker

More information

Some Time-Series Models

Some Time-Series Models Some Time-Series Models Outline 1. Stochastic processes and their properties 2. Stationary processes 3. Some properties of the autocorrelation function 4. Some useful models Purely random processes, random

More information

FE570 Financial Markets and Trading. Stevens Institute of Technology

FE570 Financial Markets and Trading. Stevens Institute of Technology FE570 Financial Markets and Trading Lecture 5. Linear Time Series Analysis and Its Applications (Ref. Joel Hasbrouck - Empirical Market Microstructure ) Steve Yang Stevens Institute of Technology 9/25/2012

More information

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis Chapter 12: An introduction to Time Series Analysis Introduction In this chapter, we will discuss forecasting with single-series (univariate) Box-Jenkins models. The common name of the models is Auto-Regressive

More information

Stat 248 Lab 2: Stationarity, More EDA, Basic TS Models

Stat 248 Lab 2: Stationarity, More EDA, Basic TS Models Stat 248 Lab 2: Stationarity, More EDA, Basic TS Models Tessa L. Childers-Day February 8, 2013 1 Introduction Today s section will deal with topics such as: the mean function, the auto- and cross-covariance

More information

Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be ed to

Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be  ed to TIME SERIES Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be emailed to Y.Chen@statslab.cam.ac.uk. 1. Let {X t } be a weakly stationary process with mean zero and let

More information

γ 0 = Var(X i ) = Var(φ 1 X i 1 +W i ) = φ 2 1γ 0 +σ 2, which implies that we must have φ 1 < 1, and γ 0 = σ2 . 1 φ 2 1 We may also calculate for j 1

γ 0 = Var(X i ) = Var(φ 1 X i 1 +W i ) = φ 2 1γ 0 +σ 2, which implies that we must have φ 1 < 1, and γ 0 = σ2 . 1 φ 2 1 We may also calculate for j 1 4.2 Autoregressive (AR) Moving average models are causal linear processes by definition. There is another class of models, based on a recursive formulation similar to the exponentially weighted moving

More information

1 Linear Difference Equations

1 Linear Difference Equations ARMA Handout Jialin Yu 1 Linear Difference Equations First order systems Let {ε t } t=1 denote an input sequence and {y t} t=1 sequence generated by denote an output y t = φy t 1 + ε t t = 1, 2,... with

More information

Univariate Time Series Analysis; ARIMA Models

Univariate Time Series Analysis; ARIMA Models Econometrics 2 Fall 24 Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen of4 Outline of the Lecture () Introduction to univariate time series analysis. (2) Stationarity. (3) Characterizing

More information

Ch 4. Models For Stationary Time Series. Time Series Analysis

Ch 4. Models For Stationary Time Series. Time Series Analysis This chapter discusses the basic concept of a broad class of stationary parametric time series models the autoregressive moving average (ARMA) models. Let {Y t } denote the observed time series, and {e

More information

STAD57 Time Series Analysis. Lecture 8

STAD57 Time Series Analysis. Lecture 8 STAD57 Time Series Analysis Lecture 8 1 ARMA Model Will be using ARMA models to describe times series dynamics: ( B) X ( B) W X X X W W W t 1 t1 p t p t 1 t1 q tq Model must be causal (i.e. stationary)

More information

TIME SERIES AND FORECASTING. Luca Gambetti UAB, Barcelona GSE Master in Macroeconomic Policy and Financial Markets

TIME SERIES AND FORECASTING. Luca Gambetti UAB, Barcelona GSE Master in Macroeconomic Policy and Financial Markets TIME SERIES AND FORECASTING Luca Gambetti UAB, Barcelona GSE 2014-2015 Master in Macroeconomic Policy and Financial Markets 1 Contacts Prof.: Luca Gambetti Office: B3-1130 Edifici B Office hours: email:

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

Ch 6. Model Specification. Time Series Analysis

Ch 6. Model Specification. Time Series Analysis We start to build ARIMA(p,d,q) models. The subjects include: 1 how to determine p, d, q for a given series (Chapter 6); 2 how to estimate the parameters (φ s and θ s) of a specific ARIMA(p,d,q) model (Chapter

More information

Time Series 2. Robert Almgren. Sept. 21, 2009

Time Series 2. Robert Almgren. Sept. 21, 2009 Time Series 2 Robert Almgren Sept. 21, 2009 This week we will talk about linear time series models: AR, MA, ARMA, ARIMA, etc. First we will talk about theory and after we will talk about fitting the models

More information

STAT 248: EDA & Stationarity Handout 3

STAT 248: EDA & Stationarity Handout 3 STAT 248: EDA & Stationarity Handout 3 GSI: Gido van de Ven September 17th, 2010 1 Introduction Today s section we will deal with the following topics: the mean function, the auto- and crosscovariance

More information

Nonlinear time series

Nonlinear time series Based on the book by Fan/Yao: Nonlinear Time Series Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 27, 2009 Outline Characteristics of

More information

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION ARIMA MODELS: IDENTIFICATION A. Autocorrelations and Partial Autocorrelations 1. Summary of What We Know So Far: a) Series y t is to be modeled by Box-Jenkins methods. The first step was to convert y t

More information

Discrete time processes

Discrete time processes Discrete time processes Predictions are difficult. Especially about the future Mark Twain. Florian Herzog 2013 Modeling observed data When we model observed (realized) data, we encounter usually the following

More information

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis Introduction to Time Series Analysis 1 Contents: I. Basics of Time Series Analysis... 4 I.1 Stationarity... 5 I.2 Autocorrelation Function... 9 I.3 Partial Autocorrelation Function (PACF)... 14 I.4 Transformation

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting) Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting) (overshort example) White noise H 0 : Let Z t be the stationary

More information

Econ 623 Econometrics II Topic 2: Stationary Time Series

Econ 623 Econometrics II Topic 2: Stationary Time Series 1 Introduction Econ 623 Econometrics II Topic 2: Stationary Time Series In the regression model we can model the error term as an autoregression AR(1) process. That is, we can use the past value of the

More information

STA 6857 Autocorrelation and Cross-Correlation & Stationary Time Series ( 1.4, 1.5)

STA 6857 Autocorrelation and Cross-Correlation & Stationary Time Series ( 1.4, 1.5) STA 6857 Autocorrelation and Cross-Correlation & Stationary Time Series ( 1.4, 1.5) Outline 1 Announcements 2 Autocorrelation and Cross-Correlation 3 Stationary Time Series 4 Homework 1c Arthur Berg STA

More information

ITSM-R Reference Manual

ITSM-R Reference Manual ITSM-R Reference Manual George Weigt February 11, 2018 1 Contents 1 Introduction 3 1.1 Time series analysis in a nutshell............................... 3 1.2 White Noise Variance.....................................

More information

MAT3379 (Winter 2016)

MAT3379 (Winter 2016) MAT3379 (Winter 2016) Assignment 4 - SOLUTIONS The following questions will be marked: 1a), 2, 4, 6, 7a Total number of points for Assignment 4: 20 Q1. (Theoretical Question, 2 points). Yule-Walker estimation

More information

Lecture 1: Fundamental concepts in Time Series Analysis (part 2)

Lecture 1: Fundamental concepts in Time Series Analysis (part 2) Lecture 1: Fundamental concepts in Time Series Analysis (part 2) Florian Pelgrin University of Lausanne, École des HEC Department of mathematics (IMEA-Nice) Sept. 2011 - Jan. 2012 Florian Pelgrin (HEC)

More information

Introduction to Stochastic processes

Introduction to Stochastic processes Università di Pavia Introduction to Stochastic processes Eduardo Rossi Stochastic Process Stochastic Process: A stochastic process is an ordered sequence of random variables defined on a probability space

More information

Empirical Market Microstructure Analysis (EMMA)

Empirical Market Microstructure Analysis (EMMA) Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg

More information

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION MAS451/MTH451 Time Series Analysis TIME ALLOWED: 2 HOURS

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION MAS451/MTH451 Time Series Analysis TIME ALLOWED: 2 HOURS NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER II EXAMINATION 2012-2013 MAS451/MTH451 Time Series Analysis May 2013 TIME ALLOWED: 2 HOURS INSTRUCTIONS TO CANDIDATES 1. This examination paper contains FOUR (4)

More information

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity.

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity. LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity. Important points of Lecture 1: A time series {X t } is a series of observations taken sequentially over time: x t is an observation

More information

7. Forecasting with ARIMA models

7. Forecasting with ARIMA models 7. Forecasting with ARIMA models 309 Outline: Introduction The prediction equation of an ARIMA model Interpreting the predictions Variance of the predictions Forecast updating Measuring predictability

More information

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010 Problem Set 2 Solution Sketches Time Series Analysis Spring 2010 Forecasting 1. Let X and Y be two random variables such that E(X 2 ) < and E(Y 2 )

More information

Introduction to ARMA and GARCH processes

Introduction to ARMA and GARCH processes Introduction to ARMA and GARCH processes Fulvio Corsi SNS Pisa 3 March 2010 Fulvio Corsi Introduction to ARMA () and GARCH processes SNS Pisa 3 March 2010 1 / 24 Stationarity Strict stationarity: (X 1,

More information

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models Module 3 Descriptive Time Series Statistics and Introduction to Time Series Models Class notes for Statistics 451: Applied Time Series Iowa State University Copyright 2015 W Q Meeker November 11, 2015

More information

Time Series Solutions HT 2009

Time Series Solutions HT 2009 Time Series Solutions HT 2009 1. Let {X t } be the ARMA(1, 1) process, X t φx t 1 = ɛ t + θɛ t 1, {ɛ t } WN(0, σ 2 ), where φ < 1 and θ < 1. Show that the autocorrelation function of {X t } is given by

More information

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1 Forecasting using R Rob J Hyndman 2.4 Non-seasonal ARIMA models Forecasting using R 1 Outline 1 Autoregressive models 2 Moving average models 3 Non-seasonal ARIMA models 4 Partial autocorrelations 5 Estimation

More information

Ch 5. Models for Nonstationary Time Series. Time Series Analysis

Ch 5. Models for Nonstationary Time Series. Time Series Analysis We have studied some deterministic and some stationary trend models. However, many time series data cannot be modeled in either way. Ex. The data set oil.price displays an increasing variation from the

More information

Generalised AR and MA Models and Applications

Generalised AR and MA Models and Applications Chapter 3 Generalised AR and MA Models and Applications 3.1 Generalised Autoregressive Processes Consider an AR1) process given by 1 αb)x t = Z t ; α < 1. In this case, the acf is, ρ k = α k for k 0 and

More information

Classic Time Series Analysis

Classic Time Series Analysis Classic Time Series Analysis Concepts and Definitions Let Y be a random number with PDF f Y t ~f,t Define t =E[Y t ] m(t) is known as the trend Define the autocovariance t, s =COV [Y t,y s ] =E[ Y t t

More information

Ross Bettinger, Analytical Consultant, Seattle, WA

Ross Bettinger, Analytical Consultant, Seattle, WA ABSTRACT DYNAMIC REGRESSION IN ARIMA MODELING Ross Bettinger, Analytical Consultant, Seattle, WA Box-Jenkins time series models that contain exogenous predictor variables are called dynamic regression

More information

Ch. 14 Stationary ARMA Process

Ch. 14 Stationary ARMA Process Ch. 14 Stationary ARMA Process A general linear stochastic model is described that suppose a time series to be generated by a linear aggregation of random shock. For practical representation it is desirable

More information

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation University of Oxford Statistical Methods Autocorrelation Identification and Estimation Dr. Órlaith Burke Michaelmas Term, 2011 Department of Statistics, 1 South Parks Road, Oxford OX1 3TG Contents 1 Model

More information

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Circle a single answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 4, 215 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 31 questions. Circle

More information

Introduction to Time Series Analysis. Lecture 7.

Introduction to Time Series Analysis. Lecture 7. Last lecture: Introduction to Time Series Analysis. Lecture 7. Peter Bartlett 1. ARMA(p,q) models: stationarity, causality, invertibility 2. The linear process representation of ARMA processes: ψ. 3. Autocovariance

More information

Module 4. Stationary Time Series Models Part 1 MA Models and Their Properties

Module 4. Stationary Time Series Models Part 1 MA Models and Their Properties Module 4 Stationary Time Series Models Part 1 MA Models and Their Properties Class notes for Statistics 451: Applied Time Series Iowa State University Copyright 2015 W. Q. Meeker. February 14, 2016 20h

More information

at least 50 and preferably 100 observations should be available to build a proper model

at least 50 and preferably 100 observations should be available to build a proper model III Box-Jenkins Methods 1. Pros and Cons of ARIMA Forecasting a) need for data at least 50 and preferably 100 observations should be available to build a proper model used most frequently for hourly or

More information

Exercises - Time series analysis

Exercises - Time series analysis Descriptive analysis of a time series (1) Estimate the trend of the series of gasoline consumption in Spain using a straight line in the period from 1945 to 1995 and generate forecasts for 24 months. Compare

More information

Financial Time Series Analysis Week 5

Financial Time Series Analysis Week 5 Financial Time Series Analysis Week 5 25 Estimation in AR moels Central Limit Theorem for µ in AR() Moel Recall : If X N(µ, σ 2 ), normal istribute ranom variable with mean µ an variance σ 2, then X µ

More information

Univariate ARIMA Models

Univariate ARIMA Models Univariate ARIMA Models ARIMA Model Building Steps: Identification: Using graphs, statistics, ACFs and PACFs, transformations, etc. to achieve stationary and tentatively identify patterns and model components.

More information

Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2

Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2 Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution Defn: Z R 1 N(0,1) iff f Z (z) = 1 2π e z2 /2 Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) (a column

More information

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications Prof. Massimo Guidolin 20192 Financial Econometrics Winter/Spring 2018 Overview Moving average processes Autoregressive

More information

MAT 3379 (Winter 2016) FINAL EXAM (SOLUTIONS)

MAT 3379 (Winter 2016) FINAL EXAM (SOLUTIONS) MAT 3379 (Winter 2016) FINAL EXAM (SOLUTIONS) 15 April 2016 (180 minutes) Professor: R. Kulik Student Number: Name: This is closed book exam. You are allowed to use one double-sided A4 sheet of notes.

More information

6.3 Forecasting ARMA processes

6.3 Forecasting ARMA processes 6.3. FORECASTING ARMA PROCESSES 123 6.3 Forecasting ARMA processes The purpose of forecasting is to predict future values of a TS based on the data collected to the present. In this section we will discuss

More information

Characteristics of Time Series

Characteristics of Time Series Characteristics of Time Series Al Nosedal University of Toronto January 12, 2016 Al Nosedal University of Toronto Characteristics of Time Series January 12, 2016 1 / 37 Signal and Noise In general, most

More information

Identifiability, Invertibility

Identifiability, Invertibility Identifiability, Invertibility Defn: If {ǫ t } is a white noise series and µ and b 0,..., b p are constants then X t = µ + b 0 ǫ t + b ǫ t + + b p ǫ t p is a moving average of order p; write MA(p). Q:

More information

1. Stochastic Processes and Stationarity

1. Stochastic Processes and Stationarity Massachusetts Institute of Technology Department of Economics Time Series 14.384 Guido Kuersteiner Lecture Note 1 - Introduction This course provides the basic tools needed to analyze data that is observed

More information

Class 1: Stationary Time Series Analysis

Class 1: Stationary Time Series Analysis Class 1: Stationary Time Series Analysis Macroeconometrics - Fall 2009 Jacek Suda, BdF and PSE February 28, 2011 Outline Outline: 1 Covariance-Stationary Processes 2 Wold Decomposition Theorem 3 ARMA Models

More information

Time Series Outlier Detection

Time Series Outlier Detection Time Series Outlier Detection Tingyi Zhu July 28, 2016 Tingyi Zhu Time Series Outlier Detection July 28, 2016 1 / 42 Outline Time Series Basics Outliers Detection in Single Time Series Outlier Series Detection

More information

The autocorrelation and autocovariance functions - helpful tools in the modelling problem

The autocorrelation and autocovariance functions - helpful tools in the modelling problem The autocorrelation and autocovariance functions - helpful tools in the modelling problem J. Nowicka-Zagrajek A. Wy lomańska Institute of Mathematics and Computer Science Wroc law University of Technology,

More information

Econometrics II Heij et al. Chapter 7.1

Econometrics II Heij et al. Chapter 7.1 Chapter 7.1 p. 1/2 Econometrics II Heij et al. Chapter 7.1 Linear Time Series Models for Stationary data Marius Ooms Tinbergen Institute Amsterdam Chapter 7.1 p. 2/2 Program Introduction Modelling philosophy

More information

ECON 616: Lecture 1: Time Series Basics

ECON 616: Lecture 1: Time Series Basics ECON 616: Lecture 1: Time Series Basics ED HERBST August 30, 2017 References Overview: Chapters 1-3 from Hamilton (1994). Technical Details: Chapters 2-3 from Brockwell and Davis (1987). Intuition: Chapters

More information

2. An Introduction to Moving Average Models and ARMA Models

2. An Introduction to Moving Average Models and ARMA Models . An Introduction to Moving Average Models and ARMA Models.1 White Noise. The MA(1) model.3 The MA(q) model..4 Estimation and forecasting of MA models..5 ARMA(p,q) models. The Moving Average (MA) models

More information

Chapter 9: Forecasting

Chapter 9: Forecasting Chapter 9: Forecasting One of the critical goals of time series analysis is to forecast (predict) the values of the time series at times in the future. When forecasting, we ideally should evaluate the

More information

Lesson 9: Autoregressive-Moving Average (ARMA) models

Lesson 9: Autoregressive-Moving Average (ARMA) models Lesson 9: Autoregressive-Moving Average (ARMA) models Dipartimento di Ingegneria e Scienze dell Informazione e Matematica Università dell Aquila, umberto.triacca@ec.univaq.it Introduction We have seen

More information

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 ) Covariance Stationary Time Series Stochastic Process: sequence of rv s ordered by time {Y t } {...,Y 1,Y 0,Y 1,...} Defn: {Y t } is covariance stationary if E[Y t ]μ for all t cov(y t,y t j )E[(Y t μ)(y

More information

Stochastic processes: basic notions

Stochastic processes: basic notions Stochastic processes: basic notions Jean-Marie Dufour McGill University First version: March 2002 Revised: September 2002, April 2004, September 2004, January 2005, July 2011, May 2016, July 2016 This

More information

Gaussian processes. Basic Properties VAG002-

Gaussian processes. Basic Properties VAG002- Gaussian processes The class of Gaussian processes is one of the most widely used families of stochastic processes for modeling dependent data observed over time, or space, or time and space. The popularity

More information

Stationary Stochastic Time Series Models

Stationary Stochastic Time Series Models Stationary Stochastic Time Series Models When modeling time series it is useful to regard an observed time series, (x 1,x,..., x n ), as the realisation of a stochastic process. In general a stochastic

More information

Econometric Forecasting

Econometric Forecasting Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 1, 2014 Outline Introduction Model-free extrapolation Univariate time-series models Trend

More information

Lecture 1: Stationary Time Series Analysis

Lecture 1: Stationary Time Series Analysis Syllabus Stationarity ARMA AR MA Model Selection Estimation Lecture 1: Stationary Time Series Analysis 222061-1617: Time Series Econometrics Spring 2018 Jacek Suda Syllabus Stationarity ARMA AR MA Model

More information

Review Session: Econometrics - CLEFIN (20192)

Review Session: Econometrics - CLEFIN (20192) Review Session: Econometrics - CLEFIN (20192) Part II: Univariate time series analysis Daniele Bianchi March 20, 2013 Fundamentals Stationarity A time series is a sequence of random variables x t, t =

More information

Moving Average (MA) representations

Moving Average (MA) representations Moving Average (MA) representations The moving average representation of order M has the following form v[k] = MX c n e[k n]+e[k] (16) n=1 whose transfer function operator form is MX v[k] =H(q 1 )e[k],

More information

STAT 720 TIME SERIES ANALYSIS. Lecture Notes. Dewei Wang. Department of Statistics. University of South Carolina

STAT 720 TIME SERIES ANALYSIS. Lecture Notes. Dewei Wang. Department of Statistics. University of South Carolina STAT 720 TIME SERIES ANALYSIS Spring 2015 Lecture Notes Dewei Wang Department of Statistics University of South Carolina 1 Contents 1 Introduction 1 1.1 Some examples........................................

More information

X t = a t + r t, (7.1)

X t = a t + r t, (7.1) Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical

More information

Stochastic Modelling Solutions to Exercises on Time Series

Stochastic Modelling Solutions to Exercises on Time Series Stochastic Modelling Solutions to Exercises on Time Series Dr. Iqbal Owadally March 3, 2003 Solutions to Elementary Problems Q1. (i) (1 0.5B)X t = Z t. The characteristic equation 1 0.5z = 0 does not have

More information