Time Series Analysis - Part 1

Size: px

Start display at page:

Download "Time Series Analysis - Part 1"

Joan Daniels
5 years ago
Views:

1 Time Series Analysis - Part 1 Dr. Esam Mahdi Islamic University of Gaza - Department of Mathematics April 19, of 189

2 What is a Time Series? Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend A collection of data {X t } recorded over a period of time (daily, weekly, monthly, quarterly, yearly, etc.), analyzed to understand the past, in order to predict the future (forecast), helping managers and policy makers to make well-informed and sound decisions. An important feature of most time series is that observations close together in time tend to be correlated (serially dependent). Time could be discrete: t = 1, 2,..., or continous: t > 0. 2 of 189

3 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend The Nature of Time Series Data - Example 1 Quarterly earnings per share for 1960Q1 to 1980Q4 of the U.S. company, Johnson & Johnson, Inc. (Upward trend). Quarterly Earnings per Share Time 3 of 189

4 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend The Nature of Time Series Data - Example 2 A small.1 second (1000 points) sample of recorded speech for the phrase aaa...hhh. (Regular repetition of small wavelets). speech Time 4 of 189

5 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend The Nature of Time Series Data - Example 3 Annual numbers of lynx trappings in McKenzie river in Northwest Territories of Canada over the years (Aperiodic cycles of approximately 10 years). lynx Time 5 of 189

6 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend The Nature of Time Series Data - Example 4 Monthly Airline Passenger Numbers (Seasonality appears to increase with the general trend). AirPassengers Time 6 of 189

7 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend The Nature of Time Series Data - Example 5 Returns of the New York Stock Exchange (NYSE) from February 2, 1984 to December 31, (Average return of approximately zero, however, volatility (or variability) of data changes over time). NYSE Time 7 of 189

8 Components of a Time Series Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend In general, the time series can be decompose into 4 components: Secular Trend, Seasonal Variation, Cyclical Variation, and Irregular Variation that can be modelled deterministically with mathematical functions of time (see the next slide). General Trend: The smooth long term direction (upward or downward) of a time series. Seasonal Variation: Patterns of change in a time series within a year which tend to repeat each year. Cyclical Variation: The rise and fall of a time series over periods longer than one year. Irregular Variation: Random and follow no regularity in the occurrence pattern. 8 of 189

9 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Three Types of Time Series Decomposition 1 The additive decomposition model: X t = T t + S t + C t + I t X t = Original data at time t T t = Trend value at time t S t = Seasonal fluctuation at time t, C t = Cyclical fluctuation at time t, I t = Irregular variation at time t. 2 The multiplicative decomposition model: X t = T t S t C t I t 3 The Mixed decomposition model: For example, if T t and C t are correlated to each others, but they are independent from S t and I t, the model will be X t = T t C t + S t + I t 9 of 189

10 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Secular (General) Trend: Sample Chart The increase or decrease in the movements of a time series that does not appear to be periodic is known as a trend. 10 of 189

Seasonal Variation: Sample Chart Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Short-term fluctuation in a time series which occur periodically in a

11 Seasonal Variation: Sample Chart Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Short-term fluctuation in a time series which occur periodically in a year. This continues to repeat year after year, although the term is applied more generally to repeating patterns within any fixed period, such as restaurant bookings on different days of the week. 11 of 189

Cyclical Variation: Sample Chart Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Recurrent upward or

12 Cyclical Variation: Sample Chart Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Recurrent upward or downward movements in a time series where the period of cycle is greater than a year. Also these variations are not regular as seasonal variation. 12 of 189

13 Time Series Decomposition in R Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend In R, the function decompose() estimates trend, seasonal, and irregular effects using the Moving Averages method (MA). In this case, the series is decompose into 3 components: trend-cycle component, seasonal component, and irregular component. 1 The additive decomposition model: X t = T t + S t + I t 2 The multiplicative decomposition model: X t = T t S t I t where T t is the trend-cycle component (containing both trend and cycle components). Continue to next slide of 189

14 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend The Function decompose() in R - Example Atmospheric concentrations of CO2 (monthly from 1959 to 1997). R> D <- decompose(co2) R> season.term <- D$figure; trend.term <- D$trend R> random.term <- D$random; plot(d) Decomposition of additive time series random seasonal trend observed Time 14 of 189

15 Procedure for Decomposition Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend 1 Decompose the series into its components (say multiplicative model: X t = T t S t C t I t. 2 Create the centered moving averages series (MA t = T t C t ). 3 Deseasonalize the series (Deseasonalizing means: remove the seasonal effects when seasonal pattern of period m exists). This will be done as follows: Find the ratio X t /MA t = S t I t. Take the average of each corresponding season to eliminate the irregular term (randomness) and get the seasonal indices. Normalize seasonal indices to get SI t (divide each seasonal index s value by the sum of all indices values and then multiply by m) to make sure they average to 1 (sum = m). Find the ratio of X t /SI t (this is the deseasonalized series X t ). 4 Estimate the trend for the deseasonalized series. 5 Forecast future values of each component ( ˆX t ). 6 Reseasonalize predictions (ResP = SI t ˆX t ). 15 of 189

16 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Estimating the Seasonality: Moving Averages Method Moving Averages method (MA) is used to smooth out the short-term fluctuations and identify the long-term trend in the original series (i.e, to create a smooth series in order to reduce the random fluctuations in the original series and estimates the trend-cycle component). Moving averages of length m Z is obtained by taking the average of the initial subset of m consecutive observations (usually m denotes the seasonal pattern of m periods). Then the subset is modified by shifting forward ; that is, excluding the first observation of the series and including the next observation following the initial subset in the series. This creates a new subset of m consecutive observations, which is averaged. Do the shifting forward method again and repeat the procedures until you get the moving averages of span of period m. Continue to next slide of 189

17 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Centered Moving Averages (Centered MA) If m is odd, MA is termed a centered moving averages. For example: For the data of observations a, b, c, d, e, f, g, the moving averages with span 3 (or length 3) is { a + b + c, b + c + d, c + d + e, d + e + f, e + f + g }, If m is even, then apply a moving average of span 2 on the resulted series created by the moving average of even span, m, to get the centered moving averages series. For example: The moving averages with span of 4 is m 1 = a+b+c+d 4, m 2 = b+c+d+e 4, m 3 = c+d+e+f 4, m 4 = d+e+f +g 4, so that the centered moving averages will be { m 1 + m 2 2, m 2 + m 3 2, m 3 + m 4 } of 189

18 Example: Moving Averages Method Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Time X t three-years moving averages four-years moving averages four-years two-years ( ) 3 = ( ) 4 = ( ) 3 = 15.3 ( ) 2 = ( ) 4 = ( ) 3 = 14.3 ( ) 2 = ( ) 4 = ( ) 3 = 16.0 ( ) 2 = ( ) 4 = ( ) 3 = 12.3 ( ) 2 = ( ) 4 = ( ) 3 = 16.0 ( ) 2 = ( ) 4 = ( ) 3 = 15.3 ( ) 2 = ( ) 4 = ( ) 3 = 18.0 ( ) 2 = ( ) 4 = ( ) 3 = 15.0 ( ) 2 = ( ) 4 = ( ) 3 = 17.7 ( ) 2 = ( ) 4 = ( ) 3 = of 189

19 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend R-code: The Centered of Moving Averages Series R> x <- c(10,14,11,21,11,16,10,22,14,18,13,22,13) R> y <- numeric() ; z <- numeric(); w <- numeric() R> ## Compute three-years moving averages (say: y) R> for (i in 1: (length(x)-2)){y[i] <- sum(x[i:(i+2)])/3} R> ## Compute four-years moving averages (say: z) R> for (i in 1: (length(x)-3)){z[i] <- sum(x[i:(i+3)])/4} R> ## Compute two-years moving averages (say: w) R> for (i in 1: (length(z)-1)){w[i] <- sum(z[i:(i+1)])/2} R> round(y,2); round(z,2) ; round(w,2) [1] [1] [1] of 189

20 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend R> plot(ts(x,start=1986),ylab="original series and MA") R> points(ts(y,start=1987),pch=1,type="o",col="red") R> points(ts(w,start=1988),pch=8,type="b",col="blue") R> legend(x="bottomright",c("original series","ma(m=3)", + "MA (m=4)"),col=c(1,2,4),text.col="green4", + pch=c(46,1,8),lty=c(1,19,15),merge=true,bg='gray95') R> title(main="smoothing curve: centered moving averages") Smoothing curve: centered moving averages original series and MA original series MA(m=3) MA (m=4) Time 20 of 189

21 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example - Deseasonalizing Using Moving Averages Method 21 of 189

22 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example: Computing Moving Averages 22 of 189

23 Example: Centered moving averages Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend 23 of 189

24 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example: Computing Seasonal Ratios 24 of 189

25 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example: Calculating Raw Seasonal Indices 25 of 189

26 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example: Normalizing Seasonal Indices (Make Sum SI =4) = of 189

27 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example: Deseasonalizing Raw Data 27 of 189

28 Estimation the Trend Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend In the absence of the seasonal effect, the time series model may be written in the Wold representation: X t = µ t + ɛ t, where µ t = T t is a deterministic component (trend), ɛ t = I t is an independent and identically distributed (i.i.d.) stochastic component, and µ t & ɛ t are uncorrelated, and µ t can be modeled in many ways: Linear: µ t = a + bt. Quadratic: µ t = a + bt + ct 2. There are two methods that can be used to estimate µ t : Method 1: Least squares method. Method 2: Smoothing by means of moving averages. 28 of 189

29 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Estimation the Trend - Least Squares Method The long term trend of many time series often approximates a straight line. If T t is the dependent variable and t is the independent variable (represents the time index), the least squares fitted line (Simple Linear Regression): ˆT t = ˆβ 0 + ˆβ 1 t, n i=1 ˆβ 1 = t it i n t T n i=1 t2 i n t 2, ˆβ 0 = T ˆβ 1 t. Residual is the difference between the true and predicted value: ê t = T t ˆT t. 29 of 189

30 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Least Squares Method - Root Mean Square Error Root Mean Square Error is a goodness of fit measure defined as: the square root of the mean of the squares of the residuals RMSE = 1 n êt 2 n t=1 30 of 189

31 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend ˆβ 1 = Year Qtr Revenue Xt t txt t Sum ( )( 20 ) ( )2 = 56.93, ˆβ 0 = ( ) = , regression line to deseasonalized data: ˆX t = t 31 of 189

32 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend R> t <- 1:20 R> x<-c(1440.2,1494.4,1598.5,1553.6,1645.1,1767.6,1820.3, ,1805.1,1863.8,1959.6,2114.0,2052.2,2054.9, ,2280.7,2493.3,2559.0,2626.0,2242.0) R> trend.line <- lm(x~t) R> trend.line Call: lm(formula = x ~ t) Coefficients: (Intercept) t of 189

33 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Regression Line for Deseasonalized Data deseasonalized revenue time ˆX t = t 33 of 189

34 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Year Revenue Seasonal Index Forecast Reseasonalize Predictions Square Error t T t SI ˆX t ResP = SI ˆX t (T t ResP) Sum ˆX t = t RMSE = /20 = of 189

35 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Plot the Forecast - 2 Years (8 Quarters) Ahead Toys R US revenue Year 35 of 189

36 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Differencing to detrend and deseasonalized a series Sometimes, the first difference operator, X t = X t X t 1, is used to remove the trend of the series (as we will discuss later). e.g., if X t = β 0 + β 1 t then X t 1 = β 0 + β 1 (t 1), so that X t = β 1 (Constant). Similarly, a seasonal component with a span period m Z + in the time series can be removed by differencing the series at lag m. That is m X t = X t X t m (as we will discuss later). 36 of 189

37 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Eliminating the Cycle Component (Multiplicative Models) After creating the moving averages (MA t = T t C t ) and estimating the trend (T t ) series, we can estimate the cycle component from the following equation C t = MA t T t. 37 of 189

38 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Forecasts Using Exponential Smoothing Method Definition For a sequence of observations {X 1, X 2,...} (assuming there is no systematic trend or seasonal effects), the exponential smoothing series { ˆX t : t 1} is given by the formula ˆX t+1 = ˆX t + α(x t ˆX t ), or equivalently, ˆX t+1 = αx t + (1 α) ˆX t, where 0 < α < 1 is the exponentially weighted moving average (referred to as the smoothing parameter). X t is the actual observed value at time t. ˆX t+1 is the one-step-ahead forecast value, where ˆX 1 = X of 189

39 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Exponential Smoothing Weighted MA (EWMA) Whereas the smoothing based on the moving averages method puts equal weights on the past observations, exponential smoothing method assign exponentially decreasing weights over time. The value of α determines the amount of smoothing. A value of α near 1 gives a little smoothing for estimating the mean level and ˆX t+1 X t. A value of α near 0 gives highly smoothed estimates of the mean level and takes little account of the most recent observation. A typical value for α is 0.2. In practice, we determine the value of α for which the mean n (X t ˆX t ) 2 squared errors MSE = is minimized. n t=2 39 of 189

40 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example - Forecasts Using Exponential Smoothing For the following daily observations, use the exponential smoothing method with two different smoothing parameters α = 0.2 and 0.4 to forecast the 1-day ahead. What is the optimal value of α? Day Sales (X t ) of 189

41 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example - Forecasts Using Exponential Smoothing Day Sales α = 0.2 α = 0.4 t X t Forecast ( ˆX t ) Square errors (e 2 ) Forecast ( ˆX t ) Square errors (e 2 ) For α = 0.2: ˆX t+1 = 0.2X t ˆX t. ˆX 1 = X 1 = 39 ˆX 2 = 0.2(39) + 0.8(39) = 39 ˆX 3 = 0.2(44) + 0.8(39) = 40 ˆX 4 = 0.2(40) + 0.8(40) = 40 ˆX 5 = 0.2(45) + 0.8(40) = 41 ˆX 6 = 0.2(38) + 0.8(41) = 40.4 ˆX 7 = 0.2(43) + 0.8(40.4) = For α = 0.4: ˆX t+1 = 0.4X t ˆX t. ˆX 1 = X 1 = 39 ˆX 2 = 0.4(39) + 0.6(39) = 39 ˆX 3 = 0.4(44) + 0.6(39) = 41 ˆX 4 = 0.4(40) + 0.6(41) = 40.6 ˆX 5 = 0.4(45) + 0.6(40.6) = ˆX 6 = 0.4(38) + 0.6(42.36) = ˆX 7 = 0.4(43) + 0.6(40.616) = ˆX 8 = 0.2(39) + 0.8(40.92) = ˆX 8 = 0.4(39) + 0.6( ) = MSE (α = 0.2) = and MSE(α = 0.4) = so that α = 0.2 is the optimal. 41 of 189

42 Exponential Smoothing in R Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend HoltWinters(x, alpha = NULL, beta = NULL, gamma = NULL, seasonal = c("additive", "multiplicative"), start.periods = 2, l.start = NULL, b.start = NULL, s.start = NULL,optim.start = c(alpha = 0.3, beta = 0.1, gamma = 0.1), optim.control = list()) Exponential smoothing is a special case of the Holt-Winters algorithm, implemented in the R function HoltWinters() where the additional parameters beta and gamma set to be FALSE. If we do not specify a value for alpha, the function HoltWinters() will estimates the value that minimises the MSE. 42 of 189

43 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example - Forecasts Using Exponential Smoothing in R R> x <- c(27,34,31,24,18,19,17,12,26,14,18,33,31,31,19, + 17,10,25,26,18,18,10,4,20,28,21,18,23,19,16) R> y <- ts(x) R> ## Consider alpha=0.2 R> out1 <- HoltWinters(y,alpha=0.2,beta=FALSE,gamma=FALSE) R> #out1$fitted##output 1st column is the smoothing series R> # R> ## Let R estimates alpha R> out2 <- HoltWinters(y,beta=FALSE,gamma=FALSE) ## The following code plot the original series and EWMA R> plot(y)## Plot the original series and EWMA R> lines(out1$fitted[,1],col="red")#alpha=0.2 R> lines(out2$fitted[,1],col="blue")#estimated alpha 43 of 189

44 Fundamental concepts Time Series Decomposition Estimating the Seasonality Estimation the Trend Example - Plot the Forecasts with Confidence Intervals R> plot(out1,xlim=c(2,38)) R> pred <- predict(out1,n.ahead=6,prediction.interval=t) R> lines(pred[,1],col="blue") R> lines(pred[,2],col="green") R> lines(pred[,3],col="green") Holt Winters filtering Observed / Fitted Time Figure : Exponentially weighted moving average. 44 of 189

45 Autocovariance and Autocorrelation Functions Sample Autocovariance and Autocorrelation The Mean and the Autocovariance Functions Definition The mean function of a time series {X t } is defined to be µ t = E(X t ), whereas the variance function is defined to be Var(X t ) = E[(X t µ t ) 2 ]. The function µ t specifies the first order properties of the time series. Definition The autocovariance function of a time series {X t } is defined to be γ X (s, t) = Cov(X s, X t ) = E(X s µ s )(X t µ t ), for any two time points t and s. The function γ(s, t) specifies the second order properties of the time series. 45 of 189

46 Remarks Introduction to time series Autocovariance and Autocorrelation Functions Sample Autocovariance and Autocorrelation If h = t s, the parameter γ X (h) is called the h th order or lag h autocovariance of {X t }. Thus, γ X (0) = Var(X t ). The difference of two moments in time is called lag. h, γ X (h) = Cov(X t, X t+h ) = Cov(X t, X t h ) = γ X ( h). h, the autocorrelation ρ X (h) = ρ X ( h). If a i s constants, X i s and Y j s r.v. s i = 1,..., n, j = 1,..., m n Cov( a i X i ) = i=1 n ai 2 Var(X i ) + 2 a i a j Cov(X i, X j ) i=1 i<j n m Cov( X i, Y j ) = i=1 j=1 n m Cov(X i, Y j ) i=1 j=1 46 of 189

47 Autocovariance and Autocorrelation Functions Sample Autocovariance and Autocorrelation The Plot of the Autocovariance and Autocorrelation (Correlogram) Functions The plot of γ(h) against the lag h values is called the autocovariance function (ACVF ). The plot of ρ(h) against the lag h values is called the autocorrelation function (ACF ). Note that ρ(0) = 1 and 1 ρ(h) 1 for all h. Autocovariance function Autocorrelation function ACF (cov) ACF Lag Lag 47 of 189

48 Autocovariance and Autocorrelation Functions Sample Autocovariance and Autocorrelation Sample Autocovariance and Autocorrelation The lag h sample autocovariance is defined as ˆγ X (h) = 1 n (x t x)(x t h x), n or equivalently t=h+1 ˆγ X (h) = 1 n h (x t x)(x t+h x), n t=1 where x = 1 n n t=1 x t. The lag h sample autocorrelation is defined as 1 ˆρ X (h) = ˆγ X (h) ˆγ X (0) = n h t=1 (x t x)(x t+h x) n t=1 (x t x) of 189

49 Example - Sample Autocorrelation Autocovariance and Autocorrelation Functions Sample Autocovariance and Autocorrelation Find ˆρ(h) where h = 1, 2, and 3 49 of 189

50 Autocovariance and Autocorrelation Functions Sample Autocovariance and Autocorrelation Example - Sample Autocovariance and Autocorrelation Find ˆγ X (h) and ˆρ X (h) where h = 0, 1, and 3. ˆγ X (0) = 1 14 (x t x) 2 = t X t t X t t= ˆγ X (1) = 1 14 (x t x)(x t 1 x) = t= ˆγ X (3) = 1 14 (x t x)(x t 3 x) = t= ˆρ X (1) = ˆγ X (1) ˆγ X (0) = 0.22 ˆρ X (3) = ˆγ X (3) ˆγ X (0) = of 189

51 R-Code for the Previous Example Autocovariance and Autocorrelation Functions Sample Autocovariance and Autocorrelation R> x <- c(2,1,3,5,1,6,2,6,2,4,3,7,8,1) R> Cov <- acf(x, type ="covariance",lag.max=3,plot=f) R> Cor <- acf(x, lag.max=3,plot=false) R> Cov Autocovariances of series 'x', by lag R> Cor Autocorrelations of series 'x', by lag of 189

Stationary Models Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Definition Stationary models: assume that the process remains in

52 Stationary Models Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Definition Stationary models: assume that the process remains in statistical equilibrium with probabilistic properties that do not change over time, in particular varying about a fixed constant mean level and with constant variance. 52 of 189

53 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Why does a time series has to be stationary? Stationarity is defined uniquely, so there is only one way for time series data to be stationary, but lots of ways for it to be non-stationary. Thus stationarity is needed to model the dependence structure uniquely. It is preferred that the estimators of parameters such as the mean and variance, if they exist, to not be changed over time. In most cases, stationary data can be approximated with stationary ARMA model as we will discuss later. Stationary processes avoid the problem of spurious regression. 53 of 189

54 Strong (Strictly) Stationarity Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Definition The joint distribution of X t1, X t2,..., X tn is defined as: F Xt1,X t2,...,x tn (x 1, x 2,..., x n ) = P(X t1 x 1, X t2 x 2,..., X tn x n ), where x 1, x 2,..., x n are any real numbers. Definition A time series {X t } is said to be Strong (or Strictly) Stationary if for any time points t 1, t 2,..., t n Z, where n 1 and any scaler shift (lag) h Z, the joint distribution of X t1, X t2,..., X tn is the same as the joint distribution of X t1 +h, X t2 +h,..., X tn+h; i.e., F Xt1,X t2,...,x tn (x 1, x 2,..., x n ) = F Xt1 +h,x t2 +h,...,x tn+h (x 1, x 2,..., x n ) 54 of 189

55 Weak (Covariance) Stationarity Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Definition Time Invariant Process: A process has time invariant if it does not depend on time. Definition A time series {X t } is said to be Weak Stationary or Covariance Stationary or Second-order Stationary if The mean µ t = E(X t ) = µ is independent of t. For all t&h, γ X (h) = Cov(X t, X t+h ) = E[(X t µ)(x t+h µ)] is time-invariant; i.e., the covariance function depends only on the time separation h and not the actual time t. 55 of 189

56 Remarks Introduction to time series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models An independent and identically distributed (i.i.d) stochastic process satisfies γ(h) = ρ(h) = 0 h 0. As the joint distribution of X t and X t+h determines µ t and γ X (h) if both exist,then the strict stationarity implies weak stationarity, while the converse is not true in general. If {X t } is a Gaussian stochastic process, then the weak stationarity is equivalent to strong stationarity (why?). A stationary time series {X t } is Ergodic if sample moments converge in probability ( p ) to population moments; i.e. if x p µ, ˆγ X (h) p γ X (h), and ˆρ X (h) p ρ X (h). 56 of 189

57 Some Useful Trigonometric Functions Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models cos(x ± y) = cos(x)cos(y) sin(x)sin(y) sin(x ± y) = sin(x)cos(y) ± cos(x)sin(y) sin(2x) = 2sin(x)cos(x), cos(2x) = cos 2 (x) sin 2 (x) 2sin(x)sin(y) = cos(x y) cos(x + y) 2cos(x)cos(y) = cos(x y) + cos(x + y) sin(2πk + x) = sin(x), where k = 0, ±1, ±2, ±3,... cos(2πk + x) = cos(x), where k = 0, ±1, ±2, ±3,... sin(π x) = +sin(x), sin(π + x) = sin(x) cos(π x) = cos(x), cos(π + x) = cos(x) sin(π/2 x) = +cos(x), sin(π/2 + x) = +cos(x) cos(π/2 x) = +sin(x), cos(π/2 + x) = sin(x) 57 of 189

58 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Stationary and Non-stationary Examples Example 1: Which of the following time sequence is a weak stationary? X t = ɛ t, where ɛ t i.i.d.(0, 1) for all t R. Solution: E(X t ) = E(ɛ t ) = 0, which is independent of t; and γ X (h) = E(ɛ t ɛ t+h ) = 0 h 0, & γ X (h) = 1, for h = 0, which is also independent of t. Hence, the process is a weak stationary. X t = t + ɛ t, where ɛ t i.i.d.(0, 1) for all t R. Solution: E(X t ) = E(t + ɛ t ) = t, dependents of t. Hence, the process is not stationary. Note that γ X (h) is independent of t because γ X (h) = E([t + ɛ t t][t + h + ɛ t+h t h]) = 0 h 0, & γ X (h) = 1, for h = of 189

59 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Stationary and Non-stationary Examples Example 2: Check whether the following model is a weak stationary or not? X t = Asin(t + B), where A is a random variable with a zero mean and a unit variance and B is a random variable with a Uniform distribution ( π, π) independent of A. Solution: E(X t ) = E(Asin(t + B)) = E(A)E(sin(t + B)) = 0, which is independent of t. { } γ X (h) = E(X t X t+h ) = E Asin(t + B)Asin(t + h + B) { 1 } = E(A 2 )E 2 (cos(h) cos(2t + 2B + h)) (see the next slide) 59 of 189

60 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Stationary and Non-stationary Examples (Cont.) Recall that Y Uniform(a, b), if its probability density function f Y (y) = { 1 b a, a y b 0, otherwise. where E(Y ) = 1 a+b and Var(Y ) = (b a)2 12. γ X (h) = 1 2 cos(h) 1 } {cos(2t 2 E + 2B + h) = 1 2 cos(h) 1 2 π π cos(2t + 2B + h). 1 2π db = 1 2 cos(h) 1 8π [sin(2t + 2B + h)]π π = 1 cos(h), which is independent of t, 2 Hence, the process is a second-order stationary. 60 of 189

61 White Noise Process Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Definition The stochastic process {ɛ t } is said to be a strong White Noise process with mean zero and variance σɛ 2 and written as i.i.d. ɛ t WN(0, σɛ 2 ) if and only if it is independent and identically distributed (i.i.d.) with zero mean and covariance function { σ 2 γ ɛ (h) = E(ɛ t ɛ t+h ) = ɛ, h = 0; 0, h 0. Thus an i.i.d. white noise process has constant mean (usually this constant equals zero). constant variance. zero autocovariance, except at lag zero. 61 of 189

62 White Noise Process Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Definition For a second-order properties, the assumption of i.i.d. is not required, only the absence of correlation is needed. In this case the process {ɛ t } is said to be a weak White Noise process. Remarks: If ɛ t N(0, σ 2 ɛ ), {ɛ t } is a Gaussian White Noise process. Two random variables X and Y are uncorrelated when their covariance coefficient is zero. Two random variables X and Y are independent when their joint probability distribution is the product of their marginal probability distributions: x, y P X,Y (x, y) = P X (x)p Y (y). If X and Y are independent, then they are also uncorrelated. However, in general, the converse is not always true. 62 of 189

63 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models A Simulated Gaussian White Noise Series R> set.seed(646) R> white.noise <- rnorm(200) R> ts.wn<- ts(white.noise) R> par(mfrow=c(1,3)) R> plot(ts.wn,ylab="gaussian Series",main="White Noise") R> acf(ts.wn, main="autocorrelation Function") R> acf(ts.wn, type="partial", main="partial ACF") White Noise Autocorrelation Function Partial ACF Gaussian Series ACF Partial ACF Time Lag Lag 63 of 189

64 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Backward (Lag), Forward (Lead) and Difference Operators The lag, lead and difference notations are very convenient way to write linear time series models and to characterize their properties. Definition The lag (backshift) operator denotes by B(.) on an element of a time series is used to produce the previous element. Definition The lead (forward) operator denotes by F(.) on an element of a time series is used to shift the time index forward by one unit. Definition The difference operator denotes by expresses the difference between two consecutive random variables (or their realizations). 64 of 189

65 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models BX t = X t 1, B 2 X t = B(BX t ) = BX t 1 = X t 2, More generally, B h X t = X t h, where h is any integer. FX t = X t+1, F 2 X t = F(FX t ) = FX t+1 = X t+2, More generally, F h X t = X t+h, where h is any integer. X t = X t X t 1 = (1 B)X t, 2 X t = ( X t ) = X t X t 1 = (1 B) X t = (1 B) 2 X t = X t 2X t 1 + X t 2, More generally, d X t = (1 B) d X t, where d is any integer. Note that Positive values of h define lags, while negative values define leads: B h X t = F h X t = X t+h, B 0 X t = F 0 X t = 0 X t = X t. 65 of 189

66 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Backshift, Forward and Difference Operators in R To get (1 B h )X t = X t X t h for all lag integer 1 h < t, type the R-code: > diff(x, lag=h) Note that the opposite of diff() is cumsum() To get d X t = (1 B) d X t for all order of integer difference 1 d < t, type the R-code: > diff(x, differences = d) In general, to get (1 B h ) d X t for all lag and difference integers 1 h d < t, type the R-code: > diff(x, lag=h, differences = d) Note that the R-code: > diff(diff(x)) produces the same result as > diff(x, differences = 2) 66 of 189

67 Introduction to time series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Transform Data to Stationary: Detrending vs. Differencing Example: Global temperature data for the years The model is X t = β 0 + β 1 t + ɛ t. Detrending is ê t = X t ˆβ 0 ˆβ 1 t. Differencing is X t = X t X t 1. Global temperature and fitted line Time Detrended global temperature Time First differences global temperature Time 67 of 189

68 Detrending vs. Differencing Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Global temperature ACF Lag detrended global temperature ACF Lag first differences ACF Lag 68 of 189

69 Linear Processes Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models For a stationary time series without trends or seasonal effects. That is; if necessary, any trends, seasonal or cyclical effects have already been removed from the series, we might construct a linear model for a time series with autocorrelation. The most important special cases of linear processes are: Autoregressive Model (AR ), Moving-Average Model (MA ), Autoregressive Moving Average Model (ARMA ), Autoregressive-Integrated-Moving Average Model (ARIMA ). These models can be used for predicting the future (forecasting) development of a time series. 69 of 189

70 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Autoregressive (AR ) Models With Zero Mean Definition A time series {X t }, with zero mean, satisfies X t = φ 1 X t 1 + φ 2 X t φ p X t p + ɛ t (ɛ t is White Noise and φ i are parameters, i = 1, 2,..., p) is called Autoregressive Process of order p, denoted by AR (p). With backshift operator, the process is: Φ p (B)X t = ɛ t, Φ p (B) = 1 φ 1 B φ 2 B 2... φ p B p polynomial of degree p The AR (p) model, p 1, is a stationary model if and only if the roots of 1 φ 1 B φ 2 B 2... φ p B p = 0 > of 189

71 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Autoregressive (AR ) Models With Mean µ If the mean, µ, of X t is not zero, replace X t by X t µ to get: X t µ = φ 1 (X t 1 µ)+φ 2 (X t 2 µ)+...+φ p (X t p µ)+ɛ t or write X t = α + φ 1 X t 1 + φ 2 X t φ p X t p + ɛ t, where α = µ(1 φ 1 φ 2... φ p ). For example, the AR (2) model X t = X t 1 0.5X t 2 + ɛ t is X t µ = 1.2(X t 1 µ) 0.5(X t 2 µ) + ɛ t, where 1.5 = µ(1 1.2 ( 0.5)). Thus the model has mean µ = 5 X t 5 = 1.2(X t 1 5) 0.5(X t 2 5) + ɛ t 71 of 189

72 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models A Simulated AR (1) Series: X t = 0.8X t 1 + ɛ t R> set.seed(12345) R> ar1.sim<-arima.sim(n=100,list(order=c(1,0,0),ar=0.8)) R> par(mfrow=c(1,3)) R> plot(ar1.sim, ylab="x(t)", main="a simulated AR(1)") R> acf(ar1.sim, main="autocorrelation Function") R> acf(ar1.sim, type="partial", main="partial ACF") A simulated AR(1) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag 72 of 189

73 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models A Simulated AR (1) Series: X t = 0.8X t 1 + ɛ t R> set.seed(12345) R> ar11.sim<-arima.sim(n=100,list(order=c(1,0,0),ar=-0.8)) R> par(mfrow=c(1,3)) R> plot(ar11.sim, ylab="x(t)", main="a simulated AR(1)") R> acf(ar11.sim, main="autocorrelation Function") R> acf(ar11.sim, type="partial", main="partial ACF") A simulated AR(1) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag 73 of 189

74 A Simulated AR (2) Series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models R> set.seed(1965) R> ar2<-arima.sim(n=200,list(order=c(2,0,0),ar=c(1.1,-0.3)) R> par(mfrow=c(1,3)) R> plot(ar2, ylab="x(t)", main="a simulated AR(2)") R> acf(ar2, main="autocorrelation Function") R> acf(ar2, type="partial", main="partial ACF") A simulated AR(2) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag Figure : A simulated AR (2) process: X t = 1.1X t 1 0.3X t 2 + ɛ t 74 of 189

75 Moving Average (MA ) Models Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Definition A time series {X t }, with zero mean, satisfies X t = ɛ t + θ 1 ɛ t 1 + θ 2 ɛ t θ q ɛ t q (ɛ t is White Noise and θ j are parameters, j = 1, 2,..., q) is called Moving Average Process of order q, denoted by MA (q). With backshift operator, the process is: X t = Θ q (B)ɛ t, Θ q (B) = 1 + θ 1 B + θ 2 B θ q B q polynomial of degree q The MA (q), q 1, process is always stationary (regardless of the values of θ j, j = 1, 2,..., q). If the roots of 1 + θ 1 B + θ 2 B θ q B q = 0 > 1, the MA (q) process is said to be invertible. 75 of 189

76 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models A Simulated MA (1) Series: X t = ɛ t + 0.8ɛ t 1 R> set.seed(6436) R> ma1.sim<-arima.sim(n=200,list(order=c(0,0,1),ma=0.8)) R> par(mfrow=c(1,3)) R> plot(ma1.sim, ylab="x(t)", main="a simulated MA(1)") R> acf(ma1.sim, main="autocorrelation Function") R> acf(ma1.sim, type="partial", main="partial ACF") A simulated MA(1) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag 76 of 189

77 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models A Simulated MA (1) Series: X t = ɛ t 0.8ɛ t 1 R> set.seed(6436) R> ma11.sim<-arima.sim(n=200,list(order=c(0,0,1),ma=-0.8)) R> par(mfrow=c(1,3)) R> plot(ma11.sim, ylab="x(t)", main="a simulated MA(1)") R> acf(ma11.sim, main="autocorrelation Function") R> acf(ma11.sim, type="partial", main="partial ACF") A simulated MA(1) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag 77 of 189

78 A Simulated MA (2) Series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models R> set.seed(14762) R> ma2<-arima.sim(n=200,list(order=c(0,0,2),ma=c(0.8,0.6))) R> par(mfrow=c(1,3)) R> plot(ma2, ylab="x(t)", main="a simulated MA(2)") R> acf(ma2, main="autocorrelation Function") R> acf(ma2, type="partial", main="partial ACF") A simulated MA(2) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag Figure : A simulated MA(2) process: X t = ɛ t + 0.8ɛ t ɛ t 2 78 of 189

79 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Autoregressive-Moving Average (ARMA ) Models Definition A time series {X t }, with zero mean, is called Autoregressive-Moving Average of order (p, q) and denoted by ARMA (p, q) if it can be written as X t = p φ i X t i i=1 }{{} Autoregressive Part + q θ j ɛ t j j=0 }{{} Moving Average Part ɛ t is White Noise, φ i & θ j for i = 1, 2,..., p, j = 0, 1,..., q, are the parameters of the Autoregressive & the Moving Average parts respectively, where θ 0 = 1., 79 of 189

80 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Autoregressive-Moving Average (ARMA ) Models With backshift operator, the process can be rewritten as: Φ p (B)X t = Θ q (B)ɛ t, }{{}}{{} Autoregressive Part Moving Average Part where Φ p (B) = 1 p i=1 φ ib i and Θ q (B) = 1 + q j=1 θ jb j. If the mean, µ, of X t is not zero, replace X t by X t µ to get: can also be written as: Φ p (B)(X t µ) = Θ q (B)ɛ t. X t = α + φ 1 X t φ p X t p +ɛ t + θ 1 ɛ t θ q ɛ t q, }{{}}{{} Autoregressive Part Moving Average Part where α = µ(1 φ 1 φ 2... φ p ) and µ is called the intercept obtained from the output of the R function arima(). 80 of 189

81 Three Conditions for ARMA Models Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models The ARMA model is assumed to be stationary, invertible and identifiable, where: The condition for the stationarity is the same for the pure AR (p) process: i.e., the roots of 1 φ 1 B φ 2 B 2... φ p B p lie outside the unit circle. The condition for the invertibility is the same for the pure MA (q) process: i.e., the roots of 1 + θ 1 B + θ 2 B θ q B q lie outside the unit circle. The identifiability condition means that the model is not redundant: i.e., Φ p (B) = 0 and Θ q (B) = 0 have no common roots. (see next page) 81 of 189

82 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Example: Model Redundancy (Shared Common Roots in ARMA Models) Consider the ARMA (1,2): X t = 0.2X t 1 + ɛ t 1.1ɛ t ɛ t 2, this model can be written as or equivalently (1 0.2B)X t = (1 1.1B B 2 )ɛ t (1 0.2B)X t = (1 0.2B)(1 0.9B)ɛ t cancelling (1 0.2B) from both sides to get: X t = (1 0.9B)ɛ t Thus, the process is not really an ARMA (1,2), but it is a MA (1) ARMA (0,1). 82 of 189

83 A Simulated ARMA (1,1) Series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models R> set.seed(6436);par(mfrow=c(1,3)) R> arma1.sim<-arima.sim(n=200,list(order=c(1,0,1), + ar=0.8,ma=-0.6)) R> plot(arma1.sim,ylab="x(t)",main="a simulated ARMA(1,1)") R> acf(arma1.sim, main="autocorrelation Function") R> acf(arma1.sim, type="partial", main="partial ACF") A simulated ARMA(1,1) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag Figure : A simulated ARMA (1,1) process: X t = 0.8X t 1 + ɛ t 0.6ɛ t 1 83 of 189

84 A Simulated ARMA (2,1) Series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models R> set.seed(6436);par(mfrow=c(1,3)) R> arma21.sim<-arima.sim(n=200,list(order=c(2,0,1), + ar=c(0.8,-0.4),ma=-0.6)) R> plot(arma21.sim,ylab="x(t)",main="a simulated ARMA(2,1)" R> acf(arma21.sim, main="autocorrelation Function") R> acf(arma21.sim, type="partial", main="partial ACF") A simulated ARMA(2,1) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag Figure : A simulated process: X t = 0.8X t 1 0.4X t 2 + ɛ t 0.6ɛ t 1 84 of 189

85 A Simulated ARMA (1,2) Series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models R> set.seed(6436);par(mfrow=c(1,3)) R> arma12.sim<-arima.sim(n=200,list(order=c(1,0,2), + ar=0.8,ma=c(-0.6,0.4))) R> plot(arma12.sim,ylab="x(t)",main="a simulated ARMA(1,2)" R> acf(arma12.sim, main="autocorrelation Function") R> acf(arma12.sim, type="partial", main="partial ACF") A simulated ARMA(1,2) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag Figure : A simulated process: X t = 0.8X t 1 + ɛ t 0.6ɛ t ɛ t 2 85 of 189

86 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Some Examples of ARMA (p, q) Process AR (1) ARMA (1,0) Process may be written as: X t = φx t 1 + ɛ t or (1 φb)x t = ɛ t AR (2) ARMA (2,0) Process may be written as: X t = φ 1 X t 1 + φ 2 X t 2 + ɛ t or (1 φ 1 B φ 2 B 2 )X t = ɛ t MA (1) ARMA (0,1) Process may be written as: X t = ɛ t + θɛ t 1 or X t = (1 + θb)ɛ t MA (2) ARMA (0,2) Process may be written as: X t = ɛ t + θ 1 ɛ t 1 + θ 2 ɛ t 2 or X t = (1 + θ 1 B + θ 2 B 2 )ɛ t 86 of 189

87 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Some Examples of ARMA (p, q) Process ARMA (1, 1) Process may be written as: X t = φx t 1 + ɛ t + θɛ t 1 or (1 φb)x t = (1 + θb)ɛ t ARMA (2, 1) Process may be written as: X t = φ 1 X t 1 +φ 2 X t 2 +ɛ t +θɛ t 1 or (1 φ 1 B φ 2 B 2 )X t = (1+θB)ɛ t ARMA (1, 2) Process may be written as: X t = φx t 1 +ɛ t +θ 1 ɛ t 1 +θ 2 ɛ t 2 or (1 φb)x t = (1+θ 1 B+θ 2 B 2 )ɛ t ARMA (2, 2) Process may be written as: X t = φ 1 X t 1 +φ 2 X t 2 +ɛ t +θ 1 ɛ t 1 +θ 2 ɛ t 2 or Φ 2 (B)X t = Θ 2 (B)ɛ t, where Φ 2 (B) = 1 φ 1 B φ 2 B 2 & Θ 2 (B) = 1 + θ 1 B + θ 2 B 2 87 of 189

88 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Autoregressive-Integrated-Moving Average (ARIMA ) Models Definition A time series {X t }, with zero mean, is called Autoregressive-Integrated-Moving Average of order (p, d, q), denoted by ARIMA (p, d, q) if the dth differences of the {X t } series is an ARMA (p, q) process. The process in the backshift operator may be written as: Φ p (B)(1 B) d X t = Θ q (B)ɛ t, where Φ p (B) = 1 φ 1 B φ 2 B 2... φ p B p polynomial of degree p, Θ q (B) = 1 + θ 1 B + θ 2 B θ q B q polynomial of degree q 88 of 189

89 Remarks Introduction to time series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models ARIMA models are applied in some cases where data show evidence of non stationarity, where an initial differencing step can be applied one or more times to eliminate the non stationarity. AR (p) ARIMA (p, 0, 0), MA (q) ARIMA (0, 0, q), ARI (p, d) ARIMA (p, d, 0), IMA (d, q) ARIMA (0, d, q), ARMA (p, q) ARIMA (p, 0, q), WN ARIMA (0,0,0), I (d) ARIMA (0, d, 0). Note that I (1) is the well-known random walk, X t = X t 1 + ɛ t, that is widely used to describe the behavior of the series of a stock price (as we will discuss later). 89 of 189

90 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Some Examples of ARIMA (p, d, q) Process ARI (1,1) ARIMA (1,1,0) Process may be written as: X t = φx t 1 + X t 1 φx t 2 + ɛ t (1 φb)(1 B)X t = ɛ t ARIMA (1,1,1) Process may be written as: (1 φb)(1 B)X t = (1 + θb)ɛ t ARIMA (2,1,1) Process may be written as: (1 φ 1 B φ 2 B 2 )(1 B)X t = (1 + θb)ɛ t ARIMA (1,2,2) Process may be written as: (1 φb)(1 B) 2 X t = (1 + θ 1 B + θ 2 B 2 )ɛ t 90 of 189

91 A Simulated ARIMA (2,1,1) Series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models R> set.seed(6436);par(mfrow=c(1,3)) R> arima.sim<-arima.sim(n=200,list(order=c(2,1,1), + ar=c(0.8,-0.4),ma=-0.6)) R> plot(arima.sim,ylab="x(t)",main="a simulated ARIMA(2,1,1 R> acf(arima.sim, main="autocorrelation Function") R> acf(arima.sim, type="partial", main="partial ACF") A simulated ARIMA(2,1,1) Autocorrelation Function Partial ACF X(t) ACF Partial ACF Time Lag Lag Figure : (1 0.8B + 0.4B 2 )(X t X t 1 ) = ɛ t 0.6ɛ t 1 91 of 189

92 Seasonal ARIMA (SARIMA ) Models Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models A non-stationary process is often possess a seasonal component that repeats itself after a regular period of time, where the smallest time period denoted by s is called the seasonal period. Monthly observations: s = 12 (12 observations per a year). Quarterly observations: s = 4 (4 observations per a year). Daily observations: s = 365 (365 observations per a year). Weekly observations: s = 52 (52 observations per a year). Daily observations by week: s = 5 (5 working days). 92 of 189

93 Seasonal ARIMA (SARIMA ) Models Definition Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models The multiplicative Seasonal ARIMA (SARIMA ) model denoted by ARIMA (p, d, q) (P, D, Q) s, where s is the number of seasons: Φ p (B)Φ P (B s )(1 B) d (1 B s ) D X t = Θ q (B)Θ Q (B s )ɛ t, Φ p (B) = 1 p i=1 φ ib i polynomial in B of degree p, Θ q (B) = 1 + q i=1 θ ib i polynomial in B of degree q, Φ P (B s ) = 1 P i=1 Φ ib is polynomial in B s of degree P, Θ Q (B s ) = 1 + Q i=1 Θ ib is polynomial in B s of degree Q, with no common roots between Φ P (B s ) and Θ Q (B s ), p, d, and q are the order of non-seasonal AR model, MA model and ordinary differencing respectively, whereas P, D, and Q are the order of Seasonal Autoregressive (SAR ) model, Seasonal Moving Average (SMA ) model, and seasonal differencing respectively. 93 of 189

94 Remarks Introduction to time series Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models The idea is that SARIMA are ARIMA (p, d, q) models whose residuals ɛ t are ARIMA (P, D, Q), whose operators are defined on B s and successive powers, where p, q, and d are the non-seasonal AR order, non-seasonal MA order, and non-seasonal differencing respectively, while P, Q and D are the seasonal AR (SAR ) order, seasonal MA (SMA ) order, and seasonal differencing at lag s respectively. Seasonal differencing s X t = (1 B s )X t = X t X t s will remove seasonality in the same way that ordinary differencing X t = X t X t 1 will remove a polynomial trend. 94 of 189

95 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Some Examples of SARIMA (p, d, q) (P, D, Q) s Process ARIMA (0, 1, 0) (0, 1, 0) 5 model can be written as: (1 B)(1 B 5 )X t = ɛ t ARIMA (0, 1, 0) (0, 1, 1) 4 model can be written as: (1 B)(1 B 4 )X t = (1 + ΘB 4 )ɛ t ARIMA (1, 0, 0) (0, 1, 1) 12 model can be written as: (1 φb)(1 B 12 )X t = (1 + ΘB 12 )ɛ t ARIMA (0, 1, 1) (0, 1, 1) 12 model can be written as: (1 B)(1 B 12 )X t = (1 + θb)(1 + ΘB 12 )ɛ t 95 of 189

96 Seasonal Patterns and Correlogram Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Correlogram to detect autocorrelations Passengers (1000's) ACF Time Lag Figure : International air passenger bookings in US for years There is a clear seasonal variation. At the time, bookings were highest during the summer months June-August and lowest during the autumn month of November and winter month of February. 96 of 189

97 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Remove Trend and Seasonality Using decompose() R> par(mfrow=c(1,2)) R> AP.decom <- decompose(airpassengers, "multiplicative") R> plot(ts(ap.decom$random[7:138])) R> acf(ap.decom$random[7:138]); par(mfrow=c(1,1)) Series AP.decom$random[7:138] ts(ap.decom$random[7:138]) ACF Time Lag The correlogram suggests either MA (2) or that the seasonal adjustment has not been entirely effective. (Try SARIMA models!). 97 of 189

98 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Example of SARIMA (p, d, q) (P, D, Q) s Process The plots of the average monthly sea level at Darwin, Australia from January 1988 to December 1999 (in metres), available from climatology/darwinsl.txt, show consistent seasonal pattern. 98 of 189

99 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models A Simulated SARIMA (2, 0, 0) (0, 1, 1) 12 Series portes source: time-series-analysis-stat3305/lectures/ R> library("portes")#try sim_sarima in sarima package! R> set.seed(13); phi<-c(1.3,-0.35); theta.season<-0.8 R> Z1<-varima.sim(list(ar=phi,d=0,ma.season=theta.season, + d.season=1,period=12),n=100,trunc.lag=50) R> par(mfrow=c(1,2)); plot(z1); acf(z1,lag.max =40) Series 1 Series ACF Time Lag (1 1.3B + 0.5B 2 )(1 B 12 )X t = ɛ t + 0.8ɛ t of 189

100 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Second order properties of MA (1) Models For the MA (1) process: X t = ɛ t + θɛ t 1 where ɛ t WN(0, σ 2 ). The mean of X t : E(X t ) = 0 (which is independent of t). The variance: γ X (0) = E(ɛ 2 t + 2θɛ t ɛ t 1 + θ 2 ɛ 2 t 1 ) = (1 + θ2 )σ 2. The autocovariance and the autocorrelation are independent of t without any restrictions on the parameter θ as shown below: γ X (h) = E(ɛ t ɛ t+h ) + θe(ɛ t 1 ɛ t+h ) + θe(ɛ t ɛ t+h 1 ) + θ 2 E(ɛ t 1 ɛ t+h 1 ) (1 + θ 2 )σ 2, h = 0 = θσ 2, h = ±1, 0, h = ±2, ±3,... ρ X (h) = γ X (h) γ X (0) = 1, h = 0 θ 1 + θ 2, h = ±1 0, h = ±2, ±3, of 189

101 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Second Order Properties of MA (q) Models For MA (q) process: X t = ρ X (h) = γ X (h) γ X (0) = q θ i ɛ t i, θ 0 = 1, ɛ t WN(0, σ 2 ). i=0 The mean of X t : E(X t ) = 0 (which is independent of t). The variance: γ X (0) = (1 + θ1 2 + θ θ2 q)σ 2 = σ 2 q i=0 θ2 i. The autocovariance and the autocorrelation are independent of t without any restrictions on the parameter θ as shown below: q σ 2 θ i θ i h, h = 0, ±1, ±2,..., ±q γ X (h) = i=h 0, h > q q i=h θ iθ i h q, i=0 θ2 i h = 0, ±1, ±2,..., ±q 0, h > q 101 of 189

102 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models ACF for Simulated MA (1) and MA (2) process MA(1), theta = 0.7 MA(1), theta = 0.7 ACF ACF Lag Lag MA(2), theta1 = 0.7 and theta2 = 0.3 MA(2), theta1 = 0.7 and theta2 = 0.3 ACF ACF Lag Lag 102 of 189

103 Example: The ACF of MA (q) Models Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Example: Find the first and second moments of the process X t = ɛ t + 0.6ɛ t 1 0.3ɛ t 2, where ɛ WN(0, σ 2 ). Solution: The process is MA (2) with θ 0 = 1, θ 1 = 0.6, and θ 2 = 0.3 E(X t ) = 0 Var(X t ) = γ X (0) = (1 + θ θ 2 2)σ 2 = 1.45σ 2 2 γ X (1) = σ 2 θ i θ i 1 = (θ 1 θ 0 + θ 2 θ 1 )σ 2 = 0.42σ 2 i=1 2 γ X (2) = σ 2 θ i θ i 2 = θ 2 θ 0 σ 2 = 0.30σ 2 i=2 ρ X (1) = γ X (1) γ X (0) = 0.27, ρ X (2) = γ X (2) γ X (0) = 0.19, ρ X (h) = 0 h > of 189

104 Invertibility of MA (q) Models Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models If we replace θ by 1/θ, the autocorrelation function for MA (1), ρ = θ, will not be changed. There are thus two 1+θ 2 processes, X t = ɛ t + θɛ t 1 and X t = ɛ t + 1 θ ɛ t 1, show identical autocorrelation pattern, and hence the MA (1) coefficient is not uniquely identified. For the general moving average process, MA (q), there is a similar identifiability problem. The problem can be resolved by assuming that the operator 1 + θ 1 B + θ 2 B θ q B q be invertible, i.e., all absolute roots of Θ q (B): 1 + θ 1 B + θ 2 B θ q B q = 0 lie outside the unit circle. 104 of 189

105 Invertibility of MA (q) Models Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Definition The MA (q) process is invertible if it can be represented as a convergent infinite AR form AR ( ). The MA (q), X t = Θ q (B)ɛ t, where Θ q (B) = 1 + q j=1 θ jb q and ɛ t WN(0, σ 2 ), in an infinite AR representation AR ( ): Θ q (B) 1 X t = Θ q (B) 1 Θ q (B)ɛ t = ɛ t Π (B)X t = ɛ t, where Π (B) = Θ q (B) 1 = 1 i=1 π ib i = i=0 π ib i with π 0 = 1, and π i satisfy i=0 π i <. Note that the condition of finite sum ( i=0 π i < ) ensures that the AR ( ) series is convergent (i.e., MA (q) is invertible). 105 of 189

106 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models MA (1) in an Infinite AR Representation AR ( ) Consider the MA (1) process X t = (1 + θb)ɛ t. where we multiply both sides by (1 + θb) 1 to get ɛ t = θb X t = Π (B)X t, where Π (B) = (1 + θb) 1 = 1 i=1 π ib i = i=0 π ib i. 1 Recall that the Taylor series of, x < 1 is the geometric 1 x series 1 + x + x 2 + x , which implies to θb = 1 1 ( θb) = 1 θb + θ2 B 2 θ 3 B = So that, (see the next slide) ( θ) i B i i=0 106 of 189

107 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models MA (1) in an Infinite AR Representation AR ( ) ( θ) i B i = i=0 π i B i. i=0 By equating the coefficients of the corresponding powers B i we get π i = ( 1) i+1 (θ) i i 0. MA (1) process X t = (1 + θb)ɛ t in an AR ( ) representation is X t = π i B i X t + ɛ t, where π i = ( 1) i+1 θ i, i = 1, 2,.... i=1 Note that the absolute root of 1 + θb = 0 implies that 1 θ > 1 or equivalently θ < 1 ensures that the MA (1) is an invertible series. 107 of 189

108 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Examples - Invertibility of MA (1) Models 1.) The MA (1) process X t = ɛ t + 0.4ɛ t 1 is an invertible process because θ = 0.4 < 1 (or equivalently the root of ( B) = 0 in absolute value is B = 1/0.4 = 2.5 > 1). This process can be written in the AR ( ) representation X t = π i X t i + ɛ t, where π i = ( 1) i+1 (0.4) i, i = 1, 2,..., i=1 X t = ɛ t + 0.4X t 1 (0.4) 2 X t 2 + (0.4) 3 X t ) The MA (1) process X t = ɛ ɛ t 1, is not invertible process because the root of ( B) = 0 in absolute value is B = 1/1.8 = 0.56 < 1. Thus, we can not write this process in an AR ( ) representation. 108 of 189

109 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models MA (q) in an Infinite AR Representation AR ( ) Because Π (B) = Θ q (B) 1, the coefficients π i can be obtained by equating the coefficients in the relation Θ q (B)Π (B) = 1. Thus, Θ q (B)Π (B) = (1 + θ 1 B θ q B q )(1 π 1 B π 2 B 2...) = 1 (π 1 θ 1 )B (π 2 + θ 1 π 1 θ 2 )B 2... (π j + θ 1 π j θ q 1 π j q+1 + θ q π j q )B j... by equating coefficients of various powers B j in the relation Θ q (B)Π (B) = 1 for j = 1, 2,..., we have π j = θ 1 π j 1 θ 2 π j 2... θ q 1 π j q+1 θ q π j q, where π 0 = 1 and π j = 0, for j < of 189

110 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Example: MA (2) in an AR ( ) Representation Example: An example of an invertible MA (2) process is X t = ɛ t 0.1ɛ t ɛ t 2. The roots of (1 0.1B B 2 ) = (1 0.7B)( B) = 0 1 are B = 1/0.7 = 1.43 > 1 and B = 0.6 = 1.67 > 1. Thus, the process is invertible. Now the process in an AR ( ) representation is written as follows: X t = π i X t i + ɛ t, i=1 where π j = θ 1 π j 1 θ 2 π j 2... θ q π j q, q = 2, θ 1 = 0.1, θ 2 = 0.42, θ j = 0 j > q, π 0 = 1, and π j = 0 j < 0. (see the next slide). 110 of 189

111 Continue with the Previous Example Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models π j = θ 1 π j 1 θ 2 π j 2... θ q π j q, where q = 2, θ 1 = 0.1, θ 2 = 0.42, θ j = 0 j > q, π 0 = 1, and π j = 0 j < 0. π 1 = θ 1 π = ( 0.1)( 1) = 0.1 π 2 = θ 1 π 1 θ 2 π 0 = ( 0.1)( 0.1) (0.42)( 1) = 0.41 π 3 = θ 1 π 2 θ 2 π = ( 0.1)(0.41) (0.42)( 0.1) 0 = π 4 = θ 1 π 3 θ 2 π = ( 0.1)(0.083) (0.42)(0.41) 0 = =. Thus, the MA (2) process in the AR ( ) representation is X t = ɛ t 0.1X t X t X t X t of 189

112 Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Example: MA (2) in an AR ( ) Representation Example: Consider the MA (2) process X t = ɛ t ɛ t ɛ t 2. Recall that the roots of the quadratic equation ax 2 + bx + c = 0 are x = b ± b 2 4ac. Thus roots of 1 B + 0.5B 2 = 0 are 2a B = ( 1) ± ( 1) 2 4(0.5)(1) = 1 ± i, where i = 1. 2(0.5) Recall that the absolute value of a complex number a + ib = a 2 + b 2, so that the absolute values of the two roots are > 1; Hence, the process is invertible. (see the next slide) 112 of 189

113 Continue with the Previous Example Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Now the process in an AR ( ) representation is written as follows: X t = π i X t i + ɛ t, i=1 π j = θ 1 π j 1 θ 2 π j 2, θ 1 = 1, θ 2 = 0.5, π 0 = 1 & π 1 = 0. π 1 = θ 1 π 0 = ( 1)( 1) = 1 π 2 = θ 1 π 1 θ 2 π 0 = ( 1)( 1) (0.5)( 1) = 0.5 π 3 = θ 1 π 2 θ 2 π 1 = ( 1)( 0.5) (0.5)( 1) = 0 π 4 = θ 1 π 3 θ 2 π 2 = ( 1)(0) (0.5)( 0.5) = =. Thus, the MA (2) process in the AR ( ) representation is X t = ɛ t X t 1 0.5X t X t of 189

114 Finding the Roots in R Weak and Strong (Strictly) Stationarity White Noise Process ARIMA and SARIMA Models Example In R, the function polyroot(a), where a is the vector of polynomial coefficients in increasing order can be used to find the zeros of a real or complex polynomial of degree n 1 p(x) = a 1 + a 2 x a n x n 1. The function Mod() can be used to check whether the roots of p(x) have modulus > 1 or not. The MA (4) process X t = ɛ t 0.3ɛ t ɛ t 2 1.2ɛ t ɛ t 4 is not invertible as R> roots <- polyroot(c(1,-0.3, 0.7,-1.2,0.1)) R> Mod(roots) [1] of 189

115 AR (1) in MA ( ) Representation Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Let X t is an AR (1) process, where ɛ t WN(0, σ 2 ). X t = φx t 1 + ɛ t X t = φ(φx t 2 + ɛ t 1 ) + ɛ t = ɛ t + φɛ t 1 + φ 2 X t 2 = ɛ t + φɛ t 1 + φ 2 (φx t 3 + ɛ t 2 ) = ɛ t + φɛ t 1 + φ 2 ɛ t 2 + φ 3 X t 3. =. X t = φ j ɛ t j j=0 which is in the MA ( ) representation. Note that (apart of ɛ t j ) this representation may be seen as a geometric series, which is converge if and only if φ < 1, (see the next slide). 115 of 189

116 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process AR (1) in MA ( ) Representation - Second Order Properties of AR (1) Models The AR (1) process with backshift operator may be written as (1 φb)x t = ɛ t, where ɛ t WN(0, σ 2 ) X t = ɛ t 1 φb. 1 If φb < 1, Taylor series of 1 φb = 1 + φb + φ2 B X t = (1 + φb + φ 2 B 2 + φ 3 B )ɛ t = φ j ɛ t j j=0 Which is in the MA ( ) Wold representation. (See the next slide.) 116 of 189

117 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process The AR (1) process in MA ( ) representation: X t = The mean of X t : E(X t ) = φ i ɛ t i i=0 φ i E(ɛ t i ) = 0 (independent of t). i=0 The variance: γ X (0) = E( φ 2i ɛ 2 t i) = σ 2 φ 2i = σ2 1 φ 2. i=0 i=0 The autocovariance is independent of t as follows: γ X (h) = E(X t X t+h ) = φ i φ j E(ɛ t i ɛ t+h j ) i=0 j=h = σ 2 φ i φ i+h = φ h σ 2 φ 2i = σ 2 φ h 1 φ 2 i=0 The autocorrelation: ρ X (h) = γ X (h) γ X (0) = φh (independent of t). i=0 117 of 189

118 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Autocorrelation Function of AR (1) Model- Example R> set.seed(125) R> ar.1<-arima.sim(n=200,list(order=c(1,0,0),ar=0.8)) R> ar.2<-arima.sim(n=200,list(order=c(1,0,0),ar=-0.5)) R> par(mfrow=c(1,2)) R> acf(ar.1,main="phi = 0.8"); acf(ar.2,main="phi = -0.5") R> par(mfrow=c(1,1)) Phi = 0.8 Phi = 0.5 ACF ACF Lag Lag 118 of 189

119 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Example: The Autocorrelation of AR (1) Example: The autocorrelation of the stationary AR (1) process: X t = 0.6X t 1 + ɛ t, ɛ t WN(0, 4) are ρ X (h) = φ h = 0.6 h h 0, Thus ρ X (0) = 1, ρ X (1) = 0.6, ρ X (2) = = 0.36,.... R> Lag <- 0:8 R> rho <- 0.6^Lag R> round(rho,3) [1] of 189

120 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Second Order Properties of AR (p) Models Consider the AR (p) model: X t = p φ i X t i + ɛ t, where ɛ t WN(0, σ 2 ), i=1 The expectation of X t is E(X t ) = 0 (Why?). The variance of X t is Var(X t ) = γ X (0) = p i=1 φ iγ X (i) + σ 2, where γ X (i) = Cov(X t, X t i ). Note that the variance of X t is obtained by multiplying both sides of the above equation by X t and taking expectation, where E(X t ɛ t ) = σ 2 (Why?). (see the next slide)! 120 of 189

121 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Second Order Properties of AR (p) Models (Cont.) Multiplying both sides of the equation X t = p i=1 φ ix t i + ɛ t by X t h and taking expectations gives p γ X (h) = φ i γ X (h i) h > 0 i=1 Now divide both sides by γ X (0) to get the autocorrelations p ρ X (h) = φ i ρ X (h i) h > 0 i=1 These are the Yule-Walker equations that we will discuss later (see Partial Autocorrelation Function (PACF ) Section). The population autocorrelations ρ X (h) can be found recursively by solving the Yule-Walker equations given the values of φ i. In practice, as we will see later, we use the sample autocorrelation coefficients to estimate the values of φ i. 121 of 189

122 Stationarity of AR (1) Models Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Definition The condition for the Autoregressive process of order 1, AR (1): X t = φx t 1 + ɛ t, to be stationary is that the absolute value of the root of (1 φb) = 0 must lie outside the unit circle. That is, the AR (1) is stationary if B = 1 φ > 1, or equivalently, if φ < 1, (1 0.4B)X t = ɛ t is a stationary process because the absolute value of the root of (1 0.4B) = 0 is B = 1/0.4 = 2.5 > 1. ( B)X t = ɛ t is not stationary process because root of 1 ( B) = 0 is B = 1.8 = 0.56 < 1. X t = 0.5X t 1 + ɛ t is a stationary process because 0.5 < of 189

123 Stationarity of AR (2) Models Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Theorem For the AR (2) model, X t = φ 1 X t 1 + φ 2 X t 2 + ɛ t, the necessary and sufficient conditions for stationarity are: Examples φ 2 < 1 φ 1 + φ 2 < 1 φ 2 φ 1 < 1 X t = 1.1X t 1 0.4X t 2 + ɛ t is stationary. X t = 0.6X t 1 1.3X t 2 + ɛ t is not stationary ( φ 2 1). X t = 0.6X t X t 2 + ɛ t is not stationary (φ 1 + φ 2 1). X t = 0.4X t X t 2 + ɛ t is not stationary (φ 2 φ 1 1). 123 of 189

124 Stationarity of AR (p) Models Definition Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process The condition for the Autoregressive process of order p, AR (p): X t = φ 1 X t φ p X t p + ɛ t, to be stationary is that the absolute values of the roots of (1 φ 1 B φ 2 B 2... φ p B p ) = 0 must lie outside the unit circle. X t = 0.4X t X t 2 + ɛ t is a stationary process as roots of ( B 0.21B 2 ) = (1 0.3B)( B) = 0 are = 3.3 > 1 and = 1.4 > 1. X t = 1.7X t X t 2 + ɛ t is not a stationary process because the absolute value of one of the two roots of (1 1.7B B 2 ) = (1 0.3B)(1 1.4B) = 0 lies inside the unit circle, which is = 0.71 < 1. X t = 0.25X t 2 + ɛ t is a stationary process because roots of B 2 = 0 are ± 2i = 2 > 1, where i = of 189

125 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Causality of AR (p) in an Infinite MA ( ) Representation The AR (p), Φ p (B)X t = ɛ t, where Φ p (B) = 1 p j=1 φ jb p and ɛ t WN(0, σ 2 ), is causal process if it can be written in an infinite MA representation MA ( ): Φ p (B) 1 Φ p (B)X t = Φ p (B) 1 ɛ t X t = Φ p (B) 1 ɛ t = Ψ (B)ɛ t = ψ i ɛ t i where Ψ (B) = Φ p (B) 1, Ψ (B) = 1 + i=1 ψ ib i, ψ i satisfy i=0 ψ i < with ψ 0 = 1. (see the next slide). i=0 125 of 189

126 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process AR (p) in an Infinite MA ( ) Representation Because Ψ (B) = Φ p (B) 1, the coefficients ψ i can be obtained by equating the coefficients in the relation Φ p (B)Ψ (B) = 1. Thus, Φ p (B)Ψ (B) = (1 φ 1 B... φ p B p )(1 + ψ 1 B + ψ 2 B ) = 1 + (ψ 1 φ 1 )B + (ψ 2 φ 1 ψ 1 φ 2 )B (ψ j φ 1 ψ j 1... φ p ψ j p )B j +... by equating coefficients of various powers B j in the relation Φ p (B)Ψ (B) = 1 for j = 1, 2,..., we have ψ j = φ 1 ψ j 1 + φ 2 ψ j φ p 1 ψ j p+1 + φ p ψ j p, where ψ 0 = 1 and ψ j = 0, for j < of 189

127 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Example: AR (p) in an Infinite MA ( ) Representation Example: An example of a stationary AR (2) process: X t = 0.1X t X t 2 + ɛ t where p = 2, φ 1 = 0.1, φ 2 = 0.42, φ j = 0 j > p. The roots of (1 0.1B B 2 ) = (1 0.7B)( B) = 0 1 are B = 1/0.7 = 1.43 > 1 and B = 0.6 = 1.67 > 1. Thus, the process is stationary-causal. Now the process in a MA ( ) representation is written as follows: X t = ψ i ɛ t 1, i=0 where ψ j = φ 1 ψ j 1 + φ 2 ψ j φ p 1 ψ j p+1 + φ p ψ j p, with ψ 0 = 1 and ψ j = 0 j < 0. (see the next slide). 127 of 189

128 Continue with the Previous Example Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Use ψ j = φ 1 ψ j 1 + φ 2 ψ j φ p ψ j p, where p = 2, φ 1 = 0.1, φ 2 = 0.42, φ j = 0 j > p, ψ 0 = 1 and ψ j = 0 j < 0, ψ 1 = φ 1 ψ 0 = (0.1)(1) = 0.1 ψ 2 = φ 1 ψ 1 + φ 2 ψ 0 = (0.1)(0.1) + ( 0.42)(1) = 0.41 ψ 3 = φ 1 ψ 2 + φ 2 ψ 1 = (0.1)( 0.41) + ( 0.42)(0.1) + 0 = ψ 4 = φ 1 ψ 3 + φ 2 ψ 2 = (0.1)( 0.083) + ( 0.42)( 0.41) + 0 = =. Thus, the AR (2) process in the MA ( ) representation: X t = ɛ t + 0.1ɛ t ɛ t ɛ t ɛ t of 189

129 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Impulse Response Sequence of ARMA (p, q) Models Recall that the ARMA (p, q) models may be represented as Φ p (B)X t = Θ q (B)ɛ t, p q Φ p (B) = 1 φ i B i and Θ q (B) = 1 + θ j B j. i=1 Under the stationarity conditions, we have the MA ( ) form where X t = Ψ (B)ɛ t, Ψ (B) = Φ p (B) 1 Θ q (B) = Ψ is known as the impulse response sequence. (see next page) j=1 ψ i B i, i=0 129 of 189

130 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Impulse Response Sequence of ARMA (p, q) Models Similar to the pure AR model situation, the coefficients ψ i can be derived by equating the coefficients of the relation The ψ j satisfy Φ p (B)Ψ (B) = Θ q (B). ψ j = φ 1 ψ j 1 + φ 2 ψ j φ p ψ j p + θ j, j = 1, 2,.... where ψ 0 = 1, ψ j = 0 j < 0, and θ j = 0 j > q (see next page) 130 of 189

131 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Impulse Response Sequence of ARMA (p, q) Models Under the invertibility conditions, we have the convergent casual infinite AR representation as where ɛ t = Π (B)X t, Π (B) = Θ q (B) 1 Φ p (B) = 1 π i B i. Similar to the pure MA model situation, the coefficients π i can be determined from the equating coefficients of relation The π j satisfy Θ q (B)Π (B) = Φ p (B). i=1 π j = θ 1 π j 1 θ 2 π j 2... θ q π j q + φ j, j = 1, 2,... where π 0 = 1, π j = 0 j < 0, and φ j = 0 j > p. 131 of 189

132 ACF of ARMA (1, 1) Models Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Consider the casual ARMA (1,1) model: X t = φx t 1 + ɛ t + θɛ t 1, where φ < 1 γ(h) = Cov(X t+h, X t ) = E(X t+h X t ) = E[(φX t+h 1 + ɛ t+h + θɛ t+h 1 )X t ] = φe(x t+h 1 X t ) + E(ɛ t+h X t ) + θe(ɛ t+h 1 X t ) (1) Recall that X t = j=0 ψ jɛ t j, where the coefficients ψ j can be calculated from the relation Ψ (B) = Φ p (B) 1 Θ q (B) = 1 + θb 1 φb (see impulse response sequence of ARMA (p, q) models), where ψ 0 = 1 and ψ 1 = φ + θ for the ARMA (1,1) case. 132 of 189

133 ACF of ARMA (1, 1) Models Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process E(ɛ t+h X t ) = E(ɛ t+h ψ j ɛ t j ) = = j=0 { ψ0 σ 2, for h = 0; 0, for h 1. ψ j E(ɛ t+h ɛ t j ) j=0 (2) E(ɛ t+h 1 X t ) = E(ɛ t+h 1 ψ j ɛ t j ) = = j=0 ψ 1 σ 2, for h = 0; ψ 0 σ 2, for h = 1; 0, for h 2. Furthermore, ψ 0 = 1 and ψ 1 = φ + θ. ψ j E(ɛ t+h 1 ɛ t j ) j=0 (3) 133 of 189

134 ACF of ARMA (1, 1) Models Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Putting all Equations (1-3) together we obtain γ(h) = φe(x t+h 1 X t ) + E(ɛ t+h X t ) + θe(ɛ t+h 1 X t ) φγ(1) + σ 2 (1 + φθ + θ 2 ), for h = 0; = φγ(0) + σ 2 θ, for h = 1; φγ(h 1), for h 2. Note that γ(h) = φγ(h 1), h 2 has an iterative form γ(2) = φγ(1) γ(3) = φγ(2) = φ 2 γ(1)... =... γ(h) = φ h 1 γ(1), with initial conditions: { γ(0) = φγ(1) + σ 2 (1 + φθ + θ 2 ) γ(1) = φγ(0) + σ 2 θ 134 of 189

135 ACF of ARMA (1, 1) Models Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Solving for γ(0) and γ(1) we obtain This gives us γ(0) = σ 2 (1 + 2φθ + θ2 ) 1 φ 2 2 (1 + φθ)(φ + θ) γ(1) = σ 1 φ 2 2 (1 + φθ)(φ + θ) γ(h) = σ 1 φ 2 φ h 1, for h 1 Finally dividing by γ(0) to get the ACF of ARMA (1,1) ρ(h) = (1 + φθ)(φ + θ) 1 + 2φθ + θ 2 φh 1, for h of 189

136 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process An ARMA (1,1) Simulated Process - Example R> set.seed(13675) R> sim<-arima.sim(n=200,list(order=c(1,0,1),ar=0.9,ma=0.5)) R> par(mfrow=c(1,2)) R> plot(sim,ylab=""); acf(sim,lag.max=85) R> par(mfrow=c(1,1)) Series sim ACF Time Lag Figure : A simulated ARMA (1,1) process: X t = 0.9X t 1 + ɛ t + 0.5ɛ t of 189

137 ACF for ARMA (p, q) Models Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Assume that the ARMA (p, q) model Φ p (B)X t = Θ q (B)ɛ t, is causal-stationary, that is the roots of Φ p (B) are outside the unit circle. Then we can write X t = ψ j ɛ t j, j=0 where the coefficients ψ j can be calculated from the relation Ψ (B) = Φ p (B) 1 Θ q (B). It follows that E(X t ) = 0. Note that X t and ɛ t+j are uncorrelated for j > 0 and ψ j = 0 for j < 0. As in the case for ARMA (1,1), We can obtain a homogeneous differential equation in terms of γ(h) with some initial conditions as follows 137 of 189

138 ACF for ARMA (p, q) Models For an ARMA model X t = p φ j X t j + j=1 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process q θ j ɛ t j, with θ 0 = 1, j=0 γ(h) = Cov(X t+h, X t ) = E(X t+h X t ) p q = E[( φ j X t+h j + θ j ɛ t+h j )X t ] = = j=1 j=0 p φ j E[X t+h j X t ] + j=1 p φ j γ(h j) + σ 2 j=1 q j=0 [ θ j E ɛ t+h j i=0 ψ i ɛ t i ] q θ j ψ j h, for h 0. (4) This gives the general homogeneous difference equation for γ(h). (see the next slide!) j=h 138 of 189

139 ACF for ARMA (p, q) Models Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process γ(h) φ 1 γ(h 1) φ p γ(h p) = 0, for h max(p, q + 1) with initial conditions p γ(h) φ j γ(h j) = σ 2 j=1 q θ j ψ j h, for 0 h max(p, q + 1). j=h Dividing these two equations through by γ(0) will allow us to solve for the ACF of ARMA (p, q) models, ρ(h) = γ(h)/γ(0). ρ(h) φ 1 ρ(h 1) φ p ρ(h p) = 0, for h max(p, q + 1) with initial conditions p ρ(h) φ j ρ(h j) = j=1 σ2 γ(0) q θ j ψ j h, for 0 h max(p, q + 1). j=h 139 of 189

140 ACF of an AR (p) Model Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process For a causal AR (p), it follows from the previous slide that ρ(h) φ 1 ρ(h 1) φ p ρ(h p) = 0, for h p. with initial conditions ρ(0) φ 1 ρ( 1) φ 2 ρ( 2) φ p ρ( p) = σ2 γ(0) ρ(1) φ 1 ρ(0) φ 2 ρ( 1) φ p ρ(1 p) = 0 ρ(2) φ 1 ρ(1) φ 2 ρ(0) φ p ρ(2 p) = 0.. ρ(p 1) φ 1 ρ(p 2) φ 2 ρ(p 3) φ p ρ( 1) = 0 ρ(p) φ 1 ρ(p 1) φ 2 ρ(p 2) φ p ρ(0) = 0 where ρ( h) = ρ(h), for h = 1, 2,..., p. 140 of 189

141 Example - ACF of an AR (2) Model Consider the case AR (2). Then Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process ρ(h) φ 1 ρ(h 1) φ 2 ρ(h 2) = 0, for h 2. with initial conditions ρ(0) φ 1 ρ( 1) φ 2 ρ( 2) = σ 2 /γ(0) (what is the value of γ(0)?) ρ(1) φ 1 ρ(0) φ 2 ρ( 1) = 0 ρ(2) φ 1 ρ(1) φ 2 ρ(0) = 0 (you may neglect this condition!) where ρ( h) = ρ(h), h. Hence, ρ(0) = 1 ρ(1) = φ 1 1 φ 2 ρ(h) = φ 1 ρ(h 1) + φ 2 ρ(h 2), for h of 189

142 Cramer s Rule Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Definition Consider the system of n linear equations for n unknowns, represented in matrix multiplication form: AX = b, where the n n matrix A has a nonzero determinant, and the vector X = (x 1, x 2,..., x n ) T is the column vector of the unknowns variables. Then the system has a unique solution, whose individual values for the unknowns are given by x i = det(a i), i = 1, 2,..., n, det(a) where A i is the matrix formed by replacing the i-th column of A by the column vector b. 142 of 189

143 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Partial Autocorrelation Function (PACF ) The partial autocorrelation function (PACF ), φ h,h, is a useful tool used in identifying the order of the AR (p) process. The partial autocorrelation measures the correlation between two random variables X t and X t+h at different lags h after removing linear dependence of X t+1 through X t+h 1 : The PACF thus represents the sequence of conditional correlations: φ h,h = Corr(X t, X t+h X t+1,..., X t+h 1 ), h = 1, 2,... (see the next slide for the definition of the conditional correlation!) The autocorrelation function (ACF ) between two random variables X t and X t+h at different lags h does not adjust for the influence of the intervening lags: The ACF thus represents the sequence of unconditional correlations. 143 of 189

144 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Partial Autocorrelation Function (PACF ) φ h,h = Corr(X t, X t+h X t+1,..., X t+h 1 ) = Cov[(X t X t+1,..., X t+h 1 ), (X t+h X t+1,..., X t+h 1 )] Var(Xt X t+1,..., X t+h 1 ) Var(X t+h X t+1,..., X t+h 1 ) where = Cov[(X t ˆX t ), (X t+h ˆX t+h )], Var(X t ˆX t ) Var(X t+h ˆX t+h ) ˆX t = α 1 X t+1 + α 2 X t α h 1 X t+h 1, ˆX t+h = β 1 X t+1 + β 2 X t β h 1 X t+h 1, and α i, β i (1 i h 1) are the mean squared linear regression coefficients obtained from minimizing the E(X t ˆX t ) 2 and E(X t+h ˆX t+h ) 2 respectively. 144 of 189

145 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Yule-Walker Equations and PACF for AR (p) Process The Yule-Walker equations can be used to derive the partial autocorrelation coefficients at lags 1, 2,..., h as follows: 1 Fit the regression model, where the dependent variable X t from a zero mean stationary process is regressed on the h lagged variables X t 1, X t 2,..., X t h. i.e., X t = φ h,1 X t 1 + φ h,2 X t φ h,h X t h + ɛ t, where φ h,h denotes the h-th regression parameter and ɛ t is an error term with mean 0 and uncorrelated with X t h, for h 0. 2 Multiply this equation by X t 1, take expectations and divide the results by the variance of X t. Do the same operation with X t 2, X t 3,..., X t h successively to get the following set of h-yule-walker equations. (see the next slide)! Yule-Walker equations is a technique that can be used to estimate the autoregression parameters of the AR (h) model X t = h i=1 φ ix t i + ɛ t from data. 145 of 189

146 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Yule-Walker Equations and PACF for AR (p) Process The h h Yule-Walker Equations associated with AR (h) model: ρ 1 = φ h,1 + φ h,2 ρ 1 + φ h,3 ρ φ h,h ρ h 1 ρ 2 = φ h,1 ρ 1 + φ h,2 + φ h,3 ρ φ h,h ρ h 2... = ρ h = φ h,1 ρ h 1 + φ h,2 ρ h 2 + φ h,h 1 ρ φ h,h Which may represented in a matrix form as: AX = b, 1 ρ 1 ρ 2... ρ h 1 φ h,1 ρ 1 ρ 1 1 ρ 1... ρ h 2 φ h,2 ρ 2 ρ ρ h 3 φ h, = ρ 2 ρ 3... ρ h 1 ρ h 2... ρ 1 1 φ h,h ρ h X is the column vector variables (φ h,1, φ h,2,..., φ h,h ) T and b is the column vector (ρ 1, ρ 2,..., ρ h ) T (see the next slide)! 146 of 189

147 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Yule-Walker Equations and PACF for AR (p) Process Use Cramer s rule successively for j = 1, 2,... to get: φ 1,1 = ρ 1 φ 2,2 = φ 3,3 = 1 ρ 1 ρ 1 ρ 2 1 ρ 1 ρ ρ 1 ρ 1 ρ 1 1 ρ 2 ρ 2 ρ 1 ρ 3 1 ρ 1 ρ 2 ρ 1 1 ρ 1 ρ 2 ρ 1 1 = ρ 2 ρ ρ 2 1 = ρ 3(1 ρ 2 1 ) + ρ 1ρ 2 (ρ 2 2) + ρ ρ ρ2 1 ρ 2 ρ 2 2. =. (see next page)! 147 of 189

148 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Yule-Walker Equations and PACF for AR (p) Process φ h,h = 1 ρ 1 ρ 2... ρ h 2 ρ 1 ρ 1 1 ρ 1... ρ h 3 ρ ρ h 1 ρ h 2 ρ h 3... ρ 1 ρ h 1 ρ 1 ρ 2... ρ h 2 ρ h 1 ρ 1 1 ρ 1... ρ h 3 ρ h ρ h 1 ρ h 2 ρ h 3... ρ of 189

149 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process A Numerical Example: How to Calculate PACF Suppose that ρ 1 = 0.7, ρ 2 = 0.5, and ρ 3 = 0.2, then φ 1,1 = ρ 1 = ρ ρ 1 ρ φ 2,2 = = = ρ ρ ρ 1 ρ ρ 1 1 ρ ρ 2 ρ 1 ρ φ 3,3 = = = ρ 1 ρ ρ 1 1 ρ ρ 2 ρ of 189

150 Remarks Introduction to time series Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process The partial autocorrelation function φ h,h is a function of the autocorrelations ρ 1,..., ρ h. Thus, 1 φ h,h 1 h > 0. If {ɛ t } is a white noise process, then the partial autocorrelation function φ h,h = 0 for all h 0, whereas φ 0,0 = ρ 0 = 1. If the underlying process is AR(p), φ h,h = 0 h > p, so the plot of the PACF should show a cutoff after lag p (see the next slide!). Replacing ρ h (population autocorrelations) by ˆρ h (sample autocorrelations) h, will give the sample PACF ˆφ h,h (see Levinson-Durbin recursion method). 150 of 189

151 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process PACF for Simulated AR (1) and AR (2) Process AR(1), phi = 0.7 AR(1), phi = 0.7 Partial ACF Partial ACF Lag Lag AR(2), phi1 = 0.4 and phi2 = 0.5 AR(2), phi1 = 1.1 and phi2 = 0.4 Partial ACF Partial ACF Lag Lag 151 of 189

152 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Sample PACF : Levinson-Durbin Recursive Method In practice the sample PACF is obtained by Levinson-Durbin recursion method starting with ˆφ 1,1 = ˆρ 1 as follows: ˆφ h+1,h+1 = ˆρ h+1 h ˆφ j=1 h,j ˆρ h+1 j 1 h ˆφ, j=1 h,j ˆρ j and ˆφ h+1,j = ˆφ h,j ˆφ h+1,h+1 ˆφ h,h+1 j, for j = 1, 2,..., h. = ˆφ 1,1 = ˆρ 1, ˆφ 2,2 = ˆρ 2 ˆρ ˆρ 2, ˆφ 2,1 = ˆφ 1,1 ˆφ 2,2 ˆφ 1,1, 1 ˆφ 3,3 = ˆρ 3 ˆφ 2,1 ˆρ 2 ˆφ 2,2 ˆρ 1 1 ˆφ 2,1 ˆρ 1 ˆφ 2,2 ˆρ 2, ˆφ 3,2 = ˆφ 2,2 ˆφ 3,3 ˆφ2,1, ˆφ 3,1 = ˆφ 2,1 ˆφ 3,3 ˆφ2,2, other ˆφ h,h, ˆφ h,j can be calculated similarly 152 of 189

153 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Levinson-Durbin Algorithm for PACF in AR Models For AR (1), ˆφ 1,1 = ˆρ 1, and ˆφ h,h = 0 h > 1. For AR (2), ˆφ 1,1 = ˆρ 1, ˆφ 2,2 = ˆρ 2 ˆρ ˆρ 2, ˆφ 2,1 = ˆφ 1,1 ˆφ 2,2 ˆφ 1,1, ˆφ h,h = 0 h > 2. 1 For AR (3), ˆφ 1,1 = ˆρ 1, ˆφ 2,2 = ˆρ 2 ˆρ ˆρ 2, ˆφ 2,1 = ˆφ 1,1 ˆφ 2,2 ˆφ 1,1, 1 ˆφ 3,3 = ˆρ 3 ˆφ 2,1 ˆρ 2 ˆφ 2,2 ˆρ 1 1 ˆφ 2,1 ˆρ 1 ˆφ 2,2 ˆρ 2, ˆφ 3,2 = ˆφ 2,2 ˆφ 3,3 ˆφ2,1, ˆφ 3,1 = ˆφ 2,1 ˆφ 3,3 ˆφ 2,2, and ˆφ h,h = 0 h > of 189

154 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process A Numerical Example: How to Calculate PACF Suppose that ˆρ 1 = 0.188, ˆρ 2 = 0.201, and ˆρ 3 = 0.181, then ˆφ 1,1 = ˆρ 1 = 0.188, ˆφ 2,2 = ˆρ 2 ˆρ ( 0.188)2 1 ˆρ 2 = 1 1 ( 0.188) 2 = 0.245, ˆφ 2,1 = ˆφ 1,1 ˆφ 2,2 ˆφ 1,1 = ( 0.245)( 0.188) = 0.234, ˆφ 3,3 = ˆρ 3 ˆφ 2,1 ˆρ 2 ˆφ 2,2 ˆρ 1 1 ˆφ 2,1 ˆρ 1 ˆφ 2,2 ˆρ 2 = ( 0.234)( 0.201) ( 0.245)( 0.188) = 0.097, 1 ( 0.234)( 0.188) ( 0.245)( 0.201) ˆφ 3,2 = ˆφ 2,2 ˆφ 3,3 ˆφ 2,1 = (0.097)( 0.234) = 0.222, ˆφ 3,1 = ˆφ 2,1 ˆφ 3,3 ˆφ 2,2 = ( 0.234) (0.097)( 0.245) = of 189

155 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process The PACF of Invertible MA and ARMA models For an MA (1), X t = ɛ t + θɛ t 1, θ < 1, one can show that φ h,h = ( θ)h (1 θ 2 ) 1 θ 2(h+1), h 1. For an invertible MA (q), no finite representation exists; Hence, the PACF will never cut off, as in the case of an AR (p). Because an invertible ARMA model has an infinite AR representation, the PACF will tails off (not cut off). The PACF for MA models behaves much like the ACF for AR models (tails off). Also, the PACF for AR models behaves much like the ACF for MA models (cuts off after the lags p and q for the AR (p) and MA (q) models respectively. 155 of 189

156 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Testing Randomness Based on Individual Autocorrelations To determine whether the values of the ACF, or the PACF, are negligible, we can use the approximation that they both have standard deviation 1/ n. At 5% significant level, the approximate confidence bounds are ±1.96/ n ±2/ n, where the values outside this range can be regarded as significant. In R, the acf() and pacf() functions can compute and plot the sample ACF and the sample PACF respectively, where the confidence bounds are shown as blue dotted lines. Testing randomness based on individual ACF or PACF is misleading because ρ(h), h = 1, 2,... are not independent. This may produce a large number of ρ(h) exceeds ±2/ n even if the underlying time series is a White Noise series. We prefer the portmanteau test statistic for testing randomness (as we will see later). 156 of 189

157 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Plot of Autocorrelation & Partial Autocorrelation Functions Consider the regular time series data available from R with the name lh. The data is 48 samples giving the luteinizing hormone in blood samples at 10 minute intervals from a human female. R> par(mfrow=c(1,2)) R> acf(lh) ##Try this code: acf(lh,plot=false) R> pacf(lh) ##Try this code: pacf(lh,plot=false) Series lh Series lh ACF Partial ACF Lag Lag 157 of 189

158 Random Walk Process Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Definition The process X t = X t 1 + ɛ t, where ɛ t is a White Noise WN(0, σ 2 ) for all t 1, (i.e., an AR (1) process with φ = 1) is called a random walk (unit-root) processes. In words, the value of X in period t is equal to its value in t 1, plus a random step due to the white-noise shock ɛ t. 158 of 189

159 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Random Walk Process is Not Stationary The process X t = X t 1 + ɛ t is not stationary as seen below: Consider X 0 = a, where a is a constant X 1 = x 0 + ɛ 1 = a + ɛ 1 X 2 = x 1 + ɛ 2 = a + ɛ 1 + ɛ 2 X 3 = x 2 + ɛ 3 = a + ɛ 1 + ɛ 2 + ɛ 3. =. X t = x t 1 + ɛ t = a + Although, the expected value of X t = E(X t ) = a, which independent of time, the autocovariance and autocorrelation are functions of time (see the next slide!) t i=1 ɛ i 159 of 189

160 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Random Walk Process is Not Stationary The autocovariance function is: γ X (h) = Cov(X t, X t+h ) = Cov(a + The autocorrelation function is: ρ X (h) = t t+h ɛ i, a + ɛ j ) = tσ 2. i=1 j=1 Cov(X t, X t+h ) Var(Xt )Var(X t+h ) = tσ 2 tσ 2 (t + h)σ = 1, h/t Note that: For large t with h considerably less than t, ρ X (h) is nearly 1, Hence, the correlogram for a random walk is characterised by positive autocorrelations with very slow decay down from unity. 160 of 189

161 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Test for Non-stationarity: Dickey-Fuller Test Statistics For a simple AR (1) model X t = φx t 1 + ɛ t, the unit root presents if φ = 1. The model would be non-stationary in this case. The regression model can be written as: X t = (φ 1)X t 1 + ɛ t = δx t 1 + ɛ t, where is the first difference operator. This model can be estimated and testing for a unit root which is equivalent to testing: H 0 : δ = 0 (or φ = 1) vs. H 1 : δ < 0 (or φ < 1). The test is a one-sided left tail test. 161 of 189

162 Dickey-Fuller Unit Root Test Statistics Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Dickey-Fuller test statistics derived (modified t-distribution) to test whether a unit root is present in an AR model. There are 3 versions of the Dickey-Fuller unit root test: 1 Test for a unit root without drift (constant) and without trend X t = δx t 1 + ɛ t. 2 Test for a unit root with drift only (no trend) X t = a 0 + δx t 1 + ɛ t. 3 Test for a unit root with drift and deterministic time trend X t = a 0 + a 1 t + δx t 1 + ɛ t. 162 of 189

163 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Test for Non-stationarity: Dickey-Fuller Test Statistic If X t is stationary (i.e., φ < 1) then it can be shown ˆφ N (φ, 1 n (1 φ2 )). Under the null hypothesis the above result gives ˆφ N (1, 0), which clearly does not make any sense. Phillips (1987) showed that the sample moments of X t converge to random functions of Brownian motion so that n( ˆφ 1) d W (r)dw (r) 0 W. (r)2 dr ˆφ is not asymptotically normally distributed. Consequently, the critical values of the test were computed by simulation. We reject H 0 if n( ˆφ 1) is the tabulated critical value. 163 of 189

164 Second Order Properties of AR Models Stationarity of AR Models Partial Autocorrelation Function (PACF ) Random Walk Process Critical Values for the Dickey-Fuller Unit Root Statistics 164 of 189

Time Series Analysis -- An Introduction -- AMS 586

Time Series Analysis -- An Introduction -- AMS 586 1 Objectives of time series analysis Data description Data interpretation Modeling Control Prediction & Forecasting 2 Time-Series Data Numerical data