Stochastic processes Time series are an example of a stochastic or random process Models for time series A stochastic process is 'a statistical phenomenon that evolves in time according to probabilistic laws' Mathematically, a stochastic process is an indexed collection of random variables :t T { } Spring 3 Searching Big Data Spring 3 Searching Big Data Specification of a process Specification of a process To describe a stochastic process fully, we must specify the all finite dimensional distributions, i.e. the joint distribution of of the random variables for any finite set of times {,, 3,...n } A simpler approach is to only specify the moments this is sufficient if all the joint distributions are normal (requires the first two moments) The mean and variance functions are given by µ t = E( ), and σ t = Var( ) Spring 3 Searching Big Data 3 Spring 3 Searching Big Data 4
(Auto)covariance Because the random variables comprising the process are not independent, we can compute their covariance γ t,t = Cov(, ) Stationarity Modeling is simpler when a process is stationary its joint distribution does not change over time This is strict stationarity A process is weakly stationary if its mean and covariance do not change over time Spring 3 Searching Big Data 5 Spring 3 Searching Big Data 6 Weak stationarity The (auto)covariance depends only on the time difference or lag between the two time points involved µ t = µ, σ t = σ and γ t,t = Cov(, ) = Cov( +τ, +τ ) = γ t +τ,t +τ γ t t (Auto)correlation It may be useful to standardize the (auto)covariance function (acvf) Autocorrelation function (acf) ρ t = γ t γ For Gaussian processes, weak stationarity implies strict stationarity Spring 3 Searching Big Data 7 Spring 3 Searching Big Data 8
White noise A sequence of independent and identically distributed random variables with zero mean Gaussian is a special case Thus, γ k = Cov(Z t,z t+k ) =, k ρ k = k = k Gaussian White Noise Example of white noise - - -3 3 5 7 9 time Figure 3.: Simulated Gaussian white noise time series. Spring 3 Searching Big Data 9 Spring 3 Searching Big Data Random walk Start with {Zt} being white noise {Xt} is a random walk if X o = t = + Z t = Z t k= Random walk Random walk is not stationary E( ) =,Var( ) = tσ First differences are stationary = Spring 3 Searching Big Data Spring 3 Searching Big Data 3
Moving Average processes Moving Average Processes Spring 3 Searching Big Data 3 {Zt} white noise mean zero, s.d. σ {Xt} is a moving average process of order q (written MA(q)) if for some constants θ,... θ q we have +θ Z t + +θ q Z t q For Gaussian white noise, this is also strongly stationary Spring 3 Searching Big Data 4 Example of MA() Example of MA() MA() Simulated MA() - - - - -3 4 6 8 time 3 5 7 9 time index θ=.5 Figure 3.5: Two simulated MA() processes, both from the white noise shown in Figure 3., but for different sets of parameters: (µ,µ) =(.5,.5) and (5, 5). Spring 3 Searching Big Data 5 Spring 3 Searching Big Data 6 4
Moving Average processes The mean and variance are given by E( ) = Var( ) = (+ θ k )σ q Proof The process is weakly stationary because mean is constant and covariance does not depend on t Proof for MA() Proof for MA() cov(xt,xt+ø) = cov(zt + µzt,zt+ø + µzt +ø) =E{(Zt + µzt )(Zt+ø + µzt +ø)} E(Zt + µzt )E(Zt+ø + µzt +ø) =E(ZtZt+ø)+µ E(ZtZt +ø)+µ E(Zt Zt+ø) + µ E(Zt Zt +ø). Now, taking various values for the lag ø, we obtain 8 < E(Zt )+µ E(Zt ) =(+µ )æ if ø =, cov(xt,xt+ø) = µ E(Zt )=µæ if ø = ±, : if ø >. How about the autocorrelation? (3.4) Spring 3 Searching Big Data 7 Spring 3 Searching Big Data 8 Inverting Moving Average processes Can we find the Z s by observing the X s for an MA process? Yes, if we impose the condition of invertibility For the MA() process, the condition is θ < Inverting Moving Average processes For the general case MA(q), introduce the backward shift operator B B j = j Then the MA(q) process is given by = (+θ B +θ B +θ q B q )Z t =θ(b)z t Spring 3 Searching Big Data 9 Spring 3 Searching Big Data 5
Inverting Moving Average processes The general condition for invertibility is that all the roots of θ(β) lie outside the unit circle. MA is inverted into AR and vice versa Autoregressive Processes Spring 3 Searching Big Data Spring 3 Searching Big Data Autoregressive processes The autoregressive process of order p, AR(p): +φ + +φ p p Assume for simplicity that the mean of is. Otherwise, a constant is added to rhs. Spring 3 Searching Big Data 3 +φ +φb = Z t ( φb) AR() = (+φb +φ B +...)Z t +φz t +φ Z t +... B is the backshift operator. In general, φ(b) φ(b) = φ B φ B... Z t = φ(b) Provided Φ < it can be written as an infinite order MA process Spring 3 Searching Big Data 4 6
E( ) = AR() γ(τ ) = σ φ j φ j+τ = σ φ τ γ() = σ j= φ ρ(τ ) = γ(τ ) γ() = φ τ j= φ j = σ φ τ φ Spring 3 Searching Big Data 5 Example of AR() 4 - -4 4 - -4 5 5 Figure 3.7: Simulated AR() processes for =.9 (top) and =.9 (bottom). Spring 3 Searching Big Data 6 3.5. AUTOREGRESSIVE PROCESSES AR(P) 5 (a) Figure 3.8: Sample ACF for (a) xt =.9xt + zt and (b) xt =.9xt + zt. (b) Example of AR() Figure 3.9: Simulated AR() processes for =.5 (top) and =.5 (bottom). +φ + +φ p p AR(q) Series : ARprocess$AR.3 Series : ARprocess$AR.4 +φ B +φ B +φ p B p ACF -.4 -....4.6.8. ACF -....4.6.8. 4 6 8 4 6 8 Lag Lag (a) (b) Figure 3.: Sample ACF for (a) xt =.5xt + zt and (b) xt =.5xt + zt. Spring 3 Searching Big Data 7 Z t = ( φ B φ B φ p B p ) = φ(b) AR(q) can be inverted into an MA process provided the roots λ, λ,..of Φ(B) lie outside the unit circle. As a result, E( ) = Find the autocovariances using recurrences called Yule-Walker equations. Spring 3 Searching Big Data 8 7
Autoregressive and Moving Average (ARMA) Processes ARMA processes Combine AR and MA processes. In general, for an ARMA(p,q) process, there will be p lagged values of X and q lagged values of Z: ARMA(,) : = φ +θ Z t + Z t ARMA(p, q) : = φ + +φ p p +θ Z t + +θ q Z t q + Z t ARMA(p, ) = AR(p), ARMA(, q) = MA(q) Spring 3 Searching Big Data 9 Spring 3 Searching Big Data 3 Alternative formulation of ARMA processes Alternative expression using the backshift operator: φ(b) =θ(b)z t, where φ(b) = ( φ B φ p B p ) θ(b) = (+θ B + +θ p B p ) ARMA(,) ( φb) = (+θb)z t = ( φb) (+θb)z t = (+φb +φ B +...)(+θb)z t, assuming φ < = (+ (φ +θ)b + (φ +θ)φb + (φ +θ)φ B 3 + (φ +θ)φ 3 B 4 +...)Z t + (φ +θ) φ j Z t j j= This is an MA process. Spring 3 Searching Big Data 3 Spring 3 Searching Big Data 3 8
Inverting ARMA(,) ( φb) = (+θb)z t Z t = (+θb) ( φb) = ( θb +θ B +...)( φb), assuming θ < = ( (φ +θ)b + (φ +θ)θb (φ +θ)θ B 3 + (φ +θ)θ 3 B 4 +...) = (φ +θ) ( θ) j j j= Thus, need both Φ and θ to be <. ARMA(,) E( ) = " γ() = σ (φ +θ) % $ + ' # φ & " γ() = σ (φ +θ)+ (φ +θ) φ % $ ' # φ & γ(τ ) = φ τ γ() ρ() = γ() (φ +θ)(+φθ) = γ() + φθ +θ ρ() = φ τ ρ() Spring 3 Searching Big Data 33 Spring 3 Searching Big Data 34 Example of ARMA(,) 3 8 3 8 phi = -.9, theta =.5 phi =.9, theta =.5 5 5-5 phi = -.9, theta = -.5 phi =.9, theta = -.5-5 - Autoregressive Integrated Moving Average (ARIMA) Processes - 3 8 3 8 Figure 3.5: Simulated ARMA(,) processes for various values of the parameters and µ. Spring 3 Searching Big Data 35 Spring 3 Searching Big Data 36 9
Autoregressive Integrated Moving Average Model To use an ARMA model, the time-series must be stationary. Many series must be integrated (differenced) to make them stationary. We write these series as I(d), where d=number of differences needed to get stationarity. If we model the I(d) series as an ARMA(p,q) model, we get an ARIMA(p,d,q) model. Where p=degree of autoregressive model, d=degree of integration and q=degree of moving average term. ARIMA processes Call the differenced process Wt. Then Wt is an ARMA process and W t = d = ( B) d This is called an (p,d,q) process. d is often. Random Walk is ARIMA(,,). Spring 3 Searching Big Data 37 Spring 3 Searching Big Data 38 Seasonal ARIMA Models For some nonstationary series, plain ARIMA models can not be used The most common of such series are the seasonal trends Ex: The data for a particular hour in a month-long trace is typically correlated with the hours preceding it as well as with the same hours in preceding days We can deal with them using seasonal ARIMA models referred to as ARIMA (p,d,q) x (P,D,Q) s Idea: Carry out two models. One for the entire time series and another only for data points that are s units apart Spring 3 Searching Big Data 39 Box-Jenkins Methodology Identification: find the values of p,d,q for series (using autocorrelation and partial autocorelation) Estimation: estimate parameters of the model. Diagnostic checking: how well does the model fit the series? (look for right residuals) Forecasting: usually good for short-term forecasting Spring 3 Searching Big Data 4
Overfitting Suppose an AR() model is appropriate. We fit an AR(3) model and obtain a better fit to training data. However, AR(3) may be worse for prediction. Models need to be parsimonious. The Akaike Information Criterion (AIC) is a function of the maximum likelihood plus twice the number of parameters. The number of parameters in the formula penalizes models with too many parameters References Shumway, R., Stoffer, D. Time Series Analysis and Its Applications: With R Examples, third edition, Springer,. Hannan, Edward James (97). Multiple time series. Wiley series in probability and mathematical statistics. New York: John Wiley and Sons. Whittle, P. (95). Hypothesis Testing in Time Series Analysis. Almquist and Wicksell. Whittle, P. (963). Prediction and Regulation. English Universities Press. Hannan, E. J., Deistler, Manfred (988). Statistical theory of linear systems. Wiley series in probability and mathematical statistics. New York: John Wiley and Sons. George Box, Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control, third edition. Prentice-Hall, 994. Brockwell, P.J., and Davis, R.A. Time Series: Theory and Methods, nd ed. Springer, 9. Percival, Donald B. and Andrew T. Walden. Spectral Analysis for Physical Applications. Cambridge University Press, 993. Spring 3 Searching Big Data 4 Spring 3 Searching Big Data 4