STAT 248: EDA & Stationarity Handout 3 GSI: Gido van de Ven September 17th, 2010 1 Introduction Today s section we will deal with the following topics: the mean function, the auto- and crosscovariance function, stationarity, the sample acf and the correlogram. Useful readings are chapters 1 and 2 from Shumway and Stoffer s book, and chapters 2 and 3 from Cryer and Chan s book. Note that both books are freely online available through the university library-system. 2 Stationarity Sometimes we may wish to specify the collection of joint distributions to all finite dimensional vectors (X t1, X t2,.., X tn ), t = (t 1,..., t n ) T n, n {1, 2,..., n}. In such a case we need to be sure that a stochastic process with the specified distributions really does exist. Kolmogorov s theorem guarantees that this is true under minimal conditions on the specified distribution functions. We are going to look for describing a time series by its first and second moments, the mean function and the covariance function respectively. You can think of the covariance function as the average cross-product relative to the joint distribution f(x r, X s ). Definition: [mean function] The mean function, provided it exists, is defined as: µ Xt = E(X t ) = where F t (x) = P (X t x) and f t (x) = Ft(x) x. xf t (x)dx (1) Definition: [autocovariance function (acvf)] If {X t, t T } is a process s.t. V ar(x t ) < for each t T, then its autocovariance function γ X (.,.) is defined as: γ X (r, s) = Cov(X r, X s ) = E[(X r E(X r ))(X s E(X s ))] r, s T (2) The autocovariance function measures the linear dependence between two points on the same series observed at different times. If X s and X r are independent then γ X (r, s) = 0. However, if γ X (r, s) = 0 then X s and X r are NOT necessarily independent. Remark: What happens when s=r? 1
Definition: [(weak) stationarity]: The time series {X t, t Z} with index set Z = {0, ±1, ±2,...} is said to be (weakly) stationary if: 1. E X t 2 < t Z 2. E(X t ) = µ t Z 3. γ X (r, s) = γ X (r + t, s + t) r, s, t Z Intuitively a time series is stationary if it has a finite variance such that the mean value is constant and doesn t depend on time. The covariance function depends on s and t only through the difference s t. Stationary processes play a crucial role in the analysis of time series. Of course many observed time series are decidedly non-stationary in appearance. Frequently such data sets can be transformed into series which can reasonably be modelled as realizations of some stationary process. The theory of stationary processes is then used for the analysis, fitting and prediction of the resulting series. In all of this the autocovariance function is a primary tool. Remark 1: Stationarity as defined is frequently referred to in the literature as weak stationarity, covariance stationarity, stationarity in the wide sense or second-order stationarity. For us when we refer to stationarity we will think about the three properties mentioned in the definition above. Remark 2: If {X t, t Z} is stationary then γ X (r, s) = γ X (r s, 0) for all r, s Z. It is therefore convenient to redefine the autocovariance function of a stationary process as the function of just one variable: γ X (h) = γ x (h, 0) = Cov(X t+h, X t ) t, h Z (3) The function γ X (.) will be referred as the autocovariance function of {X t } and γ X (h) as its value at lag h. Note that γ X (s, t) = γ X (t, s) for all points s and t. Elementary properties covariance function. If γ(.) is the autocovariance function of a stationary process {X t, t Z}, then: γ(0) 0 γ(h) γ(0) γ(h) = γ( h) h Z h Z Definition: [strict stationarity] The time series {X t, t Z} is said to be strictly stationary if the joint distributions of (X t1,..., X tk ) and (X t1 +h,..., X tk +h) are the same for all positive integers k and for all t 1,..., t k, h Z. Strict stationarity means intuitively that the graphs over two equal-length time intervals of a realization of the time series should exhibit similar statistical characteristics. For example, the proportion of ordinates not exceeding a given level x should be roughly the same for both intervals. Remark 1 The previous definition is equivalent to the statement that (X 1,..., X k ) and (X 1+h,..., X k+h ) have the same joint distribution for all positive integers k and integers h. Remark 2 A strictly stationary process with finite second moments is stationary. The inverse is not necessarily true. 2
Remark 3 If {X t, t Z} is stationary Gaussian process then {X t } is strictly stationary, since for all {1, 2,...} and for all h, t 1, t 2,..., Z the random vectors (X t1,..., X tn ) and (X t1 +h,..., X tn+h) have the same mean and covariance matrix and hence the same distribution. Definition: [autocorrelation function (acf)] The autocorrelation function of a stationary time series is the function whose value at lag h is: ρ X (h) = γ x(h) γ x (0) = Corr(X t+h, X t ) t, h Z The Cauchy-Schwarz inequality shows that 1 ρ(h) 1 for all h. Further, ρ X (h) = 0 if X t and X t+h are not correlated, and ρ X (h) = ±1 if X t+h = α 0 + α 1 X t. The value ρ X (h) is a rough measure of the ability to forecast the series at time t + h from the value at time t. Definition: [cross-covariance function (ccvf)] If {X t, t T } and {Y t, t T } are processes s.t. V ar(x t ) < & V ar(y t ) < for each t T, then the cross-covariance function γ XY (.,.) is defined as: γ XY (r, s) = Cov(X r, Y s ) = E[(X r E(X r ))(Y s E(Y s ))] r, s T (4) Of course there is also a scaled version of the cross-covariance function: Definition: [cross-correlation function (ccf)] The cross-correlation function of {X t, t T } and {Y t, t T } is defined as: ρ XY (s, t) = γ XY (s, t) γx (s, s)γ Y (t, t) s, t Z 3 Estimation of Mean and Correlation Function As we just have one realization of our time series, the assumption of stationarity becomes critical. Somehow, we must use averages over this single realization to estimate the population means and covariance functions. If a time series is stationary, the mean function is constant µ t = µ. In that case we can estimate it by the sample mean: ˆµ = x t = 1 n Assuming stationarity, the autocovariance and autocorrelation function can be estimated using: n t=1 x t 3
Definition: The sample autocovariance function is defined as: n h ˆγ(h) = n 1 (x t+h x)(x t x)) t=1 with ˆγ( h) = ˆγ(h) for h = 0, 1,..., n 1. Dividing by n h ensures that the function is nonnegative definite although it is not an unbiased estimate of γ(h). Definition: The sample autocorrelation function is defined, analogously, as: ˆρ(h) = ˆγ(h) ˆγ(0) Correlogram of 'white noise' Correlogram of 'white noise with trend' ACF 0.2 0.0 0.2 0.4 0.6 0.8 1.0 ACF 0.2 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 Lag 0 5 10 15 20 Lag Figure 1: Left: Sample acf of Gaussian white noise. Right: Sample acf of the series generated by X t = t + Z t, where Z t is Gaussian white noise (i.e. X t is white noise with a deterministic trend t). > white.noise = as.ts(rnorm(100)) > acf(white.noise, main = "Correlogram of white noise ") > t = 1:100 > Xt = t + white.noise > acf(xt, main = "Correlogram of white noise with trend ") The sample autocorrelation function has a sampling distribution that allows us to assess whether the data comes from a completely random or white series, or whether correlations are statistically significant at some lags. Large sample distribution of the acf Under general conditions, if x t is white noise, then for large n, the sample acf, ˆρ X (h) for h = 1, 2,..., H, where H is fixed but arbitrary, is approximately normally distributed with zero mean and standard deviation given by: σˆρx(h) = 1 n Remark 1 Based on this result, we obtain a rough method of assessing whether peaks in ˆρ(h) are significant by determining whether the observed peak is outside the interval ±2/ n; for a white noise 4
sequence, approximately 95% of the sample acf s should be within these limits. After trying to reduce a time series to a white noise series the acf s of the residuals should then lie roughly within the limits given above. Remark 2 The sample autocovariance and autocorrelation functions can be computed for nonstationary process. For data containing a trend, ˆρ(h) will exhibit slow decay as h increases, and for data with a substantial deterministic periodic, ˆρ(h) will exhibit similar behaviour with the same periodicity. Remark 3 The sample cross-covariance and sample cross-correlation function are defined analogously to the sample autocovariance and sample autocorrelation function. 3.1 R functions mean(), acf(), acf(type = "covariance"), ccf(), ccf(type = "covariance"). 4 Bibliography This handout is based on handouts prepared by Irma Hernandez-Magallanes, previous GSI for this course. Additional sources that are used, and that could be useful for you: Time Series: Data Analysis and Theory by David R. Brillinger Time Series: Theory and Methods by Peter Brockwell & Richard Davis Time Series Analysis and Its Applications: With R Examples by Robert Schumway & David Stoffer 5