Lecture 11: Spectral Analysis Methods For Estimating The Spectrum Walid Sharabati Purdue University Latest Update October 27, 2016 Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 1 / 24
Methods for Estimating the Spectrum Fourier Analysis The approximation of a function by taking sum of sine and cosine terms, is called the Fourier series representation and, for a function f(t), is given by f(t) = a 0 2 + (a k cos kt + b k sin kt), where a 0 = 1 π π π f(t) dt. a k = 1 π b k = 1 π converges to π π π π k=1 f(t) cos kt dt, k = 1, 2,. f(t) sin kt dt, k = 1, 2,. Fourier series f(t) as k except at points of discontinuity, where it converges to halfway up the step change 1 2 [f(t 0) + f(t + 0)] (the average of the limit from below and the limit from above). Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 2 / 24
Methods for Estimating the Spectrum Traditional Method Consider a simple deterministic sinusoidal function at ω, together with a random error term Z t. X t = µ + α cos ωt + β sin ωt + Z t, where Z t is pure random process. Then, X 1 = µ + α cos ω + β sin ω + Z 1, X 2 = µ + α cos 2ω + β sin 2ω + Z 2,. X = µ + α cos ω + β sin ω + Z, where is the total number of observations. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 3 / 24
Methods for Estimating the Spectrum Traditional Method X = Y i = β 0 + β 1 X 1,i + β 2 X 2,i + ɛ i. 1 cos ω sin ω µ., θ = α 1 cos 2ω sin 2ω, A = β.... 1 cos ω sin ω X 1 X is the least squares estimator. ˆθ = ( A T A ) 1 A T X, These formulae hold for any value of ω, but only make practical sense for ω that are not too high or too low. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 4 / 24
Methods for Estimating the Spectrum Types of Frequencies yquist Frequency The highest frequency we can uniquely fit to the data is the yquist frequency, given by ω = π, which completes one cycle every two observations (T = 2). ω = π, f = ω 2π = 1 2, T = 2π ω = 1 f = 2. Fundamental Frequency The frequency at which the whole length of the time series completes only one cycle, i.e. T =, is called the fundamental frequency. T =, f = 1 T = 1, ω = 2π T = 2πf = 2π. For simplicity, we restrict ω to one of the values. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 5 / 24
Methods for Estimating the Spectrum Harmonies This means that ω p is between 2π ω p = 2π p, p = 1,, 2. and π, and equally spaced. X t = µ + α cos ω p t + β sin ω p t + Z t. p 2 ˆµ = X, ˆα = 2 2 Xt cos ω p t, ˆβ = Xt sin ω p t. If p = 2, ω p = π and sin πt = 0. Then, p = 2 ˆµ = X, ˆα = ( 1) t X t, ˆβ = 0. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 6 / 24
Methods for Estimating the Spectrum Time series with three harmonic components. x1 <- 2 * cos(2*pi * 1 : 100 * 6/100) + 3 * sin(2*pi * 1 : 100 x2 <- 4 * cos(2*pi * 1 : 100 * 10/100) + 5 * sin(2*pi * 1 : 10 x3 <- 6 * cos(2*pi * 1 : 100 * 40/100) + 7 * sin(2*pi * 1 : 10 x <- x1 + x2 + x3 par( mfrow=c(2,2) ) plot.ts(x1, ylim=c(-10,10), main=expression(omega==6/100~~~ A2 plot.ts(x2, ylim=c(-10,10), main=expression(omega==10/100~~~ A plot.ts(x3, ylim=c(-10,10), main=expression(omega==40/100~~~ A plot.ts(x, ylim=c(-16,16), main= sum ) Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 7 / 24
Methods for Estimating the Spectrum yquist vs Fundamental Frequency If the observations are taken at equal intervals of time t, then yquist (highest) frequency is ω = π t corresponds to f = ω 2π = 1 2 t and T = 2 t. T = 2, completes one cycle every two observations. Fundamental (lowest) frequency is ω = 2π 1 t corresponds to f = t and T = t. T =, it takes the whole length of time series to complete one cycle. Harmonies are ω p = 2πp t, p = 1, 2,, 2. Figure : Frequency vs Cycle. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 8 / 24
Methods for Estimating the Spectrum yquist vs Fundamental Frequency yquist ω = π t only depends on sampling frequency. The higher the frequency of interest, the more frequently to take observations. Example Daily temperature series cannot give you the variation of temperature within a day. 2π Fundamental ω = t depends on or t, i.e. the length. The lower the frequency of interest, the longer the time period to take observations. Example Half year (6 months) winter temperature cannot help decide the trend of the year. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 9 / 24
Periodic Function f(t) is periodic with period T if f(t + nt ) = f(t) for all n. If f(t) is periodic then then f = 1 T, ω = 2π T and ω p = 2πp T is harmonies. is the fundamental frequency The Fourier series representation of a periodic function f(t) is a sum of harmonies f(t) = X t = a 0 + Alternatively, we may get... 2 p=1 a p ˆα p cos 2πp ω p 2πp t + b p sin ˆβ p ω p t. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 10 / 24
X t = µ + α p cos 2πp ω p t + β p sin 2πp ω p t + Z t. Repeat the analysis to find α p, β p at all frequencies 2π, 4π,, π for p = 1, 2,, 2. We end up with finite Fourier series representation of X t. 2 1 2πp X t = a 0 + a p cos t + b 2πp p sin t +a cos πt, for t = 1, 2,,. 2 p=1 ˆα p ˆβ p If p = 2, ω p = π, sin πt = 0. a 0 = X, a 2 a p = 2 ( 2πpt X t cos t b p = 2 ( 2πpt X t sin t This is called Fourier analysis or harmonic analysis. = 1 t ( 1)t X t. ), for p = 1, 2,, 2 1. ), for p = 1, 2,, 2 1. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 11 / 24
Harmonic Analysis The overall effect of the Fourier analysis is to partition the variability of the series into components at frequencies 2π, 4π,, π. The p th harmonic is given by where R p = a p cos ω p t + b p sin ω p t = R p cos(ω p t + φ p ), a 2 p + b 2 p is the amplitude, and φ p = tan 1 ( bp phase of the p th harmonic. R 2 p = a 2 p + b 2 p. a p ) is the Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 12 / 24
Scaled Periodogram We define a scaled periodogram as ( p ) P = [ 2 ( )] 2πpt 2 [ 2 Xt cos + = a 2 p + b 2 p. ( )] 2πpt 2 Xt sin Think of it as a measure of square correlation between the data and the sinusoids oscillating at the frequency p =. P (p/) = P (1 p/) - the mirror effect; that is why 1 2 is the highest folding frequency. Computed using the fast Fourier transform (FFT). Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 13 / 24
Periodogram Theorem Parseval s theorem: The total variance of X t is given by (Xt X) 2 = 2 1 p=1 R 2 p 2 + a2 2. 1 2 R2 p is the contribution of the p th harmonic to the variance. The total variance is partitioned. Plot 1 2 R2 p against ω p = 2πp line spectrum. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 14 / 24
Take 1 2 R2 p as the contribution to the variance in the range ω p ± π Figure : Last rectangle is a 2. 2 1 2 R2 p = area of histogram rectangle = I(ω p ) 2π = 1 2 R2 p, where I(ω p ) denote the height of the histogram at ω p. So, I(ω p ) = 1 2 R2 p 2π = R2 p 4π, for p = 1, 2,, 2 1. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 15 / 24
For p = 2, ω p = π I(ω p ) π [ ] in the range ( 1) π, π so that ( I ω 2 = a2 2 I(ω p ) vs ω p is called the periodogram. ) = I(π) = π a2. 2 is the contribution to variance Figure : Periodogram. Total area = Var(X t ). Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 16 / 24
Periodogram I(ω p ) = R2 p 4π = (a2 p + b 2 p) 4π { [ = 1 ( )] 2πpt 2 [ ( )] } 2πpt 2 π Xt cos + Xt sin, for p = 1, 2,, 2. I(ω p ) appears to be a natural way of estimating f(ω). Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 17 / 24
Relationship Between I(ω p ) and c k Let c k be the empirical acv.f. ( I(ω p ) = f(ω) = ( c 0 + 2 γ 0 + 2 1 k=1 c k cos ω p k ) /π. ) γ k cos ωk /π. k=1 I(ω p ) and f(ω) are similar, in a sense replace γ k by c k for k up to 1. Although E(I(ω)) f(ω), which means it s unbiased. Var[I(ω)] 0 as. If X estimates µ, we have E( X) = µ, and Var( X) = σ 2 0 as. X is a good unbiased estimator for µ. Thus, I(ω) is not a consistent estimator, it is a bad estimator. Consistent estimators can be obtained by smoothing the periodogram (average). Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 18 / 24
Spectral Density Estimation: Transforming the Truncated ACF For a set of weights λ k, and a number M <, introduce { } ˆf(ω) = 1 M λ 0 c 0 + 2 λ k c k cos ωk π k=1 c k s are downweighted as M gets closer to while M = O() as. Possible windows: Tukey window: λ k = 1 2 ( ) 1 + cos πk M. Parzen window: λ k = 1 6 ( k M ) 2 + 6 ( k M ) 3 if 0 k M 2 and λ k = 2 ( 1 k M ) 3 if M 2 k M. Bartlett window: λ k = 1 k M. Tukey is probably the most commonly used while Bartlett is the least common today. The choice of M relies on the balance of resolution vs variance : the larger M - the rougher the result; commonly, M is selected to be M = 2. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 19 / 24
Spectral Density Estimation: Smoothing the Periodogram Basic smoothing: ˆf(ω) = 1 I(ω p ), m where ω p = 2πp and p = 1, 2,, m. It is assumed that ω p are symmetric around ω the frequency of interest. The choice of group size m is also about the balance of bias vs variance only now the larger m is the smoother the result. Popular value of m = 2. A somewhat better option is to use a weighted average, e.g. m ( ˆf(ω) = w k I ω p + k ) n k= m with positive weights w k such that m k= m w k = 1; weights usually decrease as distance from the center weight w 0 increases. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 20 / 24 p
Confidence Intervals for the Spectrum For the lag window estimators, define the number of degrees of freedom 2 df = M. k= M λ2 k Then, the (1 α) 100% confidence interval is ( df ˆf(ω) ) ˆf(ω), df. χ 2 df,α/2 χ 2 df,1 α/2 For the smoothed periodogram, the situation is similar except that the number of degrees of freedom is df = 2m. Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 21 / 24
2*soi.per$spec[10]/L 2*soi.per$spec[10]/U 2*soi.per$spec[40]/L 2*soi.per$spec[40]/U Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 22 / 24 Computing and Plotting the Periodogram in R library(astsa) par( mfrow=c(2,1) ) soi.per <- spec.pgram(soi, taper=0, log= no ) abline(v=1/4, lty= dotted ) rec.pre <- spec.pgram(rec, taper=0, log= no ) abline(v=1/4, lty= dotted ) soi.per$spec[40] soi.per$spec[10] U <- qchisq(0.025, 2) L <- qchisq(0.975, 2)
Confidence Intervals for the Smoothed Periodogram library(astsa) par( mfrow=c(2,1) ) k <- kernel( daniell, 4) soi.ave <- spec.pgram(soi, k, taper=0, log= no ) abline( v=c(0.25, 1, 3), lty=12 ) ## Repeat using rec instead of soi df <- soi.ave$df U <- qchisq(0.025, df) L <- qchisq(0.975, df) Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 23 / 24
soi.ave$spec[10] soi.ave$spec[40] ## Intervals df*soi.ave$spec[10]/l df*soi.ave$spec[10]/u df*soi.ave$spec[40]/l df*soi.ave$spec[40]/u Professor Sharabati (Purdue University) Time Series Analysis October 27, 2016 24 / 24