Time Series. Anthony Davison. c

Size: px
Start display at page:

Download "Time Series. Anthony Davison. c"


1 Series Anthony Davison c Periodogram 76 Motivation Lutenizing hormone data Periodogram Example: Sine wave with white noise Example: Sine wave with white noise Example: AR(1), Example: AR(1), Example Comments A trigonometric lemma Properties of the periodogram Reminder: Multivariate normal distribution Smoothing 89 Motivation Moving averages Polynomial regression Local polynomial regression Local linear polynomial smoother Comments STL decomposition Summary

2 Week 3 Periodogram Smoothing Series Autumn 2008 slide 75 Periodogram slide 76 Motivation Many series have periodic structure (e.g. sunspots, CO2 data,...), but we may not know what the frequencies are in advance of looking at the data The periodogram is a summary description based on representing the observed series as a superposition of sine and cosine waves of various frequencies The idea is that the periodogram will tell us what frequencies are most important Consider first the simple model Y t = αcos(ωt) + β sin(ωt) + ε t, t = 1,...,n, (1) where {ε t } is white noise, ω = 2π/p is the known frequency of the fluctuations, p is their known period, and α,β are unknown parameters to be estimated by least squares. As we can write αcos(ωt) + β sin(ωt) = (α 2 + β 2 ) 1/2 sin(ωt + γ), γ = tan 1 (α/β), the right-hand side of equation (1) is equivalent to any sinusoidal function with frequency ω Series Autumn 2008 slide 77 Lutenizing hormone data Data on lutenizing hormone in n = 48 successive blood samples from a woman, taken at 10-minute intervals. Fourier series for n = 48 are shown in the lower panel. lh Fourier series Series Autumn 2008 slide 78 35

3 Periodogram Definition 12 (a) If y 1,...,y n is an equally-spaced time series, its periodogram ordinate for ω is defined as { 2 { } 2 I(ω) = n 1 y t sin(ωt)} + y t cos(ωt), 0 < ω < π/2. (b) The periodogram is a plot of I(2πj/n) for the Fourier frequencies 2πj/n for j = 1,...,m = (n 1)/2 ; I(π) is included only if n is even. By default R plots the log periodogram, log I(2πj/n), against j/n. (c) The cumulative periodogram C r = r j=1 I(2πj/n) m l=1 I(2πl/n), r = 1,...,m, is a plot of C 1,...,C m against the frequencies j/n for j = 1,...,m. Series Autumn 2008 slide 79 Example: Sine wave with white noise Top: data from a simulated sine wave with added white noise. Bottom: log periodogram with red horizontal line showing noise variance σ 2 = 0.25, and a green vertical line showing the signal frequency 1/200. The blue line shows the width of a 95% confidence interval for the true value at each point. Sine wave with frequency 200 and white noise variance spectrum 1e 02 1e+00 1e frequency bandwidth = Series Autumn 2008 slide 80 36

4 Example: Sine wave with white noise Top: data from a simulated sine wave with added white noise. Bottom: log periodogram with red horizontal line showing noise variance σ 2 = 1, and a green vertical line showing the signal frequency 1/20. The blue line shows the width of a 95% confidence interval for the true value at each point. Sine wave with frequency 20 and white noise variance spectrum 1e 02 1e+00 1e frequency bandwidth = Series Autumn 2008 slide 81 Example: AR(1), 0.9 Data from a simulated AR(1) model with parameter 0.9, with log periodogram and theoretical value (in red). The blue line shows the width of a 95% confidence interval for the true value at each point. The log scale on the vertical axis means there is a very large change in the periodogram itself. AR(1), 0.9 y spectrum 1e 03 1e 01 1e frequency bandwidth = Series Autumn 2008 slide 82 37

5 Example: AR(1), 0.9 Data from a simulated AR(1) model with parameter 0.9, with log periodogram and theoretical value (in red). AR(1), y spectrum 1e 02 1e+00 1e frequency bandwidth = Series Autumn 2008 slide 83 Example Data on luteinizing hormone in n = 48 blood samples at 10-minute intervals from a human female. Top left: data; top right: periodogram; bottom left: possible Fourier series for n = 48; bottom right: cumulative periodogram. lh spectrum frequency bandwidth = Fourier series frequency Series Autumn 2008 slide 84 38

6 Comments Low frequency variation (trend) appears at the left of the periodogram, and high frequency variation (rapid oscillations) appears at the right. The rationale for considering only the frequencies ω = 2πj/n is that m yt 2 = I(0) + 2 I(2πj/n) + I(π), (2) j=1 with I(π) included only if n is even. Thus the periodogram decomposes the total variability y 2 t of the data into components associated with each of these frequencies, plus one for the grand mean I(0) = ny 2, which we ignore because it is not periodic. The rationale for plotting the log periodogram is that the periodogram ordinates are (roughly) exponentially distributed, and the log transformation is variance-stabilising for the exponential distribution. A rough significance scale for the log-periodogram is shown by the vertical line on its right. The cumulative periodogram provides a visual test of whether the series is white noise. We compare C r with its expected value r/m: a large value of the Kolmogorov Smirnov statistic D = max C r r/m suggests that the underlying series is not white noise. The test involves seeing whether the cumulative periodogram falls outside a diagonal band, whose width determines the size of the test. Series Autumn 2008 slide 85 A trigonometric lemma Lemma 13 (a) Let ω j = 2πj/n, for n N and positive integer j < n/2, and write c t = cos(tω j ),s t = sin(tω j ). Then c t = s t = s t c t = 0, s 2 t = c 2 t = n/2. (b) If ω 1 = 2πj 1 /n,ω 1 = 2πj 2 /n for positive integers j 1 j 2 < n/2, and we write c kt = cos(tω k ),s kt = sin(tω k ) for k = 1,2, then s 1t s 2t = s 1t c 2t = c 1t c 2t = 0. (c) If n is odd, and we write s tj = sin(tω j ), c tj = cos(tω j ), ω j = 2πj/n, for j = 1,...,m = n/2, t = 1,...,n, then the columns of the n n matrix 1 s 11 c 11 s 12 c 12 s 1m c 1m 1 s 21 c 21 s 22 c 22 s 2m c 2m Q = s n1 c n1 s n2 c n2 s nm c nm are orthogonal, Q T Q = diag(n,n/2,...,n/2) = D, say. Series Autumn 2008 slide 86 39

7 Properties of the periodogram iid Theorem 14 If Y 1,...,Y n N(µ,σ 2 ), then all the periodogram ordinates are independent, and (a) the I(2πj/n), for j = 1,...,m, are exponential random variables with common mean σ 2 ; (b) if n is even, I(π) σ 2 χ 2 1 ; finally, (c) the cumulative periodogram ordinate C r = r j=1 I(2πj/n) m l=1 I(2πl/n), r = 1,...,m, has a beta(r,m r) distribution, and so has mean and variance r/m, r(m r)/m. Part (a) of this result tells us that Gaussian white noise has a flat spectrum. In fact the assumption of Gaussianity is needed only for the independence, and the spectrum of non-gaussian white noise is flat. Series Autumn 2008 slide 87 Reminder: Multivariate normal distribution Definition 15 The vector random variable Y = (Y 1,...,Y n ) T is said to have the multivariate normal distribution with mean vector µ n 1 = (µ 1,...,µ n ) T with ith element µ i = E(Y i ) and (co)variance matrix Ω n n with (i,j) element ω ij = cov(y i,y j ), written Y N n (µ,ω), if its density function is f(y;µ,ω) = 1 (2π) n/2 Ω 1/2 exp { 1 2 (y µ)t Ω 1 (y µ) }, y R n,µ R n, where Ω is a symmetric positive definite matrix. In this case its moment-generating function is E(e tty ) = exp(t T µ tt Ωt), t R n. In particular, if Y 1,...,Y n iid N(µ,σ 2 ), then the mean vector and variance matrix are µ1 n and σ 2 I n. Lemma 16 If Y N n (µ,ω), and a m 1 and B m n are constant, with B of rank m < n, then a + BY N m (a + Bµ,BΩB T ). Series Autumn 2008 slide 88 40

8 Smoothing slide 89 Motivation Underlying model is Y t = µ(t) + ε t, where µ(t) is smooth function of t and {ε t } is stationary. Differencing removes (some) trend to give (roughly) stationary series Sometimes we want to examine the trend by smoothing the time series Approaches: moving average (simple, related to differencing) polynomial (simple, doesn t work very well) local polynomial (simple, easy to robustify) spline (simple, similar to local polynomial) STL decomposition (robust fitting of local polynomial, with seasonal effects) Series Autumn 2008 slide 90 Moving averages Classical approach to smoothing: given data y 1,...,y n, replace y t by (y t+1 + y t + y t 1 )/3, or in general construct the moving average of order 2p + 1, s t = p j= p w j y t+j, t = p + 1,...,n p, p N, and weights w j, with w j = 1 and (usually) w j > 0 and w j = w j. This is an example of a linear filter. Fixes are possible near the ends, but usually p n, so the details are unimportant. Choose weights by iterating simple (equally-weighted) smoothers (example) choosing higher order to remove (or at least decrease) seasonality, for example taking p = 6, w 6 = w 6 = 1/24 and all other w j = 1/12. taking smaller order to highlight seasonality Series Autumn 2008 slide 91 41

9 Polynomial regression Fit polynomial of degree k to the data; assume that where {ε t } is stationary series Y t = s(t) + ε t = β 0 + β 1 t + + β k t k + ε t, Choose parameters β 0,...,β k to minimise the sum of squares {y t s(t)} 2 = { y t (β 0 + β 1 t + + β k t k )} 2, giving β k+1 1 = (X T X) 1 X T y, where y T = (y 1,...,y n ) and (t,j) element of n (k + 1) matrix X is t j 1. Comments: sensitivity to observations at extremities of series often leads to poor fit usually doesn t work well because polynomials are too restrictive may need orthogonal polynomials to avoid numerical problems if n, k large easily copes with missing values/unequally spaced observations Series Autumn 2008 slide 92 Example: Northern hemisphere temperatures Temperature anomaly ( C) for relative to instrumental average, with polynomials of degree k = 3 (blue), 10 (red), 20 (cyan) moberg Series Autumn 2008 slide 93 42

10 Local polynomial regression Fit polynomial of degree k = 0,1 or so to the data, but locally See picture on next slide. Use kernel weights w(t t 0 ) that downweight observations far from t 0, and minimise weighted sum of squares [ { w(t t 0 ) y t β 0 + β 1 (t t 0 ) + + β k (t t 0 ) k}] 2, giving β(t 0 ) = (X T WX) 1 X T Wy, where y and X as before and n n diagonal matrix W contains the weights. Use β(t 0 ) to estimate curve at t 0. Refit for numerous 1 t 0 n, and interpolate the fitted values Can robustify by downweighting observations with large residuals in initial fit Lowess (locally weighted scatterplot smoother) uses nearest neighbourhood smoother, with p = 2/3, which uses the 2/3 of the data nearest to t 0 Automatic choice of bandwidth (or equivalent degrees of freedom degree of polynomial) for kernel tends to be too small, owing to autocorrelation of time series. Series Autumn 2008 slide 94 Local linear polynomial smoother Left: observations in the shaded part of the panel are weighted using the kernel shown at the foot, with h = 0.8, and the solid straight line is fitted by weighted least squares. The local estimate is the fitted value when t = t 0, shown by the vertical line. Two hundred local estimates formed using equi-spaced t 0 were interpolated to give the dotted line, which is the estimate of g(t). Right: local linear smoothers with h = 0.2 (solid) and h = 5 (dots). Series Autumn 2008 slide 95 43

11 Example: Northern hemisphere temperatures Temperature anomaly ( C) for relative to instrumental average, with smoothing splines with degrees of freedom k = 3 (blue), 10 (red), 20 (cyan), and the automatically chosen (and much too big) value 158 (green). moberg Series Autumn 2008 slide 96 Comments Local polynomial smoothing is an example of nonparametric smoothing Another example is the use of smoothing splines Such methods allow for local behaviour of series, and so are preferable to general fitting of polynomials They all use a local fit depend on a bandwidth, related to the equivalent degrees of freedom high bandwidth low degrees of freedom smooth fit, and small bandwidth high degrees of freedom wiggly fit have approaches to choosing the bandwidth automatically, but usually for time series this gives a fit that is too wiggly can be robustifed, so that outliers have less impact on the fitted curves Series Autumn 2008 slide 97 STL decomposition An approach to removing overall trend and seasonal components, robust and (in principle) copes with missing data (but the R function stl does not!) Underlying model is Y t = U(t) + S(t) + ε t, where U(t) is trend, and S(t) is seasonal variation {ε t } stationary, Can fit a single seasonal component (next slide) or a slowly-varying one (next slide but one) Note how seasonal component gradually increases amplitude in the second plot: why? Series Autumn 2008 slide 98 44

12 Example: Mauna Loa data data seasonal remainder trend Series Autumn 2008 slide 99 time Example: Mauna Loa data seasonal trend data remainder Series Autumn 2008 slide 100 time 45

13 Summary Today we talked about the periodogram decomposes total variation into frequency components provides a test of white noise based on the cumulative periodogram trend/seasonality estimation by smoothing moving average polynomial fitting local polynomial fitting robust local polynomial fitting STL decomposition Next time: detailed consideration of AR(1) model Series Autumn 2008 slide

Periodogram of a sinusoid + spike Single high value is sum of cosine curves all in phase at time t 0 :

Periodogram of a sinusoid + spike Single high value is sum of cosine curves all in phase at time t 0 : Periodogram of a sinusoid + spike Single high value is sum of cosine curves all in phase at time t 0 : X(t) = µ + Asin(ω 0 t)+ Δ δ ( t t 0 ) ±σ N =100 Δ =100 χ ( ω ) Raises the amplitude uniformly at all

More information

Chapter 3: Regression Methods for Trends

Chapter 3: Regression Methods for Trends Chapter 3: Regression Methods for Trends Time series exhibiting trends over time have a mean function that is some simple function (not necessarily constant) of time. The example random walk graph from

More information

8.2 Harmonic Regression and the Periodogram

8.2 Harmonic Regression and the Periodogram Chapter 8 Spectral Methods 8.1 Introduction Spectral methods are based on thining of a time series as a superposition of sinusoidal fluctuations of various frequencies the analogue for a random process

More information

Statistics of stochastic processes

Statistics of stochastic processes Introduction Statistics of stochastic processes Generally statistics is performed on observations y 1,..., y n assumed to be realizations of independent random variables Y 1,..., Y n. 14 settembre 2014

More information

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010 Problem Set 2 Solution Sketches Time Series Analysis Spring 2010 Forecasting 1. Let X and Y be two random variables such that E(X 2 ) < and E(Y 2 )

More information

Unstable Oscillations!

Unstable Oscillations! Unstable Oscillations X( t ) = [ A 0 + A( t ) ] sin( ω t + Φ 0 + Φ( t ) ) Amplitude modulation: A( t ) Phase modulation: Φ( t ) S(ω) S(ω) Special case: C(ω) Unstable oscillation has a broader periodogram

More information

Computational Data Analysis!

Computational Data Analysis! 12.714 Computational Data Analysis! Alan Chave (alan@whoi.edu)! Thomas Herring (tah@mit.edu),! http://geoweb.mit.edu/~tah/12.714! Introduction to Spectral Analysis! Topics Today! Aspects of Time series

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

The Spectral Density Estimation of Stationary Time Series with Missing Data

The Spectral Density Estimation of Stationary Time Series with Missing Data The Spectral Density Estimation of Stationary Time Series with Missing Data Jian Huang and Finbarr O Sullivan Department of Statistics University College Cork Ireland Abstract The spectral estimation of

More information

Periodogram of a sinusoid + spike Single high value is sum of cosine curves all in phase at time t 0 :

Periodogram of a sinusoid + spike Single high value is sum of cosine curves all in phase at time t 0 : Periodogram of a sinusoid + spike Single high value is sum of cosine curves all in phase at time t 0 : ( ) ±σ X(t) = µ + Asin(ω 0 t)+ Δ δ t t 0 N =100 Δ =100 χ ( ω ) Raises the amplitude uniformly at all

More information

From Data To Functions Howdowegofrom. Basis Expansions From multiple linear regression: The Monomial Basis. The Monomial Basis

From Data To Functions Howdowegofrom. Basis Expansions From multiple linear regression: The Monomial Basis. The Monomial Basis From Data To Functions Howdowegofrom Basis Expansions From multiple linear regression: data to functions? Or if there is curvature: y i = β 0 + x 1i β 1 + x 2i β 2 + + ɛ i y i = β 0 + x i β 1 + xi 2 β

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

Lecture 11: Spectral Analysis

Lecture 11: Spectral Analysis Lecture 11: Spectral Analysis Methods For Estimating The Spectrum Walid Sharabati Purdue University Latest Update October 27, 2016 Professor Sharabati (Purdue University) Time Series Analysis October 27,

More information

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis Introduction to Time Series Analysis 1 Contents: I. Basics of Time Series Analysis... 4 I.1 Stationarity... 5 I.2 Autocorrelation Function... 9 I.3 Partial Autocorrelation Function (PACF)... 14 I.4 Transformation

More information

A time series is called strictly stationary if the joint distribution of every collection (Y t

A time series is called strictly stationary if the joint distribution of every collection (Y t 5 Time series A time series is a set of observations recorded over time. You can think for example at the GDP of a country over the years (or quarters) or the hourly measurements of temperature over a

More information

Local Polynomial Modelling and Its Applications

Local Polynomial Modelling and Its Applications Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,

More information

Time Series Analysis -- An Introduction -- AMS 586

Time Series Analysis -- An Introduction -- AMS 586 Time Series Analysis -- An Introduction -- AMS 586 1 Objectives of time series analysis Data description Data interpretation Modeling Control Prediction & Forecasting 2 Time-Series Data Numerical data

More information

Local regression I. Patrick Breheny. November 1. Kernel weighted averages Local linear regression

Local regression I. Patrick Breheny. November 1. Kernel weighted averages Local linear regression Local regression I Patrick Breheny November 1 Patrick Breheny STA 621: Nonparametric Statistics 1/27 Simple local models Kernel weighted averages The Nadaraya-Watson estimator Expected loss and prediction

More information

Statistics: A review. Why statistics?

Statistics: A review. Why statistics? Statistics: A review Why statistics? What statistical concepts should we know? Why statistics? To summarize, to explore, to look for relations, to predict What kinds of data exist? Nominal, Ordinal, Interval

More information

Time Series: Theory and Methods

Time Series: Theory and Methods Peter J. Brockwell Richard A. Davis Time Series: Theory and Methods Second Edition With 124 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition vn ix CHAPTER 1 Stationary

More information

Classic Time Series Analysis

Classic Time Series Analysis Classic Time Series Analysis Concepts and Definitions Let Y be a random number with PDF f Y t ~f,t Define t =E[Y t ] m(t) is known as the trend Define the autocovariance t, s =COV [Y t,y s ] =E[ Y t t

More information

Autoregressive Models Fourier Analysis Wavelets

Autoregressive Models Fourier Analysis Wavelets Autoregressive Models Fourier Analysis Wavelets BFR Flood w/10yr smooth Spectrum Annual Max: Day of Water year Flood magnitude vs. timing Jain & Lall, 2000 Blacksmith Fork, Hyrum, UT Analyses of Flood

More information

STAT 520: Forecasting and Time Series. David B. Hitchcock University of South Carolina Department of Statistics

STAT 520: Forecasting and Time Series. David B. Hitchcock University of South Carolina Department of Statistics David B. University of South Carolina Department of Statistics What are Time Series Data? Time series data are collected sequentially over time. Some common examples include: 1. Meteorological data (temperatures,

More information

Fourier Analysis of Stationary and Non-Stationary Time Series

Fourier Analysis of Stationary and Non-Stationary Time Series Fourier Analysis of Stationary and Non-Stationary Time Series September 6, 2012 A time series is a stochastic process indexed at discrete points in time i.e X t for t = 0, 1, 2, 3,... The mean is defined

More information

Lecture 3: Statistical sampling uncertainty

Lecture 3: Statistical sampling uncertainty Lecture 3: Statistical sampling uncertainty c Christopher S. Bretherton Winter 2015 3.1 Central limit theorem (CLT) Let X 1,..., X N be a sequence of N independent identically-distributed (IID) random

More information

Statistics of Stochastic Processes

Statistics of Stochastic Processes Prof. Dr. J. Franke All of Statistics 4.1 Statistics of Stochastic Processes discrete time: sequence of r.v...., X 1, X 0, X 1, X 2,... X t R d in general. Here: d = 1. continuous time: random function

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno Stochastic Processes M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno 1 Outline Stochastic (random) processes. Autocorrelation. Crosscorrelation. Spectral density function.

More information

Graphical Presentation of a Nonparametric Regression with Bootstrapped Confidence Intervals

Graphical Presentation of a Nonparametric Regression with Bootstrapped Confidence Intervals Graphical Presentation of a Nonparametric Regression with Bootstrapped Confidence Intervals Mark Nicolich & Gail Jorgensen Exxon Biomedical Science, Inc., East Millstone, NJ INTRODUCTION Parametric regression

More information

Using wavelet tools to estimate and assess trends in atmospheric data

Using wavelet tools to estimate and assess trends in atmospheric data NRCSE Using wavelet tools to estimate and assess trends in atmospheric data Wavelets Fourier analysis uses big waves Wavelets are small waves Requirements for wavelets Integrate to zero Square integrate

More information

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Basics: Definitions and Notation. Stationarity. A More Formal Definition Basics: Definitions and Notation A Univariate is a sequence of measurements of the same variable collected over (usually regular intervals of) time. Usual assumption in many time series techniques is that

More information

DATA IN SERIES AND TIME I. Several different techniques depending on data and what one wants to do

DATA IN SERIES AND TIME I. Several different techniques depending on data and what one wants to do DATA IN SERIES AND TIME I Several different techniques depending on data and what one wants to do Data can be a series of events scaled to time or not scaled to time (scaled to space or just occurrence)

More information

Lecture 5 Least-squares

Lecture 5 Least-squares EE263 Autumn 2008-09 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property

More information

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation 1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations

More information

Some general observations.

Some general observations. Modeling and analyzing data from computer experiments. Some general observations. 1. For simplicity, I assume that all factors (inputs) x1, x2,, xd are quantitative. 2. Because the code always produces

More information


SF2943: TIME SERIES ANALYSIS COMMENTS ON SPECTRAL DENSITIES SF2943: TIME SERIES ANALYSIS COMMENTS ON SPECTRAL DENSITIES This document is meant as a complement to Chapter 4 in the textbook, the aim being to get a basic understanding of spectral densities through

More information

Wavelets and Multiresolution Processing

Wavelets and Multiresolution Processing Wavelets and Multiresolution Processing Wavelets Fourier transform has it basis functions in sinusoids Wavelets based on small waves of varying frequency and limited duration In addition to frequency,

More information

NCSS Statistical Software. Harmonic Regression. This section provides the technical details of the model that is fit by this procedure.

NCSS Statistical Software. Harmonic Regression. This section provides the technical details of the model that is fit by this procedure. Chapter 460 Introduction This program calculates the harmonic regression of a time series. That is, it fits designated harmonics (sinusoidal terms of different wavelengths) using our nonlinear regression

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Regression with correlation for the Sales Data

Regression with correlation for the Sales Data Regression with correlation for the Sales Data Scatter with Loess Curve Time Series Plot Sales 30 35 40 45 Sales 30 35 40 45 0 10 20 30 40 50 Week 0 10 20 30 40 50 Week Sales Data What is our goal with

More information

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment:

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment: Stochastic Processes: I consider bowl of worms model for oscilloscope experiment: SAPAscope 2.0 / 0 1 RESET SAPA2e 22, 23 II 1 stochastic process is: Stochastic Processes: II informally: bowl + drawing

More information

EC3062 ECONOMETRICS. THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation. (1) y = β 0 + β 1 x β k x k + ε,

EC3062 ECONOMETRICS. THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation. (1) y = β 0 + β 1 x β k x k + ε, THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation (1) y = β 0 + β 1 x 1 + + β k x k + ε, which can be written in the following form: (2) y 1 y 2.. y T = 1 x 11... x 1k 1

More information

Time Series 3. Robert Almgren. Sept. 28, 2009

Time Series 3. Robert Almgren. Sept. 28, 2009 Time Series 3 Robert Almgren Sept. 28, 2009 Last time we discussed two main categories of linear models, and their combination. Here w t denotes a white noise: a stationary process with E w t ) = 0, E

More information

Estimation of cumulative distribution function with spline functions

Estimation of cumulative distribution function with spline functions INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

GATE EE Topic wise Questions SIGNALS & SYSTEMS

GATE EE Topic wise Questions SIGNALS & SYSTEMS www.gatehelp.com GATE EE Topic wise Questions YEAR 010 ONE MARK Question. 1 For the system /( s + 1), the approximate time taken for a step response to reach 98% of the final value is (A) 1 s (B) s (C)

More information

Alternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods

Alternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods Alternatives to Basis Expansions Basis expansions require either choice of a discrete set of basis or choice of smoothing penalty and smoothing parameter Both of which impose prior beliefs on data. Alternatives

More information

Chapter 3 - Temporal processes

Chapter 3 - Temporal processes STK4150 - Intro 1 Chapter 3 - Temporal processes Odd Kolbjørnsen and Geir Storvik January 23 2017 STK4150 - Intro 2 Temporal processes Data collected over time Past, present, future, change Temporal aspect

More information

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Dynamic Time Series Regression: A Panacea for Spurious Correlations International Journal of Scientific and Research Publications, Volume 6, Issue 10, October 2016 337 Dynamic Time Series Regression: A Panacea for Spurious Correlations Emmanuel Alphonsus Akpan *, Imoh

More information



More information

EL1820 Modeling of Dynamical Systems

EL1820 Modeling of Dynamical Systems EL1820 Modeling of Dynamical Systems Lecture 10 - System identification as a model building tool Experiment design Examination and prefiltering of data Model structure selection Model validation Lecture

More information

How to build an automatic statistician

How to build an automatic statistician How to build an automatic statistician James Robert Lloyd 1, David Duvenaud 1, Roger Grosse 2, Joshua Tenenbaum 2, Zoubin Ghahramani 1 1: Department of Engineering, University of Cambridge, UK 2: Massachusetts

More information

X random; interested in impact of X on Y. Time series analogue of regression.

X random; interested in impact of X on Y. Time series analogue of regression. Multiple time series Given: two series Y and X. Relationship between series? Possible approaches: X deterministic: regress Y on X via generalized least squares: arima.mle in SPlus or arima in R. We have

More information

Independent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring

Independent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring Independent Events Two events are independent if knowing that one occurs does not change the probability of the other occurring Conditional probability is denoted P(A B), which is defined to be: P(A and

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Interaction effects for continuous predictors in regression modeling

Interaction effects for continuous predictors in regression modeling Interaction effects for continuous predictors in regression modeling Testing for interactions The linear regression model is undoubtedly the most commonly-used statistical model, and has the advantage

More information

Homework #2 Due Monday, April 18, 2012

Homework #2 Due Monday, April 18, 2012 12.540 Homework #2 Due Monday, April 18, 2012 Matlab solution codes are given in HW02_2012.m This code uses cells and contains the solutions to all the questions. Question 1: Non-linear estimation problem

More information

Nonparametric regression with martingale increment errors

Nonparametric regression with martingale increment errors S. Gaïffas (LSTA - Paris 6) joint work with S. Delattre (LPMA - Paris 7) work in progress Motivations Some facts: Theoretical study of statistical algorithms requires stationary and ergodicity. Concentration

More information

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University Lecture 19 Modeling Topics plan: Modeling (linear/non- linear least squares) Bayesian inference Bayesian approaches to spectral esbmabon;

More information



More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Estimating Periodic Signals

Estimating Periodic Signals Department of Mathematics & Statistics Indian Institute of Technology Kanpur Most of this talk has been taken from the book Statistical Signal Processing, by D. Kundu and S. Nandi. August 26, 2012 Outline

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

14 - Gaussian Stochastic Processes

14 - Gaussian Stochastic Processes 14-1 Gaussian Stochastic Processes S. Lall, Stanford 14 - Gaussian Stochastic Processes Linear systems driven by IID noise Evolution of mean and covariance Example: mass-spring system Steady-state

More information

Classifying and building DLMs

Classifying and building DLMs Chapter 3 Classifying and building DLMs In this chapter we will consider a special form of DLM for which the transition matrix F k and the observation matrix H k are constant in time, ie, F k = F and H

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

More information

1. How can you tell if there is serial correlation? 2. AR to model serial correlation. 3. Ignoring serial correlation. 4. GLS. 5. Projects.

1. How can you tell if there is serial correlation? 2. AR to model serial correlation. 3. Ignoring serial correlation. 4. GLS. 5. Projects. 1. How can you tell if there is serial correlation? 2. AR to model serial correlation. 3. Ignoring serial correlation. 4. GLS. 5. Projects. 1) Identifying serial correlation. Plot Y t versus Y t 1. See

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information

Time series models in the Frequency domain. The power spectrum, Spectral analysis

Time series models in the Frequency domain. The power spectrum, Spectral analysis ime series models in the Frequency domain he power spectrum, Spectral analysis Relationship between the periodogram and the autocorrelations = + = ( ) ( ˆ α ˆ ) β I Yt cos t + Yt sin t t= t= ( ( ) ) cosλ

More information

41903: Introduction to Nonparametrics

41903: Introduction to Nonparametrics 41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific

More information

E 4101/5101 Lecture 6: Spectral analysis

E 4101/5101 Lecture 6: Spectral analysis E 4101/5101 Lecture 6: Spectral analysis Ragnar Nymoen 3 March 2011 References to this lecture Hamilton Ch 6 Lecture note (on web page) For stationary variables/processes there is a close correspondence

More information

4 Multiple Linear Regression

4 Multiple Linear Regression 4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ

More information

1 Arabidopsis growth curves

1 Arabidopsis growth curves 1 Arabidopsis growth curves 1.1 Data description The file PlantGrowth.dat contains the heights in mm. of 70 Arabidopsis plants measured every four days from day 29 to day 69 following the planting of seeds.

More information

Time Series Examples Sheet

Time Series Examples Sheet Lent Term 2001 Richard Weber Time Series Examples Sheet This is the examples sheet for the M. Phil. course in Time Series. A copy can be found at: http://www.statslab.cam.ac.uk/~rrw1/timeseries/ Throughout,

More information

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle AUTOREGRESSIVE LINEAR MODELS AR(1) MODELS The zero-mean AR(1) model x t = x t,1 + t is a linear regression of the current value of the time series on the previous value. For > 0 it generates positively

More information

This is the number of cycles per unit time, and its units are, for example,

This is the number of cycles per unit time, and its units are, for example, 16 4. Sinusoidal solutions Many things in nature are periodic, even sinusoidal. We will begin by reviewing terms surrounding periodic functions. If an LTI system is fed a periodic input signal, we have

More information

Heteroskedasticity and Autocorrelation Consistent Standard Errors

Heteroskedasticity and Autocorrelation Consistent Standard Errors NBER Summer Institute Minicourse What s New in Econometrics: ime Series Lecture 9 July 6, 008 Heteroskedasticity and Autocorrelation Consistent Standard Errors Lecture 9, July, 008 Outline. What are HAC

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

Fitting Linear Statistical Models to Data by Least Squares I: Introduction

Fitting Linear Statistical Models to Data by Least Squares I: Introduction Fitting Linear Statistical Models to Data by Least Squares I: Introduction Brian R. Hunt and C. David Levermore University of Maryland, College Park Math 420: Mathematical Modeling February 5, 2014 version

More information

Numerical Methods I Orthogonal Polynomials

Numerical Methods I Orthogonal Polynomials Numerical Methods I Orthogonal Polynomials Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001, Fall 2010 Nov. 4th and 11th, 2010 A. Donev (Courant Institute)

More information

1 Class Organization. 2 Introduction

1 Class Organization. 2 Introduction Time Series Analysis, Lecture 1, 2018 1 1 Class Organization Course Description Prerequisite Homework and Grading Readings and Lecture Notes Course Website: http://www.nanlifinance.org/teaching.html wechat

More information

Worksheet #2. Use trigonometry and logarithms to model natural phenomena from a periodic behavior perspective.

Worksheet #2. Use trigonometry and logarithms to model natural phenomena from a periodic behavior perspective. math 112 Worksheet #2 Name Due Date: Objective: Use trigonometry and logarithms to model natural phenomena from a periodic behavior perspective. We revisit the varying of tree ring widths over time. We

More information

State Space Representation of Gaussian Processes

State Space Representation of Gaussian Processes State Space Representation of Gaussian Processes Simo Särkkä Department of Biomedical Engineering and Computational Science (BECS) Aalto University, Espoo, Finland June 12th, 2013 Simo Särkkä (Aalto University)

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes COMP 55 Applied Machine Learning Lecture 2: Gaussian processes Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp55

More information

Time Series Examples Sheet

Time Series Examples Sheet Lent Term 2001 Richard Weber Time Series Examples Sheet This is the examples sheet for the M. Phil. course in Time Series. A copy can be found at: http://www.statslab.cam.ac.uk/~rrw1/timeseries/ Throughout,

More information

A Diagnostic for Seasonality Based Upon Autoregressive Roots

A Diagnostic for Seasonality Based Upon Autoregressive Roots A Diagnostic for Seasonality Based Upon Autoregressive Roots Tucker McElroy (U.S. Census Bureau) 2018 Seasonal Adjustment Practitioners Workshop April 26, 2018 1 / 33 Disclaimer This presentation is released

More information

Short-term electricity demand forecasting in the time domain and in the frequency domain

Short-term electricity demand forecasting in the time domain and in the frequency domain Short-term electricity demand forecasting in the time domain and in the frequency domain Abstract This paper compares the forecast accuracy of different models that explicitely accomodate seasonalities

More information

Problem Set 1 Solution Sketches Time Series Analysis Spring 2010

Problem Set 1 Solution Sketches Time Series Analysis Spring 2010 Problem Set 1 Solution Sketches Time Series Analysis Spring 2010 1. Construct a martingale difference process that is not weakly stationary. Simplest e.g.: Let Y t be a sequence of independent, non-identically

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036

More information

Fitting Linear Statistical Models to Data by Least Squares: Introduction

Fitting Linear Statistical Models to Data by Least Squares: Introduction Fitting Linear Statistical Models to Data by Least Squares: Introduction Radu Balan, Brian R. Hunt and C. David Levermore University of Maryland, College Park University of Maryland, College Park, MD Math

More information

Regularizing inverse problems. Damping and smoothing and choosing...

Regularizing inverse problems. Damping and smoothing and choosing... Regularizing inverse problems Damping and smoothing and choosing... 141 Regularization The idea behind SVD is to limit the degree of freedom in the model and fit the data to an acceptable level. Retain

More information

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

A SARIMAX coupled modelling applied to individual load curves intraday forecasting A SARIMAX coupled modelling applied to individual load curves intraday forecasting Frédéric Proïa Workshop EDF Institut Henri Poincaré - Paris 05 avril 2012 INRIA Bordeaux Sud-Ouest Institut de Mathématiques

More information

IDL Advanced Math & Stats Module

IDL Advanced Math & Stats Module IDL Advanced Math & Stats Module Regression List of Routines and Functions Multiple Linear Regression IMSL_REGRESSORS IMSL_MULTIREGRESS IMSL_MULTIPREDICT Generates regressors for a general linear model.

More information


IDENTIFICATION OF ARMA MODELS IDENTIFICATION OF ARMA MODELS A stationary stochastic process can be characterised, equivalently, by its autocovariance function or its partial autocovariance function. It can also be characterised by

More information

Analysis of Violent Crime in Los Angeles County

Analysis of Violent Crime in Los Angeles County Analysis of Violent Crime in Los Angeles County Xiaohong Huang UID: 004693375 March 20, 2017 Abstract Violent crime can have a negative impact to the victims and the neighborhoods. It can affect people

More information

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation

More information