An Introduction to Nonstationary Time Series Analysis

Similar documents
Bickel Rosenblatt test

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Statistics of stochastic processes

A nonparametric test for seasonal unit roots

If we want to analyze experimental or simulated data we might encounter the following tasks:

Christopher Dougherty London School of Economics and Political Science

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

Issues on quantile autoregression

Testing parametric assumptions of trends of a nonstationary time series

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Non-Stationary Time Series and Unit Root Testing

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

On a Nonparametric Notion of Residual and its Applications

Semi-parametric estimation of non-stationary Pickands functions

PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY

Non-Stationary Time Series and Unit Root Testing

Econ 424 Time Series Concepts

An application of the GAM-PCA-VAR model to respiratory disease and air pollution data

STAT 4385 Topic 01: Introduction & Review

SUPPLEMENT TO PARAMETRIC OR NONPARAMETRIC? A PARAMETRICNESS INDEX FOR MODEL SELECTION. University of Minnesota

Ch3. TRENDS. Time Series Analysis

Wavelet Methods for Time Series Analysis. Motivating Question

LIST OF PUBLICATIONS. 1. J.-P. Kreiss and E. Paparoditis, Bootstrap for Time Series: Theory and Applications, Springer-Verlag, New York, To appear.

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Wavelet Methods for Time Series Analysis. Part IX: Wavelet-Based Bootstrapping

Introduction to bivariate analysis

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

A nonparametric method of multi-step ahead forecasting in diffusion processes

Introduction to bivariate analysis

Ch 5. Models for Nonstationary Time Series. Time Series Analysis

Is the Basis of the Stock Index Futures Markets Nonlinear?

A Guide to Modern Econometric:

Introduction to Econometrics

Non-Stationary Time Series and Unit Root Testing

HETEROSCEDASTICITY AND AUTOCORRELATION ROBUST STRUCTURAL CHANGE DETECTION. By Zhou Zhou 1 University of Toronto February 1, 2013.

Local Whittle Likelihood Estimators and Tests for non-gaussian Linear Processes

(x t. x t +1. TIME SERIES (Chapter 8 of Wilks)

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

STT 843 Key to Homework 1 Spring 2018

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

ECON3327: Financial Econometrics, Spring 2016

7 Introduction to Time Series

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

CHANGE DETECTION IN TIME SERIES

Nonstationary time series models

7. Integrated Processes

Lecture 2: ARMA(p,q) models (part 2)

The increasing intensity of the strongest tropical cyclones

A Diagnostic for Seasonality Based Upon Autoregressive Roots

Robust Backtesting Tests for Value-at-Risk Models

of the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated

1.4 Properties of the autocovariance for stationary time-series

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

7. Integrated Processes

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

Thomas J. Fisher. Research Statement. Preliminary Results

Improving linear quantile regression for

7 Introduction to Time Series Time Series vs. Cross-Sectional Data Detrending Time Series... 15

Lecture 2 APPLICATION OF EXREME VALUE THEORY TO CLIMATE CHANGE. Rick Katz

Mohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983),

Arma-Arch Modeling Of The Returns Of First Bank Of Nigeria

Section 7: Local linear regression (loess) and regression discontinuity designs

A Test of Cointegration Rank Based Title Component Analysis.

1. Stochastic Processes and Stationarity

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Math 494: Mathematical Statistics

Testing Restrictions and Comparing Models

Department of Economics, Vanderbilt University While it is known that pseudo-out-of-sample methods are not optimal for

Three Papers by Peter Bickel on Nonparametric Curve Estimation

Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes (cont d)

Testing Equality of Nonparametric Quantile Regression Functions

University, Tempe, Arizona, USA b Department of Mathematics and Statistics, University of New. Mexico, Albuquerque, New Mexico, USA

Hypothesis Testing One Sample Tests

[1] Thavaneswaran, A.; Heyde, C. C. A note on filtering for long memory processes. Stable non-gaussian models in finance and econometrics. Math.

Chapter 2. Some basic tools. 2.1 Time series: Theory Stochastic processes

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

Appendix A: The time series behavior of employment growth

A review of some semiparametric regression models with application to scoring

FORECASTING AN INDEX OF THE MADDEN-OSCILLATION

Quantile Regression for Residual Life and Empirical Likelihood

Threshold estimation in marginal modelling of spatially-dependent non-stationary extremes

Linear Model Under General Variance Structure: Autocorrelation

Applied time-series analysis

Does k-th Moment Exist?

Ch 6. Model Specification. Time Series Analysis

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity.

Inference for Non-stationary Time Series Auto Regression. By Zhou Zhou 1 University of Toronto January 21, Abstract

Part II. Time Series

Inference of the Trend in a Partially Linear Model

ECON/FIN 250: Forecasting in Finance and Economics: Section 6: Standard Univariate Models

Some Time-Series Models

Nonlinear Time Series

Transcription:

An Introduction to Analysis Ting Zhang 1 tingz@bu.edu Department of Mathematics and Statistics Boston University August 15, 2016 Boston University/Keio University Workshop 2016 A Presentation Friendly for Graduate Students 1 I am grateful for the support of NSF Grant DMS-1461796. Some of the figures used in this presentation are from Wikipedia or other websites.

Welcome to Boston

Introduction A time series is a sequence of of measurements collected over time, and examples include electroencephalogram (EEG); stock price; The first human EEG recording obtained by Hans Berger in 1924. Upper: EEG. Lower: timing signal. temperature series; and many others.

An important feature of time series is the temporal dependence. Observations collected at different time points depend on each other. The common assumption of independence no longer holds. Example: Let X 1,..., X n be random variables, sharing common marginal distribution N(µ, σ 2 ). If independent, then X n = X 1 + + X n n N ) (µ, σ2, n and a (1 α)-th confidence interval for µ is given by [ X n ± q 1 α/2 σ n ]. Under dependence, the above constructed confidence interval may no longer preserve the desired nominal size, as the distribution of Xn can be different.

Example (Continued): We shall here consider an illustrative dependence case. Let ɛ 0,..., ɛ n be independent standard normal random variables, and set X i = µ + σ(ɛ i + ɛ i 1 )/ 2, then X i N(µ, σ 2 ) has the same marginal distribution, but not independent. In fact, cor(x i, X i 1 ) = 0.5. In this case, it can be shown that (exercise?) X n = X 1 + + X n n N { } µ, σ2 (2n 1) n 2, and a (1 α)-th (asymptotic) confidence interval for µ is given by [ σ X n ± q 1 α/2 n ] 2. Dependence makes a difference! Will the magic 2 adjustment work for other dependent data? Generally not!

To incorporate the dependence, parametric models have been widely used, and popular ones are: Autoregressive (AR) models: Moving average (MA) models: Threshold AR models: X i = a 1 X i 1 + + a p X i p + ɛ i ; X i = ɛ i + a 1 ɛ i 1 + + a q ɛ i q ; X i = ρ X i 1 + ɛ i ; and many others. When using parametric models, the dependence structure is fully described except for a few unknown parameters, which makes the inference procedure easier and likelihood-based methods are often used. Model misspecification issues!

Stationarity: Another Common Assumption A process is said to be stationary when its joint probability distribution does not change when shifted in time. Implying weak stationarity where E(X i ) = E(X 0 ), i = 1,..., n; cov(x i, X i + k) = cov(x 0, X k ), i = 1,..., n. Under (weak) stationarity, it makes sense to estimate the mean as a single parameter. The same holds for the marginal variance and autocorrelation. However, stationarity is a strong assumption and can be violated in practice. The first human EEG recording obtained by Hans Berger in 1924. Upper: EEG. Lower: timing signal.

Analysis of Nonstationary time series analysis has been a challenging but active area of research. When the assumption of stationarity fails, parameters of interest may no longer be a constant. In this case, they are naturally modeled as functions of time, which are infinite dimensional objects. Questions of interests: How to estimate those functions? Can parametric models be used to describe the time-varying pattern? How to make statistical inference, including hypothesis testing and constructing simultaneous confidence bands? Is it possible to provide a rigorous theoretical justification for those methods? Can we use the developed results to better address some real life problems?

Testing Parametric Assumptions on Trends of A Sample Problem

Motivating Examples (Tropical Cyclone Data) Maximum wind speed (m/s) 200 150 100 50 0 500 1000 1500 2000 Tropical cyclone number Figure: Satellite-derived lifetime-maximum wind speeds of tropical cyclones during 1981 2006 In atmospheric science (constancy or not?): Global warming has an impact on tropical cyclones (Emanuel, 1991, Holland, 1997 and Bengtsson et al., 2007); Linear trends for quantiles (Elsner, Kossin and Jagger, 2008, Nature). Is the mean really a constant?

Motivating Examples (The CET Data) 11 Temperature (celsius) 10 9 8 7 1659 1709 1759 1809 1859 1909 1959 2009 Time (years) Figure: Annual central England temperature series from 1659 to 2009 In climate science (various models proposed): Linear (Jones and Hulme, 1997); Quadratic (Benner, 1999, Int. J. Climatol.); Local polynomial (Harvey and Mills, 2003). Which model should we use?

Suppose we observe: y i = µ(i/n) + e i, i = 1,..., n, µ(t), t [0, 1], is an unknown smooth trend function; (e i ) is the error process (dependent and nonstationary). Motivated by the CET data, we want to test: H 0 : µ(t) = f(θ, t). Examples: f(θ, t) = θ 0 or f(θ, t) = θ 0 + θ 1 t. A natural strategy: Nonparametric : ˆµ n (t) v.s Parametric : f(ˆθ n, t). We form the L 2 -distance = 1 and reject H 0 if is large. Ref: Fan and Gijbels (1996) 0 {ˆµ n (t) f(ˆθ n, t)} 2 dt,

Locally fits a linear line : ˆµ n (t) = n i=1 y iw i (t). Bias: O(b 2 n); Variance: O(1/ nb n ). The window size b n satisfies b n 0 and nb n ; Ref: Fan and Gijbels (1996)

Recall the test statistic = 1 0 {ˆµ n (t) f(ˆθ n, t)} 2 dt. In order to develop a rigorous statistical test, we need an asymptotic theory on the closely related integrated squared error: ise = 1 0 {ˆµ n (t) µ(t)} 2 dt. Reason: the error of ˆµ n (t) dominates that of f(ˆθ n, t). In the literature: Independent errors: Bickel and Rosenblatt (1973, Ann. Statist.); Hall (1984, Ann. Statist.); Härdle and Mammen (1993, Ann. Statist.). Stationary linear error processes: González-Manteiga and Vilar Fernández (1995); Biedermann and Dette (2000). Nonstationary nonlinear error processes:??? Need a good framework.

Time-Varying Causal Representation The Framework

We assume that the error process has the causal representation: e i = H(i/n; F i ), F i = (..., ɛ i 1, ɛ i ), for some measurable function H : [0, 1] R R. A time-varying function of past innovations Examples: Stationary causal processes: e i = H(F i) = H(..., ɛ i 1, ɛ i); Include popular linear and nonlinear time series as special cases Nonstationary linear processes: e i = ɛ i + a 1(i/n)ɛ i 1 +. Nonstationary generalization by allowing time-varying parameters Easy to work with, and makes developing asymptotic theory of complicated statistics possible. Under this framework, an asymptotic theory was developed in Zhang and Wu (2011, Biometrika) for the test statistic, and the cut-off value can then be obtained.

Tropical Cyclone Data Maximum wind speed (m/s) 200 150 100 50 0 500 1000 1500 2000 Tropical cyclone number Figure: Satellite-derived lifetime-maximum wind speeds of tropical cyclones during 1981 2006 In the literature: Linear trends for quantiles (Elsner et al., 2008, Nature); L -based test: accept the mean constancy at the 5% level (Zhou, 2010, Ann. Statist.). Our procedure: Reject the mean constancy at the 5% level. L 2 -based tests can be more powerful than L -based ones

Central England Temperature Data Temperature (celsius) 11 10 9 8 7 1659 1709 1759 1809 1859 1909 1959 2009 Time (years) Figure: Annual central England temperature series from 1659 to 2009 with dashed curve representing the global cubic trend In climate science: Linear trend (Jones and Hulme, 1997); Quadratic trend (Benner, 1999, Int. J. Climatol.); Local polynomial trend (Harvey and Mills, 2003). Our procedure: Quadratic trend (Reject with p-value 0.00); Cubic trend (Accept with p-value 0.47). Ref: Jones and Bradley (1992)

Other Interesting Problems

Regression Setting 450 Hospital admission 350 250 150 0 100 200 300 400 500 600 700 Time (days) Sulphur dioxide (µg/m 3 ) 90 50 10 Nitrogen dioxide (µg/m 3 ) 100 60 20 0 100 200 300 400 500 600 700 Time (days) 0 100 200 300 400 500 600 700 Time (days) 140 120 Dust (µg/m 3 ) 80 Ozone (µg/m 3 ) 80 40 20 0 100 200 300 400 500 600 700 Time (days) 0 0 100 200 300 400 500 600 700 Time (days) Figure: Daily hospital admissions (top) and measurements of pollutants in Hong Kong between January 1, 1994 and December 31, 1995. Ref: Zhang and Wu (2012, Ann. Statist.) and Zhang (2015, J. Econometrics)

Multivariate Setting E 27 E 24 E 21 E 20 E 15 E 13 Degree Celcius E 11 E 09 E 08 Monitoring site E 07 E 06 E 05 E 04 E 03 E 01 10/10 10/15 10/20 10/25 10/30 Time (day) Figure: Air temperature measurements at 15 measurement facilities in the Southern Great Plains region of the United States from 10/06/2005 to 10/30/2005. Ref: Degras et al. (2012, IEEE), Zhang (2013, JASA) and Zhang (2016, Sinica)

Potential Jump Setting Temperature anoamalies (Celsius) 0.5 0.0 0.5 1.0 1850 1900 1950 2000 Time (year) Figure: Monthly global temperature anomalies in Celsius from 01/1850 to 12/2012. The period between the two dashed lines corresponds to 11/1944 12/1946. Ref: Zhang (2016, Electron. J. Stat)

Questions? Thank You! Name: Address: E-mail: Ting Zhang Department of Mathematics and Statistics Boston University Boston, MA 02215, U.S.A. tingz@bu.edu