Nonlinear Time Series

Similar documents
Financial Econometrics

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

Econometric Forecasting

Volatility. Gerald P. Dwyer. February Clemson University

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Multivariate Time Series: VAR(p) Processes and Models

Time Series: Theory and Methods

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Thomas J. Fisher. Research Statement. Preliminary Results

AR, MA and ARMA models

Multivariate Time Series

Classic Time Series Analysis

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

A Course in Time Series Analysis

Diagnostic Test for GARCH Models Based on Absolute Residual Autocorrelations

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series I Time Domain Methods

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Gaussian Copula Regression Application

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010

Symmetric btw positive & negative prior returns. where c is referred to as risk premium, which is expected to be positive.

Applied Time. Series Analysis. Wayne A. Woodward. Henry L. Gray. Alan C. Elliott. Dallas, Texas, USA

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

Statistics of stochastic processes

Econ 423 Lecture Notes: Additional Topics in Time Series 1

GARCH Models. Eduardo Rossi University of Pavia. December Rossi GARCH Financial Econometrics / 50

ITSM-R Reference Manual

STAD57 Time Series Analysis. Lecture 23

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Stochastic Processes

Applied time-series analysis

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

SF2943: TIME SERIES ANALYSIS COMMENTS ON SPECTRAL DENSITIES

STOR 356: Summary Course Notes

A time series is called strictly stationary if the joint distribution of every collection (Y t

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Revisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data

Index. Regression Models for Time Series Analysis. Benjamin Kedem, Konstantinos Fokianos Copyright John Wiley & Sons, Inc. ISBN.

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

FE570 Financial Markets and Trading. Stevens Institute of Technology

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Economics 618B: Time Series Analysis Department of Economics State University of New York at Binghamton

Statistical Methods for Forecasting

Basics: Definitions and Notation. Stationarity. A More Formal Definition

If we want to analyze experimental or simulated data we might encounter the following tasks:

TIME SERIES DATA ANALYSIS USING EVIEWS

GARCH models. Erik Lindström. FMS161/MASM18 Financial Statistics

Econometría 2: Análisis de series de Tiempo

Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models

Variance stabilization and simple GARCH models. Erik Lindström

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

A note on the specification of conditional heteroscedasticity using a TAR model

Nonlinear time series

Lecture 2: Univariate Time Series

Long memory in the R$/US$ exchange rate: A robust analysis

A test for improved forecasting performance at higher lead times

ECON 616: Lecture 1: Time Series Basics

ECON3327: Financial Econometrics, Spring 2016

Lecture 14: Conditional Heteroscedasticity Bus 41910, Time Series Analysis, Mr. R. Tsay

Empirical Market Microstructure Analysis (EMMA)

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41202, Spring Quarter 2003, Mr. Ruey S. Tsay

Part II. Time Series

GARCH Models Estimation and Inference

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Long-range dependence

A nonparametric test for seasonal unit roots

Outline. STA 6857 Cross Spectra, Linear Filters, and Parametric Spectral Estimation ( 4.6, 4.7, & 4.8) Summary Statistics of Midterm Scores

Arma-Arch Modeling Of The Returns Of First Bank Of Nigeria

distributed approximately according to white noise. Likewise, for general ARMA(p,q), the residuals can be expressed as

Theodore Panagiotidis*^ and Gianluigi Pelloni**

The autocorrelation and autocovariance functions - helpful tools in the modelling problem

Introduction to ARMA and GARCH processes

1 Linear Difference Equations

STA 6857 Cross Spectra, Linear Filters, and Parametric Spectral Estimation ( 4.6, 4.7, & 4.8)

Some Time-Series Models

A Test of the GARCH(1,1) Specification for Daily Stock Returns

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

MAT 3379 (Winter 2016) FINAL EXAM (PRACTICE)

Kernel-based portmanteau diagnostic test for ARMA time series models

The Size and Power of Four Tests for Detecting Autoregressive Conditional Heteroskedasticity in the Presence of Serial Correlation

Introduction to Signal Processing

interval forecasting

ISSN Article. Selection Criteria in Regime Switching Conditional Volatility Models

APPLIED TIME SERIES ECONOMETRICS

Econ 582 Nonparametric Regression

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Chapter 3: Regression Methods for Trends

Modelling using ARMA processes

Defect Detection using Nonparametric Regression

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

Time Series and Forecasting Lecture 4 NonLinear Time Series

Advanced Econometrics

ARIMA Modelling and Forecasting

Generalised AR and MA Models and Applications

Nonlinear Characterization of Activity Dynamics in Online Collaboration Websites

Elements of Multivariate Time Series Analysis

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity.

Transcription:

Nonlinear Time Series Recall that a linear time series {X t } is one that follows the relation, X t = µ + i=0 ψ i A t i, where {A t } is iid with mean 0 and finite variance. A linear time series is stationary if i=0 ψ 2 i <. A time series that cannot be put in this form is nonlinear. 1

Tests for Nonlinearity What kinds of statistics would be useful in testing for nonlinearity? Null hypothesis: the data follow a linear time series model. One approach would be to fit some kind of general linear model to the data, and then use some statistic computed from the residuals. Another approach would be based on comparisons of transforms with known properties of transforms of data following the hypothesized model. For tests regarding time series specifically (and maybe a few other types of data) the transform could be into the frequency domain. A different approach would be to specify an alternative hypothesis and test against it specifically. 2

Tests for Autocorrelations of Squared Residuals Residuals of what? An ARMA(p, q) model is a pretty good approximation for linear time series. Before attempting to fit an ARMA model, we should do some exploratory analyses to make sure we re even in the right ballpark. Are there any obvious departures from stationarity? Trends? Would differencing help? When it appears that we may have a stationary process, we go through the usual motions to fit an ARMA(p, q) model. Is the model a good fit? What could go wrong? There may be an ARCH effect. 3

Tests for Autocorrelations of Squared Residuals The ARCH effect arises from autocorrelations of squared residuals from an ARMA model. The simplest test for autocorrelations is based on the asymptotic normality of ˆρ(h) under the null hypothesis of 0 autocorrelation at lag h. (Recall that the test would be a t test, where the denominator is 1 + 2 h 1 i=1 The denominator is not obvious.) ˆρ 2 (i) /n. 4

Tests for Autocorrelations of Squared Residuals Of course, if ˆρ(h) is asymptotically normal, then ˆρ 2 (h) properly normalized is asymptotically chi-squared, and if ˆρ 2 (i) for i = 1,..., m are independent then the sum of them, each properly normalized, is asymptotically chi-squared with m degrees of freedom. These facts led to the Q (m) portmanteau test of Box and Pierce, and then led to the modified portmanteau test of Ljung and Box, using the statistic Q(m) = n(n + 2) m i=1 ˆρ 2 (i)(n i). This is asymptotically chi-squared with m degrees of freedom. 5

Tests for Autocorrelations of Squared Residuals As we have seen, the Q test applied to squared residuals can be used to detect an ARCH effect, as suggested by McLeod and Li. We choose a value of m. So this is one test for nonlinearity. A related test is the F test suggested by Engle. This is the usual F test of in the linear regression model H 0 : β 1 = = β m = 0 a 2 t = β 0 + β 1 a 2 t 1 + + β ma 2 t m, where the a i are the residuals from the fitted ARMA model. 6

Stationary Processes in the Frequency Domain Time series models of the form X t = f(x t 1, X t 2,..., A t, A t 1,..., β), are said to be represented in the time domain. Representations of a time series as a composition of periodic behaviors are said to be in the frequency domain. Processes with strong periodic behavior and periodic processes with a small number of periodicities (audio signals, for example) are usually modeled better in the frequency domain than in the time domain. Financial time series are best analyzed in the time domain. Stationary processes (and of course, there aren t many of those in financial time series!) have an important relationship between a time-domain measure and a frequency-domain function. 7

Spectral Representation of the ACVF If we have a stationary process with autocovariance γ(h), then there exists a unique monotonically increasing function F(ω) on the closed interval [ 1/2,1/2], such that F( 1/2) = 0, and F(1/2) = γ(0) and γ(h) = 1/2 1/2 e2πiωh df(ω) The function F(ω) is called the spectral distribution function. The proof of this theorem, the spectral representation theorem, is available in many books, but we will not prove it in this class. Note that my notation for Fourier transforms may differ slightly from that in Tsay; the difference is whether or not frequencies are in radians or in π radians. 8

Spectral Density The derivative of the spectral distribution function F(ω), which we write as f(ω) is a measure of the intensity of any periodic component at the frequency ω. We call f(ω) the spectral density. The ACVF is essentially the Fourier transform of the spectral density. By the Inversion Theorem for the Fourier transform, we have, for 1/2 ω 1/2, or f(ω) = f(ω) = h= h= γ(h)e 2πiωh, E(X t X t+h )e 2πiωh. 9

Bispectral Density Now, if the third moment E(X t X t+u X t+v ) exists and is finite, in the linear time series we have X t = µ + E(X t X t+u X t+v ) = E i=0 ( X 3 t ψ i A t i, ) i=0 ψ i ψ i+u ψ i+v. Now by analogy, we call the double Fourier transform, the bispectral density: b(ω 1, ω 2 ) = u= v= E(X t X t+u X t+v )e 2πi(ω 1u+ω 2 v). 10

Spectral Densities Letting Ψ represent the polynomial formed in the usual way from the ψ i in the linear time series model, we have for the spectral density, f(ω) = E ( X 2 t ) Ψ ( e 2πiω 1 ) Ψ ( e 2πiω 2 ) ; and for the bispectral density, b(ω 1, ω 2 ) = E ( X 3 t ) Ψ ( e 2πiω 1 ) Ψ ( e 2πiω 2 ) Ψ ( e 2πi(ω 1 +ω 2 ) ). Now, note in this case that b(ω 1, ω 2 ) 2 f(ω 1 )f(ω 2 )f(ω 1 + ω 2 ) = ( E ( X 3 t )) 2 ( E ( X 2 t )) 3, which is constant. 11

Bispectral Test The constancy of the ratio on the previous slide provides the basis for a test of nonlinearity. How would you do that? Compute it for several subsequences. There are various nonparametric tests for constancy, and consequently there are various bispectral tests. Notice also that the numerator in the test statistic is 0, if the time series is linear and the errors have a normal distribution. 12

BDS Test The BDS test is names after Brock, Dechert, and Scheinkman, who proposed it. The test is for strict stationarity of the error process. For the data x 1,..., x n, it is based on the normalized counts of closeness of subsequences, X m i and X m j, where Xm i = (x i,..., x i+m 1 ). For fixed δ > 0, the closeness is measured by how many subsequences are with δ of each other in the supnorm. We define I δ (X m i, Xm j ) = { 1 if X m i X m j δ 0 otherwise 13

BDS Test We compare the counts for subsequences of length 1 and k: and 2 C 1 (δ, n) = n(n 1) i<j 2 C k (δ, n) = (n k + 1)(n k) I δ (X 1 i, X1 j ) i<j I δ (X k i, Xk j ). In the iid case, C k (δ, n) (C 1 (δ, n)) k, and asymptotically n(c k (δ, n) (C 1 (δ, n)) k ) is normal with mean 0 and known variance (see Tsay, page 208). The null hypothesis that the errors are iid, which is one of the properties of a linear time series is tested using extreme quantiles of the normal distribution. 14

BDS Test Notice that the BDS test depends on two quantities, δ and k. Obviously, k should be small relative to n. There is an R function, bds.test, in the tseries package that performs the computations for the BDS test. 15

RESET Test The Regression Equation Specification Error Test (RESET) test of Ramsey is a general test for misspecification of a linear model (not just a linear time series). It may detect omitted variables, incorrect functional form, and heteroscedasticity. For applications of the RESET test to linear time series, we assume that the linear model is an AR model. The test statistic is an F statistic computed from the residuals of a fitted AR model (see equation (4.44) in Tsay). Because of the omnibus nature of the alternative hypothesis, the performance of the test is highly variable, and often has very low power. There is an R function, resettest, in the lmtest package that performs the computations for the RESET test. 16

F Tests There are several variations on the F statistic used in the RESET test. Tsay mentions some of these, and you can probably think of other modifications of the basic ideas. 17

Threshold Test There are various types of tests that could be constructed based on dividing the time series into different regimes. Simple approaches would be based on regimes that are separated by fixed time points. Other approaches could be based on regimes in which either observed values or fitted residuals appear to be different. Obviously, if the data are used to identify possible thresholds the significance level of a test must take that fact into consideration. In general, the form of the test statistics, however they are constructed are F statistics. 18

Time Series Models I don t know of any other area of statistics that has so many different models as in the time domain of time series analysis. Each model has its own name and sometimes different name. The common linear models are of the AR and MA types. We combine AR and MA to get ARMA. Then we difference a time series to get an ARMA, and call the complete model ARIMA. Next, we from AR and MA relations at multiples of longer lags. This yields seasonal ARMA and ARIMA models. Most of the linear models in the time domain are of these types. Then we have the nonlinear time series models. 19

Nonlinear Time Series Four general types that are useful: models of squared quantities such as variances; these are often coupled with other models to allow stochastic volatility; ARIMA+GARCH, for example bilinear models X t = c + p i=1 φ i X t i q j=1 θ j A t j + m s i=1 j=1 β ij X t i A t j + A t random coefficients models X t = p i=1 (φ i + U (i) )X t i + A t threshold models Tsay describes a number of these in Chapters 3 and 4. Another general source of nonlinearity is local fitting of a general model. t 20

Nonlinear Time Series In the area of nonlinear time series models is where the small modifications with their specific names really proliferates. First, we have the basic ones that account for conditional heteroscedasticity: ARCH and GARCH. Then the modifications (Chapter 3): IGARCH, GARCH-M, EGARCH, TGARCH (also GJR), CHARMA (or RCA), LMSV Then further modifications (Chapter 4): TAR (similar to TGARCH, but for linear terms), SETAR ( self-exciting TAR; the regime depends on a lagged value), STAR, MSA (or MSAR), NAAR. Other models are local regression models. Finally, we have algorithmic models, such as neural nets. I am not going to consider all of these little variations. The most common method of fitting these models is by maximum likelihood. There are R functions for many of them. 21

Time Series Models The names of the wide variety of time series models that evolved from the basic ARCH and GARCH models can be rather confusing. Some models go by different names. Tsay sometimes refers to the TGARCH(m, s) model as the GJR model (see p. 149). ( GJR is not in the index for his book.) Most of the models that Tsay uses are special cases of the APARCH model of Ding, Grange, and Engle (1993). This model is as the basic ARCH model, and m σt δ = α 0 + α i ( A t i γ i A t i ) δ + i=1 A t = σ t ɛ t, (1) s β j σt j. δ (2) This model includes several of the other variations on the GARCH model. j=1 22

Transition or Threshold Models A two-regime transition model is of the general form X t = g 1 (x p t p 1, aq t q 1, β 1) + A t if condition 1 g 2 (x p t p 1, aq t q 1, β 2) + A t otherwise A threshold model usually depends on the past state and so is of the general form X t = g 1 (x p t p 1, aq t q 1, β 1) + A t if x p t p 1 R 1 g 2 (x p t p 1, aq t q 1, β 2) + A t otherwise For the specific case of an AR model, the g i functions above are linear functions of x p t p 1. Also, the condition x p t p 1 R 1 is usually simplified to a simple form x t d R 1. 23

Smooth Transition Autoregressive (STAR) Model An obvious modification is to make the transition smooth by using a smooth weighting function. If the linear functions are AR relationships, we have a simple instance, namely, the STAR(p) model: X t = φ 1,0 + p i=1 φ 1,i x t i +F ( ) xt d φ 2,0 + s p i=1 where F( ) is a smooth function going from 0 to 1. φ 2,i x t i +A t, Tsay gives an R function to fit a STAR model on page 186. I could not find this code anywhere, but if someone will key it in and send it to me, I ll post it. 24

Markov Switching Model Another simple transition model in which the underlying components are AR is the Markov switching model (MSA). Here, the regime is chosen as a Markov process. The model for a two-state MSA, as before is X t = φ 1,0 + p i=1 φ 1,ix t i + A 1t if state 1 φ 2,0 + p i=1 φ 2,ix t i + A 2t otherwise All we need to specify are the transition probabilities Pr(S t s t 1 ). Fitting this is a little harder, but again, can be done by maximum likelihood. The transition probabilities can be estimated by MCMC or by the EM algorithm. 25

The following slides are preliminary versions of the material we will discuss on April 10. 26

Kernel Regression Local regression is another type of nonlinear modeling. A simple form of local regression is to use a filter or kernel function to provide local weighting of the observed data. This approach ensures that at a given point the observations close to that point influence the estimate at the point more strongly than more distant observations. A standard method in this approach is to convolve the observations with a unimodal function that decreases rapidly away from a central point. This function is the filter or the kernel. A kernel function has two arguments representing the two points in the convolution, but we typically use a single argument that represents the distance between the two points. 27

Choice of Kernels Standard normal densities have these properties described above, so the kernel is often chosen to be the standard normal density. As it turns out, the kernel density estimator is not very sensitive to the form of the kernel. Although the kernel may be from a parametric family of distributions, in kernel density estimation, we do not estimate those parameters; hence, the kernel method is a nonparametric method. 28

Choice of Kernels Sometimes, a kernel with finite support is easier to work with. In the univariate case, a useful general form of a compact kernel is K(t) = κ rs (1 t r ) s I [ 1,1] (t), where κ rs = r, for r > 0, s 0, 2B(1/r,s + 1) and B(a, b) is the complete beta function. 29

Choice of Kernels This general form leads to several simple specific cases: for r = 1 and s = 0, it is the rectangular or uniform kernel; for r = 1 and s = 1, it is the triangular kernel; for r = 2 and s = 1 (κ rs = 3/4), it is the Epanechnikov kernel, which yields the optimal rate of convergence of the MISE; for r = 2 and s = 2 (κ rs = 15/16), it is the biweight kernel. If r = 2 and s, we have the Gaussian kernel (with some rescaling). 30

Kernel Methods In kernel methods, the locality of influence is controlled by a window around the point of interest. The choice of the size of the window, or the bandwidth, is the most important issue in the use of kernel methods. In univariate applications, the window size is just a length, usually denoted by h (except maybe in time series applications). In practice, for a given choice of the size of the window, the argument of the kernel function is transformed to reflect the size. The transformation is accomplished using a positive definite matrix, V, whose determinant measures the volume (size) of the window. 31

Local Linear Regression Use of the kernel function is simple. When least squares is the basic criterion, the kernel just becomes the weight. 32

Choice of Bandwidth There are two ways to choose a bandwidth. One is based on the mean-integrated squared error (MISE). In this method, the MISE for an assumed model is determined, and then the bandwidth that minimizes this is determined. The other method is a data-based method. We use cross-validation to determine the optimal bandwidth. In cross-validation, for a given bandwidth, we fit a model using all of data except for a few points ( leave-out-d ), then determine the SSE using all of the data. We do this over a grid of bandwidths. Then we do this multiple times ( k-fold cross-validation ). The best bandwidth is the one that minimizes the SSE (from all data). 33

Nonparametric Smoothing Kernel methods may be parametric or nonparametric. In nonparametric methods, the kernels are generally simple. There are various methods, such as running medians or running (weighted) means. Running means are moving averages. The R function lowess does locally weighted smoothing using weighted running means. These methods are widely used for smoothing time series. The emphasis is on prediction, rather than model building. 34

General Additive Time Series Models A model of the form y i = β 0 + β 1 x 1i + + β m x mi + ɛ i can be generalized by replacing the constant (but unknown) coefficients by unknown functions (with specified forms): y i = f 1 (x)x 1i + + f m (x)x mi + ɛ i Hastie and Tibshirani have written extensively on such models. 35

Neural Networks When the emphasis is on prediction, we can form a black box algorithm that accepts a set of input values x, combines them into intermediate values (in a hidden layer ) and then combines the values of the hidden layer into a single output y. In a time series application, we have data r 1,..., r n, and for i = k,..., n, we choose a subsequence x i = (r i,..., r i k as an input to produce an output o i as a predictor of r i. We train the neural net so as to minimize (oi r i ) 2 The R function nnet in nnet can be used to do this. See Appendix B in Chapter 4 of Tsay. Watch out for the assignment statements! Never write R or S-Plus code like that! 36

Monte Carlo Forecasting Monte Carlo can be used for forecasting in any time series model ( parametric bootstrap ). At forecast origin t we forecast at the horizon t + h by use of the fitted (or assumed) model and simulated errors (or innovations ). Doing this many times, we get a sample of ˆr (j) t+h. The mean of this sample is the estimator, ˆr t+h, and the sample quantiles provide confidence limits. 37

Fitting Time Series Models in R There are a number of R functions that perform the computations to fit various time series models. ARMA / ARIMA arima(stats) ARMA order determination autofit(itsmr) ARMA + GARCH garchfit(fgarch) APARCH garchfit(fgarch) The APARCH model includes the TGARCH and GJR models, among others; see equation (2). Also see the help page for fgarch-package in fgarch. 38

Other R Functions for Time Series Models There are R functions for forecasting using different time series models that have been fitted. ARMA/ARIMA predict.arima(stats) APARCH (including ARMA + GARCH) predict(fgarch) There are also R functions for simulating data from different time series models. ARMA/ARIMA arima.sim(stats) APARCH (including ARMA + GARCH) garchsim(fgarch) 39