Nonlinear Autoregressive Processes with Optimal Properties

Nonlinear Autoregressive Processes with Optimal Properties F. Blasques S.J. Koopman A. Lucas VU University Amsterdam, Tinbergen Institute, CREATES OxMetrics User Conference, September 2014 Cass Business School, London 1 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Motivation Examples of nonlinear autoregressive (nonlinear AR) models : Treshold AR (TAR) : Tong (1983) y t = γ 1 y t 1 + γ 2 I(y t 2 < γ 3 )y t 1 + u t Smooth transition AR (STAR) : Chan & Tong (1986) and Teräsvirta (1994) y t = γ 4 x t 2 (γ 6 )y t 1 γ 5 [1 x t 2 (γ 6 )] y t 1 + u t where γ i is an unknown coecient, for i = 1,..., 6, I() is an indicator function and x t (γ) = 1 / [1 + exp( γ y t 2 )]. These are examples of nonlinear AR(2) models. 2 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Motivation Nonlinear AR models can also be written as linear AR models with an observation driven time-varying temporal dependence: Treshold AR (TAR) : Tong (1983) y t = ρ t y t 1 + u t, ρ t = γ 1 + γ 2 I(y t 2 < γ 3 ). Smooth transition AR (STAR) : Chan & Tong (1986) and Teräsvirta (1994) y t = ρ t y t 1 + u t, ρ t = γ 4 x t 2 (γ 6 ) + γ 5 [1 x t 2 (γ 6 )] where γ i is an unknown coecient, for i = 1,..., 6, I() is an indicator function and x t (γ) = 1 / [1 + exp( γ y t 2 )]. This is an interesting feature. 3 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Motivation More generally, we can consider a general nonlinear AR model y t = ϕ(y t 1 ; θ) + u t, u t p u (θ) for some "conveniently selected" function ϕ() of the innite past y t 1 := (y t 1, y t 2,...) and parameter vector θ. The linear AR(1) model with time-varying dependence ρ t is y t = ρ t y t 1 + u t, u t p u (θ), ρ t = h(y t 1 ; θ) When h(y t 1 ; θ) is appropriately chosen, the two model equations are a.s. the same: h(y t 1 ; θ) = ϕ(y t 1 ; θ) / y t 1. 4 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Motivation More generally, we can consider a general nonlinear AR model y t = ϕ(y t 1 ; θ) + u t, u t p u (θ) for some "conveniently selected" function ϕ() of the innite past y t 1 := (y t 1, y t 2,...) and parameter vector θ. The linear AR(1) model with time-varying dependence ρ t is y t = ρ t y t 1 + u t, u t p u (θ), ρ t = h(y t 1 ; θ) When h(y t 1 ; θ) is appropriately chosen, the two model equations are a.s. the same: h(y t 1 ; θ) = ϕ(y t 1 ; θ) / y t 1. Alternative is to base the equivalence on (Taylor) expansions. 4 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Motivation What do we take from this equivalence? Rather than focussing on "some" general function ϕ(y t 1 ; θ) in the nonlinear AR model y t = ϕ(y t 1 ; θ) + u t, we consider the linear AR(1) model y t = ρ t y t 1 + u t where ρ t is an observation driven time-varying coecient. It is, perhaps, more practical and empirically more convenient! New nonlinear AR model formulations may arise with empirical relevance. 5 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Score updating How to let the coecient ρ t change in the AR(1) model? We use the local information from the likelihood function to adapt the value for ρ t. For this purpose, we use the score function of the predictive logdensity function at time t : log p(y t y t 1 ; θ) ρ t. For a given ρ t 1 value, score information is useful to determine new ρ t value, when new observation y t becomes available. It requires a specication for predictive density function p(y t y t 1 ; θ), we can depart from Gaussian assumptions. 6 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Our paper We present an observation-driven model specication for the time-varying dependency in autoregressive models. For the AR(1) case we have y t = h(f t ; θ)y t 1 + u t, u t p u (u t ; θ), f t = φ(y t 1, f t 1 ; θ), where h() and φ() are xed functions, both possibly depending on the xed parameter vector θ, with x t = {x t, x t 1, x t 2,...} for x = f, y. The AR(1) model is general and exible but need to specify φ(y t 1, f t 1 ; θ), p u (u t ; θ), h(f t ; θ). 7 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Time-varying temporal dependence in AR(1) model For the AR(1) model, y t = h(f t ; θ)y t 1 + u t, u t p u (u t ; θ), f t = φ(y t 1, f t 1 ; θ), we take the linear updating equation φ(y t 1, f t 1 ; θ) = ω + αs t 1 + βf t 1, s t = s(y t, f t ; θ), where ω, α and β are xed coecients and s t 1 is a deterministic function of past observations. We take s t as the score function of the conditional or predictive log-density function of y t, log p(y t f t, y t 1 ; θ) log p u (u t ; θ), as u t = y t h(f t ; θ)y t 1, with respect to f t. 8 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Time-varying temporal dependence in AR(1) model Our time-varying temporal dependence AR(1) model is given by with score function y t = h(f t ; θ)y t 1 + u t, u t p u (u t ; θ), f t = ω + αs(y t 1, f t 1 ; θ) + βf t 1, s t s(y t, f t ; θ) = log p(y t f t, y t 1 ; θ) f t. In spirit of score models : Creal, Koopman & Lucas (2011,2013) and Harvey (2013). Why the score? It provides optimality properties! In a Kullback-Leibler framework, see later. But what about the choice for p u (u t ; θ) and h(f t ; θ)? 9 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Time-varying temporal dependence in AR(1) model Our time-varying temporal dependence AR(1) model is given by y t = h(f t ; θ)y t 1 + u t, u t p u (u t ; θ), f t = ω + αs t 1 + βf t 1, where score function depends on choice s t = log p u(u t ; θ) f t h(f t ; θ) f t logit(f t ) p u (u t ; θ) Normal X Student's t 10 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Basic Model I The linear Gaussian updating case with score function y t = f t y t 1 + u t, u t N(0, σ 2 u), f t = ω + αs t 1 + βf t 1, s t = [c 0.5(y t f t y t 1 ) 2 /σ 2 u] f t = (y t f t y t 1 )(y t 1 /σ 2 u) = u t y t 1 /σ 2 u. The time-varying autoregressive parameter updating equation is f t = ω + α u t 1y t 2 σ 2 u + βf t 1. 11 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Basic time-varying temporal dependence We have the model y t = f t y t 1 + u t, Interesting interpretation : f t = ω + α u t 1y t 2 σ 2 u + βf t 1. update of f t reacts to error u t 1 multiplied by y t 2 and scaled by σ 2 u. role of y t 2 is to signal whether f t is below or above its mean. update distinguishes role of observed past data and of past parameter value. More interesting/intrinsic updating equations for other p u and h 12 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Models I, II and III Model I is based on Gaussian p u and unity function h(f) = f. Model II is based on Gaussian p u and logistic function for h(f). Model III is based on Student's t p u and unity function h(f). 13 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Basic time-varying temporal dependence Figure: Updating for f t : h(f) = f and p u (u) = N, t. 14 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Nonlinear AR specication Our AR(1) model implies that h(f t ) = y t u t y t 1. In the case of unity function for h(), we have f t = (y t u t )y 1 t 1 and the score-driven updating function becomes f t = ω + αs t 1 + β y t 1 u t 1 y t 2. By substituting this expression into y t = f t y t 1 + u t we obtain y t = ωy t 1 + αs t 1 y t 1 + β y t 1 u t 1 y t 2 y t 1 + u t, 15 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Nonlinear AR specication Our time-varying temporal dependence Model I y t = f t y t 1 + u t, f t = ω + α u t 1y t 2 σ 2 u + βf t 1, with f t 1 = (y t 1 u t 1 )yt 2 1, can be rewritten as y t = ωy t 1 + α y t 1y t 2 u t 1 σ 2 + β y t 1 u t 1 y t 2 y t 1 + u t. It is a nonlinear ARMA(2, 1)! Similar results can be obtained for Models II and III. But expressions become more intricate! 16 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Nonlinear ARMA models We have shown that our basic time-varying temporal dependence model is a nonlinear ARMA model. But what is new? Nonlinear ARMA models have been formulated! Threshold AR Tong (1991) y t = ϕ t y t 1 + u t, ϕ t = ϕ + ϕ I(y t 2 < γ), Smooth Transition Chan & Tong (1986), Teräsvirta (1994) y t = ϕ t y t 1 + u t, ϕ t = γ 1 x t 2 + γ 2 (1 x t 2 ), where x t = [1 + exp( γ 3 y t )] 1. 17 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Comparison with TAR and STAR 18 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Comparison with our basic model 19 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Properties Score function : Familiar entity in econometrics, has nice properties. Stationarity and Ergodicity : Conditions can be established, for both y t and f t. Maximum likelihood : Consistency and Asymptotic Normality, conditions can be established. Optimality : Updating using score provides a step closer to the true path of the time-varying parameter, optimality in the Kullback-Leibler sense. See Blasques, Koopman and Lucas (2014) 20 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Denition I: Realized KL Divergence KL divergence between p( f t ) and p ( f t+1 ; θ ) is given by D KL (p( f t ), p ( f t+1 ; θ )) = p(y t f t ) ln p(y t f t ) p ( y t f t+1 ; θ ) dy t. 21 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Denition I: Realized KL Divergence KL divergence between p( f t ) and p ( f t+1 ; θ ) is given by D KL (p( f t ), p ( f t+1 ; θ )) = p(y t f t ) ln p(y t f t ) p ( y t f t+1 ; θ ) dy t. The realized KL variation t 1 RKL of a parameter update from f t to f t+1 is dened as t 1 RKL = D KL (p( f t ), p ( f t+1 ; θ )) D KL (p( f t ), p ( f t ; θ )) 21 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Denition II: Conditionally Expected KL Divergence An optimal updating scheme, while subject to randomness, should have tendency to move in correct direction: On average, the KL divergence should reduce in expectation. The conditionally expected KL (CKL) variation of a parameter update from f t F to f t+1 F is given by [ t 1 CKL = q( f t+1 f t, f t ; θ) p(y f t ) ln p(y f ] t ; θ) p(y f t+1 ; θ) dy d f t+1, F Y where q( f t+1 f t, f t ; θ) denotes the density of f t+1 conditional on both f t and f t. For a given p t, an update is CKL optimal if and only if t 1 CKL 0. 22 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Condition for RKL Our basic time-varying temporal dependence model y t = f t y t 1 + u t, f t = ω + α u t 1y t 2 σ 2 u + βf t 1, we obtain RKL optimality under the condition α > σu 2 ω + (β 1) f t (y t 1 f t 1 y t 2 )y t 2, The new score information should have locally sucient impact on the updating for f t. A similar but dierent condition is derived for CKL optimality. 23 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

24 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Empirical illustration: Unemployment Insurance Claims We analyze the growth rate of US seasonally adjusted weekly Unemployment Insurance Claims (UIC) for roughly the last ve decades. Meyer (1995), Anderson & Meyer (1997, 2000), Hopenhayn & Nicolini (1997) and Ashenfelter (2005) have studied the UIC series. The importance of forecasting UIC has been highlighted by Gavin & Kliesen (2002): UIC is a leading indicator for several labor market conditions: how they can be used to forecasting GDP growth rates. Here we consider various models and do some comparisons amongst them. 25 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Empirical illustration Unemployment Insurance Claims: Model Comparison Model I TAR STAR AR(2) AR(5) LL 6744 6736 6737 6439 6968 AIC -13478-13462 -13464-12870 -13921 RMSE 0.750 0.752 0.752 0.848 1.20 26 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Empirical results 27 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes

Conclusions We have introduced time-varying temporal dependence in the AR(1) model y t = f t y t 1 + u t, f t = ω + α u t 1y t 2 σ 2 u + βf t 1, an observation-driven approach to time-varying autoregressive coecient: GAS model is eective! reduced form : nonlinear ARMA models they can be compared with TAR and STAR models the ltered estimate f t has optimality properties in the KL sense when based on the score function! we provide some Monte Carlo evidence an empirical illustration for UIC is presented 28 / 28 Blasques, Koopman and Lucas Nonlinear Autoregressive Processes