The Hodrick-Prescott Filter

Similar documents
Saddlepoint-Based Bootstrap Inference in Dependent Data Settings

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

Exact Likelihood Ratio Tests for Penalized Splines

Andreas Blöchl: Trend Estimation with Penalized Splines as Mixed Models for Series with Structural Breaks

Nonparametric Small Area Estimation Using Penalized Spline Regression

Math 494: Mathematical Statistics

Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models

Some properties of Likelihood Ratio Tests in Linear Mixed Models

Penalized Splines, Mixed Models, and Recent Large-Sample Results

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Model Selection and Geometry

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

Better Bootstrap Confidence Intervals

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Independent and conditionally independent counterfactual distributions

Two Applications of Nonparametric Regression in Survey Estimation

Bayesian estimation of the discrepancy with misspecified parametric models

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

TREND ESTIMATION AND THE HODRICK-PRESCOTT FILTER

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Simultaneous Confidence Bands for the Coefficient Function in Functional Regression

interval forecasting

Spatially Adaptive Smoothing Splines

Hypothesis Testing in Smoothing Spline Models

Likelihood ratio testing for zero variance components in linear mixed models

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Testing Restrictions and Comparing Models

RIDGE REGRESSION REPRESENTATIONS OF THE GENERALIZED HODRICK-PRESCOTT FILTER

Chapter 3: Regression Methods for Trends

Issues on quantile autoregression

AR, MA and ARMA models

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).

Econ 424 Time Series Concepts

REGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University

Model comparison and selection

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Research Projects. Hanxiang Peng. March 4, Department of Mathematical Sciences Indiana University-Purdue University at Indianapolis

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Nonparametric Small Area Estimation via M-quantile Regression using Penalized Splines

Statistics of stochastic processes

Time Series and Forecasting Lecture 4 NonLinear Time Series

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

Estimation of cumulative distribution function with spline functions

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Can we do statistical inference in a non-asymptotic way? 1

11. Bootstrap Methods

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Wavelet Methods for Time Series Analysis. Part IV: Wavelet-Based Decorrelation of Time Series

One-stage dose-response meta-analysis

Lecture 3: Statistical sampling uncertainty

Econometric Forecasting

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

RLRsim: Testing for Random Effects or Nonparametric Regression Functions in Additive Mixed Models

Likelihood-based inference with missing data under missing-at-random

Nonparametric Small Area Estimation Using Penalized Spline Regression

Regularization in Cox Frailty Models

Empirical Market Microstructure Analysis (EMMA)

STAT 512 sp 2018 Summary Sheet

AFT Models and Empirical Likelihood

Econometrics of Panel Data

1 Mixed effect models and longitudinal data analysis

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Econometrics of Panel Data

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

What s New in Econometrics. Lecture 13

10. Time series regression and forecasting

A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

STAT Financial Time Series

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

Uncertainty Quantification for Inverse Problems. November 7, 2011

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Bootstrap and Parametric Inference: Successes and Challenges

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

COMS 4721: Machine Learning for Data Science Lecture 1, 1/17/2017

COMPARING PARAMETRIC AND SEMIPARAMETRIC ERROR CORRECTION MODELS FOR ESTIMATION OF LONG RUN EQUILIBRIUM BETWEEN EXPORTS AND IMPORTS

Masters Comprehensive Examination Department of Statistics, University of Florida

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Generalized Linear Models. Kurt Hornik

Nonparametric Tests for Multi-parameter M-estimators

A Diagnostic for Seasonality Based Upon Autoregressive Roots

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Statistics 910, #5 1. Regression Methods

Econ 623 Econometrics II Topic 2: Stationary Time Series

Robustness to Parametric Assumptions in Missing Data Models

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Nonparametric Bayesian Methods - Lecture I

MIT Spring 2015

Review of Statistics

Transcription:

The Hodrick-Prescott Filter A Special Case of Penalized Spline Smoothing Alex Trindade Dept. of Mathematics & Statistics, Texas Tech University Joint work with Rob Paige, Missouri University of Science and Technology Funded in part by the National Security Agency Grants: H98230-09-1-0071 (Paige) & H98230-08-1-0071 (Trindade) June 2012 SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University JuneJoint 2012work with 1 / 38 Rob

Outline 1 Motivation Examples: the need for a semiparametric smoothing approach... 2 Semiparametric Models Partially linear models & penalized spline models Penalized splines as linear mixed models (LMMs) 3 Result #2: Estimation of Smoothing Parameter Estimators as roots of quadratic estimating equations (QEEs) Saddlepoint-based bootstrap (SPBB) inference for QEEs Exact ML & REML Inference in LMMs 4 Result #1: The Hodrick-Prescott Filter (HPF) HPF solution coincides with linear penalized spline 5 Examples and Simulations U.S. Gross National Product (GNP) Returns of GNP SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University JuneJoint 2012work with 2 / 38 Rob

The LIDAR Data (Ruppert, Wand, & Carroll, 2003) Model: y = µ(x) + error. Goal: estimate mean function µ(x), i.e. smooth data. log ratio 0.8 0.6 0.4 0.2 0.0 400 450 500 550 600 650 700 range SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University JuneJoint 2012work with 3 / 38 Rob

Standard Parametric Approaches Fail Polynomial regression: µ(x) = β 0 + β 1 x + + β p x p p = 3, 4: don t capture sharp drop (x 600) p = 10: µ(x) too wiggly; µ (x) too variable (and changes sign!) Nonlinear regression, e.g. µ(x) = β 0 e β 1x, requires thinking about correct (nonlinear) relationship... Can often transform x and/or y such that get linear relationship, e.g. take logs above: log µ(x) = log(β 0 ) + β 1 x No transform here seems to help in linearizing (and stabilizing variance of) relationship. SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University JuneJoint 2012work with 4 / 38 Rob

The Fossil Data (Ruppert, Wand, & Carroll, 2003) strontium ratio 0.70720 0.70730 0.70740 0.70750 dip here? obvious dip! 95 100 105 110 115 120 age (millions of years) SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University JuneJoint 2012work with 5 / 38 Rob

U.S.A. Detrended Log(GNP) Quarterly Data (Hodrick & Prescott, 1997) A cycle of long duration? 0.15 0.10 0.05 0.00 0.05 0.10 0.15 1950 1960 1970 1980 1990 Time SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University JuneJoint 2012work with 6 / 38 Rob

A Simple Spline Model (LIDAR Data) Linear spline model with knots at κ 1 = 575 and κ 2 = 600 is promising µ(x) = β 0 + β 1 x + u 1 (x 575) + + u 2 (x 600) + Piecewise linear function, called spline (thin strip of flexible wood). A spline model is a flexible model. log ratio 0.8 0.6 0.4 0.2 0.0 400 450 500 550 600 650 700 range SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University JuneJoint 2012work with 7 / 38 Rob

Polynomial Spline Models In general, have polynomial spline model of degree p with K knots µ(x) = β 0 + β 1 x + + β p x p + K k=1 u k (x κ k ) p + Given pairs of obs (x i, y i ), i = 1,..., n, can write mean in matrix form µ = X β + Zu Bθ Can also add vector of covariates w entering mean linearly µ = W γ }{{} + }{{} Bθ linear part nonlinear part Model can allow for autocorrelation, R, in residuals (e.g. time series): y = µ + ε, ε (0, σ 2 ε R) SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University JuneJoint 2012work with 8 / 38 Rob

Penalized Spline Models Polynomial spline model Estimate θ by minimizing µ = X β + Zu Bθ ˆθ PS = arg min θ { (y Bθ) R 1 (y Bθ) + αθ Dθ } D: a non-negative definite penalty matrix (usually θ Dθ = u u) α is a smoothing parameter controlling balance between: LIDAR: linear spline fits with max and min smoothing (24 knots) fidelity to data (α = 0) smoothness of fit (α = ) log ratio 0.8 0.6 0.4 0.2 0.0 α = 0 α = 400 450 500 550 600 650 700 range SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University JuneJoint 2012work with 9 / 38 Rob

Linear Mixed Model (LMM) Formulation With assumptions: [ ] u E = ε [ ] 0, Cov 0 [ ] [ ] u σ 2 = u I K 0 ε 0 σε 2 R Penalized spline can be recast as LMM with one variance component (Brumback, Ruppert, & Wand, 1999) y = X β }{{} + Zu }{{ + ε } fixed effects random effects BLUP of y in this context is ỹ = B θ, where θ = arg min θ { (y Bθ) R 1 (y Bθ) + σ2 ε σu 2 u u }. SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work10 with / 38 Rob

Smoothing Parameter is BLUP in LMM Formulation Define: α = σε 2 /σu 2 Let D be zero with I K as lower diagonal block: [ ] 0 0 D = J 0 I p+1,k = θ Dθ = u u K We obtain that: θ = ˆθ PS Fitted penalized spline for mean of y coincides with BLUP of y in LMM defined above. Implies BLUP-optimal value for α is: α = σ 2 ε /σ 2 u SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work11 with / 38 Rob

Estimation of Smoothing Parameter Since α is ratio of variance components in LMM many methods available in this context; typically require normal errors. Call these parametric. (SAS: proc mixed; R: lme.) Methods that do not require LMM representation and specification of error distribution: nonparametric. Examples (Parametric) Maximum Likelihood (ML) REstricted Maximum Likelihood (REML) Examples (Nonparametric) Akaike s Information Criterion (AIC) Generalized Cross-Validation (GCV) SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work12 with / 38 Rob

A Unified View of Smoothing Parameter Estimators (New) Above estimators can be viewed as roots of a quadratic estimating equation (QEE) in normal random variables Q(α) = y A α y The n n matrix A α has a (complicated, but) closed-form expression in each case... Example (A α for GCV estimator) Define: S α = B ( B R 1 B + αd ) 1 B R 1, and Ṡ α = S α / α. Then: A α = (I n S α )Ṡ α {1 Tr(S α )/n} (I n S α ) 2 Tr(S α )/n SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work13 with / 38 Rob

Example of QEE Based Inference: REML Example (A α for REML estimator) Assume: y u N(X β + Zu, σε 2 R), u N(0, σui 2 K ) Differentiation of restricted log-likelihood in β and σε 2 leads to profile REML(α, R) criterion, and A α =... too long to fit here... Krivobokova & Kauermann (2007): REML less sensitive to misspecification of residual correlation than AIC or GCV. Inspect ACF/PACF plots for ideas on what R should be... SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work14 with / 38 Rob

Saddlepoint-Based Bootstrap (SPBB) Inference for QEEs Technique pioneered by Paige, Trindade, & Fernando (2009): Assume y N(µ, Σ); want confidence interval (CI) for scalar α from Q(α). Estimator ˆα is (appropriate) root of Q(α). Q(α) usually obtained by differentiation of (profile) log likelihood (nuisance parameters replaced by conditional MLEs). Can express MGF of Q(α) in closed-form. Use MGF to saddlepoint approximate CDF of Q(α). Under monotonicity of Q(α) in α, relate saddlepoint approximations: CDF of Q(α) to CDF of ˆα. CI (α L, α U ) obtained by pivoting CDF of ˆα (inverting 2-sided test). SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work15 with / 38 Rob

SPBB Details: Slide 1 Approx (1 ζ)100% CI obtained by solving: Fˆα (ˆαobs α L, ˆθ(α L ), ˆσ 2 ε (α L ) ) = 1 ζ/2 Fˆα (ˆαobs α U, ˆθ(α U ), ˆσ 2 ε (α U ) ) = ζ/2 2nd order accurate: coverage prob is (1 ζ) + O(n 1 ) (DiCiccio & Romano, 1995). Fˆα ( ) is difficult to compute..., but if Q(α) is monotone (e.g. decreasing) in α: (Daniels, 1983). Fˆα (α) = P(ˆα α) = P(Q(α) 0) = F Q (0) F Q(α) ( ) also difficult to compute..., but have closed-form expression for its MGF (linear combo of indep noncentral χ 2 r.v. s). SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work16 with / 38 Rob

SPBB Details: Slide 2 Saddlepoint approximate CDF of Q(α): F Q(α) ( ) ˆF Q(α) ( ) (Lugannani & Rice, 1980). Saddlepoint approximations extremely accurate (3rd order accurate over sets of bounded central tendency), e.g. Butler (2007). (1 ζ)100% CI obtained by solving (numerically over a grid): ( ˆF Q(ˆαobs ) 0 αl, ˆθ(α L ), ˆσ ε 2 (α L ) ) = 1 ζ/2 ( ˆF Q(ˆαobs ) 0 αu, ˆθ(α U ), ˆσ ε 2 (α U ) ) = ζ/2 Still retain 2nd order accuracy in coverage prob (Paige, Trindade, & Fernando, 2009). SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work17 with / 38 Rob

SPBB Summary Method is similar to a parametric bootstrap: Use plug-in estimator ˆα; Usually sample from bootstrap distribution Fˆα ( ) via Monte Carlo simulation; BUT, replace simulation with saddlepoint approximation of Fˆα ( ). ( Bootstrap is in the name, but there s no re-sampling...) What is a saddlepoint approximation? Accurate method to approx CDF & PDF of X from its MGF, M X ( ). Alternative to Fourier inversion of characteristic function (FFT). Most computationally expensive step: for each x X, solve for ŝ log M X (s) s = x s=ŝ Original paper: Daniels (1954). Recent book: Butler (2007). SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work18 with / 38 Rob

Graphical Overview of SPBB (Result #2) Intractable! (Except via parametric bootstrap... slow.) Fˆα (ˆα obs ) (α L, α U ) ˆα solves pivot Q(α) = 0 ˆF Q(ˆαobs )(0) Q(α) monotone Fˆα (ˆα obs ) = F Q(ˆαobs )(0) saddlepoint approx via MGF of Q(α) SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work19 with / 38 Rob

Exact ML & REML Inference for α Exact finite sample inference for α = σ 2 ε /σ 2 u in LMMs with one variance component (Crainiceanu, Ruppert, Claeskens, & Wand, 2005): Note: asymptotic χ 2 dist is poor approx in finite samples due to substantial point mass at 0 (Crainiceanu & Ruppert, 2004). Assume y N(µ, Σ). Invert (restricted) likelihood ratio test corresponding to (REML-based) ML-based inference. Grid search needed to locate endpoints of CI (α L, α U ): At each grid value of α, use fast algorithm to draw large sample from H 0 distribution of RLRT/LRT statistic in order to (empirically) calculate p-value(α) p-value(α L ) = 0.95 and p-value(α U ) = 0.95 SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work20 with / 38 Rob

Comparison of SPBB and Exact Method Exact y normal. α: ratio of vars in LMM. Dataset-specific (needs manual tuning). LIDAR: 4 hours/ci. Performance: exact. Estimators: ML & REML only. SPBB y normal. α: root of QEE. QEE-specific (needs derivatives of QEE). LIDAR: 10 mins/ci. Performance: near-exact. Estimators: many (ML, REML, AIC, GCV, etc.) Fact SPBB is able to furnish a CI when no other computationally feasible method, exact or otherwise, exists, e.g. AIC & GCV. SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work21 with / 38 Rob

The Hodrick-Prescott Filter (Hodrick & Prescott, 1997) Smoothing technique for series y t : viewed as sum of trend µ t and residual component ε t y t = µ t + ε t, t = 1,..., n For choice of smoothing parameter α, trend fitted by minimizing weighted sum of squares ŷ HP = arg min µ R n n t=1 balance (y t µ t ) 2 {}}{ + α }{{} fidelity to data n t=1 (µ t 2µ t 1 + µ t 2 ) 2 }{{} smoothness of fit = (I n + α 2 2 ) 1 y, 2 = function({0, 1, 2}) SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work22 with / 38 Rob

HPF is Linear Penalized Spline (Result #1) Theorem Consider penalized spline model of degree p = 1, explanatory variable time x t = t, K = n 2 equi-spaced knots at 2,..., n 1, and uncorrelated residuals R = I n : n 1 y t = β 0 + β 1 t + u k (t k) + + ε t, t = 1,..., n. k=2 Then HPF solution coincides with penalized spline solution: ŷ PS = B PS (B PS B PS + αj 2,n 2 ) 1 B PS y = (I n + α 2 2 ) 1 y = ŷ HP SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work23 with / 38 Rob

HPF Coincides With Linear Penalized Spline: Proof Proof. HPF can be formulated as LMM (Schlicht, 2005): y = X HP β + Z HP u + ε B HP θ + ε where Z HP = 2 ( 2 2 ) 1, and X HP is any n 2 matrix satisfying: 2 X HP = 0, and X HP X HP = I 2 LMM formulation implies HPF solution is BLUP of y: ŷ HP = B HP (B HP B HP + αj 2,n 2 ) 1 B HP y SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work24 with / 38 Rob

Proof (continued...) Defining L = B 1 PS B HP, transform basis functions (but preserve degree and knot locations): ŷ HP = B HP [B HP B HP + αj 2,n 2 ] 1 B HP [ ( y = B PS L L B PS B PS + αl T J 2,n 2 L 1) 1 L] L B PS y = B PS [B PS B PS + αl T J 2,n 2 L 1] 1 B PS y =... = B PS (B PS B PS + αj 2,n 2 ) 1 B PS y = ŷ PS Equivalence follows if we can show in step above that L T J 2,n 2 L 1 = J 2,n 2 Decomposing L 1 into block structure, result follows...

Remarks Schlicht (2005): calls LMM representation of HPF a stochastic model, unaware that it is in fact a LMM; proposes maximum (Gaussian) likelihood estimator (MLE) for α; proposes also a method of moments estimate. Successive econometrics papers have built on this; propose new estimators e.g. Dermoune, Djehiche, and Rahmania (2008). But, there is already a vast existing body of work on the estimation of fixed and random components in LMMs... e.g. Christensen (1996). Fact LMM formulation of penalized spline endows it, and therefore HPF, with rich variety of existing estimation methods, e.g. REML (a parametric choice) and GCV (a nonparametric choice). SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work26 with / 38 Rob

Example 1: U.S. Gross National Product (GNP) Quarterly (seasonally adjusted) GNP for period 1947.1 to 1993.4 (billions of dollars). Have n = 189 pairs of (t, y t ) values: y t = log GNP, t = time (quarter since 1947.1). Strong linearly increasing trend, remove it! Hodrick & Prescott (1997): smooth via HPF with choice α = 1, 600 advocated for quarterly data (the HPF smooth). Theorem: HPF equivalent to 187-knot (K = 187) linear (p = 1) penalized spline model with uncorrelated residuals (R = I n )... Alternatives to HPF α = 1, 600 choice: REML and GCV. SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work27 with / 38 Rob

Example 2: Log Returns of GNP Analyze also first differences, {y t y t 1 } of log GNP (log returns), corresponds to a growth rate. Have n = 188 pairs of (t, y t ) values. Again compare smooths: HPF, REML, GCV. Theorem: HPF equivalent to 186-knot (K = 186) linear (p = 1) penalized spline modell with uncorrelated residuals (R = I n ). SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work28 with / 38 Rob

Log GNP and Log Returns of GNP Data Detrended Log GNP Log Returns of GNP 0.15 0.00 0.10 1950 1960 1970 1980 1990 Time 0.02 0.02 0.06 1950 1960 1970 1980 1990 Time ACF of Detrended Log GNP ACF of Log Returns of GNP 1.0 0.0 0.5 1.0 1.0 0.0 0.5 1.0 0 10 20 30 40 Lag 0 10 20 30 40 Lag SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work29 with / 38 Rob

Results: Fitted Models Log GNP: ACF/PACF of REML-IID & GCV-IID (R = I n ) residuals suggests AR(3) autocorrelation structure... so fit REML-AR(3). Log Returns GNP: REML-IID & GCV-IID (R = I n ) residuals white. Dataset Method α 95% Estimate (lo, pt, hi) Log GNP Log Returns GNP HPF (?, 1600,?) Exact-REML-IID (0.07, 0.16, 0.32) SPBB-GCV-IID (0.04, 0.15, 0.59) SPBB-REML-AR(3) (97.6, 337.3, 871.4) HPF (?, 1600,?) Exact-REML-IID (8, 595, 65, 302, 330, 436) SPBB-REML-IID (5, 572, 65, 302, 463, 185) SPBB-GCV-IID (6, 022, 79, 164, ) SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work30 with / 38 Rob

Smoothed Log GNP: REML & GCV with R = I n underfit... Detrended Log GNP 0.15 0.10 0.05 0.00 0.05 0.10 0.15 HPF REML IID & GCV IID 0 50 100 150 Quarter since 1947 SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work31 with / 38 Rob

Smoothed Log GNP: REML-AR(3) coincides with HPF! Detrended Log GNP 0.15 0.10 0.05 0.00 0.05 0.10 0.15 HPF REML AR(3) 0 50 100 150 Quarter since 1947 SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work32 with / 38 Rob

Smoothed Returns of Log GNP: HPF underfits? Log Returns of GNP 0.02 0.00 0.02 0.04 0.06 REML IID & GCV IID HPF 0 50 100 150 Quarter since 1947 SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work33 with / 38 Rob

Simulations: Performance of Exact vs. SPBB CI s for α 1,000 reps from REML-fitted model to Log Returns of GNP. Performance of SPBB is near-exact! SPBB: only method to get CI in case of GCV or REML (non-iid). Estimation Median Underage Coverage Overage Method CI Length Prob. Prob. Prob. Exact-REML 374,800 0.000 0.981 0.019 SPBB-REML 466,300 0.023 0.977 0.000 SPBB-GCV 0.016 0.984 0.000 SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work34 with / 38 Rob

Summary Avoid blind usage of HPF: equivalent to linear penalized spline with equi-spaced knots and uncorrelated residuals. Advocate structured penalized spline approach: pay attention to degree of polynomial (p), number of knots (K), residual autocorrelaton (R). Use data-driven method for estimation: REML is less sensitive to misspecification of R than GCV or AIC. Use SPBB for α CI s: if desired; gives range of plausible smooths. SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work35 with / 38 Rob

Application: The Fossil Data strontium ratio 0.70720 0.70730 0.70740 0.70750 Point and 95% Interval Smoothing Estimates REML & GCV point estimate (solid) Exact REML lower bound (dash) Exact REML upper bound (dot) SPBB GCV lower bound (dot dash) SPBB GCV upper bound (long dash) 95 100 105 110 115 120 age (millions of years) SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work36 with / 38 Rob

Acknowledgements Reviewers: valuable suggestions. Ciprian Crainiceanu: exact ML & REML code. High Performance Computing Center, Texas Tech University: supercomputing. National Security Agency: not having to teach in summer. SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work37 with / 38 Rob

Key References Crainiceanu, C., Ruppert, D., Claeskens, G., and Wand, M. (2005), Exact likelihood ratio tests for penalized splines, Biometrika, 92, 91-103. Hodrick, R.J., and Prescott, E.C. (1997), Postwar U.S. business cycles: an empirical investigation, J. Money, Credit, and Banking, 29, 1-16. Krivobokova, T., and Kauermann, G. (2007), A Note on Penalized Spline Smoothing with Correlated Errors, J. Amer. Statist. Assoc., 102, 1328-1337. Paige, R.L., Trindade, A.A. and Fernando, P.H. (2009), Saddlepoint-based bootstrap inference for quadratic estimating equations, Scand. J. Stat., 36, 98-111. Paige, R.L. and Trindade, A.A. (2010), The Hodrick-Prescott Filter: A Special Case of Penalized Spline Smoothing, Electronic J. Statist., 4, 856-874. Paige, R.L., and Trindade, A.A. (2012), Fast and Accurate Inference for the Smoothing Parameter in Semiparametric Models, Austral. N. Zeal. J. Statist., (to appear). Ruppert, D., Wand, M.P. and Carroll, R.J. (2003), Semiparametric Regression, London: Cambridge. Schlicht, E. (2005), Estimating the smoothing parameter in the so-called Hodrick-Prescott filter, J. Japan Statist. Soc., 35, 99-119. SPBB (Dept. Inference of Mathematics for HPF & Statistics, Texas Tech University June 2012 Joint work38 with / 38 Rob